1 Introduction

In several computational nanoelectronic problems, the spatial discretization of ET coupled problems leads to a nonlinear quadratic dynamical system of the following form:

(18.1a)
$$\displaystyle \begin{aligned} \mathbf{y}(t)&=\mathbf{C}\mathbf{x}(t)+\mathbf{D}\mathbf{u}(t), \end{aligned} $$
(18.1b)

where \(\mathbf {E}\in \mathbb {R}^{n\times n}\) is singular, indicating that (18.1) is a system of differential-algebraic equations (DAEs), and \(\mathbf {A} \in \mathbb {R}^{n\times n},\, \mathbf {B} \in \mathbb {R}^{n\times m},\, \mathbf {C} \in \mathbb {R}^{\ell \times n},\, \mathbf {D} \in \mathbb {R}^{\ell \times m},\) while is a 3-way tensor. n is called the order of the system, which is usually large. A tensor is a multi-way array and its order is the number of dimensions, also known as ways or modes, see [7]. Here, is a 3-way tensor of n matrices \(\mathbf {F}_i \in \mathbb {R}^{n \times n}.\) Each element in is a scalar \(\mathbf {x}(t)^{T}\mathbf {F}_i \mathbf {x}(t) \in \mathbb {R}, i=1,\ldots , n\). The state vector \(\mathbf {x}(t)=(\mathbf {x}_v(t)^T, \mathbf {x}_T(t)^T)^T\in \mathbb {R}^{n}\) includes the nodal voltages \(\mathbf {x}_v(t)\in \mathbb {R}^{n_v},\) and the nodal temperatures \(\mathbf {x}_T(t)\in \mathbb {R}^{n_T}.\) \(\mathbf {u}(t)\in \mathbb {R}^{m}\) and \(\mathbf {y}(t)\in \mathbb {R}^{\ell }\) are the inputs (excitations) and the desired outputs (observations), respectively. We assume system (18.1) to be solvable, i.e., the matrix pencil λ E −A is regular for all \(\lambda \in \mathbb {C}.\) In practice, more realistic models have very large dimension n compared to the number of inputs m and outputs . Despite the ever increasing computational power, simulation of these systems in acceptable time is still challenging. MOR aims to reduce the computational burden by generating ROMs that are faster and cheaper to simulate, yet accurately represent the original large-scale system behavior. MOR replaces (18.1) by a ROM

(18.2a)
$$\displaystyle \begin{aligned} \mathbf{y_r}(t)&=\mathbf{C}_r\mathbf{x}_r(t)+\mathbf{D}_r\mathbf{u}(t), \end{aligned} $$
(18.2b)

where \(\mathbf {E}_r,\mathbf {A}_r\in \mathbb {R}^{r\times r},\, \mathbf {B}_r \in \mathbb {R}^{r\times m},\, \mathbf {C}_r\in \mathbb {R}^{\ell \times r}, \mathbf {D}_r=\mathbf {D}\) and \(\mathbf {x}_r(t)\in \mathbb {R}^r,\,r\ll n,\) is the reduced state vector and r is the order of the ROM. A good ROM should have small approximation error ∥y −y r ∥ in a suitable norm ∥⋅∥ for every arbitrary input u(t). There exist many MOR methods for nonlinear (quadratic) systems such as the snapshot and implicit moment-matching methods, see [4] for a general discussion of MOR methods. The snapshot methods are not flexible for input-dependent systems as considered in this work, hence, we consider input-independent MOR methods, such as implicit moment-matching methods [4]. However, it is well known that as the number of inputs increases, the efficiency of moment-matching MOR methods decreases, since the size of the ROM is proportional to the number of inputs. Moreover, they cannot be applied directly to quadratic DAEs [3]. In general, models with numerous inputs and outputs are challenging for MOR, and most MOR methods produce large and dense ROMs for such systems. In [2], the BDSM-ET and SIP-ET methods for ET coupled problems with many inputs are proposed to overcome this problem. The BDSM-ET method is more accurate and leads to much smaller ROMs than the SIP-ET method. However, the BDSM-ET ROMs have dense matrices in the electrical subsystem and a dense 3-way tensor in the thermal subsystem, which restricts their applicability to small and medium sized ET systems. In this paper, we modify the BDSM-ET method proposed in [2]. In Sect. 18.2, we review the BDSM-ET method. Section 18.3 introduces the proposed modification of the BDSM-ET methods. Finally, we present numerical experiments and conclusions. For simplicity, we remove (t) for time dependent variables in the next sections.

2 BDSM-ET Method for ET Coupled Problems with Many Inputs

In this section, we discuss the BDSM-ET method proposed in [2]. We consider a structure arising naturally in nanoelectronic coupled problems with many inputs, taking the form of (18.1) with system matrices and tensor structures as below,

with \(\mathbf {A}_v\in \mathbb {R}^{n_v\times n_v}\), \(\mathbf {B}_v\in \mathbb {R}^{n_v\times \tilde {m}}\), \(\mathbf {E}_T\in \mathbb {R}^{n_T\times n_T}\), \(\mathbf {A}_T\in \mathbb {R}^{n_T\times n_T}\), \(\mathbf {B}_T\in \mathbb {R}^{n_T\times \tilde {m}}\), \(\mathbf {C}_v\in \mathbb {R}^{\ell \times n_v}\), \( \mathbf {F}_{v_i}\in \mathbb {R}^{n_v \times n_v }\) \(\mathbf {C}_T\in \mathbb {R}^{\ell \times n_T}\), \(\mathbf {D}_v\in \mathbb {R}^{\ell \times \tilde {m}},\) \(\mathbf {D}_T\in \mathbb {R}^{\ell \times \tilde {m}},\) and \(\mathbf {u}_v,\mathbf {u}_T\in \mathbb {R}^{\tilde {m}},\, \tilde {m}=m/2.\) Thus, substituting the above matrices and the tensor into (18.1) leads to an equivalent decoupled system given by

(18.3a)
(18.3b)
$$\displaystyle \begin{aligned} \mathbf{y}&=\mathbf{C}_v\mathbf{x}_v+\mathbf{C}_T\mathbf{x}_T +\mathbf{D}_v\mathbf{u}_v+ \mathbf{D}_T\mathbf{u}_T,{} \end{aligned} $$
(18.3c)

with and \(\mathbf {F}_{v_i}\) is as defined earlier. Equations (18.3a) and (18.3b) are the electrical and thermal subsystems, respectively. After decoupling, the system (18.3) is now a one-way coupled system. Since the solution of the electrical and thermal subsystems can be computed consecutively, we call it decoupled, in contrast to the fully coupled original system, for which the electrical and the thermal subsystem must be solved simultaneously. We can observe that the nonlinear term can be treated as part of the thermal input, since it is obtained by first simulating the electrical subsystem. The output can be obtained through (18.3c). Even after the above simplification, system (18.3) is still computationally expensive to simulate. Moreover, the decoupled system still has numerous inputs for both the electrical and the thermal subsystems. MOR replaces the decoupled system (18.3) with a reduced-order decoupled system

(18.4a)
(18.4b)
$$\displaystyle \begin{aligned} \mathbf{y}_r&=\mathbf{C}_{v_r}\mathbf{x}_{v_r}+\mathbf{C}_{T_r}\mathbf{x}_{T_r} +\mathbf{D}_{v}\mathbf{u}_{v}+ \mathbf{D}_{T}\mathbf{u}_{T},{} \end{aligned} $$
(18.4c)

where \(\mathbf {A}_{v_r}\in \mathbb {R}^{r_v\times r_v}\), \(\mathbf {B}_{v_r}\in \mathbb {R}^{r_v\times \tilde {m}}\), \(\mathbf {E}_{T_r}\in \mathbb {R}^{r_T\times r_T}\), \(\mathbf {A}_{T_r}\in \mathbb {R}^{r_T\times r_T}\), \(\mathbf {B}_{T_r}\in \mathbb {R}^{r_T\times \tilde {m}}\), \(\mathbf {C}_{v_r}\in \mathbb {R}^{\ell \times r_v}\), with the reduced order r = r v  + r T  ≪ n. In order to obtain the ROM (18.4), we combine the MOR techniques for algebraic and differential subsystems to obtain (18.4a) and (18.4b), respectively. MOR for general algebraic systems is still underdeveloped and the existing methods are often application specific, such as the method based on Gaussian elimination for algebraic systems arising from circuit simulations, see [5, 6, 9, 10] for details. MOR methods based on Gaussian elimination could be applied to algebraic systems, if the input matrix B v has many zero rows, see [2]. The most challenging step is to reduce the nonlinear term in the thermal subsystem. The BDSM-ET method [2] was proposed to overcome this problem for the case of ET coupled problems which can be written in the form of (18.3). This method combines the Gaussian elimination based methods, such as SIP [10], with the BDSM method [11] to reduce the electrical and thermal subsystems, respectively. This can be briefly described as follows. Assume that B v has many zero rows, then the electrical subsystem (18.3a) can be reformulated and partitioned as

$$\displaystyle \begin{aligned} \begin{aligned} \begin{pmatrix} \mathbf{A}_{v_{11}} &\mathbf{A}_{v_{12}}\\[0.5em] \mathbf{A}_{v_{12}}^{{}^T} &\mathbf{A}_{v_{22}} \end{pmatrix}\begin{pmatrix} \mathbf{x}_{v_e}\\[0.5em]\mathbf{x}_{v_I} \end{pmatrix}=- \begin{pmatrix}\mathbf{B}_{v_{e}}\\[0.5em] 0 \end{pmatrix}\mathbf{u}_v,\quad \mathbf{y}_{v}=\begin{pmatrix} \mathbf{C}_{v_{e}} &0 \end{pmatrix}\begin{pmatrix} \mathbf{x}_{v_e}\\\mathbf{x}_{v_I}\end{pmatrix}+\mathbf{D}_v\mathbf{u}_v, \end{aligned} \end{aligned} $$
(18.5)

where \(\mathbf {x}_{v_e}\in \mathbb {R}^{n_{v_e}}\) and \(\mathbf {x}_{v_I}\in \mathbb {R}^{n_{v_I}}\) represent the port and the internal nodal voltages, respectively, and \(n_v=n_{v_e}+n_{v_I}.\) Eliminating all internal nodes from (18.5) leads to the reduced-order electrical subsystem (18.4a) with matrix coefficients

$$\displaystyle \begin{aligned} \mathbf{A}_{v_{r}}&=\left[ \mathbf{A}_{v_{11}}-\mathbf{A}_{v_{12}}\mathbf{W}_{v}\right] \in \mathbb{R}^{r_{v}\times r_{v}},\, \mathbf{B}_{v_{r}} =\mathbf{B}_{v_{e}} \in \mathbb{R}^{r_{v}\times\tilde{m}}, \, \mathbf{C}_{v_{r}} =\mathbf{C}_{v_{e}} \in \mathbb{R}^{\ell \times r_{v}}, \end{aligned} $$
(18.6)

where \(\mathbf {W}_{v}=\mathbf {A}_{v_{22}}^{-1}\mathbf {A}_{v_{12}}^{T}\in \mathbb {R}^{n_{v_I} \times n_{v_e}},\,\mathbf {x}_{v_r}= \mathbf {x}_{v_e} \in \mathbb {R}^{r_v},\) and the order of the reduced electrical subsystem \(r_{v}=n_{v_e}\ll n_v.\) The reduction is based on the assumption that the input matrix B v is very sparse in the sense that it has much fewer nonzero rows than the total row number, i.e. \(n_{v_e} \ll n_v .\) According to [11], the reduced matrix \( \mathbf {A}_{v_{r}}\) is the Schur complement of the block \(\mathbf {A}_{v_{22}}\) of the matrix A v . However, the Schur complement is dense due to the large number of fill-in. In many cases, eliminating all internal nodes at once is not advisable because it makes the construction of \(\mathbf {W}_{v}=\mathbf {A}_{v_{22}}^{-1}\mathbf {A}_{v_{12}}^{T}\) responsible for the reduction, either costly or infeasible, since the matrix \(\mathbf {A}_{v_{22}}\) can be very large due to a large number of internal nodes. It then produces a ROM (18.6) with very dense matrix \( \mathbf {A}_{v_{r}}\) which may even be more computationally expensive than the original model. A sparse \( \mathbf {A}_{v_{r}}\) can be obtained using sparsity control algorithms such as reduceR [9], which minimizes fill-in in the reduced matrix \( \mathbf {A}_{v_{r}}\) by using fill-in reducing reordering algorithms, e.g., approximation minimum degree (AMD) [1], so that internal nodes responsible for fill-in are placed toward the end of the elimination sequence, along with the other nodes.

The reduction in the electrical subsystem induces a reduction in the thermal subsystem through the nonlinear part, leading to

(18.7)

where is a 3-way tensor. The 3-way tensors are the partitions of the tensor corresponding to the partitions in (18.5). The next step is to apply the superposition principle to (18.7). Assume that the thermal input matrix B T has no zero columns, so that it can be split into \(\mathbf {B}_T=\sum _{i=1}^{\tilde {m}} \mathbf {B}_{T_i},\) where \(\mathbf {B}_{T_i}\in \mathbb {R}^{n_T\times \tilde {m}} \) are column rank-1 matrices defined as

$$\displaystyle \begin{aligned} \mathbf{B}_{T_i}(:,j)=\begin{cases}\mathbf{ b}_{T_i} \in \mathbb{R}^{n_T}, &\mbox{if } j=i, \\ 0, & \mbox{otherwise, } \end{cases} i=1,\ldots,\tilde{m}. \end{aligned}$$

Here and below, blkdiag denotes the block-diagonal matrix defined by the input arguments. Applying the two-stage superposition principle from [2] to (18.7) leads to a block-diagonal structured system of dimension \(\tilde {m}n_T\) given by

(18.8)

where \(\mathbb {E}_T=\mathrm {blkdiag}( \mathbf {E}_T, \ldots , \mathbf {E}_T)\in \mathbb {R}^{\tilde {m}n_T \times \tilde {m}n_T},\, \mathbb {C}_{T}=(\mathbf {C}_{T},\ldots , \mathbf {C}_{T})\in \mathbb {R}^{\ell \times \tilde {m}n_T},\, \mathbb {A}_T=\mathrm {blkdiag}( \mathbf {A}_T, \ldots , \mathbf {A}_T)\in \mathbb {R}^{\tilde {m}n_T \times \tilde {m}n_T}, \, \mathbb {B}_T =({\mathbf {B}_{T_1}}^{T}, \ldots , {\mathbf {B}_{T_{\tilde {m}}}}^{T} )^{T} \in \mathbb {R}^{\tilde {m}n_T \times \tilde {m}},\) and The corresponding reduced-order thermal subsystem in the form of (18.4b) has block-diagonal structured matrices given by

$$\displaystyle \begin{aligned} \mathbf{E}_{T_{r}}= \mathbf{V}^{\mathrm{T}} \mathbb{E}_T \mathbf{V}, \quad \mathbf{A}_{T_{r}}=\mathbf{ V}^{\mathrm{T}} \mathbb{A}_T \mathbf{V},\quad \mathbf{B}_{T_{r}}=\mathbf{ V}^{\mathrm{T}} \mathbb{B}_T,\quad \mathbf{C}_{T_{r}}= \mathbb{C}_T \mathbf{V}, \end{aligned} $$
(18.9)

where \(\mathbf {V}=\mathrm {blkdiag}(\mathbf {V}^{(1)},\ldots ,\mathbf {V}^{(\tilde {m})} ).\) The projection matrices V (i) can be constructed from each subsystem of (18.8) as (see [2] for details)

$$\displaystyle \begin{aligned} \mathrm{range}(\mathbf{V}^{(i)})=\mathrm{span}\{\mathbf{R}_i, \mathbf{M}\mathbf{R}_i, \ldots,\mathbf{M}^{r_{T_i}-1}\mathbf{R}_i \},\quad r_{T_i}\ll n_T,{} \end{aligned} $$
(18.10)

where \(\mathbf {M}=(s_0 \mathbf {E}_T - \mathbf {A}_T )^{-1}\mathbf {E}_T \in \mathbb {R}^{n_T\times n_T},\) and \(\mathbf {R}_i=(s_0 \mathbf {E}_T-\mathbf {A}_T)^{-1}\mathbf { b}_{T_i} \in \mathbb {R}^{n_T}, \, i=1,\ldots ,\tilde {m}.\) The nonlinear term \(\mathbf {V}^{\mathrm {T}} \left (\mathbf {x}_{v_r}^{T}\mathbf {\mathbb {F}}_{T}\mathbf {x}_{v_r}\right )\) can be reformulated as a reduced-order nonlinear term using the following proposition from [3].

Proposition 18.1

Let \(\mathbf {W}=\left ( \mathbf {w}_{ij} \right ) \in \mathbb {R}^{n\times r}\) be a matrix, \(\mathbf {x}_r\in \mathbb {R}^r,\) and be a 3D tensor, then there exist a 3D tensor such that:

where with \(\mathbf {F}_{r_j}=\displaystyle \sum _{i=1}^{n}{\mathbf w}_{ij}\tilde {\mathbf {F}}_{i}\in \mathbb {R}^{r\times r}, \, j=1,\ldots ,r.\)

From Proposition 18.1, we see that in the reduced-order nonlinear term is independent of the time t and can be precomputed before simulating the ROM. Therefore reformulating the nonlinear term further improves the efficiency of simulating the ROM. It can be seen that V (i) depends only on the single column \(\mathbf { b}_{T_i},\) rather than B T with many columns, leading to a block-wise sparse ROM as compared with the standard moment-matching methods, such as PRIMA [8]. Here, \(s_0\in \mathbb C\) is chosen arbitrarily. Finally, the order of the reduced thermal subsystem (18.4b) is \(r_T=\sum _{i=1}^{\tilde {m}} r_{T_i}.\) From the analysis in [2, 11], the block-diagonal system (18.8) yields a system equivalent to (18.7) , so that the block-diagonal ROM of (18.8) can be considered as the ROM of (18.7). However, the matrix \(\mathbf {A}_{v_r}\) and the tensor \(\mathbf {F}_{T_r} \) in the ROM are dense which is still a computational and storage burden. In the next section, we propose a modified BDSM-ET method which leads to sparser ROMs.

3 Proposed Modified BDSM-ET Method

In this section, we propose the modified BDSM-ET method. The goal of the modified BDSM-ET method is to reduce the computational and storage demand of simulating the reduced electrical subsystem and the reduced nonlinear term in the thermal subsystem, obtained using the BDSM-ET method. Actually, the BDSM method in [11] can be extended to the electrical subsystem in algebraic form. Assume that the electrical input matrix B v has no zero columns, so that it can be split into \(\mathbf {B}_v= \sum _{i=1}^{\tilde {m}} \mathbf {B}_{v_i},\) where \(\mathbf {B}_{v_i}\in \mathbb {R}^{n_v\times \tilde {m}}\) is a column rank-1 matrix defined as

$$\displaystyle \begin{aligned} \mathbf{B}_{v_i}(:,j)=\begin{cases}\mathbf{ b}_{v_i} \in \mathbb{R}^{n_v}, &\mbox{if } j=i, \\ 0, & \mbox{otherwise, } \end{cases}\, i=1,\ldots,\tilde{m}. \end{aligned}$$

Applying the superposition principle to the electrical subsystem in (18.3) results in an equivalent block-diagonal algebraic system

$$\displaystyle \begin{aligned} \mathbb{A}_v\xi_{v}&=-\mathbb{B}_v\mathbf{u}_v,\quad \mathbf{y}_v=\mathbb{C}_v\xi_v,{} \end{aligned} $$
(18.11)

where \(\mathbb {A}_{v}=\mathrm {blkdiag}(\mathbf {A}_{v}, \ldots , \mathbf {A}_{v}),\, \mathbb {B}_{v} =(\mathbf {B}_{v_{1}}^{T}, \ldots , \mathbf {B}_{v_{{\tilde {m}}}}^{T} )^{T},\, \mathbb {C}_{v} =(\mathbf {C}_{v}, \ldots ,\mathbf {C}_{v}), \xi _v=(\mathbf {x}_{v_1}^{T},\ldots , \mathbf {x}_{v_{\tilde {m}}}^{T})^{T}. \) The next step is to reduce the dimension of (18.11). This is done by applying reordering and elimination techniques to each subsystem of (18.11):

(18.12)

Assuming each \(\mathbf {B}_{v_i}\) has many zero rows, then each subsystem in (18.12) can be reformulated as

$$\displaystyle \begin{aligned} \begin{aligned} \begin{pmatrix} \mathbf{A}_{v_{11}}^{(i)} &\mathbf{A}_{v_{12}}^{(i)}\\[0.5em] \mathbf{A}_{v_{12}}^{(i)^T} &\mathbf{A}_{v_{22}}^{(i)} \end{pmatrix}\begin{pmatrix} \mathbf{x}_{v_e}^{(i)}\\[0.5em] \mathbf{x}_{v_I}^{(i)} \end{pmatrix}=- \begin{pmatrix}\mathbf{B}_{v_{e}}^{(i)}\\[0.5em] 0 \end{pmatrix}\mathbf{u}_v,\quad \mathbf{y}_{v_i}=\begin{pmatrix} \mathbf{C}_{v_{e}}^{(i)} &0 \end{pmatrix}\begin{pmatrix} \mathbf{x}_{v_e}^{(i)}\\[0.5em] \mathbf{x}_{v_I}^{(i)} \end{pmatrix}, \end{aligned} \end{aligned} $$
(18.13)

where \(\mathbf {x}_{v_e}^{(i)}\in \mathbb {R}^{n_{v_e}^{(i)}}\) and \(\mathbf {x}_{v_I}^{(i)}\in \mathbb {R}^{n_{v_I}^{(i)}}\) represent the port and the internal nodal voltages, respectively, and \(n_v=n_{v_e}^{(i)}+n_{v_I}^{(i)},\, i=1,\ldots ,\tilde {m}.\) Eliminating all internal nodes from (18.13) leads to the ROM of each subsystem as below

$$\displaystyle \begin{aligned} \mathbf{A}_{v_{r_i}}\mathbf{x}_{v_{r_i}}&=\mathbf{B}_{v_{r_i}} \mathbf{u}_v,\, \mathbf{y}_{v_{r_i}}=\mathbf{C}_{v_{r_i}}\mathbf{x}_{v_{r_i}},{} \end{aligned} $$
(18.14)

where \(\mathbf {A}_{v_{r_i}}=\big [ \mathbf {A}_{v_{11}}^{(i)}-\mathbf {A}_{v_{12}}^{(i)}\mathbf {W}_{v_i}\big ] \in \mathbb {R}^{r_{v_i}\times r_{v_i}},\, \mathbf {B}_{v_{r_i}} =-\mathbf {B}_{v_{e}}^{(i)} \in \mathbb {R}^{r_{v_i}\times \tilde {m}}, \, \mathbf {C}_{v_{r_i}} =\mathbf {C}_{v_{e}}^{(i)} \in \mathbb {R}^{\ell \times r_{v_i}}, \,\mathbf {W}_{v_i}=\mathbf {A}_{v_{22}}^{{(i)}^{-1}}\mathbf {A}_{v_{12}}^{{(i)}^T}\in \mathbb {R}^{n_{v_I}^{(i)} \times n_{v_e}^{(i)} },\, \mathbf {x}_{v_{r_i}}= \mathbf {x}_{v_e}^{(i)} \in \mathbb {R}^{r_{v_i}},\) and \(r_{v_i}=n_{v_e}^{(i)}\ll n_v.\) Replacing each \(\mathbf {A}_v, \mathbf {B}_{v_i},\mathbf {C}_v, \mathbf {x}_{v_i}\) in (18.11) with \(\mathbf {A}_{v_{r_i}}, \mathbf {B}_{v_{r_i}}, \mathbf {C}_{v_{r_i}}, \mathbf {x}_{v_{r_i}}\) leads to the ROM of (18.11), which is also the ROM of (18.3a) of dimension \(r_v=\sum _{i=1}^{\tilde {m}}r_{v_i}\) and with matrices

$$\displaystyle \begin{aligned} \mathbf{A}_{v_r}&=\mathrm{blkdiag}(\mathbf{A}_{v_{r_1}}, \ldots, \mathbf{A}_{v_{r_{\tilde{m}}}}),\, \mathbf{B}_{v_{r}} =(\mathbf{B}_{v_{r_1}}^{T}, \ldots, \mathbf{B}_{v_{r_{\tilde{m}}}}^{T} )^{T},\, \mathbf{C}_{v_{r}} =(\mathbf{C}_{v_{r_1}}, \ldots,\mathbf{C}_{v_{r_{\tilde{m}}}}). \end{aligned} $$

Finally, we reduce the thermal subsystem (18.3b). Here, we propose the approach which leads to a much sparser reduced 3-way tensor than that obtained using the BDSM-ET method. Applying the superposition principle to the algebraic subsystem (18.3a) introduces into the thermal subsystem, i.e. x v is replaced by \(\sum _{i=1}^{\tilde {m}}\mathbf {x}_{v_i}\) in the nonlinear part. In order to obtain a sparse tensor, the approximation is introduced for the thermal subsystem. From numerical simulations results, we have observed that the error introduced by the approximation is very small and can be neglected for the nanoelectronic problems considered.

Thus (18.3b) can be approximated as

$$\displaystyle \begin{aligned} \mathbf{E}_T\mathbf{x}^{\prime}_T& = \mathbf{A}_T \mathbf{x}_T+\xi_v^{T}\mathbb{F}_T\xi_v,+\mathbf{B}_T \mathbf{u}_T,\quad \mathbf{x}_T(0)=\mathbf{x}_{T_0}, \end{aligned} $$
(18.15a)
$$\displaystyle \begin{aligned} \mathbf{y}_{T}&=\mathbf{C}_{T}\mathbf{x}_{T} + \mathbf{D}_{T}\mathbf{u}_{T}. \end{aligned} $$
(18.15b)

Here we have used the equality

where \(\mathbb {F}_T=\big [\mathbb {F}_{T_1}^T,\ldots , \mathbb {F}_{T_{n_T}}^T \big ]^T\in \mathbb {R}^{\tilde {n}_v\times \tilde {n}_v\times n_T},\, \tilde {n}_v=\tilde {m}n_v, \, \mathbb {F}_{T_i}=\mathrm {blkdiag}(\mathbf {F}_{T_i},\ldots ,\mathbf {F}_{T_i})\in \mathbb {R}^{\tilde {n}_v\times \tilde {n}_v}, \,\mathbf {F}_{T_i}\in \mathbb {R}^{n_v\times n_v}\) and ξ v is defined as in (18.11). We can see that each reduced state in (18.14) induces a reduction in (18.15) leading to

$$\displaystyle \begin{aligned} \mathbf{E}_T\mathbf{x}^{\prime}_T& = \mathbf{A}_T \mathbf{x}_T+\xi_{v_r}^{T}\mathbb{F}_{T_r}\xi_{v_r}+\mathbf{B}_T \mathbf{u}_T,\quad \mathbf{x}_T(0)=\mathbf{x}_{T_0}, \end{aligned} $$
(18.16a)
$$\displaystyle \begin{aligned} \mathbf{y}_{T}&=\mathbf{C}_{T}\mathbf{x}_{T} + \mathbf{D}_{T}\mathbf{u}_{T}, \end{aligned} $$
(18.16b)

where \(\xi _{v_r}=(\mathbf {x}_{v_{r_1}}^{T},\ldots , \mathbf {x}_{v_{r_{\tilde {m}}}}^{T})^{T},\, \mathbb {F}_{T_r}=\big [\mathbb {F}_{T_{r_1}}^T,\ldots , \mathbb {F}_{T_{r_{n_T}}}^T \big ]^T\in \mathbb {R}^{r_v \times r_v \times n_T},\)

with \( \mathbb {F}_{T_{r_i}}=\mathrm {blkdiag}(\mathbf {F}_{T_{r_i}},\ldots ,\mathbf {F}_{T_{r_i}})\in \mathbb {R}^{r_{v}\times r_{v}},\) where \(\mathbf {F}_{T_{r_i}}=\mathbf {F}_{T_{11}}^{(i)}-\mathbf {W}_{v_i}^{T}\mathbf {F}_{T_{21}}^{(i)}-\mathbf {F}_{T_{12}}^{(i)}\mathbf {W}_{v_i}+\mathbf {W}_{v_i}^{T}\mathbf {F}_{T_{22}}^{(i)}\mathbf {W}_{v_i}\in \mathbb {R}^{r_{v_i}\times r_{v_i}}.\) Here \(\mathbf {F}_{T_{11}}^{(i)}, \mathbf {F}_{T_{12}}^{(i)}, \mathbf {F}_{T_{21}}^{(i)}, \mathbf {F}_{T_{22}}^{(i)}\) are the sub-blocks of \(\mathbf {F}_{T_i}\) partitioned according to the partition of A v in (18.13). Since can be considered as an extra input for the thermal subsystem, the superposition principle still applies to the thermal subsystem. Therefore, (18.16) can also be split into \(\tilde {m}\) subsystems, the thermal state x T of (18.16) can be reduced following the steps from (18.8) till the end of Sect. 18.2. The reduced thermal system is in the form of (18.4b) with the reduced matrices being defined in (18.9). Using Proposition 18.1, the nonlinear term \(\mathbf { V}^{\mathrm {T}}\left (\xi _{v_r}^{T}\mathbf {\mathbb {\tilde {F}}}_{T}\xi _{v_r}\right ),\) where \(\mathbf {\mathbb {\tilde {F}}}_{T}=\begin {pmatrix} \mathbb {F}_{T_r}\\0 \end {pmatrix} \in \mathbb {R}^{r_v \times r_v\times \tilde {m}n_T},\, \mathbf {\mathbb {\tilde {F}}}_{T}=\big [\mathbf {\mathbb {\tilde {F}}}_{T_1}^T,\ldots , \mathbf {\mathbb {\tilde {F}}}_{T_{\tilde {m}n_T}}^T \big ]^T\)with \(\mathbf {\mathbb {\tilde {F}}}_{T_i} \in \mathbb {R}^{r_v\times r_v}\) can also be reformulated as \(\xi _{v_r}^{T}\mathbf {\mathbb {\tilde {F}}}_{T_r}\xi _{v_r},\) where \(\mathbf {\mathbb {\tilde {F}}}_{T_r}=\big [ \tilde {\mathbb {F}}_{T_{r_1}}^{T},\ldots , \tilde {\mathbb {F}}_{T_{r_{r_T}}}^{T} \big ]^{T} \in \mathbb {R}^{r_v\times r_v\times r_T} \) with \(\tilde {\mathbb {F}}_{T_{r_j}}=\displaystyle \sum _{i=1}^{\tilde {m}n_T}v_{ji}\mathbf {\mathbb {\tilde {F}}}_{T_i}\in \mathbb {R}^{r_v\times r_v}, j=1,\ldots ,r_T ,\, \mathbf {V}=\left (\mathbf {v}_{ij} \right ) \in \mathbb {R}^{\tilde {m}n_T\times r_T}.\) Here, the reduction matrix V is defined and computed as in (18.10). Instead of a dense tensor as in the previous section, here \(\mathbf {\mathbb {\tilde {F}}}_{T_r}\) is in block-diagonal form which is sparse. Combining the above block structured reduced electrical and thermal subsystems, we obtain the modified BDSM-ET ROMs of (18.1) in the form of (18.2) with system matrices

Hence, by construction, the modified BDSM-ET method constructs sparser ROMs than the BDSM-ET method proposed in [2], since all its reduced matrices and the tensor are block-wise sparse as also illustrated in the next section.

4 Numerical Experiments

In this section, we illustrate the efficiency of the modified BDSM-ET method by examining three ET coupled models from industrial applications, namely, a package model (n = 9193, m = 34,  = 68), a power-MOS model (n = 13, 216, m = 6,  = 12), and a power cell model (n = 925, 286, m = 408,  = 816) as shown in Table 18.1. The first two ET models are nonlinear quadratic DAEs of the form (18.1), while the last model is a linear DAE, i.e., Simulations on the first two ET models are done in MATLAB®Version 2012b on a Laptop with 6 GB RAM, CPU@ 2.00 GHz. Simulation on the power cell model is done on a Unix compute server with 1 TB main memory.

Table 18.1 Dimension comparison of ROMs, r = r v  + r T

All these models can be reformulated into an equivalent decoupled system of the form (18.3). Then, the numerical solutions are obtained by applying the built-in MATLAB function mldivide(/) to the electrical subsystem and the implicit-Euler integration scheme to the thermal subsystem in the desired time interval. We reduce each ET decoupled model using the PRIMA-ET, BDSM-ET and the proposed modified BDSM-ET methods. The PRIMA-ET method uses the Gaussian elimination and PRIMA methods, to reduce the order of the electrical and thermal subsystems, without applying the superposition principle. The other two MOR methods are as discussed in Sects. 18.2 and 18.3, respectively.

In Table 18.1, n T is the order of the thermal subsystem, n v is the order of the electrical subsystem, r v is the order of the reduced electrical subsystem, r T is the order of the reduced thermal subsystem, r = r v  + r T is the order of the reduced ET coupled model, “%Red” means the reduction rate in % w.r.t. the original order n. In Table 18.2, “Stor. (Mb)” is the storage requirement, “Error” is the maximum output relative error in time domain, “Speed-up” represents the speed-up factor w.r.t. the time for simulating the original large model. From Table 18.1, we can see that PRIMA-ET was unable to reduce the large model with dimension 925,286, because of memory limitations. Comparing the BDSM-ET type methods with the PRIMA-ET method, we see that both methods produce accurate ROMs with large speed-ups as shown in Table 18.2. The modified BDSM-ET ROMs are computationally cheaper than the BDSM-ET ROMs yet with almost the same accuracy, especially for large models. For the case of the power cell model, the modified BDSM-ET ROM is 170.6 faster than the BDSM-ET method. This is due to the fact that the resulting reduced model is completely block-wise sparse (see Fig. 18.4), and each block is very small w.r.t. the original order n, which results in a very sparse ROM. Furthermore, it requires much less storage requirements, since it constructs sparse ROMs as illustrated in Figs. 18.1, 18.2, 18.3 and 18.4. In Table 18.3, we compare the off-line costs which are the times to construct the ROMs. We can observe that modified BDSM-ET ROMs are computationally more expensive to construct compared to the other ROMs and their computational cost depends on the number of inputs.

Fig. 18.1
figure 1

Comparison of the sparsity of the reduced matrix E r , n = 9193

Fig. 18.2
figure 2

Comparison of the sparsity of the reduced matrix A r , n = 9193

Fig. 18.3
figure 3

Comparison of the sparsity of the first nonzero slice of the reduced tensor

Fig. 18.4
figure 4

Comparison of the sparsity of the reduced power cell matrix A r , n = 925, 286

Table 18.2 Efficiency comparison of ROMs
Table 18.3 Off-line cost comparison of ROMs

In Fig. 18.5, we compare the outputs at port 611, y 611, given by the BDSM-ET type ROMs and the original power-cell model. The power-cell model corresponds to a power-transistor design of ONN that is intended for use in smart-power ICs. The system is excited by 408 inputs defined as below.

$$\displaystyle \begin{aligned} u_i=\left\{ \begin{array}[c]{ll} 5, & 1\leq i \leq 200, \\ 0, & i=201,\\ 0, & i=202, t\in [0, 10^{-7}),\\ 1.5( 10^7 t -1), &i=202, t\in [10^{-7}, 2\times 10^{-7}],\\ 1.5, &i=202, t\in (2\times 10^{-7}, 5\times 10^{-7}],\\ 10, & i=203,\\ 0, & i=204,\\ 26.85 & 205 \leq i \leq 408. \end{array} \right. \end{aligned}$$

The initial condition for all electrical state variables is 0 V, and the initial condition for all thermal state variables is 26.85∘C. We used the implicit-Euler integration scheme on a nonuniform grid in the time interval [0, 0.002 s] to simulate the thermal subsystem.

Fig. 18.5
figure 5

Comparison of the outputs at port 611, (y 611, n = 925, 286). (a) The thermal flux. (b) The relative error

Both methods introduce very small relative errors as shown in Fig. 18.5b. The ROM error is defined as

$$\displaystyle \begin{aligned} \max\limits_{i \in \{t_1,\ldots,t_{29}\}} \|y_i-y_{r_i}\|{}_2/\|y_i\|, \end{aligned}$$

where \(y_i \in \mathbb R^{n\times \ell }\) is the output, obtained from the original power-cell model, it is a vector containing all the output values at the ith nonuniform time step t i , i = 1, …, 29 in the time interval [0, 2.0 × 10−3 s].

5 Conclusion

We have proposed a modified BDSM-ET method for ET coupled problems with many inputs arising from industrial applications. The modified BDSM-ET method produces sparse yet accurate ROMs compared with the BDSM-ET method. Finally, the proposed method allows independent calculations which attracts parallelization. This could be a topic in the future.