Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Model Order Reduction of Nonlinear Network Problems

The dynamics of an electrical circuit can in general be described by a nonlinear, first order, differential-algebraic equation (DAE) system of the formFootnote 1:

$$\displaystyle\begin{array}{rcl} \frac{d} {dt}\mathbf{q}(\mathbf{x}(t)) + \mathbf{j}(\mathbf{x}(t)) + \mathbf{B}\mathbf{u}(t) = \mathbf{0},& &{}\end{array}$$
(6.1a)

completed with the output mapping

$$\displaystyle\begin{array}{rcl} \mathbf{y}(t) = \mathbf{h}(\mathbf{x}(t),\mathbf{u}(t)).& &{}\end{array}$$
(6.1b)

In the state equation (6.1a), which arises from applying modified nodel analysis (MNA) to the network graph, \(\mathbf{x}(t) \in \mathbb{R}^{n}\) represents the unknown vector of circuit variables at time \(t \in \mathbb{R}\); \(\mathbf{q},\mathbf{j}: \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}\) describe the contribution of reactive and nonreactive elements, respectively and \(\mathbf{B} \in \mathbb{R}^{n\times m}\) distributes the input excitation \(\mathbf{u}: \mathbb{R} \rightarrow \mathbb{R}^{m}\). The system’s response \(\mathbf{y}(t) \in \mathbb{R}^{p}\) is a possibly nonlinear function \(\mathbf{h}: \mathbb{R}^{n} \times \mathbb{R}^{m} \rightarrow \mathbb{R}^{q}\) of the system’s state x(t) and inputs u(t).

In circuit design, (6.1a) is often not considered to describe the overall design but rather to be a model of a subcircuit or subblock. Connection to and communication with a block’s environment is done via its terminals, i.e. external nodes. Therefore, we assume in the remainder of this document that the inputs u(t) and outputs y(t) denote terminal voltages and terminal currents, respectively, or vice versa, which are injected and extracted linearly, i.e., the output mapping is assumed to be of the form

$$\displaystyle{ \mathbf{y}(t) = \mathbf{C}\mathbf{x}(t), }$$
(6.1c)

with \(\mathbf{C} \in \mathbb{R}^{p\times n}\).

The dimension n of the unknown vector x(t) is of the order of the number of elements in the circuit, which can easily reach hundreds of millions. Therefore, one may solve the network equations (6.1a) and (6.1c) by means of computer algebra in an unreasonable amount of time only.

Model order reduction (MOR) aims to replace the original model (6.1a) and (6.1c) by a system

$$\displaystyle\begin{array}{rcl} \frac{d} {dt}\hat{\mathbf{q}}(\mathbf{z}(t)) +\hat{ \mathbf{j}}(\mathbf{z}(t)) +\hat{ \mathbf{B}}\mathbf{u}(t) = \mathbf{0},& & \\ \mathbf{\hat{y}}(t) =\hat{ \mathbf{C}}\mathbf{z}(t),& &{}\end{array}$$
(6.2)

with \(\mathbf{z}(t) \in \mathbb{R}^{r}\); \(\hat{\mathbf{q}},\hat{\mathbf{j}}: \mathbb{R}^{r} \rightarrow \mathbb{R}^{r}\) and \(\hat{\mathbf{B}} \in \mathbb{R}^{r\times m}\) and \(\hat{\mathbf{C}} \in \mathbb{R}^{p\times r}\), which can compute the system response \(\mathbf{\hat{y}}(t) \in \mathbb{R}^{p}\) that is sufficiently close to y(t) given the same input signal u(t), but in much less time.

1.1 Linear Versus Nonlinear Model Order Reduction

So far most research effort was spent on developing and analysing MOR techniques suitable for linear problems. For an overview on these methods we refer to [1].

When trying to transfer approaches from linear MOR, fundamental differences emerge.

To see this, first consider a linear problem of the form

$$\displaystyle\begin{array}{rcl} \mathbf{E} \frac{d} {dt}\mathbf{x}(t) + \mathbf{A}\mathbf{x}(t) + \mathbf{B}\mathbf{u}(t)& =& \mathbf{0},\quad \mbox{ with $\mathbf{E},\mathbf{A} \in \mathbb{R}^{n\times n}$,} \\ \mathbf{y}(t)& =& \mathbf{C}\mathbf{x}(t). {}\end{array}$$
(6.3)

The state x(t) is approximated in a lower dimensional space of dimension r ≪ n, spanned by basis vectors which we subsume in \(\mathbf{V} = (\mathbf{v}_{1},\mathop{\ldots },\mathbf{v}_{r}) \in \mathbb{R}^{n\times r}\):

$$\displaystyle{ \mathbf{x}(t) \approx \mathbf{V}\mathbf{z}(t),\quad \text{with}\;\;\mathbf{z}(t) \in \mathbb{R}^{r}. }$$
(6.4)

The reduced state z(t), i.e., the coefficients of the expansion in the reduced space, is defined by a reduced dynamical system. Applying Galerkin technique, this reduced system arises from projecting (6.3) on a test space spanned by the columns of some matrix \(\mathbf{W} \in \mathbb{R}^{n\times r}\). There, W and V are chosen, such that their columns are biorthonormal, i.e., \(\mathbf{W}^{T}\mathbf{V} = \mathbf{I}_{r\times r}\). The Galerkin projectionFootnote 2 yields

$$\displaystyle\begin{array}{rcl} \hat{\mathbf{E}} \frac{d} {dt}\mathbf{z}(t) +\hat{ \mathbf{A}}\mathbf{z}(t) +\hat{ \mathbf{B}}\mathbf{u}(t)& =& \mathbf{0}, \\ \mathbf{y}(t)& =& \hat{\mathbf{C}}\mathbf{z}(t){}\end{array}$$
(6.5)

with \(\hat{\mathbf{E}} = \mathbf{W}^{T}\mathbf{E}\mathbf{V}\), \(\hat{\mathbf{A}} = \mathbf{W}^{T}\mathbf{A}\mathbf{V} \in \mathbb{R}^{r\times r}\) and \(\hat{\mathbf{B}} = \mathbf{W}^{T}\mathbf{B} \in \mathbb{R}^{r\times m}\), \(\hat{\mathbf{C}} = \mathbf{C}\mathbf{V} \in \mathbb{R}^{p\times r}\). The system matrices \(\hat{\mathbf{E}},\hat{\mathbf{A}},\hat{\mathbf{B}},\hat{\mathbf{C}}\) of this reduced substitute model are of smaller dimension and constant, i.e., need to be computed only once. However, \(\hat{\mathbf{E}},\hat{\mathbf{A}}\) are usually dense whereas the system matrices E and A are usually very sparse.

Applying the same technique directly to the nonlinear system means obtaining the reduced formulation (6.2) by defining \(\hat{\mathbf{q}}(\mathbf{z}) = \mathbf{W}^{T}\mathbf{q}(\mathbf{V}\mathbf{z})\) and \(\hat{\mathbf{j}}(\mathbf{z}) = \mathbf{W}^{T}\mathbf{j}(\mathbf{V}\mathbf{z})\). Clearly, \(\hat{\mathbf{q}}\) and \(\hat{\mathbf{j}}\) map from \(\mathbb{R}^{r}\) to \(\mathbb{R}^{r}\).

To solve network problems of type (6.2) numerically, usually multistep methods are used. This means that at each timepoint t l a nonlinear equation

$$\displaystyle{ \alpha \hat{\mathbf{q}}(\mathbf{z}_{l}) +\boldsymbol{\hat{\beta }} _{l} +\hat{ \mathbf{j}}(\mathbf{z}_{l}) +\hat{ \mathbf{B}}\mathbf{u}(t_{l}) = \mathbf{0} }$$
(6.6)

has to be solved for z l which is the approximation of z(t l ). In the above equation α is the integration coefficient of the method and \(\boldsymbol{\hat{\beta }}_{l} \in \mathbb{R}^{r}\) contains history from previous timesteps. Newton techniques that are used to solve (6.6) usually require an update of the system’s Jacobian matrix in each iterations ν:

$$\displaystyle{ \mathbf{\hat{J}}_{l}^{(\nu )} = \left (\alpha \frac{\partial \hat{\mathbf{q}}} {\partial \mathbf{z}} + \frac{\partial \hat{\mathbf{j}}} {\partial \mathbf{z}}\right )\Big\vert _{\mathbf{z}=\mathbf{z}_{l}^{(\nu )}} = \mathbf{W}^{T}\left [\alpha \frac{\partial \mathbf{q}} {\partial \mathbf{x}}\, +\, \frac{\partial \mathbf{j}} {\partial \mathbf{x}}\right ]\Big\vert _{\mathbf{x}^{(\nu )}=\mathbf{V}\mathbf{z}_{l}^{(\nu )}}\mathbf{V}. }$$
(6.7)

The evaluation of the reduced system, i.e., \(\hat{\mathbf{q}}\) and \(\hat{\mathbf{j}}\), necessitates in each step the back projection of the argument z to its counterpart Vz followed by the evaluation of the full system q and j and the projection to the reduced space with W and V.

Consequently, with respect to computation time no reduction will be obtained unless additional measures are taken or other strategies are pursued.

1.2 Some Nonlinear MOR Techniques

In MOR for linear systems especially methods based on Krylov subspaces [19] and balanced realization [30] are well understood and highly elaborated. Hence, it seems likely to adapt them to nonlinear problems, too. In the following, we shortly describe these approaches and give references for further reading.

1.2.1 Krylov Subspace Methods in Nonlinear MOR

In linear MOR Krylov subspace methods are used to construct reduced order models of systems (6.3) such that the moments, i.e., the coefficients in a Taylor expansion of the frequency domain transfer function of original and reduced system match up to a certain order. The transfer function \(\mathbf{H}: \mathbb{C} \rightarrow \mathbb{C}^{p\times m}\) is defined by the linear equation \(\mathbf{H}(s) = \mathbf{C}(s\mathbf{E} + \mathbf{A})^{-1}\mathbf{B}\).

It is not straightforward to define a transfer function for the nonlinear problem (6.1a) and (6.1c). Instead, there are Krylov based techniques that deal with bilinear systems (6.8) or linear periodically time varying (LPTV) problems (6.9).

$$\displaystyle\begin{array}{rcl} \text{bilinear system:}& & \quad \frac{d} {dt}\mathbf{\hat{x}}(t) +\hat{ \mathbf{A}}\mathbf{\hat{x}}(t) + \mathbf{\hat{N}}\mathbf{\hat{x}}(t)\mathbf{u}(t) +\hat{ \mathbf{B}}\mathbf{u}(t) = \mathbf{0}{}\end{array}$$
(6.8)
$$\displaystyle\begin{array}{rcl} \text{ LPTV system:}& & \quad \frac{d} {dt}[\mathbf{E}(t)\mathbf{x}(t)] + \mathbf{A}(t)\mathbf{x}(t) + \mathbf{B}\mathbf{u}(t) = \mathbf{0}{}\end{array}$$
(6.9)

The type of problem (6.8) arises from expanding a nonlinear problem \(\dot{\mathbf{x}}(t) + \mathbf{f}(\mathbf{x}(t)) + \mathbf{B}\mathbf{u}(t) = \mathbf{0}\) around an equilibrium point. Systems of type (6.9) with matrices E(t), A(t) that are periodic with some period T one gets when linearising the system (6.1) around a periodic steady state solution with \(\mathbf{x}^{0}(t + T) = \mathbf{x}^{0}(t)\).

Volterra-series expansion, followed by multivariable Laplace-transformation and multimoment expansions are the key to apply Krylov subspace based MOR. For further reading we refer to [15] and the references therein.

In case of the LPTV systems, a timevarying system function H(s, t) can be defined. This plays the role of a transfer function and can be determined by a differential equation. H(s, t) has to be determined in terms of time- or frequency samples on [0, T) for one s. Krylov techniques can then be applied to get a reduced system with which samples for different frequencies s can be constructed. We refer to [21] and the references therein.

Given a nonlinear problem (6.1a) and (6.1c) of dimension n, the bilinear system that is reduced actually has a dimension of \(n + n^{2} + n^{3} + \cdots \), depending on the order of the expansion. Similar, the system in the LPTV case that is subject to reduction has dimension k ⋅ n with k being the number of timesamples in the initial determination of H(s, t). Therefore, it seems that these methods are suitable for small to medium sized nonlinear problems only.

1.2.2 Balanced Truncation in Nonlinear MOR

The energy L c (x 0) that is needed to drive a system to a given state x 0 and the energy L o (x 0) the system provides to observe the state x 0 it is in are the main terms in Balanced Truncation. A system is called balanced if states that are hard to reach are also hard to observe and vice versa, i.e. L c (x) large implies L o (x). Truncation, i.e. reduced order modelling is then done by eliminating these states.

For linear problems L c and L o are connected directly, by means of algebraic calculation, to the reachability and observability Gramians P and Q, respectively. These can be computed from Lyapunov equations, involving the system matrices E, A, B, C of the linear system (6.3). Balancing is reached by transforming the state space such that P and Q are simultaneously diagonalised:

$$\displaystyle{ \mathbf{P} = \mathbf{Q} = \text{diag}(\sigma _{1},\mathop{\ldots },\sigma _{n}) }$$

with the so called Hankel singular values \(\sigma _{1},\mathop{\ldots },\sigma _{n}\). From the basis that arises from the transformation only those basis vectors that correspond to large Hankel singular values are kept. The main advantage of this approach is that there exists an a priori computable error bound for the truncated system.

In transferring Balanced Truncation to nonlinear problems, three main tracks can be recognized. Energy consideration is the common ground for the three directions.

In the approach suggested in [20] the energy functions arise from solving Hamilton-Jacobi differential equations. Similar to the linear case, a state-space transformation is searched such that L c and L o are formulated as quadratic form with diagonal matrix. The magnitude of the entries are then basis to truncation again. The transformation is now state dependent, and instead of singular values, we get singular value functions. As the Hamilton-Jacobi system has to be solved and the varying state-space transformations have to be computed, it is an open issue, how the theory could be applied in a computer environment.

In Sliding Interval Balancing [46], the nonlinear problem is first linearised around a nominal trajectory, giving a linear time varying system like (6.9). At each state finite time reachability and observability Gramians are defined and approximated by truncated Taylor series expansion. Analytic calculations, basically the series expansions, connect the local balancing transformation smoothly. This necessary step is the limiting factor for this approach in circuit simulation.

Finally, balancing is also applied to bilinear systems (6.8). Here the key tool are so called algebraic Gramians arising from generalised Lyapunov equations. However, no one-to-one connection between these Gramians and the energy functions L c , L o can be made, but rather they can serve to get approximative bounds for the aforementioned. Furthermore, convergence parameters have to be introduced to guarantee the solvability of the generalised Lyapunov equations. For further details we refer to [9, 14] and the references therein.

1.3 TPWL and POD

In view of high dimensional problems in circuit simulation and feasibility in a computational environment, Trajectory PieceWise Linearization (TPWL) and Proper Orthogonal Decomposition (POD) are amongst the most promising approaches for the time being. The basic idea of TPWL is to replace nonlinearity with a collection of linear substitute problems and apply MOR on these. The background of POD is to identify a low dimensional manifold the solution resides on and reformulate the problem in such a way that it is solved in terms of the basis of this principal manifold.

In the following we give more details on the steps done for both approaches.

1.3.1 Trajectory PieceWise Linearization

The idea of TPWL [33], is to represent the full nonlinear system (6.1a) and (6.1c) by a set of order reduced linear models that can reproduce the typical behaviour of the system.

Since its introduction in [33, 34], TPWL has gained a lot of interest and several adaptions have been made, see e.g., [18, 39, 49]. In the following we will basically follow the lines in the original works [33, 34] and briefly mention alternatives that have been suggested.

For extracting a model, a training input \(\mathbf{\bar{u}}(t)\) for t ∈ [t start, t end] is chosen and a transient simulation is run in order to get a trajectory, i.e. a collection of points \(\mathbf{x}_{0},\mathop{\ldots },\mathbf{x}_{N}\), approximating x(t i ) at timepoints \(t_{\text{start}} = t_{0} < t_{1} < \cdots < t_{N} = t_{\text{end}}\). The training input is chosen such that the trajectory it causes, reflects the typical state of the system. On the trajectory, points \(\{\mathbf{x}_{0}^{\text{lin}},\mathop{\ldots },\mathbf{x}_{s}^{\text{lin}}\} \subset \{\mathbf{x}_{0},\mathop{\ldots },\mathbf{x}_{N}\}\) are chosen around which the nonlinear functions q and j are linearised:

$$\displaystyle\begin{array}{rcl} \mathbf{q}(\mathbf{x}(t)) \approx \mathbf{q}(\mathbf{x}_{i}^{\text{lin}}) + \mathbf{E}_{ i} \cdot \left (\mathbf{x}(t) -\mathbf{x}_{i}^{\text{lin}}\right );\quad \mathbf{j}(\mathbf{x}(t)) \approx \mathbf{j}(\mathbf{x}_{ i}^{\text{lin}}) + \mathbf{A}_{ i} \cdot \left (\mathbf{x}(t) -\mathbf{x}_{i}^{\text{lin}}\right ),& &{}\end{array}$$
(6.10)

with \(\mathbf{E}_{i} = \frac{\partial \mathbf{q}} {\partial \mathbf{x}}\Big\vert _{\mathbf{x}=\mathbf{x}_{i}^{\text{lin}}}\) and \(\mathbf{A}_{i} = \frac{\partial \mathbf{j}} {\partial \mathbf{x}}\Big\vert _{\mathbf{x}=\mathbf{x}_{i}^{\text{lin}}}\).

Then the nonlinear state-space equation (6.1a) can locally be replaced locally around x i lin for \(i = 1,\mathop{\ldots },s\) by

$$\displaystyle{ \frac{d} {dt}\left [\mathbf{E}_{i}\mathbf{x(t)} +\boldsymbol{\delta } _{i}\right ] + \mathbf{A}_{i}\mathbf{x}(t) +\boldsymbol{\gamma } _{i} + \mathbf{B}\mathbf{u}(t) = \mathbf{0}, }$$
(6.11)

with \(\boldsymbol{\delta }_{i} = \mathbf{q}(\mathbf{x}_{i}^{\text{lin}}) -\mathbf{E}_{i}\mathbf{x}_{i}^{\text{lin}}\) and \(\boldsymbol{\gamma }_{i} = \mathbf{j}(\mathbf{x}_{i}^{\text{lin}}) -\mathbf{A}_{i}\mathbf{x}_{i}^{\text{lin}}\).

Fig. 6.1
figure 1

TPWL – model extraction and usage

One approach, used by Rewieński [33], to get a model that represents the nonlinear problem on a larger range, is to combine the local models (6.11) to

$$\displaystyle{ \frac{d} {dt}\left (\sum _{i=0}^{s}w_{ i}(\mathbf{x(t)})\left [\mathbf{E_{i}}\mathbf{x}(t) +\boldsymbol{\delta } _{i}\right ]\right ) +\sum _{ i=0}^{s}w_{ i}(\mathbf{x(t)})\left [\mathbf{A}_{i}\mathbf{x}(t) +\boldsymbol{\gamma } _{i}\right ] + \mathbf{B}\mathbf{u}(t) = \mathbf{0}, }$$
(6.12a)

where \(w_{i}: \mathbb{R}^{n} \rightarrow [0,1]\) for \(s = 1,\mathop{\ldots },s\) is a state-dependent weight-function. The weighting functions w i are chosen such that w i (x(t)) is large for x close to x i lin and such that \(w_{0}(\mathbf{x}(t)) + \cdots + w_{s}(\mathbf{x}(t)) = 1\).

A different way to define a global substitute model, suggested by Voß [49] is

$$\displaystyle{ \sum _{i=0}^{s}w_{ i}(\mathbf{x}(t))\left (\mathbf{E}_{i} \frac{d} {dt}\mathbf{x}(t) + \mathbf{A}_{i}\mathbf{x}(t) +\gamma _{i}\right ) + \mathbf{B}\mathbf{u}(t) = \mathbf{0}. }$$
(6.12b)

Although different in definition, in deployment both approaches (6.12a) and (6.12b) are equivalent, as we will see later.

Figure 6.1 illustrates the idea: Along a training trajectory, extracted from a full dimensional simulation, a set of locally valid linear models is created. When this model is used for simulation with a different input, the existing linear models are turned on and off, adapted to the state the system is in at one moment.

Simulation of the piecewise linearized system (6.12a) or (6.12b) may already be faster than simulation of the original nonlinear system. However, the linearized system can be reduced by using model order techniques for linear systems to increase efficiency.

The main difference between linear MOR and TPWL is that the latter introduces in addition to the application of a linear MOR technique the selection of linearization points (to get a linear problem) and the weighting of the linear submodels (to recover the global nonlinear behavior).

1.3.1.1 Reducing the System

Basically, any MOR-technique for linear problems can be applied to the linear submodels (6.11), i.e., \(\left (\mathbf{E}_{i},\mathbf{A}_{i},\left [\mathbf{B},\boldsymbol{\gamma }_{i}\right ],\mathbf{C}\right )\). Note that we did extend the columns of B with \(\boldsymbol{\gamma }_{i}\) – thus MOR may exploit refinements for multiple terminals (see, f.i., Sect. 6.2 in this Chapter). Originally Rewieński [33] proposed the usage of Krylov-based reduction. Vasilyev, Rewieński and White [40] introduced Balanced Truncation to TPWL and Voß [49] uses Poor Man’s TBR (PMTBR) as linear MOR kernel. Each of these methods creates local subspaces, spanned by the columns of projection matrices \(\mathbf{V}_{i} \in \mathbb{R}^{n\times r_{i}}\) for \(i = 0,\mathop{\ldots },s\). For some comparisons on different MOR methods used within TPWL, see [29]. For comparison between TPWL and POD (see Sect. 6.1.3.2), see [7, 42].

In a second step one global subspace is created from the information contained in the local subspaces. This is done by applying a singular value decomposition (SVD) on the aggregated matrix \(\mathbf{V}_{\text{agg}} = [\mathbf{V}_{0},\mathbf{x}_{0}^{\text{lin}};\mathop{\ldots };\mathbf{V}_{s},\mathbf{x}_{s}^{\text{lin}}]\). Note that the x j lin are “snapshots” in time of the nonlinear solution. Their span actually forms a POD-subspace (see Sect. 6.1.3.2) that is collected on-the-fly within TPWL. The inclusion reduces the error of the solution of the reduced model [7].

The final reduced subspace is then spanned by the r dominating left singular vectors, subsumed in \(\mathbf{V} \in \mathbb{R}^{n\times r}\). Furthermore let \(\mathbf{W} \in \mathbb{R}^{n\times r}\) be the corresponding test matrix, where often we have W = V. Then a reduced order model for the piecewise-linearized system (6.12a) is

$$\displaystyle{ \frac{d} {dt}\left (\sum _{i=0}^{s}w_{ i}(\mathbf{V}\mathbf{z}(t))\left [\hat{\mathbf{E}}_{i}\mathbf{z}(t) +\boldsymbol{\hat{\delta }} _{i}\right ]\right ) +\sum _{ i=0}^{s}w_{ i}(\mathbf{V}\mathbf{z}(t))\left [\hat{\mathbf{A}}_{i}\mathbf{z}(t) +\boldsymbol{\hat{\gamma }} _{i}\right ] +\hat{ \mathbf{B}}\mathbf{u}(t) = \mathbf{0}, }$$
(6.13)

with \(\hat{\mathbf{E}}_{i} = \mathbf{W}^{T}\mathbf{E}_{i}\mathbf{V}\), \(\hat{\mathbf{A}}_{i} = \mathbf{W}^{T}\mathbf{A}_{i}\mathbf{V}\), \(\boldsymbol{\hat{\delta }}_{i} = \mathbf{W}^{T}\boldsymbol{\delta }_{i}\), \(\boldsymbol{\hat{\gamma }}_{i} = \mathbf{W}^{T}\boldsymbol{\gamma }_{i}\) and \(\hat{\mathbf{B}} = \mathbf{W}^{T}\mathbf{B}\).

1.3.1.2 Selection of Linearization Points

A crucial point in TPWL is to decide, which linearization points \(\mathbf{x}_{0}^{\text{lin}},\mathop{\ldots },\mathbf{x}_{s}^{\text{lin}}\) should be chosen. With a large number of such points, we could expect to find a linear model suitable to reproduce the nonlinear behaviour locally. But, this would especially cause to store huge amount of data, making the final model slow. On the other hand, if too few points are chosen to linearise around, the nonlinear behaviour will not be reflected correctly. Different strategies to decide upon adding a new linearization point, and hence a new model automatically exist:

  • In the original work, Rewieński [33, 34] suggests to check at each accepted timepoint t during simulation for the relative distance of the current state x k  ≈ x(t k ) of the nonlinear problem to all yet existing i linearization states \(\mathbf{x}_{0}^{\text{lin}},\mathop{\ldots },\mathbf{x}_{i-1}^{\text{lin}}\). If the minimum is equal to or greater than some parameter α > 0, i.e.

    $$\displaystyle{ \min _{0\leq j\leq i-1}\left (\frac{\|\mathbf{x}_{k} -\mathbf{x}_{j}^{\text{lin}}\|_{\infty }} {\|\mathbf{x}_{j}^{\text{lin}}\|_{\infty }} \right ) \geq \alpha, }$$
    (6.14)

    x k becomes the (i + 1)st linearization point. Accordingly, a new linear model, arising from linearizing around \(\mathbf{x}_{i+1}^{\text{lin}} = \mathbf{x}_{k}\) is added to the collection. The parameter α is chosen depending on the steady state of the system (6.1a).

  • In [49] the mismatch of nonlinear and linear system motivates the creation of a new linearization point and an additional linear model: at each timepoint during training both the nonlinear and a currently valid linear system are computed in parallel with the same stepsize. If the difference of the two approximations to the true solution at a timepoint t k+1 becomes too large, a new linear model is created from linearizing the nonlinear system around the state the system was in at the previous timepoint t k .

  • The strategy pursued by Dong an Roychowdhury [18] is similar to (6.14). Here, not deviations between states but function evaluations at the current approximation x k and the linearization points x j lin are considered.

  • In Martinez [28] an optimization criterion is used to determine the linearization points. The technique exploits the Hessian of the system as an error bound metric.

1.3.1.3 Determination of the Weights

When replacing the full nonlinear problem with the TPWL model (6.13) the weights \(w_{i}: \mathbb{R}^{n} \rightarrow [0,1]\) are responsible for switching between the linear submodels, i.e., for choosing the linear model that reflects best the behaviour caused by the nonlinearity.

Besides the specifications of the desired behaviour, made before, one wants to have minimum complexity, i.e., one aims at having to deal with a combination of just a small number of linear submodels at each timepoint. It is hence obvious that the weight functions have to be nonlinear in nature. Again, different strategies exist:

  • Both Rewieński [33] and Voß [49] use

    $$\displaystyle{ w_{i}(\mathbf{x}) = e^{- \frac{\beta }{m}\cdot d_{i}(\mathbf{x})},\quad \mbox{ with $d_{ i}(\mathbf{x}) =\| \mathbf{x} -\mathbf{x}_{i}^{\text{lin}}\|_{ 2}$ and $m =\min _{i}d_{i}(\mathbf{x})$.} }$$
    (6.15a)

    The constant β adjusts how abrupt the change of models is. A typical value is β = 25.

  • Dong and Roychowdhury [18], however, use

    $$\displaystyle{ w_{i}(\mathbf{x}) = \left ( \frac{m} {d_{i}(\mathbf{x})}e^{\frac{-d_{i}(\mathbf{x})-m} {M} }\right )^{\mu }, }$$
    (6.15b)

    where d i (x) and m are the same as in (6.15a) and M is the minimum distance, taken in the 2-norm, amongst the linearization points. The parameter μ is chosen from {1, 2}.

In both cases, the weights are normalized such that i w i (x) = 1.

Clearly the nonlinearity of the weights causes the TPWL-model (6.13) arising from (6.12a) – and similar the reduced model that would originate from (6.12b) – to be nonlinear still. That means, after applying a numerical integration scheme to (6.13), still a nonlinear problem has to be solved to get an approximation z k  ≈ z(t k ). To overcome this problem, both Rewieński [33] and Voß [49] decouple the evaluation of the weights from the time discretisation by replacing

$$\displaystyle{ w_{i}(\mathbf{V}\mathbf{z}_{k}) \rightsquigarrow w_{i}(\mathbf{V}\mathbf{\tilde{z}}_{k})\quad \mbox{ with $\mathbf{\tilde{z}}_{k} \approx \mathbf{z}_{k}$}, }$$

i.e., for calculating z k from the discretisation of (6.13) at t = t k , z k in the weighting is replaced by a cheaper approximation \(\mathbf{\tilde{z}}_{k}\). It is easy to see that with this action, (6.12a) and (6.12b) are equivalent.

Note:

The work of Dong and Roychowdhury [18] does actually not consider a piecewise linear a but piecewise polynomial approach, i.e., the Taylor expansions in (6.10) contain one more coefficient, leading to the need for reducing local bilinear systems. Tiwary and Rutenbar [39] look into details of implementing a TPWL-technique in an economic way.

1.3.1.4 TPWL and Time-Domain MOR

In [53] the TPWL approach is combined with wavelet expansions that are defined directly in the time-domain. For wavelets in circuit simulation we refer to [16, 17] and for technical details to [10, 11, 51, 52]. After linearizing a differential equation

$$\displaystyle{ \frac{d} {dt}\mathbf{x}(t) = \mathbf{f}(\mathbf{x}(t)) + \mathbf{B}\mathbf{u}(t) }$$
(6.16)

at x i  = x(t i ), we obtain that \(\tilde{\mathbf{x}}(t) = \mathbf{x}(t) -\mathbf{x}_{i}\) is approximately given by

$$\displaystyle\begin{array}{rcl} \frac{d} {dt}\tilde{\mathbf{x}}(t)& =& \mathbf{f}(\mathbf{x}_{i}) + \mathbf{A}(\tilde{\mathbf{x}}(t) -\mathbf{x}_{i}) + \mathbf{B}\mathbf{u}(t),{}\end{array}$$
(6.17)
$$\displaystyle\begin{array}{rcl} & =& \mathbf{A}\tilde{\mathbf{x}}(t) + \mathbf{f}(\mathbf{x}_{i}) -\mathbf{A}\mathbf{x}_{i} + \mathbf{B}\mathbf{u}(t).{}\end{array}$$
(6.18)

where \(\mathbf{A} = \frac{\partial f(x)} {\partial x} (t_{i})\). The output request y(t) = Cx(t) transfers to \(\tilde{\mathbf{y}}(t) = \mathbf{C}\mathbf{x}_{i} + \mathbf{C}\tilde{\mathbf{x}}(t)\), in which the first term is known. Thus it is sufficient to consider on an interval [0, T] the sum of the solutions of the two problems

$$\displaystyle\begin{array}{rcl} \frac{d} {dt}\mathbf{x}(t)& =& \mathbf{A}\mathbf{x}(t) + \mathbf{B}\mathbf{u}(t),{}\end{array}$$
(6.19)
$$\displaystyle\begin{array}{rcl} \mathbf{y}(t)& =& \mathbf{C}\mathbf{x}(t){}\end{array}$$
(6.20)

and

$$\displaystyle\begin{array}{rcl} \frac{d} {dt}\mathbf{x}(t)& =& \mathbf{A}\mathbf{x}(t) + \mathbf{f}(\mathbf{x}_{i}) -\mathbf{A}\mathbf{x}_{i},{}\end{array}$$
(6.21)
$$\displaystyle\begin{array}{rcl} \mathbf{y}(t)& =& \mathbf{C}\mathbf{x}(t).{}\end{array}$$
(6.22)

Assuming T being integer (see [53] for the more general case), for a wavelet order J we get \(M = 2^{J} \cdot T + 3\) basis functions θ j (t), j = 1, … M. We can write x(t) = H 1 θ(t), and x(t) = H 2 θ(t), respectively, where θ(t) = (θ 1(t), , θ M (t)) and \(\mathbf{H}_{1},\;\mathbf{H}_{2} \in \mathbb{R}^{n\times M}\). We can plug these expressions into (6.19) and into (6.21). However, note that in (6.19) the source term is time-dependent, while in (6.21) the source term is constant. Hence rather then to consider (6.19), one considers

$$\displaystyle\begin{array}{rcl} \frac{d} {dt}\mathbf{x}(t)& =& \mathbf{A}\mathbf{x}(t) + \mathbf{B}\delta (t),{}\end{array}$$
(6.23)
$$\displaystyle\begin{array}{rcl} \mathbf{y}(t)& =& \mathbf{C}\mathbf{x}(t){}\end{array}$$
(6.24)

where δ(t) is an impulse excitation with the property \(\int _{0_{-}}^{t}\delta (\tau )d\tau = 1\). Then, for (6.23), the matrix H 1 satisfies

$$\displaystyle{ \mathbf{H}_{1} \frac{d} {dt}\theta (t) = \mathbf{A}\mathbf{H}_{1}\theta (t) + \mathbf{B}\delta (t) }$$
(6.25)

Assuming that the wavelets have their support in [0, T] we derive

$$\displaystyle{ \mathbf{H}_{1}\theta (t) = \mathbf{A}\mathbf{H}_{1}\int _{0_{-}}^{t}\theta (\tau )d\tau + \mathbf{B}. }$$
(6.26)

In [53] one applies collocation using M collocation points. Next H 1 is found after solving the resulting Sylvester equation. A similar approach is done for (6.21). Now one determines V i  = Orthog(H 1, H 2, x i ) (note that this is similar to the multiple terminal approach mentioned before for the frequency domain case). From (V 1, , V s ) one determines an overall orthonormal basis \(\mathbf{V} \in \mathbb{R}n \times r\) that is used for the projection as before.

1.3.2 Proper Orthogonal Decomposition and Adaptions

The Proper Orthogonal Decomposition (POD) method, also known as the Principal Component Analysis and Karhunen–Loève expansion, provides a technique for analysing multidimensional data [24, 27].

In this section we briefly describe some basics of POD. For a more detailed introduction to POD in MOR we refer to [31, 47]. For further studies we point to [32], which addresses error analysis for MOR with POD and [50] where the connection of POD to balanced model reduction can be found.

POD sets work on data extracted from a benchmark simulation. In a finite dimensional setup like it is given by (6.1a), K snapshots of the state x i  ≈ x(t i ), the system is in during the training interval [t start, t end], are collected in a snapshot matrix

$$\displaystyle{ \mathbf{x} = \left (\mathbf{x}_{1},\mathop{\ldots },\mathbf{x}_{K}\right )\;\; \in \mathbb{R}^{n\times K}. }$$
(6.27)

The snapshots, i.e., the columns of x, span a space of dimension k ≤ K. We search for an orthonormal basis \(\{\mathbf{v}_{1},\mathop{\ldots },\mathbf{v}_{k}\}\) of this space that is optimal in the sense that the time-averaged error that is made when the snapshots are expanded in the space spanned by just r < k basis vectors to \(\mathbf{\tilde{x}}_{r,i}\),

$$\displaystyle{ \langle \|\mathbf{x} -\mathbf{\tilde{x}}_{r}\|_{2}^{2}\rangle \quad \text{with the averaging operator}\quad \langle \mathbf{f}\rangle = \frac{1} {K}\sum _{i=1}^{K}\mathbf{f}_{ i} }$$
(6.28)

is minimised. This least squares problem is solved by computing the eigenvalue decomposition of the state covariance matrix \(\frac{1} {K}\mathbf{x}\mathbf{x}^{T}\) or, equivalently by the singular value decomposition (SVD) of the snapshot matrix (assuming K > n)

$$\displaystyle{ \mathbf{x} = \mathbf{U}\mathbf{S}\mathbf{T}\quad \text{with}\quad \mathbf{U} \in \mathbb{R}^{n\times n},\mathbf{T} \in \mathbb{R}^{K\times K}\;\text{and}\;\mathbf{S} = \left (\begin{matrix}\scriptstyle \sigma _{1} \\ \scriptstyle &\scriptstyle \ddots \\ \scriptstyle &\scriptstyle &\scriptstyle \sigma _{n}\end{matrix}\;\Big\vert \;\mathbf{0}_{n\times (K-n)}\right ), }$$
(6.29)

where U t and T are orthogonal and the singular values satisfy \(\sigma _{1} \geq \sigma _{2} \geq \cdots \sigma _{n} \geq 0\). The matrix \(\mathbf{V} \in \mathbb{R}^{n\times r}\) whose columns span the reduced subspace is now build from the first r columns of u, where the truncation r is chosen such that

$$\displaystyle{ \frac{\sum _{i=1}^{r}\sigma _{i}^{2}} {\sum _{i=1}^{n}\sigma _{i}^{2}} \geq \frac{d} {100}, }$$
(6.30)

where usually d = 99 is usually a reasonable choice. For the, in this way constructed matrix, it holds V T V = I r×r . Therefore, Galerkin projection as described above can be applied to create a reduced system (6.2).

However, as mentioned in Sect. 6.1.1 the cost for evaluating the nonlinear functions q, j is not reduced. In the following we describe some adaptions to POD that have been made to overcome this problem.

1.3.3 Missing Point Estimation

The Missing Point Estimation (MPE) was proposed by Astrid [2, 4] to reduce the cost of updating system information in the solution process of time varying systems arising in computational fluid dynamics. Verhoeven and Astrid [3] brought the MPE approach forward to circuit simulation.

Once a POD basis is constructed, there is no Galerkin projection deployed. Instead a numerical integration scheme is applied which in general leads to system of n nonlinear equations, analogue to (6.6), for the r dimensional unknown z l , that approximate z(t l ). In MPE this system is reduced to dimension g with r ≤ g < n by discarding ng equations. Formally this can be described by multiplying the system with a selection matrixFootnote 3 P g  ∈ { 0, 1}g×n, stating a g-dimensional overdetermined problem

$$\displaystyle{ \alpha \mathbf{\bar{q}}(\mathbf{V}\mathbf{z}_{l}) + \mathbf{P}_{g}\boldsymbol{\beta }_{l} + \mathbf{\bar{j}}(\mathbf{V}\mathbf{z}_{l}) + \mathbf{P}_{g}\mathbf{B}\mathbf{u}(t_{l}) = \mathbf{0}, }$$
(6.31)

with \(\mathbf{\bar{q}}(\mathbf{V}\mathbf{z}_{l}) = \mathbf{P}_{g}\mathbf{q}(\mathbf{V}\mathbf{z}_{l})\) and \(\mathbf{\bar{j}}(\mathbf{V}\mathbf{z}_{l}) = \mathbf{P}_{g}\mathbf{j}(\mathbf{V}\mathbf{z}_{l})\). The system (6.31) is solved at each timepoint t l for z l in the least-squares sense [3, 41, 44, 45].

The effect of P g operating on q(⋅ ) and j(⋅ ) is the same as evaluating only the g ≪ n components of q and j corresponding to the columns P g has a 1 in.

The choice of P g is motivated by identifying the g most dominant state variables, i.e., components of x. In terms of the POD basis this is connected to restricting the orthogonal V to \(\mathbf{\tilde{V}} = \mathbf{P}_{g}\mathbf{V} \in \mathbb{R}^{g\times r}\) in an optimal way. This in turn goes down to

$$\displaystyle{ \min _{\mathbf{P}_{g}}\|\left (\mathbf{\tilde{V}}^{T}\mathbf{\tilde{V}}\right )^{-1} -\mathbf{I}_{ r\times r}\|. }$$
(6.32)

Details on reasoning and solving (6.32) can be found in [4].

1.3.4 Adapted POD

A second approach to reduce the work of evaluating the nonlinear functions, Adapted POD, was proposed in [41, 4345]. Having done an SVD (6.29) on the snapshot matrix, not directly a projection matrix V is defined from the singular values and vectors. Instead the matrix \(\mathbf{L} = \mathbf{u}\varSigma \in \mathbb{R}^{n\times n}\), with \(\varSigma = \text{diag}(\sigma _{1},\mathop{\ldots },\sigma _{n})\) is defined. Hence, L arises from scaling the left-singular vectors with the corresponding singular values. Although L is not orthogonal, its columns are. Next we transform the original system (6.1a) by writing x(t) = Lw(t) with \(\mathbf{w}(t) \in \mathbb{R}^{n}\) and using the Galerkin approach:

$$\displaystyle{ \frac{d} {dt}\left [\mathbf{L}^{T}\mathbf{q}(\mathbf{L}\mathbf{w}(t))\right ] + \mathbf{L}^{T}\mathbf{j}(\mathbf{L}\mathbf{w}(t)) + \mathbf{L}^{T}\mathbf{B}\mathbf{u}(t) = \mathbf{0}. }$$
(6.33)

At this point, L and L T are treated as two different matrices, one acting on the parameter of the function, the other on the value. For both L and L T we identify the r and g, most dominant columns. A measure for the significance of a column vector \(v \in \mathbb{R}^{n}\) is its 2-norm \(\|v\|_{2}\).

As the columns of L are ordered according to the singular values, we will pick the first r columns in this case. Now L and L T are approximated by matrices that agree with the respective matrix in the selected r and g selected columns but have the nr and ng, respectively, remaining columns set to \(\mathbf{0} \in \mathbb{R}^{n}\). This can be expressed with the help of selection matrices P r  ∈ { 0, 1}r×n and P g  ∈ { 0, 1}g×n, respectively:

$$\displaystyle{ \mathbf{L} \approx \mathbf{L}\mathbf{P}_{r}^{T}\mathbf{P}_{ r}\quad \text{and}\quad \mathbf{L}^{T} \approx \mathbf{L}^{T}\mathbf{P}_{ g}^{T}\mathbf{P}_{ g}. }$$
(6.34)

We may conclude L T ≈ P r T P r L T P g T P g , insert these approximations in (6.33) and multiply with P r , bearing in mind that P r P r T = I r×r :

$$\displaystyle{ \frac{d} {dt}\left [\mathbf{P}_{r}\mathbf{L}^{T}\mathbf{P}_{ g}^{T}\mathbf{P}_{ g}\mathbf{q}(\mathbf{L}\mathbf{P}_{r}^{T}\mathbf{P}_{ r}\mathbf{\tilde{w}})\right ]+\mathbf{P}_{r}\mathbf{L}^{T}\mathbf{P}_{ g}^{T}\mathbf{P}_{ g}\mathbf{j}(\mathbf{L}\mathbf{P}_{r}^{T}\mathbf{P}_{ r}\mathbf{\tilde{w}})+\mathbf{P}_{r}^{T}\mathbf{L}^{T}\mathbf{B}\mathbf{u} = \mathbf{0}. }$$
(6.35)

Note that due to the approximations to L and L T in the above equation w has changed to \(\mathbf{\tilde{w}}\) which can merely be an approximation to the former. We introduce \(\boldsymbol{\varSigma }_{r} = \text{diag}(\sigma _{1},\mathop{\ldots },\sigma _{r})\) and let \(\mathbf{V} \in \mathbb{R}^{n\times r}\) be the first r columns of u. In this wa we have LP r T = VS r . Finally we scale (6.35) with \(\boldsymbol{\varSigma }_{r}^{-1}\) and introduce a new unknown \(\mathbf{z} =\boldsymbol{\varSigma } _{r}\mathbf{P}_{r}\mathbf{\tilde{w}} \in \mathbb{R}^{r}\) from which we can reconstruct the full state by approximation x ≈ Vz. We end up with

$$\displaystyle{ \frac{d} {dt}\left [\mathbf{W}_{r,g}\mathbf{\bar{q}}(\mathbf{V}\mathbf{z})\right ] + \mathbf{W}_{r,g}\mathbf{\bar{j}}(\mathbf{V}\mathbf{z}) + \mathbf{\tilde{B}}\mathbf{u}(t) = \mathbf{0}, }$$
(6.36)

with \(\mathbf{\bar{q}}(\mathbf{V}\mathbf{z}) = \mathbf{P}_{g}\mathbf{q}(\mathbf{V}\mathbf{z})\), \(\mathbf{\bar{j}}(\mathbf{V}\mathbf{z}) = \mathbf{P}_{g}\mathbf{j}(\mathbf{V}\mathbf{z})\), \(\mathbf{W}_{r,g} = \mathbf{V}^{T}\mathbf{P}_{g}^{T} \in \mathbb{R}^{r\times g}\) and \(\mathbf{\tilde{B}} = \mathbf{V}^{T}\mathbf{B}\).

Here P g has the same effect as noted in the previous subsection: not the full nonlinear functions q and j have to be evaluated but g components only.

1.3.5 Discrete Empirical Interpolation

Recently, Chaturantabut and Sorensen [12, 13] did present the Discrete Empirical Interpolation Method (DEIM) as a further modification of POD. It originates from partial differential equations (PDEs) where the nonlinearities exhibit a special structure. It can, however, be applied to general nonlinearities as well. We give a brief introduction of how this may look like in circuit simulation problems.

Given a nonlinear function \(\mathbf{f}: \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}\), the essential idea of DEIM is to approximate f(x) by projecting it on a subspace, spanned by the basis \(\{\mathbf{u}_{1},\mathop{\ldots },\mathbf{u}_{g}\} \subset \mathbb{R}^{n}\):

$$\displaystyle{ \mathbf{f}(\mathbf{x}) \approx \mathbf{U}\mathbf{c}(\mathbf{x}), }$$
(6.37)

where \(\mathbf{U} = (\mathbf{u}_{1},\mathop{\ldots },\mathbf{u}_{g}) \in \mathbb{R}^{n\times g}\) and \(\mathbf{c}(\mathbf{x}) \in \mathbb{R}^{g}\) is the coefficient vector. Forcing equality in (6.37) would state an overdetermined system for the g < n coefficients c t(x). Instead accordance in g rows is required, which can be expressed by

$$\displaystyle{ \mathbf{P}_{g}\mathbf{f}(\mathbf{x}) = (\mathbf{P}_{g}\mathbf{U})\mathbf{c}(\mathbf{x}), }$$
(6.38)

with a selection matrix P g  ∈ { 0, 1}g×n. If P g U is non-singular, (6.38) has a unique solution c(x) and, hence f(x) can be approximated by

$$\displaystyle{ \mathbf{f}(\mathbf{x}) \approx \mathbf{U}\left (\mathbf{P}_{g}\mathbf{U}\right )^{-1}\mathbf{P}_{ g}\mathbf{f}(\mathbf{x}), }$$
(6.39)

which means that f(x) is interpolated at the entries specified by P g .

In (6.39), \(\mathbf{U}\left (\mathbf{P}_{g}\mathbf{U}\right )^{-1}\) can be computed in advance, and, again, the multiplication P g f(x) corresponds to evaluating only those entries of f, addressed by P g .

Using the notations introduced before, POD with the DEIM modification yields a reduced model (6.2) with

$$\displaystyle\begin{array}{rcl} \hat{\mathbf{q}}(\mathbf{z}(t)) = \mathbf{\hat{W}}\;\mathbf{\bar{q}}(\mathbf{z}(t)),\quad \hat{\mathbf{j}}(\mathbf{z}(t)) = \mathbf{\hat{W}}\;\mathbf{\bar{j}}(\mathbf{z}(t)),\quad \hat{\mathbf{B}} = \mathbf{V}^{T}\mathbf{B},\quad \hat{\mathbf{C}} = \mathbf{C}\mathbf{V}^{T},& &{}\end{array}$$
(6.40)

with \(\mathbf{\hat{W}} = \mathbf{V}^{T}\mathbf{U}\left (\mathbf{P}_{g}\mathbf{U}\right )^{-1}\) and \(\mathbf{\bar{q}}(\cdot ) = \mathbf{P}_{g}\mathbf{q}(\cdot )\) and \(\mathbf{\bar{j}}(\cdot ) = \mathbf{P}_{g}\mathbf{j}(\cdot )\). Here POD provides the state-space part of the reduction, i.e., V. And DEIM determines the subspace on which q and j is projected, hence the columns of the matrix \(\mathbf{U} \in \mathbb{R}^{n\times g}\) and the selection P g .

The reduced subspace, suitable for representing a nonlinear function f on, is constructed from an SVD on a matrix \(\mathbf{F} = (\mathbf{f}(\mathbf{x}_{1}),\mathop{\ldots },\mathbf{f}(\mathbf{x}_{K})) \in \mathbb{R}^{n\times K}\) whose columns are snapshots of the function evaluations. The matrix U in (6.39) consists then of the g most dominant left singular vectors of F.

The core of DEIM is the construction of the selection P g  ∈ { 0, 1}g×n. A set of indices \(\{\rho _{1},\mathop{\ldots },\rho _{g}\} \subset \{ 1,\mathop{\ldots },n\}\), determined by the DEIM-algorithm, define the selection matrix, meaning that P g has a 1 in the ith row and ρ i th column (for \(i = 1,\mathop{\ldots },g\)) and 0 elsewhere.

The first index, ρ 1 is chosen to be the index of the largest (in absolute value) entry in u 1. In step \(l = 2,\mathop{\ldots },g\) the residual

$$\displaystyle{ \mathbf{r}_{l+1} = \mathbf{U}_{l+1} -\mathbf{U}_{l}\left (\mathbf{P}_{l}\mathbf{U}_{l}\right )^{-1}\mathbf{P}_{ l}\mathbf{U}_{l+1} }$$

is computed where \(\mathbf{U}_{l} = (\mathbf{u}_{1},\mathop{\ldots },\mathbf{u}_{l})\) and P l  ∈ { 0, 1}l×n is constructed from the indices \(\rho _{1},\mathop{\ldots },\rho _{l}\) (cp. (6.39)). Then, the index corresponding to entry of the residual r l+1 the largest magnitude of is taken as index ρ l+1.

Setting up the selection matrix with this algorithm, P g U in (6.37) is guaranteed to be regular. For a detailed description and discussion, including error estimates we refer to [12, 13].

Note:

Originally, DEIM is constructed in the context of discretisation and approximation of PDEs with a special structure of the nonlinearity involved. Considering network problem (6.1a) that leads to the reduced problem (6.40), we constructed a uniform DEIM-approximation, i.e., U and P g for the both nonlinearities, q and j involved. This could probably be approached in a different way, too.

1.4 Other Approaches

We shortly address some other approaches. In [5, 6, 8, 48] Krylov-subspace methods are applied to bilinear and quadratic-bilinear ODE-systems. One exploits the observation that several nonlinear functions can be generated by extending the system first with additional unknowns for which simple differential equations are introduced. In [48] also the application to DAEs is discussed. In [22] a transformation from a set of nonlinear differential equations to another set of equivalent nonlinear differential equations that involve only quadratic terms of state variables is described to which Volterra analysis is applied to derive a reduced model.

We already mentioned [20] for nonlinear balancing in which the energy functions arise from solving Hamilton-Jacobi differential equations. Related work is on cross Gramians for dissipative and symmetric nonlinear systems [25, 26].

In [37, 38] interpolating input-output behavior of nonlinear systems is studied. This is related to table modelling.

Fig. 6.2
figure 2

Nonlinear transmission line

1.5 Numerical Experiments

For testing purposes, a time-simulator, has been implemented in octave. The underlying DAE integration scheme used here is CHORAL [23], a Rosenbrock-Wanner type of method, adapted to circuitry problems. Besides performing transient-analysis, TPWL and POD models can be extracted and reused in simulations.

To show the performance of TPWL and POD when applied to an example from circuit design, the nonlinear transmission line in Fig. 6.2, taken from [33] is chosen. Only the diodes introduce the designated nonlinearity to the circuit, as the current ı d traversing a diode is modeled by \(\imath _{d}(v) =\exp (40 \cdot v) - 1\) where v is the voltage drop between the diode’s terminals. The resistors and capacitors contained in the model have unit resistance and capacitance \((R = C = 1)\), respectively. The current source between node 1 and ground marks the input to the system u(t) = ı(t) and the output of the system is chosen to be the voltage at node 1: y(t) = v 1(t).

Introducing the state vector \(\mathbf{x}(t) = (v_{1}(t),\mathop{\ldots },v_{N}(t))^{T} \in \mathbb{R}^{N}\), where v i (t) describes the voltage at node \(i \in \{ 1,\mathop{\ldots },N\}\) modified nodal analysis yields:

$$\displaystyle\begin{array}{rcl} \frac{d} {dt}\mathbf{x}(t) + \mathbf{j}(\mathbf{x}(t)) + \mathbf{B}u(t)& =& \mathbf{0} \\ y(t)& =& \mathbf{C}\mathbf{x}(t),{}\end{array}$$
(6.41)

where \(\mathbf{B} = \mathbf{C}^{T} = (1,0,\mathop{\ldots },0)^{T} \in \mathbb{R}^{N}\) and \(\mathbf{j}: \mathbb{R}^{N} \rightarrow \mathbb{R}^{N}\) with

$$\displaystyle{ \mathbf{j}(\mathbf{x}) = \left (\begin{array}{*{10}c} 2 &-1\\ -1 & 2&-1 \\ & \ddots & \ddots & \ddots\\ & &-1 & 2 &-1 \\ & & &-1& 1 \end{array} \right )\cdot \mathbf{x}-\left (\begin{array}{*{10}c} 2 - e^{40x_{1}} - e^{40(x_{1}-x_{2})} \\ e^{40(x_{1}-x_{2})} - e^{40(x_{2}-x_{3})}\\ \vdots \\ e^{40(x_{N-2}-x_{N-1})} - e^{40(x_{N-1}-x_{N})} \\ e^{40(x_{N-1}-x_{N})} - 1 \end{array} \right ) }$$

We choose N = 100, causing a problem of dimension n = 100.

For extracting a model a shifted Heaviside function was used as training input. Resimulation was done both with the training input and with a cosine function on the interval [t start, t end] = [0, 10]:

$$\displaystyle\begin{array}{rcl} u_{\text{train}}(t) = H(t - 3) = \left \{\begin{array}{@{}l@{\quad }l@{}} 0\quad &t < 3\\ 1\quad &t \geq 3 \end{array} \right.\quad \quad u_{\text{resim}}(t) = \frac{1} {2}\left (1 +\cos \left ( \frac{2\pi } {10}t\right )\right ).& & {}\\ \end{array}$$

The TPWL-model was extracted with the Arnoldi-method as suggested in [33], leading to a order reduced model of dimension 10. For choosing linearization points, the strategy proposed by Rewieński with α = 0. 0167 in (6.14) has been tested. With this setting, 27 linear models are constructed. Also the extended strategy described in Voß [49] is implemented, but does not show much different results for the transmission line. A more detailed discussion on the model extraction and statistics on which models are chosen can be found in [35, 36].

For the transmission line, also a POD model as well as a POD model that has been modified with the Discrete Empirical Interpolation Method (DEIM) algorithm is constructed. By choosing d = 99. 9 in (6.30) a reduced model of dimension 4 is constructed. Applying the DEIM algorithm the nonlinear q and j where reduced to order 5. Figure 6.3 displays the singular values form snapshots collected during a training run and the behaviour of the coverage function (6.30). Note, that only 38 singular values are shown, although the full system is of dimension 100. This is caused by the time domain simulation: with tolerances specified for the timestepping mechanism, only 38 time steps where necessary to resolve the system. However, also with more snapshots, the gradient of the singular values does not change remarkably.

Fig. 6.3
figure 3

Transmission line: singular values (+) & coverage (∗)

Fig. 6.4
figure 4

Nonlinear transmission line: resimulation results

Figures 6.4 and 6.5 show the trajectories, i.e., the behaviour in time, of the voltages at nodes 1 and ten, when the training signal is and when the cosine like signal is applied at the input, respectively. The plots show the signals reproduced by using the full model, the TPWL-model and the plain POD and DEIM-adapted POD model. Slight deviations from the reference solution are obvious, but, in total, a good matching is observable. However, the TPWL-model seems to have problems following the reference solution, when a signal, different to the input is applied. This indicates that there are still improvements possible.

Fig. 6.5
figure 5

Transmission line: different input

Finally, Table 6.1 gathers the performance of the models, measured in time consumption. Clearly, simulation with the TPWL model is cheaper than using the full network as not the full nonlinearity has to be evaluated. Still, POD, adapted with DEIM is superior, as no decision has to be made, which model to use. Furthermore, as predicted in Sect. 6.1.1 applying only projection without taking care of the nonlinearity, does not guarantee cheaper to evaluate model: the plain POD model, used for simulation, causes equal or even increased computational expenses.

Table 6.1 Transmission line: performance of nonlinear MOR techniques

2 Model Order Reduction for Multi-terminal Circuits

Analysis of effects due to parasitics is of vital importance during the design of large-scale integrated circuits and derived products.Footnote 4 One way to model parasitics is by means of parasitic extraction, which results in large linear RCL(k) networks. In ESD analysis [65, 75], for instance, the interconnect network is modeled by resistors with resistances that are based on the metal properties. In other (RF) applications one needs RC or even RCLk extractions to deal accurately with higher frequencies as well.

The resulting parasitic networks may contain up to millions of resistors, capacitors, and inductors, and hundreds of thousands of internal nodes, and thousands of external nodes (nodes with connections to active elements such as transistors). Simulation of such large networks within reasonable time is often not possible [62, 63], and including such networks in full system simulations may be even unfeasible. Hence, there is need for much smaller networks that accurately or even exactly describe the behavior of the original network, but allow for fast analysis.

In this section we describe recently developed methods for the reduction of large R networks, and present a new approach for the reduction of large RC networks. We show how insights from graph theory, numerical linear algebra, and matrix reordering algorithms can be used to construct a reduced network that shows sparsity preservation especially for circuits with multi-terminals (ports). Hence it allows for the same number of external nodes, but needs much fewer internal nodes and circuit elements (resistors and capacitors). Circuit synthesis is applied after model reduction, and the resulting reduced netlists are tested with industrial circuit simulators. For related literature we refer to [5557].

The section is organized as follows. Section 6.2.1 revisits recent work on reduction of R networks [83, 84]. It provides the basis for understanding how graph theoretical tools can be used to significantly improve the sparsity of the reduced models, which are later synthesized [70] into reduced netlists. Section 6.2.2 deals with the reduction of RC networks. Section 6.2.2.1 first reviews an existing method which employs Pole Analysis via Congruence Transformations (PACT) [73] to reduce RC netlists with multi-terminals. In Sect. 6.2.2.2 the new method Sparse Modal Approximation (SparseMA) is presented, where graph-theoretical tools are brought in to enhance sparsity preservation for the reduced models. The numerical results for both R and RC netlist reduction are presented in Sect. 6.2.3. Section 6.2.4 concludes.

2.1 Reduction of R Networks

In this section we review the approach for reducing R networks, as developed in [83, 84]. Reduction of R networks, i.e., networks that consist of resistors only, is needed in electro-static discharge analysis (ESD), where large extracted R networks are used to model the interconnect. Accurate modeling of interconnect is required here, since the costs involved may vary from a few cents to millions if, due to interconnect failures, a respin of the chip is needed. An example of a damaged piece of interconnect that was too small to conduct the amount of current is shown in Fig. 6.6.

Fig. 6.6
figure 6

Example of a piece of interconnect that was damaged because it was too small to conduct the amount of current caused by a peak charge

2.1.1 Circuit Equations and Matrices

Kirchhoff’s Current Law and Ohm’s Law for resistors lead to the following system of equations for a resistor network with N resistors (resistor i having resistance r i ) and n nodes (n < N):

$$\displaystyle{ \left [\begin{array}{*{10}c} R &P\\ -P^{T } & 0 \end{array} \right ]\left [\begin{array}{*{10}c} \mathbf{i}_{b}\\ \mathbf{v} \end{array} \right ] = \left [\begin{array}{*{10}c} \mathbf{0}\\ \mathbf{i} _{n} \end{array} \right ], }$$
(6.42)

where \(R = \mbox{ diag}(r_{1},\ldots,r_{N}) \in \mathbb{R}^{N\times N}\) is the resistor matrix, P ∈ {−1, 0, 1}N×n is the incidence matrix, \(\mathbf{i}_{b} \in \mathbb{R}^{N}\) are the resistor currents, \(\mathbf{i}_{n} \in \mathbb{R}^{n}\) are the injected node currents, and \(\mathbf{v} \in \mathbb{R}^{n}\) are the node voltages.

The MNA (modified nodal analysis) formulation [60, 76] can be derived from (6.42) by eliminating the resistor currents \(\mathbf{i}_{b} = -R^{-1}P\mathbf{v}\):

$$\displaystyle{ G\mathbf{v} = \mathbf{i}_{n}, }$$
(6.43)

where \(G = P^{T}R^{-1}P \in \mathbb{R}^{n\times n}\) is symmetric positive semidefinite. Since currents can only be injected in external nodes, and not in internal nodes of the network, system (6.43) has the following structure:

$$\displaystyle{ \left [\begin{array}{*{10}c} G_{11} & G_{12} \\ G_{12}^{T}&G_{22} \end{array} \right ]\left [\begin{array}{*{10}c} \mathbf{v}_{e} \\ \mathbf{v}_{i}\end{array} \right ] = \left [\begin{array}{*{10}c} B\\ 0 \end{array} \right ]\mathbf{i}_{e}, }$$
(6.44)

where \(\mathbf{v}_{e} \in \mathbb{R}^{n_{e}}\) and \(\mathbf{v}_{i} \in \mathbb{R}^{n_{i}}\) are the voltages at external and internal nodes, respectively (\(n = n_{e} + n_{i}\)), \(\mathbf{i}_{e} \in \mathbb{R}_{e}^{n}\) are the currents injected in external nodes, \(B \in \{-1,0,1\}^{n_{e}\times n_{e}}\) is the incidence matrix for the current injections, and \(G_{11} = G_{11}^{T} \in \mathbb{R}^{n_{e}\times n_{e}}\), \(G_{12} \in \mathbb{R}^{n_{e}\times n_{i}}\), and \(G_{22} = G_{22}^{T} \in \mathbb{R}^{n_{i}\times n_{i}}\). The block G 11 is also referred to as the terminal block.

A current source (with index s) between terminals a and b with current j results in contributions B a, s  = 1, \(B_{b,s} = -1\), and i e (s) = j. If current is only injected in a terminal a (for instance if a connects the network to the top-level circuit), the contributions are B a, s  = 1 and i e (s) = j.

Finally, systems (6.42)–(6.44) must be made consistent by grounding a node gnd, i.e., setting v(gnd) = 0 and removing the corresponding equations. In the following we will still use the notation G for the grounded system matrix, if this does not lead to confusion.

2.1.2 Problem Formulation

The problem is: given a very large resistor network described by (6.42), find an equivalent network with (a) the same external nodes, (b) exactly the same path resistances between external nodes, (c) \(\hat{n} \ll n\) internal nodes, and (d) \(\hat{r} \ll r\) resistors. Additionally, (e) the reduced network must be realizable as a netlist so that it can be (re)used in the design flow as subcircuit of large systems.

Simply eliminating all internal nodes will lead to an equivalent network that satisfies conditions (a)–(c), but violates (d) and (e): for large numbers m of external nodes, the number of resistors \(\hat{r} = (m^{2} - m)/2\) in the dense reduced network is in general much larger than the number of resistors in the sparse original network (r of O(n)), leading to increased memory and CPU requirements.

2.1.3 Existing Approaches

There are several approaches to deal with large resistor networks. In some cases the need for an equivalent reduced network can be circumvented in some way: due to sparsity of the original network, memory usage and computational complexity are in principle not an issue, since solving linear systems with the related conductance matrices is typically of complexity O(n α), where 1 < α ≤ 2, instead of the traditional O(n 3) [79]. Of course, α depends on the sparsity and will rapidly increase as sparsity decreases. This also explains why eliminating all internal nodes does not work in practice: the large reduction in unknowns is easily undone by the enormous increase in number of resistors, mutually connecting all external nodes.

However, if we want to (re)use the network in full system simulations, a reduced equivalent network is needed to limit simulation times or make simulation possible at all. In [77] approaches based on large-scale graph partitioning packages such as (h)METIS [72] are described, but only applied to small networks. Structure preserving projection methods for model reduction [66, 86], finally, have the disadvantage that they lead to dense reduced-order models if the number of terminals is large. There is commercial software [59, 64] available for the reduction of parasitic reduction networks.

2.1.4 Improved Approach

Knowing that eliminating all internal nodes is not an option and that projection methods lead to dense reduced-order models, we use concepts from matrix reordering algorithms such as AMD [54] and BBBD [88], usually used as preprocessing step for (parallel) LU- or Cholesky-factorization, to determine which nodes to eliminate. The fill-in reducing properties of these methods also guarantee sparsity of the reduced network. Similar ideas have also been used in [77, 89].

Our main motivation for this approach is that large resistor networks in ESD typically are extracted networks with a structure that is related to the underlying (interconnect) layout. Unfortunately, the extracted networks are usually produced by extraction software of which the algorithms are unknown, and hence the structure of the extracted network is difficult to recover. Standard tools from graph theory, however, can be used to recover at least part of the structure.

Our approach can be summarized as follows:

  1. 1.

    The first step is to compute the strongly connected components [61] of the network. The presence of strongly connected components is very natural in extracted networks: a piece of interconnect connecting two other elements such as diodes or transistors, for instance, results in an extracted network with two terminals, disconnected from the rest of the extracted circuit. By splitting the network into connected components, we have simplified the problem of reduction because we can deal with the connected components one by one.

  2. 2.

    The second step is to selectively eliminate internal nodes in the individual connected components. For resistor networks, this can be done using the Schur complement [67], and no approximation error is made. The key here is that those internal nodes are eliminated that give the least fill-in. First, (Constrained) AMD [62] is used to reorder the unknowns such that the terminal nodes will be among the last to eliminate. To find the optimal reduction, internal nodes are eliminated one-by-one in the order computed by AMD, while keeping track of the reduced system with fewest resistors.

    Since the ordering is chosen to minimize fill-in, the resulting reduced matrix is sparse. Note that all operations are exact, i.e., we do not make any approximations. As a result, the path resistances between external nodes remain equal to the path resistances in the original network.

  3. 3.

    Finally, the reduced conductance matrix can be realized as a reduced resistor network that is equivalent to the original network. This is done easily by unstamping the values in the G matrix intro the corresponding resistor values and their node connections in the netlist [69]. Since the number of resistors (and number of nodes) is smaller than in the original network, also the resulting netlist is smaller in size.

An additional reduction could be obtained by removing relatively large resistors from the resulting reduced network. However, this will introduce an approximation error that might be hard to control a priori, since no sharp upper bounds on the error are available [87]. Another issue that is subject to further research is that the optimal ratio of number of (internal) nodes to resistors (sparsity) may also depend on the ratio of number of external to internal nodes, and on the type of simulation that will be done with the network.

In the following sections we will describe how strongly connected components and fill-in minimizing reorderings can be used for the reduction of RC networks as well.

2.2 Reduction of RC Networks

This section presents the developments for RC netlist reduction, first by reviewing an existing approach called PACT (Pole Analysis via Congruence Transformations). Then, graph-based tools are brought in to enhance sparsity preservation with the novel reduction method, SparseMA (Sparse Modal Approximation).

Following the problem description in [73], consider the modified nodal analysis (MNA) description of an input impedance type RC circuit, driven by input currents:

$$\displaystyle{ (\mathbf{G} + s\mathbf{C})\mathbf{x}(s) = \mathbf{B}\mathbf{u}(s), }$$
(6.45)

where x denote the node voltages, and u represent the currents injected into the terminals (also called ports or external nodes). The number of internal nodes is n, and the number of terminals is p, thus \(\mathbf{G} \in \mathbb{R}^{(p+n)\times (p+n)}\), \(\mathbf{C} \in \mathbb{R}^{(p+n)\times (p+n)}\) and \(\mathbf{B} \in \mathbb{R}^{(p+n)\times p}\). A natural choice for the system outputs are the voltage drops at the terminal nodes, i. e., y(s) = B T x(s). Thus the transfer function of (6.45) is the input impedance:

$$\displaystyle{ \mathbf{Z}(s) = \frac{\mathbf{y}(s)} {\mathbf{u}(s)} = \mathbf{B}^{T}(\mathbf{G} + s\mathbf{C})^{-1}\mathbf{B}. }$$
(6.46)

Modal approximation is a method to reduce (6.45), by preserving its most dominant eigenmodes. The dominant eigenmodes are a subset of the poles of Z(s) (i. e. of the generalized eigenvalues Λ(−G, C)) and can be computed using specialized eigenvalue solvers ( SADPA [80] or SAMDP [82, 85]). For the complete discussion on modal approximation and its implementation we refer to [80, 81, 85]. Here, we emphasize that applying modal approximation to reduce (6.45) directly is unsuitable especially if the underlying RC circuit has many terminals (inputs). This is because modal approximation does not preserve the structure of B and B T during reduction (for ease of understanding we denote the input-output structure loss as non-preservation of terminals) [69]. Modeling the input-output connectivity of the reduced model would require synthesis via controlled sources at the circuit terminals, and furthermore would connect all terminals with one-another [69]. In this chapter we present several alternatives for reducing RC netlists where not only the terminals are preserved, but also the sparsity of the reduced models.

Grouping the node voltages so that \(\mathbf{x}_{P} \in \mathbb{R}^{p}\) are the voltages measured at the terminal nodes (ports), and \(\mathbf{x}_{I} \in \mathbb{R}^{n}\) are the voltages at the internal nodes, we can partition (6.45) as follows:

$$\displaystyle\begin{array}{rcl} \left (\left [\!\begin{array}{cc} \mathbf{G}_{P}&\mathbf{G}_{C}^{T} \\ \mathbf{G}_{C}& \mathbf{G}_{I} \end{array} \!\right ] + s\left [\!\begin{array}{cc} \mathbf{C}_{P}&\mathbf{C}_{C}^{T} \\ \mathbf{C}_{C}& \mathbf{C}_{I} \end{array} \!\right ]\right )\left [\!\begin{array}{c} \mathbf{x}_{P} \\ \mathbf{x}_{I} \end{array} \!\right ] = \left [\!\begin{array}{c} \mathbf{B}_{P} \\ \mathbf{0}\end{array} \!\right ]\mathbf{u}.& &{}\end{array}$$
(6.47)

Since no current is injected into internal nodes, the non-zero contribution from the input is \(\mathbf{B}_{P} \in \mathbb{R}^{(p\times p)}\). Eliminating x I , system (6.47) is equivalent to:

$$\displaystyle\begin{array}{rcl} \mathop{\underbrace{\mathop{[(\mathbf{G}_{P} + s\mathbf{C}_{P})}}\limits }_{\mathbf{Y}_{P}(s)}& -& \mathop{\underbrace{\mathop{(\mathbf{G}_{C} + s\mathbf{C}_{C})^{T}(\mathbf{G}_{I} + s\mathbf{C}_{I})^{-1}(\mathbf{G}_{C} + s\mathbf{C}_{C})]}}\limits }_{\mathbf{Y}_{I}(s)}\mathbf{x}_{P} = \mathbf{B}_{P}\mathbf{u}{}\end{array}$$
(6.48)
$$\displaystyle\begin{array}{rcl} \mathbf{Y}(s)& =& \mathbf{Y}_{P}(s) -\mathbf{Y}_{I}(s){}\end{array}$$
(6.49)

In (6.48) the matrix blocks (G P + s C P ) corresponding to the circuit terminals are isolated. Applying modal approximation on Y I (s) would reduce the system and preserve the location of the terminals. This would involve for instance computing the dominant eigenmodes of (−G I , C I ) via a variant of SAMDP (called here frequency dependent SAMDP, because the input-output matrices (G C + s C C ) depend on the frequency s). We have implemented this approach, but it turns out that a large number of dominant eigenmodes of (−G I , C I ) would be needed to capture the DC and offset of the full system Y(s). Instead, two alternatives are presented that improve the quality of the approximation: an existing method called PACT (Pole Analysis via Congruence Transformations) [73] and a novel graph-based reduction called SparseMA (Sparse Modal Approximation).

2.2.1 Existing Method: PACT

In [73] the authors propose to capture the DC and offset of Y(s) via a congruence transformation which reveals the first two moments of Y(s) as follows. Since G I is symmetric positive definite, the Cholesky factorization LL T = G I exists. Using the following congruence transformation:

$$\displaystyle\begin{array}{rcl} \mathbf{X}= \left [\!\begin{array}{cc} \mathbf{I} & \ \ \mathbf{0}\\ -\! \mathbf{G} _{I }^{-1 }\mathbf{G} _{C } &\ \ \mathbf{L} ^{-T} \end{array} \!\right ],\ \ \mathbf{G}^{{\prime}} = \mathbf{X}^{T}\mathbf{G}\mathbf{X} = \left [\!\begin{array}{cc} \mathbf{G}^{{\prime}}_{P}&\mathbf{0} \\ \mathbf{0} & \mathbf{I} \end{array} \!\right ],\ \ \mathbf{C}^{{\prime}} = \mathbf{X}^{T}\mathbf{C}\mathbf{X} = \left [\!\begin{array}{cc} \mathbf{C}^{{\prime}}_{P}&\mathbf{C}^{{\prime}T}_{C} \\ \mathbf{C}^{{\prime}}_{C}& \mathbf{C}^{{\prime}}_{I} \end{array} \!\right ]& &{}\end{array}$$
(6.50)

Eqs. (6.48) and (6.49) are rewritten as:

$$\displaystyle\begin{array}{rcl} \mathop{\underbrace{\mathop{[(\mathbf{G}^{{\prime}}_{P} + s\mathbf{C}^{{\prime}}_{P})}}\limits }_{\mathbf{Y}^{{\prime}}_{P}(s)}& -& \mathop{\underbrace{\mathop{s^{2}\mathbf{C}^{{\prime}T}_{C}(\mathbf{I} + s\mathbf{C}^{{\prime}}_{I})^{-1}\mathbf{C}^{{\prime}}_{C}]}}\limits }_{\mathbf{Y}^{{\prime}}_{I}(s)}\mathbf{x}^{{\prime}}_{ P} = \mathbf{B}_{P}\mathbf{u}{}\end{array}$$
(6.51)
$$\displaystyle\begin{array}{rcl} \mathbf{Y}^{{\prime}}(s)& =& \mathbf{Y}^{{\prime}}_{ P}(s) -\mathbf{Y}^{{\prime}}_{ I}(s),{}\end{array}$$
(6.52)

where:

$$\displaystyle\begin{array}{rcl} \mathbf{G}^{{\prime}}_{ P}& =& \mathbf{G}_{P} -\mathbf{G}_{C}^{T}\mathbf{M},\ \ \ \mathbf{M} = \mathbf{G}_{ I}^{-1}\mathbf{G}_{ C}{}\end{array}$$
(6.53)
$$\displaystyle\begin{array}{rcl} \mathbf{C}^{{\prime}}_{ P}& =& \mathbf{C}_{P} -\mathbf{N}^{T}\mathbf{M} -\mathbf{M}^{T}\mathbf{C}_{ C},\ \ \ \mathbf{N} = \mathbf{C}_{C} -\mathbf{C}_{I}\mathbf{M}{}\end{array}$$
(6.54)
$$\displaystyle\begin{array}{rcl} \mathbf{C}^{{\prime}}_{ C}& =& \mathbf{L}^{-1}\mathbf{N},\ \ \ \mathbf{C}^{{\prime}}_{ I} = \mathbf{L}^{-1}\mathbf{C}_{ I}\mathbf{L}^{-T}.{}\end{array}$$
(6.55)

In (6.51), the term Y P (s) captures the first two moments of Y′(s) and is preserved in the reduced model. The reduction is performed on Y I (s) only. In [73] this is done via modal approximation as described next. Using the symmetric eigendecomposition \(\mathbf{C}^{{\prime}}_{I} = \mathbf{U}\varLambda _{I}^{{\prime}}\mathbf{U}^{T}\), U T U = I, the system matrices (6.50) are block diagonalized as follows:

$$\displaystyle\begin{array}{rcl} \mathbf{X}^{{\prime}}& =& \left [\!\begin{array}{cc} \mathbf{I} & \ \ \mathbf{0} \\ \mathbf{0}&\ \ \mathbf{U} \end{array} \!\right ],\ \ \mathbf{G}^{{\prime\prime}} = \mathbf{X}^{{\prime}T\!}\mathbf{G}^{{\prime}}\mathbf{X}^{{\prime}} = \left [\!\begin{array}{cc} \mathbf{G}^{{\prime}}_{P}&\mathbf{0} \\ \mathbf{0} & \mathbf{I} \end{array} \!\right ] = \mathbf{G}^{{\prime}}{}\end{array}$$
(6.56)
$$\displaystyle\begin{array}{rcl} \mathbf{C}^{{\prime\prime}}& =& \mathbf{X}^{{\prime}T\!}\mathbf{C}^{{\prime}}\mathbf{X}^{{\prime}} = \left [\!\begin{array}{cc} \mathbf{C}^{{\prime}}_{P} & \mathbf{C}^{{\prime}T}_{C} \mathbf{U} \\ \mathbf{U}^{T}\mathbf{C}^{{\prime}}_{C}&\mathbf{U}^{T}\mathbf{C}^{{\prime}}_{I}\mathbf{U} \end{array} \!\right ] = \left [\!\begin{array}{cc} \mathbf{C}^{{\prime}}_{ P} &\mathbf{C}^{{\prime\prime}T}_{ C} \\ \mathbf{C}^{{\prime\prime}}_{C}& \varLambda _{I}^{{\prime}} \end{array} \!\right ]{}\end{array}$$
(6.57)
$$\displaystyle\begin{array}{rcl} \mathbf{Y}^{{\prime\prime}}(s)& =& \mathbf{Y}_{ P}^{{\prime}}(s) - s^{2}[\mathbf{C}^{{\prime\prime}T}_{ C}(\mathbf{I} + s\varLambda _{ I}^{{\prime}})^{\!-\!1}\mathbf{C}^{{\prime\prime}}_{ C}]{}\end{array}$$
(6.58)

The reduced model is obtained by selecting only k of the n eigenvalues from Λ I :

$$\displaystyle\begin{array}{rcl} \mathbf{Y}_{k}^{{\prime\prime}}(s) = \mathbf{Y}_{ P}^{{\prime}}(s) - s^{2}\sum _{ i=1}^{k} \frac{\mathbf{r}_{i}^{T}\mathbf{r}_{ i}} {1 + s\lambda _{i}^{{\prime}}}\ \,\ \ \mathbf{r}_{i}^{T} = \mathbf{C}_{ C}^{{\prime}T\!}\mathbf{U}_{ [:,1:k]},\ \ \lambda _{i}^{{\prime}} = \varLambda ^{{\prime}}_{I [i,i]}.& &{}\end{array}$$
(6.59)

In [73], a selection criterion for λ i ,  i = 1… k is proposed, based on a user-specified error and a maximum frequency. These eigenmodes are computed in [73] via the Lanczos algorithm. The criterion proposed in [81, 85] can also be used to compute the dominant eigenmodes λ i via SAMDP.

The advantage of the PACT reduction method is the preservation of the first two moments of Y(s) in Y P (s). This ensures that the DC and offset of the response is approximated well in the reduced model. The main costs of such an approach are: (1) performing a Cholesky factorization of C I (which becomes expensive when n is very large, (2) solving an eigenvalue problem from a dense C I matrix and, most importantly, (3) the fill-in in the port block matrices G P , C P and in C C . It turns out that (2) can be solved more efficiently by keeping C I as a product of sparse matrices during computation, and will be addressed elsewhere. Avoiding problems (1) and (3) however require new strategies to improve sparsity, and are presented in Sect. 6.2.2.2. The fill-in introduced in G P , C P becomes especially important for RC netlists with many terminals [p ∼ O(103)]. Compared to the original model where the port blocks G P and C P were sparse, the dense G P , C P will yield many R and C components during synthesis, resulting in a reduced netlist where almost all the nodes are interconnected. Simulating such netlists might require longer time measures than the original circuit simulation, hence sparser reduced models (and netlists) are desired. Next, we present several ideas for improving the sparsity of RC reduced models via a combination of tools including: netlist partitioning, graph-based node reordering strategies, and efficient algorithms for modal approximation.

2.2.2 Improved Graph-Based Method: SparseMA

In this section we present an improved model reduction method for RC circuits, which overcomes the disadvantages of PACT: it requires no matrix factorizations prior to reduction, performs all numerical computations on sparse matrices, and most importantly, preserves the sparsity of the matrix blocks corresponding to the external nodes. The method is called sparse modal approximation (SparseMA) and uses tools from graph theory to identify a partitioning and reordering of nodes that, when applied prior to the model reduction step, can significantly improve the sparsity of the reduced model.

The idea is to reorder the nodes in the RC netlist so that some of the internal nodes (m) are promoted as external nodes, together with the circuit terminals (p). We will denote as selected nodes the collection of p + m terminals and promoted internal nodes. The nm internal nodes are the remaining nodes. Supposing one has already identified such a partitioning of nodes, the following structure is revealed, where without loss of generality we assume the selected nodes appear in the border of the G and C matrices:

$$\displaystyle\begin{array}{rcl} \left (\left [\!\begin{array}{cc} \mathbf{G}_{R} &\mathbf{G}_{K} \\ \mathbf{G}_{K}^{T}& \mathbf{G}_{S} \end{array} \!\right ] + s\left [\!\begin{array}{cc} \mathbf{C}_{R} &\mathbf{C}_{K} \\ \mathbf{C}_{K}^{T}& \mathbf{C}_{S} \end{array} \!\right ]\right )\left [\!\begin{array}{c} \mathbf{x}_{R} \\ \mathbf{x}_{S} \end{array} \!\right ] = \left [\!\begin{array}{c} \mathbf{0}\\ \mathbf{B} _{ S} \end{array} \!\right ]\mathbf{u}.& &{}\end{array}$$
(6.60)

Note that in B S the rows corresponding to the promoted m internal nodes are still zero. Similarly to (6.48), the admittance is expressed as:

$$\displaystyle\begin{array}{rcl} \mathop{\underbrace{\mathop{[(\mathbf{G}_{S} + s\mathbf{C}_{S})}}\limits }_{\mathbf{Y}_{S}(s)}& -& \mathop{\underbrace{\mathop{(\mathbf{G}_{K} + s\mathbf{C}_{K})^{T}(\mathbf{G}_{R} + s\mathbf{C}_{R})^{-1}(\mathbf{G}_{K} + s\mathbf{C}_{K})]}}\limits }_{\mathbf{Y}_{R}(s)}\mathbf{x}_{S} = \mathbf{B}_{S}\mathbf{u}{}\end{array}$$
(6.61)
$$\displaystyle\begin{array}{rcl} \mathbf{Y}(s)& =& \mathbf{Y}_{S}(s) -\mathbf{Y}_{R}(s).{}\end{array}$$
(6.62)

Recall that reducing Y I (s) directly from the simple partitioning (6.47) and (6.48) is not a method of choice, because by preserving Y P (s) only, the DC and offset of Y(s) would not be accurately matched. Using instead the improved partitioning (6.60) and (6.61), one aims at better approximating the DC and offset of Y(s) by preserving Y S (s) (which now encaptures not only the external nodes but also a subset of the internal nodes). Finding the partitioning (6.60) only requires a reordering of nodes, thus no Cholesky factorization or fill-introducing congruence transformation is needed prior to the MOR step. One can reduce Y R (s) directly with modal approximation (via frequency dependent SAMDP), and preserve the sparsity of the extended port blocks from Y S (s).

By interpolating k dominant eigenmodes from the symmetric eigendecoposition \([\varLambda _{R},\mathbf{V}] = \mathit{eig}(-\mathbf{G}_{R},\mathbf{C}_{R})\), the reduced model is obtained:

$$\displaystyle\begin{array}{rcl} \mathbf{Y}_{k}(s) = \mathbf{Y}_{S}(s) -\sum _{i=1}^{k}\frac{\mathbf{q}_{i}^{T}\mathbf{q}_{ i}} {1 + s\lambda _{i}}\ \,\ \ \mathbf{q}_{i}^{T} = (\mathbf{G}_{ K} + s\mathbf{C}_{K})^{T}\mathbf{V}_{ [:,1:k]},\ \ \lambda _{i} = \varLambda _{R[i,i]}.& &{}\end{array}$$
(6.63)

In matrix terms, the reduced model is easily constructed by re-connecting the preserved selected matrix blocks to the reduced blocks:

$$\displaystyle\begin{array}{rcl} & & \left (\left [\!\begin{array}{cc} \hat{\mathbf{G}}_{R} &\hat{\mathbf{G}}_{K} \\ \hat{\mathbf{G}}_{K}^{T}& \mathbf{G}_{S} \end{array} \!\right ] + s\left [\!\begin{array}{cc} \hat{\mathbf{C}}_{R} &\hat{\mathbf{C}}_{K} \\ \hat{\mathbf{C}}_{K}^{T}& \mathbf{C}_{S} \end{array} \!\right ]\right )\left [\!\begin{array}{c} \hat{\mathbf{x}}_{R} \\ \mathbf{x}_{S} \end{array} \!\right ] = \left [\!\begin{array}{c} \mathbf{0}\\ \mathbf{B} _{ S} \end{array} \!\right ]\mathbf{u},{}\end{array}$$
(6.64)

where:

$$\displaystyle\begin{array}{rcl} \hat{\mathbf{G}}_{R}& =& \mathbf{V}_{[:,1:k]}^{T}\mathbf{G}_{ R}\mathbf{V}_{[:,1:k]} \rightarrow \mathrm{ diagonal},\ \ \hat{\mathbf{G}}_{K} = \mathbf{V}_{[:,1:k]}^{T}\mathbf{G}_{ K},\ \ \mathbf{G}_{S} \rightarrow \mathrm{ sparse}{}\end{array}$$
(6.65)
$$\displaystyle\begin{array}{rcl} \hat{\mathbf{C}}_{R}& =& \mathbf{V}_{[:,1:k]}^{T}\mathbf{C}_{ R}\mathbf{V}_{[:,1:k]} \rightarrow \mathrm{ diagonal},\ \ \hat{\mathbf{C}}_{K} = \mathbf{V}_{[:,1:k]}^{T}\mathbf{C}_{ K},\ \ \mathbf{C}_{S} \rightarrow \mathrm{ sparse}.{}\end{array}$$
(6.66)

The remaining problem is how to determine the selected nodes and the partitioning (6.60). Inspired from the results obtained for R networks, we propose to first find the permutation P which identifies the strongly connected components (sccs) of G. Both G and C are reordered according to P, revealing the structure (6.60). With this permutation, the circuit terminals are redistributed according to the sccs of G, and several clusters of nodes can be identified: a large component consisting of internal nodes and very few (or no) terminals, and clusters formed each by internal nodes plus some terminals. We propose to leave all clusters consisting of internal nodes and terminals intact, and denote these nodes as the selected nodes mentioned above. If there are still terminals outside these clusters, they are added to these selected nodes and complete the blocks G S , C S . The remaining cluster of internal nodes forms G R and C R . The model reduction step is performed on G R and C R (and implicitly on G K and C K ). We also note that matrices G K and C k resulting from this partitioning usually have many zero columns, thus \(\hat{\mathbf{G}}_{K}\) and \(\hat{\mathbf{C}}_{K}\) will preserve these zero columns.

The procedure is illustrated in Sect. 6.2.3 through a medium-sized example. Larger netlists can be treated via a similar reordering and partitioning strategy, possibly in a recursive manner (for instance when after an initial reordering the number of selected nodes is too large, the same partitioning strategy could be re-applied to G S and C S and further reduce these blocks). Certainly, other reorderings of G and C could be exploited, for instance according to a permutation which identifies the sccs of C instead of G. The choice for either using G or C to determine the permutation P is made according to the structure of the underlying system and may depend on the application. We also emphasize that the reduced models for both PACT and SparseMA are passive [74] and therefore also stable. Passivity is ensured by the fact that all transformations applied throughout are congruence transformations on symmetric positive definite matrices, thus the reduced system matrices remain symmetric positive definite.

2.3 Numerical Results

The graph-based reduction procedures were applied on several networks resulting from parasitic extraction. We present results for both R and RC networks.

2.3.1 R Network Reduction

Table 6.2 shows results for three resistor networks of realistic interconnect layouts. The number of nodes is reduced by a factor > 10 and the number of resistors by a factor > 3. As a result, the computing time for calculating path resistances in the original network (including nonlinear elements such as diodes) is 10 times smaller.

Table 6.2 Results of reduction algorithm
Fig. 6.7
figure 7

Original G matrix

2.3.2 RC Network Reduction

Fig. 6.8
figure 8

Original C matrix

Fig. 6.9
figure 9

Permuted G according to scc(G)

We reduce an RC netlist with n = 3, 231 internal nodes and p = 22 terminals (external nodes). The structure of the original G and C matrices is shown in Figs. 6.7 and 6.8, where the p = 22 terminals correspond to their first 22 rows and columns.

The permutation revealing the strongly connected components of G reorders the matrices as shown in Figs. 6.9 and 6.10. The reordering is especially visible in the “arrow-form” capacitance matrix. There, the p = 22 terminal nodes together with m = 40 internal nodes are promoted to the border, revealing the 62 selected nodes that will be preserved in the reduced model (i.e. the G S and C S blocks in (6.60)). The first \(n - m = 3,191\) nodes are the remaining internal nodes and form the G R and C R blocks in (6.60). The G K block has only 1 non-zero column, and also in C K many zero columns can be identified.

Fig. 6.10
figure 10

Permuted C according to scc(G)

Fig. 6.11
figure 11

Reduced G matrix with Sparse MA

The reduced SparseMA model is obtained according to (6.63) and (6.64) and is shown in Figs. 6.11 and 6.12. The internal blocks G R and C R were reduced from dimension 3, 191 to \(\hat{\mathbf{G}}_{R}\) and \(\hat{\mathbf{C}}_{R}\) of dimension k = 7, by interpolating the 7 most dominant eigenmodes of \([\varLambda _{R},\mathbf{V}] = eig(-\mathbf{G}_{R},\mathbf{C}_{R})\). Note that \(\hat{\mathbf{G}}_{R}\) and \(\hat{\mathbf{C}}_{R}\) are diagonal. The selected 62 nodes corresponding to the G S and C S blocks are preserved, evidently preserving sparsity. The only fill-in introduced by the proposed reduction procedure is in the non-zero columns of \(\hat{\mathbf{G}}_{K}\) and \(\hat{\mathbf{C}}_{K}\). It is worth noticing that \(\hat{\mathbf{G}}_{K}\) only has 1 non-zero column, thus remains sparse.

Fig. 6.12
figure 12

Reduced C matrix with Sparse MA

Fig. 6.13
figure 13

Reduced G matrix with PACT

The sparsity structure of the PACT reduced model (6.59) is shown in Figs. 6.13 and 6.14. The blocks corresponding to the first 22 nodes (the preserved external nodes) are full, as are the capacitive connection blocks to the reduced internal part. Only the reduced internal blocks remain sparse (diagonal).

Fig. 6.14
figure 14

Reduced C matrix with PACT

Fig. 6.15
figure 15

AC simulation 1: original, reduced (Sparse MA) and synthesized model

Aside from sparsity preservation, one is interested in the quality of the approximation for the reduced model. In Fig. 6.15, we show that the SparseMA model accurately matches the original response for a wide frequency range (1 Hz → 10 THz). The Pstar [78] simulations of the synthesized model are identical to the Matlab simulations (the synthesized model was obtained via the RLCSYN unstamping procedure [71, 87]). In Fig. 6.16, the relative errors between the original model and three reduced models are presented: SparseMA, PACT and the commercial software Jivaro [64]. The SparseMA model is the most accurate for the entire frequency range.

Fig. 6.16
figure 16

AC simulation 1: relative error between original and reduced models (SparseMA, Pact, Jivaro)

Figure 6.17 shows a different AC circuit simulation, where the SparseMA model performs comparably to the reduced model obtained with the commercial software Jivaro [64]. Finally, the transient simulation in Fig. 6.18 confirms that the SparseMA model is both accurate and stable.

Fig. 6.17
figure 17

AC simulation 2: original, reduced (Sparse MA) and reduced (Jivaro)

Fig. 6.18
figure 18

Transient simulation 1: all external nodes grounded and voltage measured at node 2. Original and reduced (Sparse MA – synthesized)

Table 6.3 Results with SparseMA reduction on RC netlist

Table 6.3 shows the reduction results for the RC network. For the 3 reduced models: SparseMA, PACT and Jivaro we assess the effect of the reduction by means of several factors. With all methods, both the number of nodes and the number of circuit elements was reduced significantly, resulting in at least 68x speed-up in AC simulation time. It should be noted that the SparseMA model and the Jivaro model have lower ratios of \(\frac{\#\mathit{elements}} {\#\mathit{unknowns}}\) and \(\frac{\#\mathit{elements}} {\#\mathit{int.nodes}}\) than the PACT model. Even though the Jivaro and the PACT model are faster to simulate for this network, the SparseMA model gives a good trade-off between approximation quality, sparsity preservation and CPU speed-up. Recall that the matrix blocks corresponding to the circuit terminals become dense with PACT, but remain sparse with SparseMA. As for circuits with more terminals ∼ O(103) the corresponding matrix blocks become larger, preserving their sparsity via SparseMA is an additional advantage. Hence, the improvement on simulation time could be greater with SparseMA when applied on larger models with many terminals.

2.4 Concluding Remarks

New approaches were presented for reducing R and RC circuits with multi-terminals, using tools from graph theory. It was shown how netlist partitioning and node reordering strategies can be combined with existing model reduction techniques, to improve the sparsity of the reduced RC models and implicitly their simulation time. The proposed sparsity preserving method, SparseMA, performs comparably to the commercial tool Jivaro. Future work will investigate how similar strategies can be applied to RC models with many more terminals [ ∼ O(103)] and to RLCk netlists.

3 Simulation of Mutually Coupled Oscillators Using Nonlinear Phase Macromodels and Model Order Reduction Techniques

The design of modern RF (radio frequency) integrated circuits becomes increasingly more complicated due to the fact that more functionality needs to be integrated on a smaller physical area.Footnote 5 In the design process floor planning, i.e., determining the locations for the functional blocks, is one of the most challenging tasks. Modern RF chips for mobile devices, for instance, typically have an FM radio, Blue- tooth, and GPS on one chip. These functionalities are implemented with Voltage Controlled Oscillators (VCOs), that are designed to oscillate at certain different frequencies. In the ideal case, the oscillators operate independently, i.e., they are not perturbed by each other or any signal other than their input signal. Practically speaking, however, the oscillators are influenced by unintended (parasitic) signals coming from other blocks (such as Power Amplifiers) or from other oscillators, via for instance (unintended) inductive coupling through the substrate. A possibly undesired consequence of the perturbation is that the oscillators lock to a frequency different than designed for, or show pulling, in which case the oscillators are perturbed from their free running orbit without locking.

The locking effect was first observed by the Dutch scientist Christian Huygens in the seventeenth century. He observed that pendulums of two nearby clocks hanging on the same wall after some time moved in unison [120] (in other words they locked to the same frequency). Similar effects occur also for electrical oscillators. When an oscillator is locked to a different frequency, it physically means that the frequency of the oscillator is changed and as a result the oscillator operates at the new frequency. In this case in the spectrum of the oscillator we will observe a single peak corresponding to the new frequency of the oscillator. Contrary to the locking case, frequency pulling occurs when the interfering frequency source is not strong enough to cause frequency locking (e.g. weak substrate coupling). In this case in the spectrum of the pulled oscillator we will observe several sidebands around the carrier frequency of the oscillator. In Sect. 6.3.9 we will discuss several practical examples of locking and pulling effects.

Oscillators appear in many physical systems and interaction between oscillators has been of interest in many applications. Our main motivation comes from the design of RF systems, where oscillators play an important role [95, 100, 107, 120] in, for instance, high-frequency Phase Locked Loops (PLLs). Oscillators are also used in the modeling of circadian rhythm mechanisms, one of the most fundamental physiological processes [91]. Another application area is the simulation of large-scale biochemical processes [114].

Although the use of oscillators is widely spread over several disciplines, their intrinsic nonlinear behavior is similar, and, moreover, the need for fast and accurate simulation of their dynamics is universal. These dynamics include changes in the frequency spectrum of the oscillator due to small noise signals (an effect known as jitter [100]), which may lead to pulling or locking of the oscillator to a different frequency and may cause the oscillator to malfunction. The main difficulty in simulating these effects is that both phase and amplitude dynamics are strongly nonlinear and spread over separated time scales [113]. Hence, accurate simulation requires very small time steps during time integration, resulting in unacceptable simulation times that block the design flow. Even if computationally feasible, transient simulation only gives limited understanding of the causes and mechanisms of the pulling and locking effects.

To some extent one can describe the relation between the locking range of an oscillator and the amplitude of the injected signal (these terms will be explained in more detail in Sect. 6.3.1). Adler [90] shows that this relation is linear, but it is now well known that this is only the case for small injection levels and that the modeling fails for higher injection levels [111]. Also other linearized modeling techniques [120] suffer, despite their simplicity, from the fact that they cannot model nonlinear effects such as injection locking [111, 127].

In this section we use the nonlinear phase macromodel introduced in [100] and further developed and analyzed in [104106, 111, 113, 115, 116, 127]. Contrary to linear macromodels, the nonlinear phase macromodel is able to capture nonlinear effects such as injection locking. Moreover, since the macromodel replaces the original oscillator system by a single scalar equation, simulation times are decreased while the nonlinear oscillator effects can still be studied without loss of accuracy. One of the contributions of this paper is that we show how such macromodels can be used in industrial practice to predict the behavior of inductively coupled oscillators.

Returning to our motivation, during floor planning, it is of crucial importance that the blocks are located in such a way that the effects of any perturbing signals are minimized. A practical difficulty here is that transient simulation of the full system is very expensive and usually unfeasible during the early design stages. One way to get insight in the effects of inductive coupling and injected perturbation signals is to apply the phase shift analysis [100]. In this section we will explain how this technique can be used to estimate the effects for perturbed individual and coupled oscillators, and how this can be of help during floor planning. We will consider perturbations caused by oscillators and by other components such as balanced/unbalanced transformers (baluns).

In some applications to reduce clockskew (clocksignals becoming out of phase), for instance, oscillators can be coupled via transmission lines [102]. Since accurate models for transmission lines can be large, this may lead to increased simulation times. We show how model order reduction techniques [94, 96, 97, 124] can be used to decrease simulation times without unacceptable loss of accuracy.

The section is organized as follows. In Sect. 6.3.1 we summarize the phase noise theory. A practical oscillator model and an example application are described in Sect. 6.3.2. Inductively coupled oscillators are discussed in detail in Sect. 6.3.3. In Sect. 6.3.4 we give an overview of existing methods to model injection locking of individual and resistively/capacitively coupled oscillators. In Sect. 6.3.5 we consider small parameter variations for mutually coupled oscillators. In Sects. 6.3.6 and 6.3.7 we show how the phase noise theory can be used to analyze oscillator-balun coupling and oscillator-transmission line coupling, respectively. In Sect. 6.3.8 we give a brief introduction to model order reduction and present a Matlab script used in our implementations. Numerical results are presented in Sect. 6.3.9 and the conclusions are drawn in Sect. 6.3.10.

3.1 Phase Noise Analysis of Oscillator

A general free-running oscillator can be expressed as an autonomous system of differential (algebraic) equations:

$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}\mathbf{q}(\mathbf{x})} {\mathrm{d}t} + \mathbf{j}(\mathbf{x})& =& 0,{}\end{array}$$
(6.67a)
$$\displaystyle\begin{array}{rcl} \mathbf{x}(0)& = \mathbf{x}(T),&{}\end{array}$$
(6.67b)

where \(\mathbf{x}(t) \in \mathbb{R}^{n}\) are the state variables, T is the period of the free running oscillator, which is in general unknown, \(\mathbf{q},\mathbf{j}:\ \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}\) are (nonlinear) functions describing the oscillator’s behavior and n is the system size. The solution of (6.67) is called Periodic Steady State (PSS) and is denoted by x pss . Although finding the PSS solution can be an challenging task in itself, we will not discuss this in the present paper and refer the interested reader to, for example, [105, 108110, 122, 123, 126].

A general oscillator under perturbation can be expressed as a system of differential equations

$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}\mathbf{q}(\mathbf{x})} {\mathrm{d}t} + \mathbf{j}(\mathbf{x})& =& \mathbf{b}(t),{}\end{array}$$
(6.68)

where \(\mathbf{b}(t) \in \mathbb{R}^{n}\) are perturbations to the free running oscillator. For small perturbations b(t) it can be shown [100] that the solution of (6.68) can be approximated by

$$\displaystyle{ \mathbf{x}_{p}(t) = \mathbf{x}_{pss}(t +\alpha (t))+\mathbf{y}(t), }$$
(6.69)

where y(t) is the orbital deviation and \(\alpha (t) \in \mathbb{R}\) is the phase shift, which satisfies the following scalar nonlinear differential equation:

$$\displaystyle\begin{array}{rcl} \dot{\alpha }(t)& =& \mathbf{V}^{T}(t +\alpha (t)) \cdot \mathbf{b}(t),{}\end{array}$$
(6.70a)
$$\displaystyle\begin{array}{rcl} \alpha (0)& =& 0,{}\end{array}$$
(6.70b)

where \(\mathbf{V}(t) \in \mathbb{R}^{n}\) is called Perturbation Projection Vector (PPV) of (6.68). It is a special projection vector of the perturbations and is computed based on Floquet theory [99, 100, 115]. The PPV is a periodic function with the same period as the oscillator and can efficiently be computed directly from the PPS solution, see for example [101]. Using this simple and numerically cheap method one can do many kinds of analysis for oscillators, e.g. injection locking, pulling, a priori estimate of the locking range [100, 111].

For small perturbations the orbital deviation y(t) can be ignored [100] and the response of the perturbed oscillator is computed by

$$\displaystyle{ \mathbf{x}_{p}(t) = \mathbf{x}_{\mathit{pss}}(t +\alpha (t)). }$$
(6.71)

3.2 LC Oscillator

For many applications oscillators can be modeled as an LC tank with a nonlinear resistor as shown in Fig. 6.19. This circuit is governed by the following differential equations for the unknowns (v, i):

$$\displaystyle\begin{array}{rcl} C\frac{\mathrm{d}v(t)} {\mathrm{d}t} + \frac{v(t)} {R} + i(t) + S\tanh (\frac{G_{n}} {S} v(t))& =& b(t),{}\end{array}$$
(6.72a)
$$\displaystyle\begin{array}{rcl} L\frac{\mathrm{d}i(t)} {\mathrm{d}t} - v(t)& =& 0,{}\end{array}$$
(6.72b)

where C, L and R are the capacitance, inductance and resistance, respectively. The nodal voltage is denoted by v and the branch current of the inductor is denoted by i. The voltage controlled nonlinear resistor is defined by S and G n parameters, where S has influence on the oscillation amplitude and G n is the gain [111].

Fig. 6.19
figure 19

Voltage controlled oscillator: current of the nonlinear resistor is given by \(f(v) = S\tanh (\frac{G_{n}} {S} v(t))\)

A lot of work [111, 120] has been done for the simulation of this type of oscillators. Here we will give an example that can be of practical use for designers. During the design process, early insight in the behavior of system components is of crucial importance. In particular, for perturbed oscillators it is very convenient to have a direct relationship between the injection amplitude and the side band level.

For the given RLC circuit with the following parameters \(L = 930 \cdot 10^{-12}\) H, \(C = 1.145 \cdot 10^{-12}\) F, R = 1, 000 Ω, \(S = 1/R\), \(G_{n} = -1.1/R\) and injected signal b(t) = A injsin(2π f), we plot the side band level of the voltage response versus the amplitude A inj of the injected signal for different offset frequencies, see Fig. 6.20. The results in Fig. 6.20 can be seen as a simplified representation of Arnol’d tongues [98], that is helpful in engineering practice. We see, for instance, that the oscillator locks to a perturbation signal with an offset of 10 MHz if the corresponding amplitude is larger than ∼ 10−4 A (when the signal is locked the sideband level becomes 0 dB). This information is useful when designing the floor plan of a chip, since it may put additional requirements on the placement (and shielding) of components that generate, or are sensitive to, perturbing signals.

As an example, consider the floor plan in Fig. 6.21. The analysis described above and in Fig. 6.20 first helped to identify and quantify the unintended pulling and locking effects due to the coupling of the inductors (note that the potential causes (inductors) of pulling and locking effects first have to be identified; in practice, designers usually have an idea of potential coupling issues, for instance when there are multiple oscillators in a design). The outcome of this analysis indicated that there were unintended pulling effects in the original floorplan and hence some components were relocated (and shielded) to reduce unintended pulling effects. Finally, the same macromodels, but with different coupling factors due to the relocation of components, were used to verify the improved floorplan.

Fig. 6.20
figure 20

Side band level of the voltage response versus the injected current amplitude for different offset frequencies

Fig. 6.21
figure 21

Floor plan with relocation option that was considered after nonlinear phase noise analysis showed an intolerable pulling due to unintended coupling. Additionally, shielding was used to limit coupling effects even further

Although the LC tank model is relatively simple, it can be of high value especially in the early stages of the design process (schematic level), since it can be used to estimate the effects of perturbation and (unintended) coupling on the behavior of oscillators. As explained before, this may be of help during floor planning. In later stages, one typically validates the design via layout simulations, which can be much more complex due to the inclusion of parasitic elements. In general one has to deal with larger dynamical systems when parasitics are included, but the phase noise theory still applies. Therefore, in this paper we do not consider extracted parasitics. However, the values for L, C, R and coupling factors are typically based on measurement data and layout simulations of real designs.

Fig. 6.22
figure 22

Two inductively coupled LC oscillators

3.3 Mutual Inductive Coupling

Next we consider the two mutually coupled LC oscillators shown in Fig. 6.22. The inductive coupling between these two oscillators can be modeled as

$$\displaystyle\begin{array}{rcl} L_{1}\frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t} + M \frac{\mathrm{d}i_{2}(t)} {\mathrm{d}t} & =& v_{1}(t),{}\end{array}$$
(6.73a)
$$\displaystyle\begin{array}{rcl} L_{2}\frac{\mathrm{d}i_{2}(t)} {\mathrm{d}t} + M \frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t} & =& v_{2}(t),{}\end{array}$$
(6.73b)

where \(M = k\sqrt{L_{1 } L_{2}}\) is the mutual inductance and | k |  < 1 is the coupling factor. This makes the matrix

$$\displaystyle{\left (\begin{array}{c c} L_{1} & M \\ M &L_{2} \end{array} \right )}$$

positive definite, which ensures that the problem is well posed. In this section all the parameters with a subindex refer to the parameters of the oscillator with the same subindex. If we combine the mathematical model (6.72) of each oscillator with (6.73), then the two inductively coupled oscillators can be described by the following differential equations

$$\displaystyle\begin{array}{rcl} & & C_{1}\frac{\mathrm{d}v_{1}(t)} {\mathrm{d}t} + \frac{v_{1}(t)} {R_{1}} + i_{1}(t) + S\tanh (\frac{G_{n}} {S} v_{1}(t)) = 0,{}\end{array}$$
(6.74a)
$$\displaystyle\begin{array}{rcl} & & L_{1}\frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t} - v_{1}(t) = -M \frac{\mathrm{d}i_{2}(t)} {\mathrm{d}t},{}\end{array}$$
(6.74b)
$$\displaystyle\begin{array}{rcl} & & C_{2}\frac{\mathrm{d}v_{2}(t)} {\mathrm{d}t} + \frac{v_{2}(t)} {R_{2}} + i_{2}(t) + S\tanh (\frac{G_{n}} {S} v_{2}(t)) = 0,{}\end{array}$$
(6.74c)
$$\displaystyle\begin{array}{rcl} & & L_{2}\frac{\mathrm{d}i_{2}(t)} {\mathrm{d}t} - v_{2}(t) = -M \frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t}.{}\end{array}$$
(6.74d)

For small values of the coupling factor k the right-hand side of (6.74b) and (6.74d) can be considered as a small perturbation to the corresponding oscillator and we can apply the phase shift theory described in Sect. 6.3.1. Then we obtain the following simple nonlinear equations for the phase shift of each oscillator:

$$\displaystyle\begin{array}{rcl} \dot{\alpha _{1}}(t)& =& \mathbf{V}_{1}^{T}(t +\alpha _{ 1}(t)) \cdot \left (\begin{array}{c} 0 \\ - M \frac{\mathrm{d}i_{2}(t)} {\mathrm{d}t} \end{array} \right ),{}\end{array}$$
(6.75a)
$$\displaystyle\begin{array}{rcl} \dot{\alpha _{2}}(t)& =& \mathbf{V}_{2}^{T}(t +\alpha _{ 2}(t)) \cdot \left (\begin{array}{c} 0 \\ - M \frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t} \end{array} \right ),{}\end{array}$$
(6.75b)

where the currents and voltages are evaluated by using (6.71):

$$\displaystyle\begin{array}{rcl} [v_{1}(t),\,i_{1}(t)]^{T}& =& \mathbf{x}_{ pss}^{1}(t +\alpha _{ 1}(t)),{}\end{array}$$
(6.75c)
$$\displaystyle\begin{array}{rcl} [v_{2}(t),\,i_{2}(t)]^{T}& =& \mathbf{x}_{ pss}^{2}(t +\alpha _{ 2}(t)).{}\end{array}$$
(6.75d)

Small parameter variations have also been studied in the literature by Volterra analysis, see e.g. [92, 93].

3.3.1 Time Discretization

The system (6.75) is solved by using implicit backward Euler for the time discretization and the Newton method is applied for the solution of the resulting two dimensional nonlinear equations (6.76a) and (6.76b), i.e.

$$\displaystyle\begin{array}{rcl} & & \alpha _{1}^{m+1} =\alpha _{ 1}^{m} +\tau \mathbf{V}_{ 1}^{T}(t^{m+1} +\alpha _{ 1}^{m+1})\cdot {}\end{array}$$
(6.76a)
$$\displaystyle\begin{array}{rcl} & & \qquad \qquad \left (\begin{array}{c} 0 \\ - M \frac{i_{2}(t^{m+1}) - i_{2}(t^{m})} {\tau } \end{array} \right ), \\ & & \alpha _{2}^{m+1} =\alpha _{ 2}^{m} +\tau \mathbf{V}_{ 2}^{T}(t^{m+1} +\alpha _{ 2}^{m+1})\cdot {}\end{array}$$
(6.76b)
$$\displaystyle\begin{array}{rcl} & & \qquad \qquad \left (\begin{array}{c} 0 \\ - M \frac{i_{1}(t^{m+1}) - i_{1}(t^{m})} {\tau } \end{array} \right ), \\ & & [v_{1}(t^{m+1}),\,i_{ 1}(t^{m+1})]^{T} = \mathbf{x}_{ pss}^{1}(t^{m+1} +\alpha _{ 1}^{m+1}), {}\end{array}$$
(6.76c)
$$\displaystyle\begin{array}{rcl} & & [v_{2}(t^{m+1}),\,i_{ 2}(t^{m+1})]^{T} = \mathbf{x}_{ pss}^{2}(t^{m+1} +\alpha _{ 2}^{m+1}), \\ & & \alpha _{1}^{1} = 0,\;\alpha _{ 2}^{1} = 0,\;m = 1,\ldots, {}\end{array}$$
(6.76d)

where \(\tau = t^{m+1} - t^{m}\) denotes the time step. For the Newton iterations in (6.76a) and (6.76b) we take (α 1 m, α 2 m) as initial guess on the time level (m + 1). This provides very fast convergence (in our applications within around four Newton iterations). See [123] and references therein for more details on time integration of electric circuits.

3.4 Resistive and Capacitive Coupling

For completeness in this section we describe how the phase noise theory applies to two oscillators coupled by a resistor or a capacitor.

3.4.1 Resistive Coupling

Resistive coupling is modeled by connecting two oscillators by a single resistor, see Fig. 6.23. The current \(i_{R_{0}}\) flowing through the resistor R 0 satisfies the following relation

Fig. 6.23
figure 23

Two resistively coupled LC oscillators

Fig. 6.24
figure 24

Two capacitively coupled LC oscillators

$$\displaystyle{ i_{R_{0}} = \frac{v_{1} - v_{2}} {R_{0}}, }$$
(6.77)

where R 0 is the coupling resistance. Then the phase macromodel is given by

$$\displaystyle\begin{array}{rcl} \dot{\alpha _{1}}(t)& =& \mathbf{V}_{1}^{T}(t +\alpha _{ 1}(t)) \cdot \left (\begin{array}{c} (v_{1} - v_{2})/R_{0} \\ 0 \end{array} \right ),{}\end{array}$$
(6.78a)
$$\displaystyle\begin{array}{rcl} \dot{\alpha _{2}}(t)& =& \mathbf{V}_{2}^{T}(t +\alpha _{ 2}(t)) \cdot \left (\begin{array}{c} - (v_{1} - v_{2})/R_{0} \\ 0 \end{array} \right ),{}\end{array}$$
(6.78b)

where the voltages are updated by using (6.71). More details on resistively coupled oscillators can be found in [113].

3.4.2 Capacitive Coupling

When two oscillators are coupled via a single capacitor with a capacitance C 0 (see Fig. 6.24), then the current \(i_{C_{0}}\) through the capacitor C 0 satisfies

$$\displaystyle{ i_{C_{0}} = C_{0}\frac{d(v_{1} - v_{2})} {dt}. }$$
(6.79)

In this case the phase macromodel is given by

$$\displaystyle\begin{array}{rcl} \dot{\alpha _{1}}(t)& =& \mathbf{V}_{1}^{T}(t +\alpha _{ 1}(t)) \cdot \left (\begin{array}{c} C_{0}\frac{d(v_{1} - v_{2})} {dt}\\ 0 \end{array} \right ),{}\end{array}$$
(6.80a)
$$\displaystyle\begin{array}{rcl} \dot{\alpha _{2}}(t)& =& \mathbf{V}_{2}^{T}(t +\alpha _{ 2}(t)) \cdot \left (\begin{array}{c} - C_{0}\frac{d(v_{1} - v_{2})} {dt}\\ 0 \end{array} \right ),{}\end{array}$$
(6.80b)

where the voltages are updated by using (6.71).

Time discretization of (6.78) and (6.80) is done according to (6.76).

3.5 Small Parameter Variation Model for Oscillators

For many applications performing simulations with nominal design parameters is no longer sufficient and it is necessary to do simulations around the nominal parameters. In practice designers use Monte-Carlo type simulation techniques to get insight about the device performance for small parameter variations. However these methods can be very time consuming and not applicable for large problems. For analyzing small parameter variations one can use polynomial chaos approach described in [119]. But in this paper we apply the technique described in [128] to mutually coupled oscillators. Here we briefly sketch the ideas of the method and for details we refer to [128].

Consider an oscillator under a perturbation b(t) described by a set of ODE’s:

$$\displaystyle{ \frac{\mathrm{d}\mathbf{x}} {\mathrm{d}t} + f(\mathbf{x},p) = \mathbf{b}(t), }$$
(6.81)

where f describes the nonlinearity in the oscillator and it is a function of the state variables x and the parameter p. Let us consider a parameter variation

$$\displaystyle{ \mathbf{p} = \mathbf{p}_{0} +\varDelta \mathbf{p}, }$$
(6.82)

where p 0 is the nominal parameter and Δ p is the parameter deviation from p 0. Then for small parameter deviations the phase shift equation for (6.81) reads

$$\displaystyle\begin{array}{rcl} \dot{\alpha }(t)& =& \mathbf{V}^{T}(t +\alpha (t)) \cdot (\mathbf{b}(t) - F_{ P}(t +\alpha (t))\varDelta \mathbf{p}),{}\end{array}$$
(6.83a)
$$\displaystyle\begin{array}{rcl} \alpha (0)& =& 0,{}\end{array}$$
(6.83b)

where V(t) is the perturbation projection vector of the oscillator with nominal parameters and

$$\displaystyle{ F_{P}(t +\alpha (t)) = \frac{\partial f} {\partial \mathbf{p}}\Big\vert _{\mathbf{x}_{\text{pss}}(t+\alpha (t)),\mathbf{p}_{0}}, }$$
(6.84)

where x pss is the PSS of (6.81) with nominal parameters.

In Sect. 6.3.9.1 we show numerical experiments of two inductively coupled oscillators using small parameter variations.

3.6 Oscillator Coupling with Balun

In this section we analyze inductive coupling effects between an oscillator and a balun. A balun is an electrical transformer that can transform balanced signals to unbalanced signals and vice versa, and they are typically used to change impedance (applications in (RF) radio). The (unintended) coupling between an oscillator and a balun typically occurs on chips that integrate several oscillators for, for instance, FM radio, Bluethooth and GPS, and hence it is important to understand possible coupling effects during the design. In Fig. 6.25 a schematic view is given of an oscillator which is coupled with a balun via mutual inductors.

Fig. 6.25
figure 25

Oscillator coupled with a balun

The following mathematical model is used for oscillator and balun coupling (see Fig. 6.25):

$$\displaystyle\begin{array}{rcl} & & C_{1}\frac{\mathit{dv}_{1}(t)} {\mathit{dt}} + \frac{v_{1}(t)} {R_{1}} + i_{1}(t) + S\tanh (\frac{Gn} {S} v_{1}(t)) = 0,{}\end{array}$$
(6.85a)
$$\displaystyle\begin{array}{rcl} & & L_{1}\frac{\mathit{di}_{1}(t)} {\mathit{dt}} + M_{12}\frac{\mathit{di}_{2}(t)} {dt} + M_{13}\frac{\mathit{di}_{3}(t)} {dt} - v_{1}(t) = 0,{}\end{array}$$
(6.85b)
$$\displaystyle\begin{array}{rcl} & & C_{2}\frac{\mathit{dv}_{2}(t)} {\mathit{dt}} + \frac{v_{2}(t)} {R_{2}} + i_{2}(t) + I(t) = 0,{}\end{array}$$
(6.85c)
$$\displaystyle\begin{array}{rcl} & & L_{2}\frac{\mathit{di}_{2}(t)} {\mathit{dt}} + M_{12}\frac{\mathit{di}_{1}(t)} {dt} + M_{23}\frac{\mathit{di}_{3}(t)} {dt} - v_{2}(t) = 0,{}\end{array}$$
(6.85d)
$$\displaystyle\begin{array}{rcl} & & C_{3}\frac{\mathit{dv}_{3}(t)} {\mathit{dt}} + \frac{v_{3}(t)} {R_{3}} + i_{3}(t) = 0,{}\end{array}$$
(6.85e)
$$\displaystyle\begin{array}{rcl} & & L_{3}\frac{\mathit{di}_{3}(t)} {\mathit{dt}} + M_{13}\frac{\mathit{di}_{1}(t)} {\mathit{dt}} + M_{23}\frac{\mathit{di}_{2}(t)} {\mathit{dt}} - v_{3}(t) = 0,{}\end{array}$$
(6.85f)

where \(M_{\mathit{ij}} = k_{\mathit{ij}}\sqrt{L_{i } L_{j}},\,i,j = 1,2,3,\,i < j\) is the mutual inductance and k ij is the coupling factor. The parameters of the nonlinear resistor are \(S = 1/R_{1}\) and \(G_{n} = -1.1/R_{1}\) and the current injection in the primary balun is denoted by I(t).

For small coupling factors we can consider \(M_{12}\frac{di_{2}(t)} {dt} + M_{13}\frac{di_{3}(t)} {dt}\) in (6.85b) as a small perturbation to the oscillator. Then similar to (6.75), we can apply the phase shift macromodel to (6.85a)–(6.85b). The reduced model corresponding to (6.85a)–(6.85b) is

$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}\alpha (t)} {\mathrm{d}t} & =& \mathbf{V}^{T}(t +\alpha (t)) \cdot \left (\begin{array}{c} 0 \\ - M_{12}\frac{\mathit{di}_{2}(t)} {\mathit{dt}} - M_{13}\frac{\mathit{di}_{3}(t)} {\mathit{dt}} \end{array} \right ).{}\end{array}$$
(6.86)

The balun is described by a linear circuit (6.85c)–(6.85f) which can be written in a more compact form:

$$\displaystyle{ E\frac{\mathrm{d}\mathbf{x}(t)} {\mathrm{d}t} + A\mathbf{x}(t) + B\frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t} + C = 0, }$$
(6.87)

where

$$\displaystyle\begin{array}{rcl} E& =& \left (\begin{array}{cccc} C_{2} & 0 & 0 & 0 \\ 0 & L_{2} & 0 &M_{23} \\ 0 & 0 &C_{3} & 0 \\ 0 &M_{23} & 0 & L_{3}\\ \end{array} \right ),{}\end{array}$$
(6.88a)
$$\displaystyle\begin{array}{rcl} A& =& \left (\begin{array}{cccc} 1/R_{2} & 1& 0 &0 \\ - 1 &0& 0 &0 \\ 0 &0&1/R_{3} & 0 \\ 0 &0& - 1 &0\\ \end{array} \right ),{}\end{array}$$
(6.88b)
$$\displaystyle\begin{array}{rcl} B^{T}& =& \left (\begin{array}{c c c c} 0&M_{ 12} & 0&M_{13}\\ \end{array} \right ),{}\end{array}$$
(6.88c)
$$\displaystyle\begin{array}{rcl} C^{T}& =& \left (\begin{array}{c c c c} I(t)&0&0&0\\ \end{array} \right ),{}\end{array}$$
(6.88d)
$$\displaystyle\begin{array}{rcl} \mathbf{x}^{T}& =& \left (\begin{array}{c c c c} v_{ 2}(t)&i_{2}(t)&v_{3}(t)&i_{3}(t)\\ \end{array} \right ).{}\end{array}$$
(6.88e)

With these notations (6.86) and (6.87) can be written in the following form

$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}\alpha (t)} {\mathrm{d}t} = \mathbf{V}^{T}(t +\alpha (t)) \cdot \left (\begin{array}{c} 0 \\ - B^{T}\frac{\mathrm{d}\mathbf{x}(t)} {\mathrm{d}t} \end{array} \right ),{}\end{array}$$
(6.89)
$$\displaystyle\begin{array}{rcl} & & E\frac{\mathrm{d}\mathbf{x}(t)} {\mathrm{d}t} + A\mathbf{x}(t) + B\frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t} + C = 0,{}\end{array}$$
(6.90)

where i 1(t) is computed by using (6.71). This system can be solved by using a finite difference method.

3.7 Oscillator Coupling to a Transmission Line

In some applications oscillators are coupled via transmission lines. By coupling oscillators via transmission lines, for instance, one can reduce the clock skew in clock distribution networks [102]. Accurate models for transmission lines may contain up to thousands or millions of RLC components [129]. Furthermore, the oscillators or the components that perturb (couple to) the oscillators can consists of many RLC components, for instance when ones takes into account parasitic effects. Since simulation times usually increase with the number of elements, one would like to limit the number of (parasitic) components as much as possible, without losing accuracy.

Fig. 6.26
figure 26

Oscillator coupled to a transmission line

The schematic view of an oscillator coupled to a transmission line is given in Fig. 6.26. Using phase macromodel for oscillator and by applying Kirchhoff’s current law to the transmission line circuit, we obtain the following set of differential equations:

$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}\alpha (t)} {\mathrm{d}t} = \mathbf{V}^{T}(t +\alpha (t)) \cdot \left (\begin{array}{c} \frac{y(t) - v(t)} {R_{1}} \\ 0 \end{array} \right ){}\end{array}$$
(6.91a)
$$\displaystyle\begin{array}{rcl} & & E\frac{\mathrm{d}\mathbf{x}(t)} {\mathrm{d}t} = A\mathbf{x}(t) + B\mathbf{u}(t),{}\end{array}$$
(6.91b)
$$\displaystyle\begin{array}{rcl} & & y(t) = \mathcal{C}^{T}\mathbf{x},{}\end{array}$$
(6.91c)

where

$$\displaystyle\begin{array}{rcl} & & E = \text{diag}(C_{1},C_{2},\ldots,C_{n}),\,A = \text{tridiag}( \frac{1} {R_{i}},- \frac{1} {R_{i}} - \frac{1} {R_{i+1}}, \frac{1} {R_{i+1}}),{}\end{array}$$
(6.92a)
$$\displaystyle\begin{array}{rcl} & & B = \left (\begin{array}{c c } \frac{1} {R_{1}} & 0 \\ 0 &0\\ \vdots & \vdots\\ 0 &1\\ \end{array} \right ),\,\mathbf{x} = \left (\begin{array}{c } v_{1}(t) \\ v_{2}(t)\\ \vdots \\ v_{n}(t)\\ \end{array} \right ),\,\mathbf{u}(t) = \left (\begin{array}{c } v(t)\\ I(t) \\ \end{array} \right ),\,\mathcal{C} = \left (\begin{array}{c } 1\\ 0\\ \vdots \\ 0 \end{array} \right ).{}\end{array}$$
(6.92b)
Fig. 6.27
figure 27

Two oscillators coupled via a transmission line

In a similar way the phase macromodel of two oscillators coupled via a transmission line, see Fig. 6.27, is given by the following equations:

$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}\alpha _{1}(t)} {\mathrm{d}t} = \mathbf{V}_{1}^{T}(t +\alpha _{ 1}(t)) \cdot \left (\begin{array}{c} \frac{v_{1}(t) - v(t)} {R_{1}} \\ 0 \end{array} \right ){}\end{array}$$
(6.93a)
$$\displaystyle\begin{array}{rcl} & & E\frac{\mathrm{d}\mathbf{x}(t)} {\mathrm{d}t} = A\mathbf{x}(t) + B\mathbf{u}(t),{}\end{array}$$
(6.93b)
$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}\alpha _{2}(t)} {\mathrm{d}t} = \mathbf{V}_{2}^{T}(t +\alpha _{ 2}(t)) \cdot \left (\begin{array}{c} \frac{v_{n}(t) - v_{0}(t)} {R_{n+1}} \\ 0 \end{array} \right ),{}\end{array}$$
(6.93c)

where α 1(t) and α 2(t) (V 1 and V 2) are phase shifts (PPV’s) of the corresponding oscillator. The matrices E, A and x are given by (6.92) and

$$\displaystyle{ B = \left (\begin{array}{c c } \frac{1} {R_{1}} & 0 \\ 0 & 0\\ \vdots & \vdots \\ 0 & \frac{1} {R_{n+1}}\\ \end{array} \right ),\,\mathbf{u}(t) = \left (\begin{array}{c } v(t)\\ v_{ 0}(t)\\ \end{array} \right ). }$$
(6.94)

3.8 Model Order Reduction

Model order reduction (MOR) techniques [94, 96, 97, 124] can be used to reduce the number of elements significantly. Here we show how model order reduction can be used for the analysis of oscillator perturbation effects as well. Since the main focus is to show how MOR techniques can be used (and not which technique is the most suitable), we limit the discussion here to Balanced Truncation [118]. For other methods, see, e.g., [94, 96, 97, 124].

Given a dynamical system (A, B, C) (assume E = I), balanced truncation [118] consists of first computing a balancing transformation \(V \in \mathbb{R}^{n\times n}\). The balanced system (V T AV, V T B, V T C) has the nice property that the Hankel Singular ValuesFootnote 6 are easily available. A reduced order model can be constructed by selecting the columns of V that correspond to the k < n largest Hankel Singular Values. With \(V _{k} \in \mathbb{R}^{n\times k}\) having as columns these k columns, the reduced order model (of order k) becomes (V k A V k , V k T B, V k T C). If EI is nonsingular, balanced truncation can be applied to \((E^{-1}A,E^{-1}B,C)\). For more details on balanced truncation, see [96, 97, 118, 124].

In this section we apply model order reduction to linear circuits that are coupled to oscillators, and the relevant equations for each problem describing linear circuits have the form of (6.89b)–(6.89c). For each problem the corresponding matrices A, E, B, and C can be identified readily, see (6.88), (6.92), (6.94) and note \(C \equiv \mathcal{C}\). We use Matlab [117] implementation for balanced truncation to obtain reduced order models:

sys = ss( -E\A, -E\B, C’, 0 ) ;

[hsv, baldata] = hsvd(sys); % Hankel singular values

mor_dim = nnz((hsv>1e-10)); % choose largest singular

% values where mor_dim is the dimension

% of the reduced system

rsys= balred(sys,mor_dim,’Elimination’,’Truncate’,...

’Balancing’, baldata) ; %truncate

Note that we can apply balanced truncation because E is nonsingular. It is well known that in many cases in circuit simulation the system is a descriptor system and hence E is singular. Although generalizations of balanced truncation to descriptor systems exist [124, 125], other MOR techniques such as Krylov subspace methods and modal approximation might be more appropriate. We refer the reader to [94, 96, 97, 124] for a good introduction to such techniques and MOR in general.

3.9 Numerical Experiments

It is known that a perturbed oscillator either locks to the injected signal or is pulled, in which case side band frequencies all fall on one side of the injected signal, see, e.g., [111]. We will see that contrary to the single oscillator case, where side band frequencies all fall on one side of the injected signal, for (weakly) coupled oscillators a double-sided spectrum is formed.

In Sects. 6.3.9.16.3.9.3 we consider two LC oscillators with different kinds of coupling and injection. The inductance and resistance in both oscillators are \(L_{1} = L_{2} = 0.64\,\) nH and \(R_{1} = R_{2} = 50\,\varOmega\), respectively. The first oscillator is designed to have a free running frequency f 1 = 4. 8 GHz with capacitance \(C_{1} = 1/(4L_{1}\pi ^{2}f_{1}^{2}) = 1.7178\,\) pF. Then the inductor current in the first oscillator is A 1 = 0. 0303 A and the capacitor voltage is V 1 = 0. 5844 V. In a similar way the second oscillator is designed to have a free running frequency f 2 = 4. 6 GHz with the inductor current A 2 = 0. 0316 A and the capacitor voltage V 2 = 0. 5844 V. For both oscillators we choose \(S_{i} = 1/R_{i}\), \(G_{n} = -1.1/R_{i}\) with i = 1, 2. In Sect. 6.3.9.4 we describe experiments for an oscillator coupled to a balun.

The values for L, C, R and (mutual) coupling factors are based on measurement data and layout simulations of real designs.

In all the numerical experiments the simulations are run until \(T_{\text{final}} = 6 \cdot 10^{-7}\,\) s with the fixed time step \(\tau = 10^{-11}\). Simulation results with the phase shift macromodel are compared with simulations of the full circuit using the CHORAL[103, 121] one-step time integration algorithm, hereafter referred to as full simulation. All experiments have been carried out in Matlab 7.3. We would like to remark that in all experiments simulations with the macromodels were typically ten times faster than the full circuit simulations.

In all experiments, for a given oscillator or balun we use the response of the nodal voltage to plot the spectrum (spectrum composed of discrete harmonics) of the signal.

3.9.1 Inductively Coupled Oscillators

Numerical simulation results of two inductively coupled oscillators, see Fig. 6.22, for different coupling factors k are shown in Fig. 6.28, where the frequency is plotted versus the Power Spectral Density (PSDFootnote 7). In Fig. 6.28 we present results for the first oscillator. Similar results are obtained for the second oscillator around its own carrier frequency. For small values of the coupling factor we observe a very good approximation with the full simulation results. As the coupling factor grows, small deviations in the frequency occur, see Fig. 6.28d. Because of the mutual pulling effects between the two oscillators a double sided spectrum is formed around each oscillator carrier frequency. The additional sidebands are equally spaced by the frequency difference of the two oscillators.

The phase shift α 1(t) of the first oscillator for a certain time interval is given in Fig. 6.29. We note that it has a sinusoidal behavior. For a single oscillator under perturbation a completely different behavior is observed: in locked condition the phase shift changes linearly, whereas in the unlocked case the phase shift has a nonlinear behavior different than a sinusoidal, see for example [112].

Fig. 6.28
figure 28

Inductive coupling. Comparison of the output spectrum of the first oscillator obtained by the phase macromodel and by the full simulation for a different coupling factor k. (a) k = 0. 0005. (b) k = 0. 001. (c) k = 0. 005. (d) k = 0. 01

Fig. 6.29
figure 29

Inductive coupling. Phase shift α 1(t) of the first oscillator with k = 0. 001

3.9.1.1 Parameter Variation in Two Inductively Coupled Oscillators

Let us consider two inductively coupled oscillators with the nominal parameters given in Sect. 6.3.9 and a small parameter Δ L variation in the inductance of the second oscillator. Then the corresponding model is:

$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}v_{1}(t)} {\mathrm{d}t} + \frac{v_{1}(t)} {C_{1}R_{1}} + i_{1}(t) + \frac{S} {C_{1}}\tanh (\frac{G_{n}} {S} v_{1}(t)) = 0,{}\end{array}$$
(6.95a)
$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t} -\frac{v_{1}(t)} {L_{1}} = -\frac{M} {L_{1}} \frac{\mathrm{d}i_{2}(t)} {\mathrm{d}t},{}\end{array}$$
(6.95b)
$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}v_{2}(t)} {\mathrm{d}t} + \frac{v_{2}(t)} {C_{2}R_{2}} + i_{2}(t) + \frac{S} {C_{2}}\tanh (\frac{G_{n}} {S} v_{2}(t)) = 0,{}\end{array}$$
(6.95c)
$$\displaystyle\begin{array}{rcl} & & \frac{\mathrm{d}i_{2}(t)} {\mathrm{d}t} - \frac{v_{2}(t)} {L_{2} +\varDelta L} = - \frac{M} {L_{2} +\varDelta L} \frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t}.{}\end{array}$$
(6.95d)

By using the small parameter variation model given in Sect. 6.3.5 we obtain the corresponding phase shift macromodel for (6.95):

$$\displaystyle\begin{array}{rcl} \dot{\alpha _{1}}(t)& =& \mathbf{V}_{1}^{T}(t +\alpha _{ 1}(t)) \cdot \left (\begin{array}{c} 0 \\ -\frac{M} {L_{1}} \frac{\mathrm{d}i_{2}(t)} {\mathrm{d}t} \end{array} \right ),{}\end{array}$$
(6.96a)
$$\displaystyle\begin{array}{rcl} \dot{\alpha _{2}}(t)& =& \mathbf{V}_{2}^{T}(t +\alpha _{ 2}(t)) \cdot \left (\begin{array}{c} 0 \\ - \frac{M} {L_{2} +\varDelta L} \frac{\mathrm{d}i_{1}(t)} {\mathrm{d}t} -\frac{v_{2}(t)} {L_{2}^{2}} \varDelta L \end{array} \right ),{}\end{array}$$
(6.96b)

where the currents and voltages are evaluated by using (6.75c)–(6.75d).

For this numerical experiments we consider the coupling factor to be equal to k = 0. 0005. Furthermore, let us denote by f 2 full, Δ L and f 2 phase, Δ L the new frequency of the second oscillator obtained by full simulation and phase macromodel for the given parameter variation Δ L. Then we define

$$\displaystyle{\varDelta f = f_{2}^{\text{full},\varDelta L} - f_{ 2}^{\text{phase},\varDelta L}.}$$

In Fig. 6.30 we show the relative frequency difference Δ f versus parameter variation Δ L. We note that for small parameter variations (Δ LL 2 ≤ 0. 01) the phase macromodel provides a good approximation to the full simulation results.

Fig. 6.30
figure 30

Frequency difference versus parameter variation

Fig. 6.31
figure 31

Output spectrum of the second oscillator for several parameter variations Δ L. (a) \(\varDelta L/L_{2} = 0.005\). (b) \(\varDelta L/L_{2} = 0.01\). (c) \(\varDelta L/L_{2} = 0.02\). (d) \(\varDelta L/L_{2} = 0.03\)

In Fig. 6.31 we show the output spectrum of the second oscillator for several values of the parameter Δ L.

3.9.2 Capacitively Coupled Oscillators

The coupling capacitance in Fig. 6.24 is chosen to be \(C_{0}\ =\ k\ \cdot \ C_{\text{mean}}\), where \(C_{\text{mean}} = (C_{1} + C_{2})/2 = 1.794 \cdot 10^{-12}\) and we call k the capacitive coupling factor. Simulation results for the first oscillator for different capacitive coupling factors k are given in Fig. 6.32 (similar results are obtained for the second oscillator around its own carrier frequency).

For a larger coupling factor k = 0. 01 the phase shift macromodel shows small deviations from the full simulation results Fig. 6.32d.

Fig. 6.32
figure 32

Capacitive coupling. Comparison of the output spectrum of the first oscillator obtained by the phase macromodel and by the full simulation for a different coupling factor k. (a) k = 0. 0005. (b) k = 0. 001. (c) k = 0. 005. (d) k = 0. 01

Fig. 6.33
figure 33

Capacitive coupling. Phase shift of the first oscillator with k = 0. 001

The phase shift α 1(t) of the first oscillator and a zoomed section for some interval are given in Fig. 6.33. In a long run the phase shift seems to change linearly with a slope of \(a = -0.00052179\). The linear change in the phase shift is a clear indication that the frequency of the first oscillator is changed and is locked to a new frequency, which is equal to (1 + a)f 1. The change of the frequency can be explained as follows: as noted in [114], capacitive coupling may change the free running frequency because this kind of coupling changes the equivalent tank capacitance. From a mathematical point of view it can be explained in the following way. For the capacitively coupled oscillators the governing equations can be written as:

$$\displaystyle\begin{array}{rcl} (C_{1} + C_{0})\frac{\mathit{dv}_{1}(t)} {\mathit{dt}} & +& \frac{v_{1}(t)} {R}{}\end{array}$$
(6.97a)
$$\displaystyle\begin{array}{rcl} & +& i_{1}(t) + S\tanh (\frac{G_{n}} {S} v_{1}(t)) = C_{0}\frac{\mathit{dv}_{2}(t)} {\mathit{dt}}, \\ L_{1}\frac{\mathit{di}_{1}(t)} {\mathit{dt}} & -& v_{1}(t) = 0, {}\end{array}$$
(6.97b)
$$\displaystyle\begin{array}{rcl} (C_{2} + C_{0})\frac{\mathit{dv}_{2}(t)} {\mathit{dt}} & +& \frac{v_{2}(t)} {R}{}\end{array}$$
(6.97c)
$$\displaystyle\begin{array}{rcl} & +& i_{2}(t) + S\tanh (\frac{G_{n}} {S} v_{2}(t)) = C_{0}\frac{\mathit{dv}_{1}(t)} {\mathit{dt}}, \\ L_{2}\frac{\mathit{di}_{2}(t)} {\mathit{dt}} & -& v_{2}(t) = 0. {}\end{array}$$
(6.97d)

This shows that the capacitance in each oscillator is changed by C 0 and that the new frequency of each oscillator is

$$\displaystyle{\tilde{f}_{i} = \frac{1} {2\pi \sqrt{L_{1 } (C_{i } + C_{0 } )}},\;i = 1,2.}$$

In the zoomed figure within Fig. 6.33 we note that the phase shift is not exactly linear but that there are small wiggles. By numerical experiments it can be shown that these small wiggles are caused by a small sinusoidal contribution to the linear part of the phase shift. As in case of mutually coupled inductors, the small sinusoidal contributions are caused by mutual pulling of the oscillators (right-hand side terms in (6.97a) and (6.97c)).

3.9.3 Inductively Coupled Oscillators Under Injection

As a next example, let us consider two inductively coupled oscillators where in one of the oscillators an injected current is applied. Let us consider the case where a sinusoidal current of the form

$$\displaystyle{ I(t) = A_{\text{inj}}\sin (2\pi (f_{1} - f_{\text{off}})t) }$$
(6.98)

is injected in the first oscillator. Then (6.75a) is modified to

$$\displaystyle\begin{array}{rcl} \dot{\alpha _{1}}(t)& =& \mathbf{V}_{1}^{T}(t +\alpha _{ 1}(t)) \cdot \left (\begin{array}{c} - I(t) \\ - M \frac{di_{2}(t)} {dt} \end{array} \right ).{}\end{array}$$
(6.99)

For a small current injection with A inj = 10 μA and an offset frequency f off = 20 MHz the spectra of both oscillators and the phase shift with coupling factor k = 0. 001 are given in Fig. 6.34. It is clear from Figs. 6.34a, b that the phase shift of both oscillators does not change linearly, which implies that the oscillators are not in the steady state. As a result in Figs. 6.34c, d we observe spectral widening in the spectra of both oscillators. We note that the phase macromodel simulations are good approximations of the full simulation results.

Fig. 6.34
figure 34

Inductive coupling with injection and k = 0. 001. Top: phase shift. Bottom: comparison of the output spectrum obtained by the phase macromodel and by the full simulation with a small current injection. (a) Oscillator 1. (b) Oscillator 2. (c) Oscillator 1. (d) Oscillator 2

3.9.4 Oscillator Coupled to a Balun

Finally, consider an oscillator coupled to a balun as shown in Fig. 6.25 with the following parameters values: The coupling factors in (6.85) are chosen to be

$$\displaystyle{ k_{12} = 10^{-3},\,k_{ 13} = 5.96 {\ast} 10^{-3},\,k_{ 23} = 9.33 {\ast} 10^{-3}. }$$
(6.100)

The injected current in the primary balun is of the form

$$\displaystyle{ I(t) = A_{\text{inj}}\sin (2\pi (f_{0} - f_{\text{off}})t), }$$
(6.101)

where f 0 = 4. 8 GHz is the oscillator’s free running frequency and f off = 20 MHz is the offset frequency.

Table 4
Fig. 6.35
figure 35

Comparison of the output spectrum of the oscillator coupled to a balun obtained by the phase macromodel and by the full simulation for an increasing injected current amplitude A inj and an offset frequency f off = 20 MHz. (a) oscillator. (b) primary balun. (c) oscillator. (d) primary balun. (e) oscillator. (f) primary balun. (g) oscillator. (h) primary balun

Results of numerical experiments done with the phase macromodel and the full simulations are shown in Fig. 6.35. We note that for a small current injection (\(A_{\text{inj}} = 10^{-4} - 10^{-2}\)  A) both the oscillator and the balun are pulled by each other. When the injected current is not strong (\(A_{\text{inj}} = 10^{-4}\)  A) the oscillator is pulled slightly and in the spectrum of the oscillator (Fig. 6.35a) we observe a spectral widening with two spikes around-60 dB (weak “disturbance” of the oscillator). By gradually increasing the injected current, the oscillator becomes more disturbed and in the spectrum we observe widening with higher side band levels, cf. Fig. 6.35c–f. When the injected current is strong enough (with \(A_{\text{inj}} = 10^{-1}\,\) A) to lock the oscillator to the frequency of the injected signal, we observe a single spike at the new frequency. Similar results are also obtained for the secondary balun.

3.9.4.1 Oscillator Coupled to a Balun

Consider an oscillator coupled to a balun as shown in Fig. 6.25 with the following parameters values:

Table 5

The coefficients of the mutual inductive couplings are \(k_{12} = 10^{-3},\,k_{13} = 5.96 {\ast} 10^{-3},\,k_{23} = 9.33 {\ast} 10^{-3}.\) The injected current in the primary balun is of the form

$$\displaystyle{ I(t) = A_{\text{inj}}\sin (2\pi (f_{0} - f_{\text{off}})t), }$$
(6.102)

where f 0 = 4. 8 GHz is the oscillator’s free running frequency, f off is the offset frequency and A inj is the current amplitude.

Results of the numerical experiments are shown in Fig. 6.36, where the results obtained by the macromodel-MOR technique with mor_dim = 2 provide a good approximation to the full-simulation results. We note that for the injected current with \(A_{\text{inj}} = 10^{-1}\,\) A the oscillator is locked to the injected signal. Similar results are also obtained for the balun.

Fig. 6.36
figure 36

Comparison of the output spectrum of the oscillator coupled to a balun obtained by the macromodel-full and the macromodel-MOR simulations for an increasing injected current amplitude A inj and an offset frequency f off = 20 MHz

3.9.5 Oscillators Coupled with Transmission Lines

In this section we consider two academic examples, where transmission lines are modeled with RC components.

3.9.5.1 Single Oscillator Coupled to a Transmission Line

Let us consider the same oscillator as given in the previous section, now coupled to a transmission line, see Fig. 6.26. The size of the transmission line is n = 100 with the following parameters: \(C_{1} =\ldots = C_{n} = 10^{-2}\,\text{pF},R_{1} = 40\,\text{k}\varOmega,R_{2} =\ldots = R_{n} = 1\,\varOmega.\) The injected current has the form (6.102) with \(A_{\text{inj}} = 10^{-2}\) A and f off = 20 MHz. Dimension of the reduced system is mor_dim = 18. Simulation results around the first and third harmonics (this oscillator does not have a second harmonic) are shown in Fig. 6.37. The macromodel-MOR method, using techniques described in Sect. 6.3.8, gives a good approximation to the full simulation results.

Fig. 6.37
figure 37

Comparison of the output spectrum around the first and third harmonics of the oscillator coupled to a transmission line, cf. Fig. 6.26. (a) first harmonic. (b) third harmonic

3.9.5.2 Two LC Oscillators Coupled via a Transmission Line

For this experiment we consider two LC oscillators coupled via a transmission line with the mathematical model given by (6.93). The first oscillator has a free running frequency f 1 = 4. 8 GHz and is described in Sect. 6.3.9.4. The second LC oscillator has the following parameter values: R 0 = 50 Ω, L 0 = 0. 64 nH, C 0 = 1. 87 pF and a free running frequency f 2 = 4. 6 GHz. The size of the transmission line is n = 100 with the following parameters: \(C_{1} =\ldots = C_{n} = 10^{-2}\,\text{pF},R_{1} = R_{n+1} = 4\,\text{k}\varOmega,R_{2} =\ldots = R_{n} = 0.001\,\varOmega.\) Dimension of the reduced system is mor_dim = 16. Numerical simulation results are given in Fig. 6.38. We note that macromodel-MOR approach gives a very good approximation to the full-simulation results.

Fig. 6.38
figure 38

Comparison of the output spectrum around the first and third harmonics of two oscillators coupled via a transmission line. (a) first harmonic. (b) third harmonic. (c) first harmonic. (d) third harmonic

3.10 Conclusion

In this section we have shown how nonlinear phase macromodels can be used to accurately predict the behavior of individual or mutually coupled voltage controlled oscillators under perturbation, and how they can be used during the design process. Several types of coupling (resistive, capacitive, and inductive) have been described and for small perturbations, the nonlinear phase macromodels produce results with accuracy comparable to full circuit simulations, but at much lower computational costs. Furthermore, we have studied the (unintended) coupling between an oscillator and a balun, a case which typically arises during design and floor planning of RF circuits. For the coupling of oscillators with transmission lines we showed how the phase macromodel can be used with model order reduction techniques to provide an accurate and efficient method.