1 Introduction

Process, voltage, and temperature (PVT) variations have a huge impact on circuit performance, yield, and reliability [1]. Circuit parameters are no longer truly deterministic and are considered as probability distributions on their infinite space. The problem of predicting circuit behavior and performance for their entire parameter space is compulsory to catch most of their undesired behavior prior to their fabrication.

Traditional corner case verification methods are not accurate and cannot guarantee that a circuit will always behave according to its specification. Also, the methods, which compute circuits performance bounds in presence of parameters variability using affine interval arithmetic [2] or global optimization [3, 4], are expensive, scale poorly with circuit complexity, and often lead to over-conservative results. Sampling-based methods such as MC simulation [5] methods are easy to implement but are computationally expensive. The enhancement of MC space sampling schemes by performing importance sampling [6] or by reducing the sampling discrepancy [79] does not work for all circuits and scales poorly with circuit sizes. Stochastic spectral methods [1012], which model parameters as stochastic processes and avoid repeated simulations, require sophisticated solvers and quickly hit the computation limits for nonlinear circuits with correlated parameters.

Model order reduction (MOR) [13] is a promising technique that reduces the size and complexity of large mathematical models. It builds compact models that reproduce the simulated behavior of an original model in a smaller amount of time. MOR methods for circuit simulation [14] can be a key to address the challenging problem of alleviating the computational cost of MC methods. They can be effectively used to reduce the number of differential equations which have to be solved in order to estimate circuit performances. The preliminary application of MOR methods to address process variation for linear networks [12, 15, 16] proves their potential. Unfortunately, the key challenge, which has been preventing further progress in this direction, resides in the limitations of MOR methods for the case of nonlinear circuits.

In this paper, we propose a novel simulation method which makes use of the recently developed MOR methods for nonlinear circuits [1719] and reduces the computational cost of statistical circuit analysis under process variation. The parameter space is sampled similarly to the MC method and the resulting nonlinear models are reduced simultaneously to a small macromodel capable of reproducing the statistical behavior of all repeated MC simulations in a single simulation run. The efficiency of the proposed method is proved for three applications: a current mirror, a three-stage inverter chain, and an operational transconductance amplifier. We analyze the performance of each circuit under the effect of device mismatch and show that they are accurate in terms of mean and standard deviation measurement compared to a MC simulation analysis of the original circuit model. Also, we show that the simulation traces retain the required accuracy (less than \(5\,\%\) relative error).

The remainder of this paper is organized as follows. Related work is briefly reviewed in Sect. 2. Preliminaries are introduced in Sect. 3. The proposed approach is described in detail in Sect. 4. Section 5 discusses experimental results and Sect. 6 concludes the paper.

2 Related work

In order to address the problem of circuit parameters uncertainty and circuit yield estimation, a few methods have been used in practice.

MC simulation is a leading method that consists in N repetitive simulations of a circuit model for randomly generated parameter values. Then, the statistical distributions of circuit performance metrics are predicted from the N obtained simulation traces. While this method is straightforward to implement, its run time scales poorly with the number of parameter samples and the length of transient simulations [5].

To accelerate the convergence of the MC method, several techniques have been developed, for example, Latin hypercube sampling (LHS) based methods [7, 8]. The LHS based method controls the generation of the random samples which reduces the number of required samples and provides superior convergence rates over the MC method. However, the LHS based method might require a preprocessing time and a large memory, and its performance becomes comparable to the MC method as the number of random parameters increase. The quasi MC method [9], a.k.a low divergence sampling, is a generalization of the LHS method for the multivariate case. The LHS and the Quasi MC methods do not always reduce the simulation cost even though they simulate a smaller number of experiments.

Stochastic spectral methods [1012], which have the same objective as the MC method, model the circuit parameters as continuous stochastic processes and compute the circuit response in terms of polynomial chaos (PC), in a Hilbert space. The stochastic circuit behavior is obtained by solving the obtained stochastic circuit model. The PC based methods show remarkable speedup over the MC method for RC or RLC interconnect model analysis. However, they result in extremely large models when random parameters are correlated and they require robust stochastic solvers especially for nonlinear circuit models. MC methods are typically more feasible in these situations.

In [20], the authors presented an intrusive-type stochastic solver, named ST, to quantify the uncertainties in transistor-level circuit analysis. The simulator is based on generalized PC (gPC) expansions and therefore can handle Gaussian and non-Gaussian random parameters. The efficiency of the ST method is enhanced by allowing decoupled numerical simulation and adaptive step size control.

The use of stochastic spectral circuit simulators based on gPC to handle parameters uncertainties has many limitations. For example, the number of gPC expansions scales poorly with the parameter space size. Also, it is difficult to transform correlated non Gaussian parameters to uncorrelated ones in order to easily construct gPC basis, as discussed in [21]. However, it is important to notice that in [22], an efficient framework to reduce the computational cost associated with the stochastic simulation based on gPC of complex systems with a large number of parameters, such as MEMS problems, is proposed. As detailed in [23], complex systems are decomposed hierarchically into subsystems which are simulated using a sparse stochastic testing simulator based on the adaptive anchored analysis of the variance method [24]. Then, the system level stochastic simulation is accelerated by the use of the tensor-train decomposition [25].

In [26], the authors extended the DC sensitivity-based mismatch analysis [27, 28] for analyzing mismatch effects on transient characteristics. The accuracy of the pseudo noise based mismatch analysis, which only computes the variance of a Gaussian performance variation, relies on the assumptions of a linear perturbation model and small mismatches.

Recently, statistical simulation methods started to benefit from linear circuits MOR methods. For example, in [29] the authors predicted the impact of process variation using linear MOR methods over intervals of parameters (the asymptotic waveform evaluation method [30] and the passive reduced order interconnect macromodeling algorithm [31]). However, these methods are limited to linear interconnect circuits, may have more than \(5\,\%\) estimation error for some cases and require robust numerical methods to avoid the interval range estimation error explosion. Also, MOR and stochastic spectral methods have been employed to enhance the extraction and the simulation of linear interconnect models [15, 16] and mismatch analysis [12]. However, these methods inherit the complexity and limitations of stochastic methods.

In [32], the authors proposed to first represent the performance bound of analog circuits under the effect of multiple interval valued parameter variations. They then generate a reduced differential model using the polytope representations [33] of circuits uncertain states and the nonlinear systems MOR method [34]. The models are solved using computational geometry. Unfortunately, due to the computational cost, this method is impractical for large nonlinear circuit models with correlated state variables.

In this work, we are taking advantage of the recent progress in the MOR methods which apply to the case of nonlinear circuit models [17, 18] and using them to reduce the computational cost of solving the large number of nonlinear equations needed to analyze process variation effects. We aim to use a single simulation run of a reduced model to approximate a large number of MC simulations without degrading the analysis accuracy. Therefore, our method computes the statistical circuit performances in a much smaller amount of time and can make them converge faster to their theoretical values by increasing the number of samples of the parameters space.

3 Preliminaries

3.1 Circuit model formulation in presence of process Variation

Analog circuit differential models are often obtained using the modified nodal analysis [35] method. In general this method leads to the model given in Eq. (1)

$$\dot{x}=f(x, u, p)$$
(1)

where \(\dot{x}\) is the time derivative of the vector x, which represents the circuit voltage and current state variables, and f is a nonlinear vector function of the state vector x, the input u, and the circuit device model parameters p. The accuracy of the model in Eq. (1) is directly related to the accuracy of the circuit device models which take into consideration the performance variation due to process, voltage and temperature (PVT) variations. The effect of process variations (PV) on a circuit device is the mathematical sum of the effect of two variation types: (1) the inter-die variation that affects all the devices similarly; and (2) the intra-die variation which affect different devices differently [36]. PV leads to variations in attributes of devices (length, width, or oxide thickness, etc.) when integrated circuits are fabricated. It affects the yield and performance (bandwidth, gain, rise time, delay, etc.) of the produced circuits and its effect becomes prevalent at smaller manufacturing technology processes and lower power supply voltages. For example, device mismatches, which refers to the small random variations in the characteristics of identically designed devices, is a major concern in the design of analog circuits such as digitally controlled analog circuits, oscillators, current mirrors, or amplifiers, etc. The Pelgrom’s model [37] for MOS transistors is used to relate the local mismatch variance of electrical device parameters \(\sigma (\Delta p)\), device width W and length L, and technology constants \(A_{p}\), as given in Eq. (2). It is widely used to express the threshold voltage \(v_t\) and the current factor \(\beta =\mu C_{ox}\frac{W}{L}\) mismatches.

$$\sigma ^2\left( \Delta p\right) =\frac{A^2_{p}}{W \cdot L}+S_p \cdot D$$
(2)

where D is the distance between two transistors, \(S_p\) describes the variation of the parameter p with spacing. In [38], a different mismatch model, which is proposed for semiconductor devices (diodes, bipolar, etc.), is based on the propagation of variance in Eq. (3), where e is an electrical property and \(p_l\) are the process and geometry parameters.

$$\sigma ^2_{\delta e} = \sum _l \left( \frac{\partial e}{\partial p_l }\right) ^2 \sigma ^2_{\delta p_l}$$
(3)

Except for very small circuits, it is difficult to analytically predict the behavior of a circuit due to the combination of mismatches of individual devices [1]. The impact of these random parameter variations on circuit behavior is rather studied with MC simulation [5] by repeating circuit simulations for randomly varied devices.

3.2 Sampling-based statistical methods

In sampling-based statistical simulation methods, the statistical characteristics of a state variable x due to a variation of the parameter p is obtained by solving the model in Eq. (1) N times \((x_{p_1} ,x_{p_2},\ldots ,x_{p_N})\) for random generations of the parameter p \((p_1,p_2,\ldots , p_N)\). In order to accurately capture the effect of the variation of the parameter p, the entire parameter space has to be covered with a very large number N based on the variance of the output of interest and the required accuracy. For example, a MC simulation may require 1000 to 10,000 randomly generated values to get a good confidence level on the conducted simulation results [5]. The main challenges of the sampling based methods are: (1) how to efficiently sample the high dimensional parameters space for a high coverage of a model behavior and (2) how to efficiently or simultaneously solve the subsequent large number of nonlinear dynamical models. Methods like quasi MC [9], LHS MC [7, 8], sparse sampling grids [39] and importance sampling algorithms [6] address the first challenge. The second challenge has been approached in different ways but is still problematic given the increasing complexity and size of integrated circuits. The differences of these approaches are lead by the way how the PV effect is modeled. The stochastic methods model PV as a stochastic process and employ stochastic solvers to estimate circuits statistical behavior [11, 12, 20, 21]. MOR methods can be used to address both challenges by reducing the size of the parameters space or reducing the models which have to be solved iteratively for each parameter sample, respectively. The application of MOR to address PV for linear networks [12, 15, 16] proves their potential. In this paper, our contribution lies in taking advantage of the recent progress in MOR methods for nonlinear circuits and applying them to reduce the computational cost of solving the large number of nonlinear equations to analyze PV effects, mainly mismatch of identically designed devices.

3.3 Nonlinear projection based model order reduction

In the literature, model order reduction (MOR) is the transformation of a large dynamical model described by Eq. (1) into a smaller model which mimics its behavior while it can be simulated in a considerably smaller amount of time. MOR has been previously applied to the class of nonlinear circuit models and led to faster circuit models with acceptable accuracy levels [18, 19, 40, 41]. We consider in this paper the method proposed in [17] that can be briefly described by Algorithm 1. This MOR method requires a differential model (Line 1) and constructs reduced models in an iterative way until a target speedup and accuracy requirements are checked (Line 11). The original model is simulated (in Line 4, DC and transient simulations) which results in a collection of trajectories for different inputs and initial conditions. These trajectories are captured in the form of three matrices; the state variable X, its time derivative F and the input U. The obtained behavior snapshot X is clustered into a set of k centroids C in Line 5 using the k-means algorithm [42]. Then, the model in Eq. (1) is linearized at each element of the set of clusters C. The number k of clusters is set initially to a minimal value and is increased iteratively until the linearized model becomes accurate. The linearization matrices are used to compute a unitary Krylov space projection matrix V \(({VV}^t={I}_n)\) using either the block Arnoldi or block Lanczos algorithms [13]. The linearized model in Line 6 is reduced via projection in Line 8 which leads to Eq. (4), where \(\hat{F}=V^t\cdot F\), \(\hat{J}_{z}=V^t \cdot \frac{\partial f}{\partial x}_{|x}\cdot V\), \(\hat{J}_{u}=V^t \cdot \frac{\partial f}{\partial u}_{|u}\), and \(Z=V^t \cdot C\). The matrices and vectors are dynamically evaluated using the weights \(w(i)= \frac{\Vert z-Z(i)\Vert _2^{-1}}{\left( \sum _{i=1}^m\Vert z-Z(i)\Vert _2\right) ^{-1}}, i=1,\ldots , k\) in order to approximate the behavior of the original model.

$$\dot{z}=\sum _{i=1}^{m\le k} w(i)\cdot \left( \hat{F}(i)+\hat{J_z}_i\cdot \left( z-Z(i)\right) +\hat{J_u}_i\cdot \left( u-U(i)\right) \right)$$
(4)

The simulation of the reduced model in Line 9 is required to check that it yields an acceptable speedup and accuracy conformance criteria. The speedup is evaluated as the simulation time ratio \(S=T(z)/T(x)\) where T(z) and T(x) are the simulation times of the reduced and the original models, respectively. The accuracy of the reduced model is checked by measuring the relative error between the state variable x and its approximation \(\hat{x}\). If the speedup and accuracy goals are not met in Line 11, the MOR process is iteratively restarted with a refinement of the parameters until the reduced model is accepted.

figure d

4 Proposed methodology

We propose to transform the problem of statistical simulation into a problem of reducing the size of a large nonlinear dynamical model using nonlinear circuits MOR methods [17]. Instead of using a traditional statistical analysis approach of performing repeated simulations of a circuit model for a large number of samples of uncertain parameters, we propose to reduce a larger differential model built with different instances of the circuit model each of which corresponding to different samples of the uncertain parameters. Then, the obtained reduced model is simulated only once to perform the job of N-points MC simulation. Figure 1 depicts the four main steps of the methodology. First is the model replication step where we build a large differential model out of N instances of the circuit model Model(px) for the randomly generated parameter samples (\(Model(x,u,p_1,p_2,\ldots ,p_N)\)). Then, we reduce the obtained large model using the MOR method described in Sect. 3.3. After that in the reduced model simulation step, we simulate the resulting reduced model. Finally, we perform a backward projection of the reduced model simulation traces into the state space of the N circuit instances and use them in the statistics generation step to compute the statistical behavior of Model(px). The details of each step are provided in the sequel.

Fig. 1
figure 1

Fast statistical simulation method

4.1 Model replication

In this first step of the statistical simulation methodology, we build a large differential model out of N instances of the circuit model Model(px) for the randomly generated parameter samples (\(Model(x,u,p_1,p_2,\ldots ,p_N)\)). First, a number of N parameter samples (\(p_1,p_2,\ldots ,p_N)\)) are generated according to the circuit technology specification or some PV estimation formulas such as the Pelgrom’s model [37] provided in Eq. (2). The parameter distribution is used in N instances of the circuit model, as shown in Eq. (5). This system of differential models (\(Model(x,u,p_1,p_2,\ldots ,p_N)\)) can be viewed as a single differential model with a large state vector formed by the states vectors of all the N instances of the circuit model.

$$\begin{array}{rccc} \dot{x}_{p_1}&=& f(x_{p_1},u,p_1) \\ \dot{x}_{p_2}&=& f(x_{p_2},u,p_2) \\ \vdots&\vdots&\ \ \ \vdots \\ \dot{x}_{p_N}&= & f(x_{p_N},u,p_{N}) \end{array}$$
(5)

where \(x_{p_i}, i=1,\ldots , N\) is the state vector of the original circuit Model(xup) when the parameter p is set to the sample \(p_i\). The model replication step is implemented using a script that copies N times the original circuit model while it sets the corresponding parameter sample according to the perviously generated random parameter distribution.

4.2 Model order reduction

The MOR method described in Algorithm 1 is modified and customized for an efficient reduction of the large model in Eq. (5), as described in the following three steps.

4.2.1 Linearization points generation

The main steps for generating the linearization points for the N circuit model instances are summarized in Algorithm 2. In Algorithm 1, the linearization points are selected by clustering a snapshot of the original circuit model simulation. In this case, performing a simulation of the N circuit instances to generate a snapshot from which we can select linearization points is computationally expensive and the main objective of this work is to avoid it. Consequently, as shown in Algorithm 2, first we simulate only one instance of the original circuit model in Line 3 and use clustering to generate the necessary linearization points of the simulated model in Line 4. The snapshot of the circuit model simulation (DC and transient simulations) when the parameter p is set to the mean value \(\mu _p=\frac{1}{N}\sum _{i=1}^{N}p_i\) is divided into clusters using the agglomerative hierarchical clustering method in MATLAB [43]. As a result, a number of k clusters is obtained and it leads to an accurate piecewise linear approximation of \(Model(x, \mu _p)\) in each cluster \(c=1, 2, \ldots , k\), as given in Eq. (6).

$$\begin{aligned} \dot{x }&= \,f(x(c), u(c), \mu _p)+ J_{x} \cdot (x -x(c))\\&\quad +\, J_{u}\cdot (u-u(c)) \end{aligned}$$
(6)

where x(c) is the cluster centroid, u(c) is the input that corresponds to x(c), \(J_{x}= \frac{\partial f}{\partial x}\), and \(J_{u}=\frac{\partial f}{\partial u}\). Then, in Lines 5–14, we employ a small perturbation model around each cluster centroid in Line 9, that is given in Eq. (7), to find the linearization points needed for all the N instances of the model subject to PV. Finally, it is also possible that the parameter variation changes the DC operating behavior and in this case we must verify that the linear approximation is still valid. Basically, we solve the DC equation of the circuit model again with the new parameter value as shown in Lines 10–12. Therefore, any DC behavior change of the circuit instances due to PV is captured into the set of clusters.

$$x(c , \mu _p+\delta p)=x(c, \mu _p)+\delta p \frac{\partial x}{\partial p}$$
(7)
figure e

Figure 2 illustrates the clusters generation step for a two state variable circuit (a tunnel diode oscillator). It describes six clusters (\(k = 6\)) and their perturbed centroid points which together form the centroid of twenty circuit instances (\(N = 20\)). The perturbation effect is shown as a deviation the nominal parameter circuit model cluster while always being within the real trajectories of the perturbed circuit models. In the case where the parameter variation highly affects the DC circuit behavior, we must verify that the linear approximation is still valid. Basically, we solve the DC equation of the circuit model again with the new parameter value as shown in Lines 10–12. Therefore, all DC behavior variation of the circuit instances due to PV is captured into the set of clusters.

Fig. 2
figure 2

Example of perturbed clusters centroids

4.2.2 Linearization of system of equations

In this step the set of differential models in Eq. (5) is reformulated as follows:

$$\dot{y}=f^*(y, u, p)$$
(8)

where \(y=[x_{S_1}, \ldots , x_{S_m}]\) is the new state variable that consists of groups of state variables. The state variables in each group exhibit almost the same dynamical behavior range and have the same order of magnitude. For example, currents are grouped together, and voltages are divided into groups based on their range and the sign of the \(\frac{\partial x_i}{\partial u}\). As a result, the state variable of each group have a similar behavior and can be reduced efficiently using the proper orthogonal decomposition (POD) reduction method described in [13]. The step of grouping state variables is performed manually, in this work, however a classification algorithm [43] can perform it automatically based on the circuit simulation traces.

The function \(f^*\) in Eq. (8) represents m system of equations labeled \(S_1, \ldots , S_m\). Each system \(S_i\) is represented with equations that are functions of the state variable \(x_{S_i}\), the new input \(u_{S_i}\) (the original circuit input u as well as other state variables from the remaining systems \(S_1, \ldots , S_m\)), and a subset of the randomly generated parameters \(p_1, \ldots , p_N\). Thereby, the new model in Eq. (8) can be described by Fig. 3(a). Then, each system \(S_i\) is locally linearized using the linearization points generated in the previous step. As a result, each linearized system \(S_i\) is described by Eq. (9).

$$\begin{aligned} \dot{x}& = \sum _{c=1}^{l} W(x)\cdot [f^*_{S_i}(x(c), u_{S_i}(c), p) \\&\quad +\,A_{S_i}\cdot (x-x_{S_i}(c)) + B_{S_i} \cdot (u_{S_i}-u_{S_i}(c))] \end{aligned}$$
(9)

where \(A_{S_i}=\frac{\partial f^*}{\partial x_{S_i}}\), \(B_{S_i}=\frac{\partial f^*}{\partial u_{S_i}}\), and W(x) are the weights computed using Eq. (10) that enable the aggregation of \(l \le k\) linearized models over the clusters boundaries.

$$W(x)= \frac{\Vert x -x_{S_i}(c)\Vert ^{-1}}{\left( \sum _{c=1}^{l}\Vert x -x_{S_i}(c)\Vert \right) ^{-1}}$$
(10)
Fig. 3
figure 3

Block subdivision. (a) Original model. (b) Reduced model

4.2.3 Reduction of system of equations

A reduction basis \(V_i\) of size \(n_{S_i}\times q_i\) is computed using the POD reduction method described in [13]. This step corresponds to Line 7 of Algorithm 1 where \(q_i \ll n_{S_i}\). Then, \(V_i\) is used to reduce the matrices that appear in the multiple input linear systems in Eq. (9). It results in the reduced systems \(\hat{S}_i\) for \(i=1\ldots m\), as given in Eq. (11).

$$\begin{aligned} \dot{z}&= \sum _{c=1}^{l} W(z)\cdot [\hat{f}^*_{S_i}(x(c), u_{S_i}(c), p) \\&\quad + \, \hat{A}_{S_i}\cdot (z-z(c)) + \hat{B}_{S_i} \cdot (u_{S_i}-u_{S_i}(c))] \end{aligned}$$
(11)

where z is the state variable of the reduced system \(\hat{S}_i\), \(\hat{f}^*_{S_i}(x(c), u_{S_i}(c), p)=V_i^T \cdot f^*_{S_i}(x(c), u_{S_i}(c), p)\), \(\hat{A}_{S_i}=V^t \cdot A_{S_i}\cdot V\), \(\hat{B}_{S_i}= V^t \cdot B_{S_i}\), and \(z(c)=V^t \cdot x(c)\). The local reduced linear models are also weighted to enable models aggregation in the reduced state space using the weight function in Eq. (10).

Figure 3(b) depicts the reduced systems of equations \(\hat{S}_i\) and how the backward projection of their state variables is used to form the state variable y of the original problem in 3(a). In fact, the reduced model has a size \(q=\sum _i^m q_i\) and a state variable \([z_{S_1}, z_{S_2}, \ldots , z_{S_m}]\). The full order state variable that approximate the state vector y is \(\hat{y}=[\hat{x}_{S_1}, \hat{x}_{S_2}, \ldots , \hat{x}_{S_m}]\), where \(\hat{x}_{S_i}= V_i \cdot z_{S_i}, \quad i=1, \ldots , m\).

4.3 Reduced model simulation and statistics generation

Algorithm 3 provides a description of the steps for the reduced model simulation, the statistics generation of a circuit performance \(P_f\), and the comparison with the MC simulation method. In Lines 3-8, the reduced model is simulated and the state vector \(\hat{y}\) that can be compared with MC simulation traces is reconstructed via the backward projection \(\hat{y}=[V_1\cdot z_{S_1}, V_2\cdot z_{S_2}, \ldots , V_m\cdot z_{S_m}]\). The circuit behavior performance \(P_f\) statistics using the reduced model are generated in Line 6 and the runtime \(T_{RM}\) is saved in Line 7. In Lines 8–14, the MC simulation is conducted for the original circuit model \(Model(x, u, p_1, p_2, \ldots , p_N)\) in Eq. (5). The circuit behavior performance \(P_f\) statistics using the MC method are generated in Line 13 and the runtime \(T_{RM}\) is saved in Line 14. Finally, the reduced model speedup over the MC method and its accuracy are evaluated in Lines 15 and 16, respectively.

figure f

5 Applications

We apply the proposed statistical simulation method on three circuits; a current mirror, an operational transconductance amplifier and a three inverter chain under the current factor (\(\beta\)) and threshold voltage (\(v_t\)) process variation. For all applications, N values of the variations \(\delta \beta\) and \(\delta v_t\) are generated based on the Pelgrom’s simplified model [37]. The mean values of \(\beta\) and \(v_t\) are set to the 180 nm technology nominal values and their standard deviation is computed using Eq. (12).

$$\begin{aligned} \sigma^{2}(\Delta V_t)&&= \frac{A^{2}_{v_t}}{W \cdot L}\\ \frac{\sigma^{2}(\Delta \beta )}{\beta }&&= \frac{A^{2}_{\beta }}{W \cdot L} \end{aligned}$$
(12)

where the terms \(A_{v_t}\) and \(A_{\beta }\) are proportionality constants for 180 nm technology and are taken from [44], W and L refer to the width and the length of the transistors, respectively.

Figure 4 provides an example that uses \(N=1000\) Gaussian distributed samples of the current factor and the threshold voltage mismatch (\(\delta \beta\) and \(\delta v_t\) for NMOS (\(\frac{W}{L}=\frac{360}{180}\)) and PMOS (\(\frac{W}{L}=\frac{720}{180}\)) transistors) using the proportionality constant values provided in Table 1.

Fig. 4
figure 4

Threshold voltage and current factor variation distributions for NMOS and PMOS transistors

Table 1 180 nm matching proportionality constants for size dependence

In what follows, we describe and compare the statistical circuit performance obtained from the N points MC simulations and the ones obtained through the application of the proposed method. All simulations were performed in the MATLAB environment [43], on a \(Windows\ 7\) operating system with an Intel core i7 CPU, 2.8 GHz with 24 GB of RAM.

5.1 Current mirror

We consider the current mirror shown in Fig. 5 which functions by replicating the current produced in one active device into a second active device. The main feature of a current mirror is the high output impedance which guarantees a stable output current regardless of the load conditions. The output currents \(I_2,\ldots ,I_M\) are proportional to \(I_{1}\) as shown in Eq. (13). The current ratios depend on the transistors sizes, their drain-source voltages and the early voltage. When the transistors are not perfectly matched there is always a systematic current gain error \(\varepsilon\) [45].

$$I_{j}=\frac{(W/L)_{j}}{(W/L)_{1}}I_{1}(1+\varepsilon )\quad for\quad j=2,\ldots ,M$$
(13)
Fig. 5
figure 5

Current mirror circuit

In this application, we apply our method to analyze the effect of threshold voltage mismatch on the copying capability of the current \(I_1\) where the transistors \(M_1\) and \(M_2\) have the same width and length while the rest of the transistors have different sizes. The size of the original problem is \(n \times N=4 \times 1000\); 4 state variables and 1000 sample points Gaussian threshold voltage distribution. The state variables are divided into 4 systems of 1000 equations based on their order of magnitude which is different since the mirroring capability of the transistors is different. The reduction procedure is performed using 4 reduction basis of size \(5 \times 1000\) which makes the total reduction size \(q=20\).

Figure 6 illustrates the current mirror statistical distribution of the currents \(I_1\), and \(I_2\) for the 4 state original model, in the left column, obtained through 1000 points MC simulation and the 20 state reduced model, in the right column, obtained through the proposed method. The currents \(I_1\) and \(I_2\), in the left column, have similar distribution since the transistors \(M_1\) and \(M_2\) have equal sizes. The slight variation of their mean and standard deviation is due to the effect of their mismatch. The comparison of the currents, in the left and the right columns, shows that the distributions are the same which illustrates that the 20 state reduced model provides the same statistics of the 1000 points MC simulation.

Fig. 6
figure 6

Current mirror currents distributions

Table 2 provides the numerical values of the mean and the standard deviation of currents \(I_1\) and \(I_2\) for MC simulations (column 2) and the 20 state reduced model obtained using a different number of clusters, i.e., linearization points (columns 3, 4, and 5). The last row of Table 2 shows that the simulation speedup when using the reduced model ranges from 211 to 456 compared with the MC simulation while their statistical behaviors are almost the same.

Table 2 Comparison of the simulated current mirror performance using the MC method \((n\times N= 4 \times 1000)\) and the reduced model \((q=20)\)

5.2 Operational transconductance amplifier

In this application, we consider an Operational Transconductance Amplifier (OTA) shown in Fig. 7. It is one of the most basic and versatile circuits in analog IC design for which performance is affected in the presence of PV. If the symmetrical devices of the OTA circuit are not identical, the differential gain, common mode rejection ratio, and offset voltage are affected [1].

Fig. 7
figure 7

Fully differential operational transconductance amplifier circuit

We use our method to analyze the effect of threshold voltage and current factor mismatches on the differential gain \(A_d= \frac{vop-von}{ vip-vin}\) and the output offset voltage \(V_{os}\) (the differential output (\(vop-von\)) when the inputs are tied together (\(vip-vin=0V\))). The input common mode voltage of the OTA is set to 0.5 V. The size of the original problem is \(n \times N=5 \times 1000\); 5 state variables and 1000 sample points for threshold voltage \(v_t\) and current factor \(\beta\) assumed to have Gaussian distributions. The state variables are divided into 2 systems: the first system has 1000 equations and corresponds to the state \(x_1\). The second system has 4000 equations and corresponds to the states \(x_2,x_3,x_4,x_5\), as shown in Fig. 7. The reduction procedure is performed using 2 projection basis of size \(5 \times 1000\) and \(20 \times 4000\) which makes the total reduction size \(q=25\). Figure 8 shows that the DC behavior under PV obtained with the reduced OTA model and MC simulation overlap. The continuous line DC behavior corresponds to one parameter sample s from 1000 points MC simulation. The circle marked line DC behavior was generated by solving only 25 state DC equations, backward projection of the resulting solution vector to the original state space using the projection matrix transpose \(V^t\), and the selection of the DC solution that corresponds to the same parameter sample s. Figure 9 compares the OTA transient behavior under PV obtained with the reduced OTA model and the MC simulations. The continuous line transient behavior corresponds to one parameter sample s from 1000 points MC simulation. The circle marked line transient behavior was generated by solving only 25 state differential equations and reconstructing the original state space transient behavior that corresponds to the same parameter sample s. Figure 10 shows the OTA differential gain \(A_d\) and offset voltage \(V_{os}\) statistical distributions for 1000 points MC simulation of the 5 state original OTA model in the left column and the simulation of the 25 state reduced OTA model in the right column, obtained through the proposed method, are very close.

Fig. 8
figure 8

OTA DC characteristic sample

Fig. 9
figure 9

OTA transient simulation sample

Fig. 10
figure 10

OTA differential gain and offset voltage distributions

Table 3 compares the numerical values of the mean and the standard deviation of the offset voltage \(V_{os}\) and the differential gain \(A_d\) using a different number of clusters (columns 3, 4, and 5) and shows that they almost have the same characteristics. The relative errors of the state space vectors \(\frac{\Vert x-\hat{x}\Vert }{\Vert x \Vert }\) and the output vectors \(\frac{\Vert y-\hat{y}\Vert }{\Vert y \Vert }\) are also shown in rows 7 and 8 of Table 3, respectively. The state space vector x corresponds to the MC simulation traces and the state vector \(\hat{x}\) is obtained by backward projection of the 25 state reduced OTA model simulation traces. The last row of Table 3 shows that the simulation speedup when using the reduced model ranges from 89 to 220 compared with the MC simulation . The reduced OTA model runs 220 times faster when using 10 clusters than MC simulations while providing very close statistical behavior to the MC simulation.

Table 3 Comparison of the simulated OTA Performance using the MC method \((n\times N= 5 \times 1000)\) and the reduced model \((q=25)\)

5.3 Three inverter chain

In this application, we consider a three inverter chain composed of three CMOS inverters, as shown in Fig. 11 which is a basic cell of many integrated circuits such as oscillators, and transmission lines, etc. We apply our method to analyze the effect of threshold voltage mismatch and current factor mismatches due to PV on the three inverter chain gain (G) and rise time (\(t_r\)). The gain is computed as the steepest slope of input output DC transfer curve when the input is swept from 0 to 1.8 V. The rise time is computed as the time required for the output \(x_3\) to increase its value from 10 to \(90\,\%\) of its maximal value when the input u is a sharp step input \(u(t)=1.8 \cdot H(t-2.5 \times 10^{-9})\). The capacitance at each output node is C = 100 fF. The size of the original problem is \(n \times N=3 \times 1000\). It corresponds to three state variables \(x_1,x_2, x_3\) and 1000 sample points Gaussian threshold voltage \(v_t\) and current factor \(\beta\) distributions. The state variables are divided into two systems: the first system has 2000 equations and corresponds to the states \(x_1\) and \(x_3\) as they have the same derivative sign. The second system has 1000 equations and corresponds to the state \(x_2\). The two systems of equations are reduced using two projection bases of size \(10 \times 2000\) and \(5 \times 1000\), respectively, which makes the total reduction size \(q=15\).

Fig. 11
figure 11

Three inverter chain circuit

Figure 12 compares the transient behavior of the three inverter chain states using the 1000 points MC simulation (bottom curves) and the backward projection of the reduced model generated by our method (top curves). The variation of the transient behavior due to process variation is the same in both graphs.

Fig. 12
figure 12

Inverter chain transient behavior

Figure 13 compares the statistical distributions of the gain G and the rise time \(t_r\) for 1000 points MC simulation of the 3 state original inverter chain model, shown in the left column, and the simulation of the 15 state reduced inverter chain model, shown in the right column, obtained through the proposed method.

Fig. 13
figure 13

Three inverter chain gain and rise time distributions

Table 4 compares the numerical values of the mean and the standard deviation of the gain G and the rise time \(t_r\) using a different number of clusters (columns 3, 4, and 5) and shows that they almost have the same characteristics. The relative errors of the state space vectors \(\frac{\Vert x-\hat{x}\Vert }{\Vert x \Vert }\) and the output vectors \(\frac{\Vert y-\hat{y}\Vert }{\Vert y \Vert }\) are also shown in rows 7 and 8 of Table 4, respectively. The state space vector x corresponds to the MC simulation traces and the state vector \(\hat{x}\) is obtained by backward projection of the 15 state reduced three inverter circuit model simulation traces. The largest values of the relative error \(\frac{\Vert x-\hat{x}\Vert }{\Vert x \Vert }=3.27 \,\%\) and \(\frac{\Vert y-\hat{y}\Vert }{\Vert y \Vert }=3.98 \,\%\), which are only observed when five linearization points are used to build the reduced model, are still in the acceptable range (\(\le\)5 \(\%\)). The last row of Table 4 shows that the simulation speedup when using the reduced model ranges from 97 to 215 compared with the MC simulation.

Table 4 Comparison of the simulated three inverter chain performance using the MC method \((n\times N= 3 \times 1000)\) and the reduced model \((q=15)\)

5.4 Discussion

We have proved the efficiency of our proposed fast statistical simulation method for three nonlinear circuits: a current mirror, an operational transconductance amplifier and a three inverter chain. The main challenge was to get the same statistical behavior as the one obtained by using the MC method but in a smaller amount of time. As shown by numerical experiments provided in this section, the reduced models of the considered applications are capable of reproducing the transistor mismatch effect that was also simulated with the MC method. They yield speedup values in the range 100–500, accurate statistical properties, and small relative errors (less than \(5\,\%\) relative error) when compared to 1000 samples MC simulations of the original circuit model. We expect that the method can yield much higher speedup values and accurate results if it is applied to larger circuits such as a oscillators, analog filters, mixers, and voltage references, etc. The distributions of the circuits performances shown in Figs. 6, 10, and 13 have almost the same mean and standard deviation values but they display slight variations in the histogram bins compared with the MC method results. These slight variations are related to the accuracy of the reduced model and can be reduced by increasing the number of clusters used by the reduced model at the cost of smaller speedup values. For example, by using a 30 clusters for all three applications, we can achieve a speedup very close to 100 with the same behavior as the MC method. These clusters are expanded using the small perturbation method in order to cover the behavior of the multiple circuit instances. Therefore, we have assumed that the linearization points required for the accurate simulation of the perturbed parameter are either in the same cluster or can be obtained from the remaining clusters. However, this assumption might be unsafe for strongly nonlinear circuits where the DC behavior can completely change under PVT variation. In this case, it might be interesting to use local polynomial models instead of linearized models for better accuracy [18]. Also, the use of numerical continuation methods for dynamical systems [46] instead of the cluster perturbation model might be helpful as well in terms of accuracy.

6 Conclusion

In this paper, we described a new method for increasing the computational efficiency of sampling based statistical methods, such as the MC method. The proposed method is based on nonlinear model order reduction techniques. We first transformed a dynamical system model of n state variables that needs to be solved N times to a larger model of \(n \times N\) state variables. We then utilized state space clustering, linearization and proper orthogonal decompositions techniques to reduce the order of the model. The reduced model size q tends to be much smaller than \(n \times N\) and can be solved in a more reasonable simulation time. The behavior of the N original models can be retrieved via a simple state space projection. Finally, we did an in-depth and detailed examination of the method and demonstrated the validity of our approach using three case studies: a current mirror (\(n \times N = 4000\)), an operational transconductance amplifier (\(n \times N = 5000\)), and a three-stage inverter chain (\(n \times N = 3000\)) subject to threshold voltage and current factor process variations. The experimental results showed that the reduced models are accurate and provide the same statistics as the MC simulation of the original models while leading to high simulation speedup values.

Even though, one of the main advantages of the presented method lies in the quick estimation of statistical variations of dynamic system performances, it does have some limitations inherited from the used MOR method. The first limitation is that the obtained statistical simulation results cannot be used directly for other MC parameter samples. The process of generating and simulating reduced models for new parameter samples has to be repeated again. The second limitation is related to the large dimensions of the matrices required while generating reduced models. Therefore, an algorithm, which can transform them into sparse matrices and can delete the no longer needed ones on the fly, can be implemented to reduce the memory usage. The third limitation results from the clusters perturbation step, in the case where the circuit behavior changes completely under the effect of process variation. Therefore, a different method for the clusters expansion such as the numerical continuation method for dynamical systems [46] may be employed. Then, the presented method has to be applied in the presence of more severe process variation to make sure that it always yields accurate reduced models.

As a future work, we plan to further develop our approach to include different MOR methods which approximate the nonlinear behavior using polynomial representations or employ symbolic transformation of nonlinear representation and then compare them to the current piecewise linear representation based MOR method. For example, the general purpose nonlinear model order reduction using piecewise polynomial representations [19] or the projection based nonlinear model order reduction approach using quadratic-linear representation of nonlinear systems [40] deserve further investigation in order to be applied in this context. The presented method can also serve as a basis for the analysis of circuit behavior in the presence of noise and the analysis of biologic systems which can be similarly formulated as nonlinear circuit analysis problems. Another potential application of MOR methods and specifically projection based methods is the reduction of the infinite circuit parameter space dimension which represents a bottleneck for circuit verification, optimization, sizing and synthesis approaches.