Keywords

1 Introduction

We study the problem of estimating yield of an Integrated Circuit (IC) , through electric level simulations which are performed with the Eldo® simulator from Mentor Graphics®, under variability constraints due to the manufacturing process. To ensure a high yield of an IC, the failure probabilities of its components must be very small. For example, consider a 256 Mbit SRAM circuit, having a 256 million “identical” bitcells. To guarantee a yield of 99% of the SRAM, the failure probability of a single bitcel is required to be less than 3.9 × 10−11. For details, see [9].

We consider Monte Carlo (MC) techniques for estimating the failure probability of a circuit. For complex systems one usually can only afford a limited number (say, hundreds) of simulations, and therefore standard MC method is not directly applicable and a variance reduction Importance Sampling MC (ISMC) technique remains unattractive. Then an effective hybrid ISMC (HISMC) approach is introduced in [9]. The main drawback of HISMC is that the authors in [9] do not prove that the resulted probability estimator is unbiased. In this paper, we propose an Unbiased-HISMC (UHISMC) approach which is identical to HISMC in the exploration but different in the estimation phase using the controlled stratified sampling as in [2].

The outline of the paper is as follows: in Sect. 2, we give a brief summary of ISMC. Then we introduce our UHISMC approach in Sect. 3. The numerical results are presented in Sect. 4. Finally, we end with a conclusion in Sect. 5.

2 The Importance Sampling Monte Carlo Technique

2.1 Basic Idea

ISMC is a well-known variance reduction technique for rare-event simulations. For literature on ISMC techniques we refer to [1]. ISMC tunes the basic MC to an area in the parameter space from where the rare-events are generated, i.e., one seeks to sample the random variables from a different distribution, called the Importance Sampling (IS) distribution, rather than the original distribution.

Figure 1 illustrates the ISMC technique in a two dimensional space, i.e., x = (x1, x2) is a vector of two input variables. The samples drawn from g(x) are assumed to be distributed in the red ellipse centered at origin. In this case no failure would occur, since failure lies outside the red ellipse. On the other hand, ISMC allows us to sample from the new pdf (say ψ(x)) that generates the samples (assumed to be fallen in the green ellipse) around the limit state surface \(\mathscr {L}_{\gamma }\).

Fig. 1
figure 1

Illustration of the ISMC technique

2.2 Mathematical Formulation

Let \(\mathbf {X}\in \mathbb {R}^d\) be the input vector of the circuit under study and x a realization of X, with probability density function (pdf) g(x), and let H(X) be the corresponding response of the circuit. Then the failure probability \(p_{\mathrm {fail}}=\mathbb {P}(H(\mathbf {X})\geq \gamma )\) of the circuit is defined as

(1)

where subscript g means that the expectation is taken with respect to the pdf g(x), γ is a given failure threshold and is an indicator function that gives the value 1 if H(x ≥ γ), 0 otherwise.

In ISMC, we sample from another distribution (from which the rare-events are generated) rather than the original. Let ψ(x) be the new density function. Then probability equation (1) can be written as

(2)

where Y is a random vector with pdf ψ(x).

Following [7, 9], we consider a particular case where g(x) is standard Gaussian,Footnote 1 i.e., \(g(\mathbf {x})\sim \mathcal {N}(\mathbf {0},\mathbf {I})\). We define ψ(x) = g θ(x) with \(g^{\boldsymbol {\theta }}(\mathbf {x})\sim \mathcal {N}(\boldsymbol {\theta },\mathbf {I})\) parameterized by its mean \(\boldsymbol {\theta }\in \mathbb {R}^d\). Changing the density from g(x) to g θ(x) implies the translation of mean-shift, then g θ(x) = g(x −θ). Therefore, the ISMC estimator of (2) is given by

$$\displaystyle \begin{aligned} \hat{p}_{\mathrm{fail}}^{\mathrm{IS}} = \dfrac{1}{N}\sum_{j=1}^N J({\mathbf{X}}_j) \end{aligned} $$
(3)

where and X j’s are N independent and identically distributed random vectors with the density g(x), and thus J(X j)’s are N independent random numbers. For details, we refer to [7, Section 3.5.2].

It is easy to prove that \(\hat {p}_{\mathrm {fail}}^{\mathrm {IS}}\) is an unbiased estimator for p fail and its variance is given by

$$\displaystyle \begin{aligned} \mathrm{Var}_{\psi}(\hat{p}_{\mathrm{fail}}^{\mathrm{IS}}) = \dfrac{\sigma_{IS}^2}{N} \end{aligned} $$
(4)

where

$$\displaystyle \begin{aligned} \sigma_{IS}^2 = \mathrm{Var}_{\psi}\left[J(\mathbf{X})\right] = \mathbb{E}_{\psi}\left[\left(J(\mathbf{X})\right)^2\right] - p_{\mathrm{fail}}^2 \end{aligned} $$
(5)

Notice that the accuracy of the estimator \(\hat {p}_{\mathrm {fail}}^{\mathrm {IS}}\) depends on the unknown mean-shift θ which is estimated by minimizing the variance \(\sigma _{IS}^2\). Only, \(\mathbb {E}_{\psi }\left [\left (J(\mathbf {X})\right )^2\right ]\) in Eq. (5) depends on θ. Thus, we find the mean-shift θ by minimizing

(6)

For details we refer to [4, 6,7,8,9].

3 An Unbiased Hybrid Importance Sampling Monte Carlo Approach

In this section, we propose the UHISMC approach which is a hybrid of ISMC, surrogate modelsFootnote 2 and a stratification sampling technique [4]. Moreover, it provides an unbiased probability estimator.

In UHISMC, we split the ISMC technique into an exploration and an estimation phase. In the exploration phase, we substitute the circuit response by its surrogate model, directly. However, in the estimation phase, a surrogate model is used to stratified the sampling region, to find samples that contribute to each stratum by using an accept/reject criterion. At the end, the full response is used for the accepted samples. For details, see Sect. 3.2.

3.1 The Exploration Phase

Let \(\widehat {H}(\mathbf {x})\) be a surrogate prediction for the circuit response H(x) at some input x. Then, we can approximate v(θ) in Eq. (6) by

(7)

The estimate \(\widehat {\boldsymbol {\theta }}\) of the mean shift θ can be computed with the Newton algorithm by solving the following optimization problem

$$\displaystyle \begin{aligned} \widehat{\boldsymbol{\theta}} = \min_{\boldsymbol{\theta}}\hat{v}_m(\boldsymbol{\theta}) \end{aligned} $$
(8)

with

(9)

the MC approximation of the second moment \(\widehat {v}(\boldsymbol {\theta })\).

Notice that there must be at least one X j such that to have \(\widehat {v}_m\neq 0\). However, this condition may fail in a rare event context. To overcome this problem, a multilevel approach is used. We refer for more details to [4, 5, 9].

3.2 The Estimation Phase

In the estimation phase we estimate the failure probability p fail with the known θ (from the exploration phase). Moreover, we use a so called controlled stratified sampling (CSS) technique as proposed in [2].

We partition the input space \(\mathbb {R}^d\) into I mutually exclusive and exhaustive regions D 1, …, D I called strata. To use stratified sampling we must know how to sample in each D i. To this end we use a surrogate model to find the strata \(D_i = \{\mathbf {X}+\boldsymbol {\theta }:\hat {h}_{\rho _{i-1}} < \widehat {H}(\mathbf {X}+\boldsymbol {\theta })\leq \hat {h}_{\rho _{i}}\}\) where \(\widehat {H}(\mathbf {X}+\boldsymbol {\theta })\) is a surrogate prediction and \(\hat {h}_{\rho _{i-1}}\) and \(\hat {h}_{\rho _{i}}\) are quantiles of \(\widehat {H}(\mathbf {X}+\boldsymbol {\theta })\) computed for 0 ≤ ρ i−1 < ρ i < 1. Now let N be the total number of simulations from the simulator. We fix an allocationFootnote 3N 1, …, N I of positive integers with \(\sum _{i=1}^{I}N_i=N\). For each stratum i, we generate many realizationsFootnote 4 of the r.v. X ∼ g(x) and accept N i realizations \(\left ({\mathbf {X}}_j^{(i)}\right )_{j=1,\dots ,N_i}\) among them such that the model predictions \(\hat {H}\left ({\mathbf {X}}_j^{(i)}+\boldsymbol {\theta }\right )\) lie in the interval \((h_{\rho _{i-1}}, h_{\rho _j}]\). The true response \({H}\left ({\mathbf {X}}_j^{(i)}+\boldsymbol {\theta }\right )\) is computed for the accepted realizations. The conditional probability \(P_i=\left \{\mathbb {E}_g\left [J({\mathbf {X}}^{(i)})\right ]: {\mathbf {X}}^{(i)}+\boldsymbol {\theta }\in D_i\right \}\) is estimated for each i:

$$\displaystyle \begin{aligned} \hat{P}_i = \dfrac{1}{N_i}\sum_{j=1}^{N_i}J({\mathbf{X}}_j^{(i)}) \end{aligned} $$
(10a)

with its conditional variance:

$$\displaystyle \begin{aligned} \mathrm{Var}_{\psi}(\hat{P}_i) = \dfrac{\mathrm{Var}\left(J({\mathbf{X}}_j^{(i)})\right)}{N_i}=:\dfrac{\sigma_i^2}{N_i} \end{aligned} $$
(10b)

Finally, the Importance Sampling Controlled Stratified (ISCS) probability estimator of p fail is given by:

$$\displaystyle \begin{aligned} \hat{p}_{\mathrm{fail}}^{\mathrm{ISCS}} = \sum_{i=1}^{I} w_i\hat{P}_i \end{aligned} $$
(11)

with variance

$$\displaystyle \begin{aligned} \mathrm{Var}_{\psi}(\hat{p}_{\mathrm{fail}}^{\mathrm{ISCS}}) = \sum_{i=1}^{I} w_i^2\dfrac{\sigma_i^2}{N_i} \end{aligned} $$
(12)

where \(w_i = \mathbb {P}(\mathbf {X}+\boldsymbol {\theta }\in D_i)\). For an optimal mean-shift θ, the estimator \(\hat {p}_{\mathrm {fail}}^{\mathrm {ISCS}}\) is unbiased and satisfies the central limit theorem [2].

From the above it should be clear that when using the controlled stratification approach one needs to consider the following three issues carefully:

  • The choice of the numberIof strata

    To the authors knowledge, there is no exact rule to choose the number I of strata. By applying rule of thumb we use I = 10 “equal probable” strata. The equal probable means that all w i are equal for i = 1, …, I.

  • The estimation of the allocationN 1, …, N I

    For choosing the allocation we first need to find the allocation fractions q i. Next the allocation can be estimated by the relation N i = ⌈q iN⌉ where ⌈x⌉ is the smallest positive integer greater than or equal to x. Substituting this in Eq. (12), we get

    $$\displaystyle \begin{aligned} \mathrm{Var}_{\psi}(\hat{p}_{\mathrm{fail}}^{\mathrm{ISCS}}) \approx \dfrac{1}{N}\sum_{i=1}^{I} w_i^2\dfrac{\sigma_i^2}{q_i} \end{aligned} $$
    (13)

    The fraction q i can be found by solving the following minimization problem

    $$\displaystyle \begin{aligned} &\min_{q_i}\left\{\sum_{i=1}^{I}\dfrac{w_i^2}{q_i}\sigma_i^2\right\},\, \mbox{with the constraints}\, 0<q_i<1,\, \sum_{i=1}^{I}q_i = 1 \end{aligned} $$
    (14)

    which gives the unique solution

    $$\displaystyle \begin{aligned} q_i = \dfrac{w_i\sigma_i}{\sum_{k=1}^{I}w_k\sigma_k},\quad i=1,\dots, I \end{aligned} $$
    (15)

    In practice, the σ i are unknown, so the optimal fractions q i are not directly applicable. Nevertheless, it is often beneficial to use pilot runs for getting the estimate \(\hat {\sigma }_i\) of σ i. The estimated allocation fractions \(\hat {q}_i\) can then be used to allocate samples to strata in a second (typically larger) set of runs. For details, we refer to [2].

4 Numerical Results

Here we present the results of two realistic circuits.Footnote 5 The first one is a VCO with 1500 stochastic input parameters and scalar response ‘oscillation frequency’ and the second one is a memory cell with 2096 stochastic input parameters and scalar response ‘read delay’.

Note that to compare the empirical results of ISMC and HISMC methods, we need a reference probabilityFootnote 6 \({p}_{\mathrm {fail}}^{\mathrm {ref}}\). A simple estimation \(\hat {p}_{\mathrm {fail}}^{\mathrm {ref}}\) of \({p}_{\mathrm {fail}}^{\mathrm {ref}}\) can be found by using ISMC estimator Eq. (3) with a small (say less than 1%) Coefficient of Variation

$$\displaystyle \begin{aligned} \mathrm{CV}=z_{\alpha/2}\dfrac{\sqrt{\mathrm{Var}(\hat{p}_{\mathrm{fail}}^{\mathrm{M}})}}{\hat{p}_{\mathrm{fail}}^{\mathrm{M}}} \end{aligned}$$

where \(\hat {p}_{\mathrm {fail}}^{\mathrm {M}}\) denotes the probability estimate of p fail computed with the method M and z α∕2 = 1.96 for the 95% confidence interval of the estimated probability [9].

4.1 The VCO

For VCO our goal is to estimate the probability of oscillation frequency to be larger than the given threshold γ = 1900. The reference probability \(\hat {p}_{\mathrm {fail}}^{\mathrm {ref}}=1.10\times 10^{-10}\) which is estimated by using the ISMC technique with CV = 0.99%.

The left and right plots in Fig. 2 show the probability distributions from N rep = 100 experiments (repetitions) of ISMC and HISMC method, respectively. The vertical solid ‘black’ line in the center represents the mean of the distribution. The dotted line in the center is the reference probability \(\hat {p}_{\mathrm {fail}}^{\mathrm {ref}}\). The two dashed lines around the center represent the 95% confidence interval. Notice that the distribution mean is equal (for ISMC) or close (for HISMC) to the reference probability.

Fig. 2
figure 2

VCO: Empirical probability distributions of ISMC and UHISMC from N rep = 100 experiments with mean 1.10 × 10−10 and 1.09 × 10−10, respectively

The mean results of the methods are given in Table 1. ‘\(\hat {p}_{\mathrm {fail}}\)’, ‘CV’ and ‘#Runs’ are the estimated probability, coefficient of variation and number of runs, respectively. ‘MSE’ is the mean squared error \(\left (\sum _{i=1}^{100}(\hat {p}_{\mathrm {fail}i}-\hat {p}_{\mathrm {fail}}^{\mathrm {ref}})^2\right )\). The efficiency of the UHISMC method with respect to the ISMC is computed as

So, ISMC requires about 6 times more simulations than UHISMC to achieve the same accuracy. Thus, UHISMC is preferred.

Table 1 VCO: ISMC versus HISMC probability estimation

4.2 The Memory Cell

Our goal for the memory cell is to estimate the probability of the oscillation frequency to be larger than the given thresholds γ = 902. The reference probability \(\hat {p}_{\mathrm {fail}}^{\mathrm {ref}}\) is 6.01 × 10−8 which is estimated by using the ISMC technique with CV = 0.99%.

Similar to the VCO, Fig. 3 shows the probability distributions and Table 2 represents the mean results with N rep = 100 experiments of ISMC and UHISMC for the memory cell. It can be seen that the probabilities from both the methods are close to the reference probability. The efficiency of the UHISMC method with respect to the ISMC is computed as

Fig. 3
figure 3

Memory Cell: Empirical probability distributions of ISMC and UHISMC from N rep = 100 experiments with mean 6.02 × 10−8 and 6.03 × 10−8, respectively

Table 2 Memory Cell: ISMC versus HISMC probability estimation
$$\displaystyle \begin{aligned} \mathrm{Eff}(\mathrm{ISMC}, \mathrm{UHISMC}) = \dfrac{3.65\times10^{-18} \times 11000}{6.5\times10^{-18}\times 3347} \approx 2 \end{aligned} $$

5 Conclusion

It is clear from both the examples above that UHISMC is more efficient than ISMC. Moreover, UHISMC provides an unbiased estimation of the probability. Thus, we prefer UHISMC over ISMC. For future work we will try to compare our UHISMC and UHISMC as in [9] in terms of efficiency and robustness.