Efficient reliability analysis based on Bayesian framework under input variable and metamodel uncertainties

An, Dawn; Choi, Joo Ho

doi:10.1007/s00158-012-0776-6

Efficient reliability analysis based on Bayesian framework under input variable and metamodel uncertainties

Research Paper
Published: 28 February 2012

Volume 46, pages 533–547, (2012)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Efficient reliability analysis based on Bayesian framework under input variable and metamodel uncertainties

Download PDF

Dawn An¹ &
Joo Ho Choi²

671 Accesses
13 Citations
Explore all metrics

Abstract

In the reliability analysis, input variables as well as the metamodel uncertainties are often encountered in practice. The input uncertainty includes the statistical uncertainty of the distribution parameters due to the lack of knowledge or insufficient data. Metamodel uncertainty arises when the response function is approximated by a surrogate function using a finite number of responses to reduce the costly computations. In this study, a reliability analysis procedure is proposed based on a Bayesian framework that can incorporate these uncertainties in an integrated manner into the form of posterior PDF. The PDF, often expressed by arbitrary functions, is evaluated via Markov Chain Monte Carlo (MCMC) method, which is an efficient simulation method to draw random samples that follow the distribution. In order to avoid the nested computation in the full Bayesian approach, a posterior predictive approach is employed, which requires only a single loop of reliability analysis. Gaussian process model is employed for the metamodel. Mathematical and engineering examples are used to demonstrate the proposed method. In the results, comparing with the full Bayesian approach, the predictive approach provides much less information, i.e., only a point estimate of the probability. Nevertheless, the predictive approach adequately accounts for the uncertainties with much less computation, which is more advantageous in the design practice. The smaller the data are provided, the higher the statistical uncertainty, leading to the higher (or lower) failure probability (or reliability).

Hybrid reliability analysis with both random and probability-box variables

Article 18 December 2014

Bayes theorem–based and copula-based estimation for failure probability function

Article 15 January 2020

A bayesian model calibration under insufficient data environment

Article 28 February 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In the modern industrial society, many efforts are directed to incorporate uncertainties in the engineering design environment in a precise and cost-effective way. As part of such efforts, reliability analysis for dealing with the input variable uncertainty at the design stage has been actively studied, which is to evaluate failure probability or safety levels of mechanical systems. Great number of researches have been conducted to efficiently handle these problems, which can be classified into sampling based methods such as Monte Carlo Simulation (MCS), Most Probable Point (MPP) based methods including First/ Second Order Reliability Method (FORM/SORM) (Haldar and Mahadevan 2000), and moment based integration methods with the Dimension Reduction Method (DRM) as being the most noteworthy (Rahman and Xu 2004; Xu and Rahman 2004, 2005). In the DRM, more recent works include the polynomial dimensional decomposition method (Rahman 2008, 2009), which exploits the smoothness of a stochastic response by orthogonal polynomials that are consistent with arbitrary probability distributions of the random input, and MPP-based DRM for reliability-based design optimization (Lee et al. 2008, 2010) in order to overcome inaccuracy of the MPP-based FORM.

In the previous developments, the uncertainty has mostly been considered as aleatory uncertainty which is irreducible and related with inherent physical randomness that is completely described by a suitable probability model. Epistemic uncertainty, however, is prevalent in the real industrial world, which makes the existing methods less useful since it results from the lack of data or subjective knowledge (Der Kiureghian and Ditlevsen 2009). There have been recent studies to handle this uncertainty by using non-probabilistic methods, which includes the interval analysis (Qiu et al. 2004), the possibility theory (Du et al. 2006; Zhou and Mourelatos 2008) and evidence theory (Bae et al. 2006; Mourelatos and Zhou 2006). The weakness of these methods, however, is that the uncertainty is modeled more or less based on the subjective expert opinions. In engineering design practice, the uncertainty is often given by a small number of samples from historical data or actual experiment, which is too few to infer the probability distributions. This is called as statistical uncertainty. The Bayesian approach can be a useful method in this case due to the advantages that it easily represents the insufficiency of the data in terms of the probability, it provides a unified way for aleatory and epistemic uncertainty in a single framework; and it can conveniently update the degree of uncertainty by adding more data to the prior information. Very recently, Noh et al. (2011) addressed the same issue in the course of the reliability-based design optimization.

In the Bayesian approach, the probability itself is treated as a random variable which quantifies our degree of belief on the probability in light of the observed data. In the recent literature, two-stage nested or double loop of reliability analysis based on a full Bayesian procedure was proposed in order to implement this approach, in which the distribution parameters are treated as the unknown random variable. In this procedure, the outer loop determines the CDF of probability, whereas the inner loop solves conventional reliability analysis problem given the distribution parameters. Several methods, e.g., FORM (Gunawan and Papalambros 2006), Markov Chain Monte Carlo (MCMC) (Cruse and Brown 2007) and the DRM (Youn and Wang 2008; Choi et al. 2010; Rahman and Wei 2008) were used for this purpose. The full Bayesian procedure, however, calls for substantial increase of computations, and is not tractable for the practical design purpose. In this study, a method of posterior prediction is employed to resolve this problem, which requires only a single reliability analysis step.

In the evaluation of structural response function, common practice is to employ metamodel that approximates the original response in an effort to save computational cost. This causes another uncertainty, which we call metamodel uncertainty, since the model is constructed using only a finite number of responses and hence is unknown at the untried points (O’Hagan 2006). In this paper, Gaussian process model, also known as Kriging, is employed for the response approximation. Within this model, correlation parameter plays important role which determines the smoothness of the metamodel. In the previous approaches, this was given arbitrarily which is cumbersome, or more likely determined by costly sub-optimization called Maximum Likelihood Estimation (MLE) (Sacks et al. 1989). Furthermore, the parameter, despite given by the MLE, often failed to provide smooth metamodel in practice as will be shown in Section 4. In this paper, this parameter is treated as the uncertain along with the other parameters, of which the posterior distributions are determined conditional on the finite number of responses at computer experiment points. A similar treatment has been addressed in a number of literatures (e.g., see Kennedy and O’Hagan 2001; Rasmussen and Williams 2006).

The final goal of this study is the integration of all the uncertainties in a single Bayesian framework. Information of the posterior distribution, whether they are the model parameters of the input variables or the parameters of the metamodel, are obtained by employing MCMC simulation, which is a modern computational method to draw random sequence of parameters that samples the given distribution (Andrieu et al. 2003). Once the posterior samples of the parameters are available, predictive samples are drawn from the associated distribution given each value of the parameters. The uncertainty of the response information is then estimated using the drawn data. Mathematical examples and engineering problems are given to demonstrate that proposed method is feasible and practical.

2 Reliability analysis under input uncertainty

Reliability analysis is a necessary step in the design under uncertainty, which is to evaluate failure probability or safety levels of mechanical systems.

The problem is typically given in the form

$$ \begin{array}{rll} p_g ={}&P\left[ {G<0} \right]\\ ={}&\int_{g\left( {\rm {\bf X}} \right)<0} {f_{\rm {\bf X}} \left( {\rm {\bf x}} \right)d{\rm {\bf x}}} \quad \mbox{or}\quad \int_{g<0} {f_G \left( g \right)dg} \end{array} $$

(1)

where X is a vector of random input variables, f _x(x) is the joint PDF of X, g is the response function, g = 0 is the limit state function, f _G(g) is the PDF of g, and G is the probabilistic representation of g. The event g(X) < 0 can be a failure or safety depending on the problem. Each of the random input variable X _i has its own statistical distribution described by a set of model parameters $\boldsymbol\theta $. If all the members of X are aleatory, i.e. if the values of model parameters $\boldsymbol\theta $ are completely known either through an infinite amount of data or well-established knowledge, then the problem becomes an ordinary reliability prediction, from which a fixed value of the probability p _g is calculated using the existing methods such as the MCS, FORM or DRM. Figure 1 gives a schematic picture of the analysis procedure.

Suppose that the input variable X shows statistical uncertainty, i.e., only a small number of data ${\bf x}^e=\big\{ {\bf x}_1^e ,{\bf x}_2^e $, $..., {\bf x}_{ne}^e \big\}$ from experiments or past experience are available for part or all of the input variables. Then the corresponding model parameters $\boldsymbol\theta $ become uncertain, which leads to the uncertainty in the reliability prediction. In this case, the model parameters are assumed to be random and are depicted as capital letter $\boldsymbol\Theta $, and Bayes’ theorem is used to estimate the probabilistic behavior of the parameters (Gelman et al. 2004):

$$ f\left( {\boldsymbol\theta \vert {\rm {\bf x}}^e} \right)\propto f\left( {{\rm {\bf x}}^e\vert \boldsymbol\theta } \right)f\left( \boldsymbol\theta \right) $$

(2)

where $f\left( \boldsymbol\theta | {\bf x}^e \right)$ is the posterior PDF of $\boldsymbol\Theta $ conditional on the observed data x ^e, $f\left( {\bf x}^e |\boldsymbol\theta \right)$ is the likelihood of the observed data x given the parameters $\boldsymbol\theta $, and f($\boldsymbol\theta $) is the prior PDF of $\boldsymbol\Theta $.

Since the model parameters are not constant but follow a distribution, the probability p _g as defined in (1) is no longer a deterministic value but behaves as a random variable, denoted by P _g. This is called outer-loop reliability analysis problem, in which the random input variables are $\boldsymbol\Theta $, and the random output is the probability P _g. During this step, the inner-loop reliability analysis is conducted to obtain the individual realization of P _g. This is a nested, double-loop analysis, which is called the full Bayesian approach. After all, the probability distribution of P _g can be expressed in the form

$$ F_{P_g } \left( {p_g } \right)=P\left[ {P_g <p} \right]=\int_{p_g \le p} {f_{P_g } \left( {p_g } \right)dp_g } $$

(3)

where $F_{P_g } \left( {p_g } \right)$ and $f_{P_g } \left( {p_g } \right)$ are the CDF and PDF of P _g, respectively. The procedure is summarized in Fig. 2. In this figure, the model parameters $\boldsymbol\Theta $ are expressed by the posterior PDF $f\left( \boldsymbol\theta | {\bf x}^e \right)$ conditional on the given samples x ^e. Once a set of $\boldsymbol\theta $ values are drawn from this random distribution, the probability distribution for X is established and the probability p _g is evaluated for each drawn values of $\boldsymbol\Theta $. Then the PDF of the probability is obtained, which represents a degree of belief on the probability conditional on the sample data. Using the full PDF information of P _g, mean and lower and upper confidence bounds can be estimated easily.

In the Bayesian procedure, the PDF is often expressed by arbitrary functions in terms of the Bayes’ theorem, which makes the conventional methods such as the numerical integration or inverse CDF less useful. In this study, Markov Chain Monte Carlo (MCMC) method is employed, which is an efficient simulation method to draw random samples that follow the target distribution with any complexity including no closed form expressions (Andrieu et al. 2003). In the full Bayesian approach, the MCMC should be implemented in the nested process, which may increase the computational cost greatly. In this study, this is avoided by introducing posterior predictive approach, which requires only a single loop of reliability analysis. The posterior predictive distribution, defined as X ^p, is given in the following form (Gelman et al. 2004)

$$ f\left( {\left. {{\rm {\bf x}}^p} \right|{\rm {\bf x}}^e} \right)=\int {f\left( {\left. {{\rm {\bf x}}^p} \right|\boldsymbol\theta } \right)f\left( {\left. \boldsymbol\theta \right|{\rm {\bf x}}^e} \right)d\boldsymbol\theta } $$

(4)

where the superscript p represents the prediction, $f\left( \boldsymbol\theta | {\bf x}^e \right)$ is the posterior distribution obtained by (2), and $f\left( {\bf x}^p | \boldsymbol\theta \right)$ is the probability distribution of the prediction conditional on the parameters $\boldsymbol\theta $. The predictive distribution can be obtained by integrating out the two terms on the right in terms of $\boldsymbol\theta $. In practice, however, predictive samples of X ^p are drawn from the conditional probability distribution $f\left( {\bf x}^p | \boldsymbol\theta \right)$, given the values of $\boldsymbol\theta $ obtained via the MCMC method from the posterior distribution $f\left( \boldsymbol\theta | {\bf x}^e \right)$. Once the samples of X ^p are available, one can proceed to produce the data of response function by evaluating it at each sample point. Probability of the event given by (1) is then calculated from the resulting data.

3 Reliability analysis under metamodel uncertainty

Metamodel is commonly exploited in the modern simulation-based engineering analysis during the design stage. The purpose is to reduce the computational cost by approximating the original response to a surrogate function using a finite set of samples. One of popular choices receiving greatest attention is the Kriging model. Until recently, however, the Kriging was studied mostly from deterministic viewpoint, i.e., used just as a fitting or interpolation while ignoring the metamodel uncertainty, where the true response is unknown except at the sample points. Numerous efforts have been made in the statistical community to quantify this uncertainty. In one of the most popular approaches, the Kriging model is viewed as a realization of a Gaussian process model and Bayesian methods are used to quantify the associated uncertainties by calculating its posterior distribution of unknown parameters given the finite response values (Kennedy and O’Hagan 2001).

Let us now consider a case that the response function is approximately interpolated by a finite number of computed outputs ${\bf g}^c=\left\{ {g_1^c ,...,g_{nc}^c } \right\}$ at a set of DOE points ${\bf x}^c=\left\{ {\bf x}_1^c ,\ldots , {\bf x}_{nc}^c \right\}$ with number nc. For this purpose, Gaussian random function is introduced as follows (Kennedy and O’Hagan 2001).

$$\begin{array}{rll}\hat{{G}}\left( {\rm {\bf x}} \right)&=&{\rm {\bf f}}\left( {\rm {\bf x}} \right)\boldsymbol\beta +Z\left( {\rm {\bf x}} \right),\;\;Z\sim N\left( {0{\rm {\bf I}}_{nc} ,\sigma^2{\bf R}} \right),\\{\bf R}&=&R\left( {{\rm {\bf x}}_i ,{\rm {\bf x}}_j } \right),\;\;i,j=1,\ldots ,nc \end{array} $$

(5)

where $\hat{~}$ denotes the surrogate representation, ${\bf f}\left( {\bf x} \right)\boldsymbol\beta $ is the normal linear model, ${\bf f}=\left[ {f_1 ,\ldots ,f_m } \right]$ and $\boldsymbol\beta =\big[ {\beta_1 ,...}$, ${\beta_m } \big]^T$ are m number of the trial functions and associated parameters, respectively, Z is a Gaussian stochastic process with zero mean and variance σ ², I _nc is the nc×nc identity matrix, and R is a correlation function between x _i and x _j which is represented by

$$ R\left( {{\rm {\bf x}}_i ,{\rm {\bf x}}_j } \right)=\exp \left\{ {-\left( {\frac{\left\| {{\rm {\bf x}}_i -{\rm {\bf x}}_j } \right\|}{h}} \right)^2} \right\} $$

(6)

where h is a correlation parameter that controls the degree of smoothness of the function. If the h gets higher, the model becomes smoother, but the singularity is encountered in the correlation matrix if it is too high. In most studies, h is determined by the method of maximum likelihood estimate (MLE). According to Etman (1994) and Sasena et al. (2002), however, MLE method is not only computationally expensive which requires additional optimization process, but also the quality of the obtained parameter is questionable. In this study, h is considered as an unknown parameter to avoid this.

Based on (5), the computer outputs g ^c follow multivariate normal distribution:

$$ f\left( {\left. {{\rm {\bf g}}^c} \right|\boldsymbol\beta ,\sigma ,h} \right)=N\left( {{\rm {\bf F}}\boldsymbol\beta ,\,\,\sigma^2{\rm {\bf R}}_{\left( {{\rm {\bf x}}^c} \right)} } \right) $$

(7)

where

$$ {\rm {\bf F}}=\left[ {{\begin{array}{*{20}c} {{\rm {\bf f}}\left( {{\rm {\bf x}}_1 } \right)} \hfill \\ \vdots \hfill \\ {{\rm {\bf f}}\left( {{\rm {\bf x}}_{nc} } \right)} \hfill \\ \end{array} }} \right] $$

(8)

is nc dimensional vector of f at x ^c, and the subscript (x ^c) in (7) denotes the correlation matrix in terms of x ^c. In this procedure, the parameters $\boldsymbol\beta $, σ and h are the unknowns to be determined. In the Bayesian procedure, the uncertainties of these parameters are characterized by the joint posterior distribution conditional on the finite number of computed outputs. In view of the Bayes’ rule (2), the distribution of these parameters is given by multiplying the likelihood to obtain the computer outputs given the parameters and the prior distribution of the parameters that represents our prior knowledge. Assuming non-informative prior in this study, the prior distribution is defined as

$$ f\left( {\boldsymbol\beta ,\sigma ,h} \right)\propto \sigma^{-2} $$

(9)

The likelihood to obtain the computer outputs is given by

$$\begin{array}{rll} &&f\left( {\left. {{\rm {\bf g}}^c} \right|\boldsymbol\beta ,\sigma ,h} \right)\\ &&{\kern6pt} \propto \,\sigma^{-nc}\,\left| {{\bf R}_{\left( {{\rm {\bf x}}^c} \right)} } \right|^{-1/2}\\ &&{\kern18pt} \times\exp \left( {-\frac{1}{2\sigma^2}\left( {{\rm {\bf g}}^c-{\rm {\bf F}}\boldsymbol\beta } \right)^T{\bf R}_{\left( {{\rm {\bf x}}^c} \right)} ^{-1}\left( {{\rm {\bf g}}^c-{\rm {\bf F}}\boldsymbol\beta } \right)} \right) \end{array} $$

(10)

Then the posterior PDF of the parameters becomes

$$ f\left( {\left. {\boldsymbol\beta ,\sigma ,h} \right|{\rm {\bf g}}^c} \right)\propto f\left( {\left. {{\rm {\bf g}}^c} \right|\boldsymbol\beta ,\sigma ,h} \right)f\left( {\boldsymbol\beta ,\sigma ,h} \right) $$

(11)

As in the previous section, the MCMC method is employed to draw samples of the parameters that follow this distribution.

Once we have obtained the distributions of the parameters in the form of samples, we can proceed to obtain the posterior predictive distribution of the surrogate $\hat{{G}}^p$ at untried point x ^p in terms of (4), in which x ^p, x ^e and $\boldsymbol\theta $ are replaced by $\hat{{g}}^p$, g ^c and $\boldsymbol\upbeta $, σ, h respectively. The two functions on the right in (4) become the conditional PDF of $\hat{{g}}^p$ which is (7), and the posterior PDF of $\boldsymbol\upbeta $, σ, h which is (11), respectively. In this case, following the procedure by Kennedy and O’Hagan (2001), the predictive distribution of $\hat{{G}}^p$ is exactly expressed by the multivariate normal distribution, in which the mean is given by

$$ \begin{array}{lll} &E\left( {\left. {\hat{{G}}^p} \right|{\rm {\bf g}}^c,\boldsymbol\beta ,\sigma ,h} \right)\\ &\,\,={\rm {\bf f}}\left( {{\rm {\bf x}}^p} \right)\boldsymbol\beta +{\bf R}_{\left( {{\bf x}^p,{\rm {\bf x}}^c} \right)} {\bf R}_{\left( {{\rm {\bf x}}^c} \right)}^{-1}\left( {g^c-{\rm {\bf F}}\boldsymbol\beta } \right) \end{array} $$

(12)

and the variance is given by

$$ \begin{array}{lll} &{\rm var}\left( {\left. {\hat{{G}}^p} \right|{\rm {\bf g}}^c,\boldsymbol\beta ,\sigma ,h} \right)\\ &\,\,=\left( {{\bf R}_{\left( {{\rm {\bf x}}^p} \right)} -{\bf R}_{\left( {{\bf x}^p,{\rm {\bf x}}^c} \right)} {\bf R}_{\left( {{\rm {\bf x}}^c} \right)}^{-1}{\bf R}_{\left( {{\rm {\bf x}}^c,\mbox{x}^p} \right)} } \right)\sigma^2 \end{array} $$

(13)

As in the previous section, since the samples of the parameters are available at hand, the predictive samples of $\hat{{G}}^p$ at x ^p are drawn at random from this normal distribution conditional on each samples set of the parameters. The values such as the mean and 90% predictive bounds of $\hat{{G}}^p$ can be computed from the obtained samples. Note that this is just a single result at a point x ^p. Repeating this at a number of untried points, e.g., grids with equal interval in the interested range, one gets the mean and predictive bounds of the metamodel over the whole range.

4 Integrated reliability analysis using Bayesian approach

The Bayesian procedure holds for both cases of input uncertainty and metamodel uncertainty, which is the reason that the approach can be generalized to integrate both the uncertainties in a single Bayesian framework. The steps are summarized in Table 1, and explained as follows.

Step 1:
Collect data.

In case of statistical uncertainty, collect ne number of data ${\bf x}^e=\left\{ {\bf x}_1^e , {\bf x}_2^e ,..., {\bf x}_{ne}^e \right\}$ of input variables by either previous information or direct measurements.

In case of costly computation for the response function, collect nc number of computer outputs ${\bf g}^c=\left\{ {g_1^c ,\ldots ,g_{nc}^c } \right\}$ at a set of DOE points by repeated simulations, which will be used to build metamodel.
Step 2:
Establish posterior distribution.

Identify unknown model parameters $\boldsymbol\theta $ in the input variables distribution and the parameters $\boldsymbol\beta $, σ, h of the metamodel, respectively. Upon introducing suitable priors for each group of the parameters, obtain the expressions for the posterior distributions conditional on the provided data in each case.
Step 3:
Draw samples of the unknown parameters.

Draw random samples of the unknown parameters with sufficient number N ≈ 5·10³~10⁴ that follow their corresponding posterior distributions. MCMC simulation method is well suited in this case since the target distribution is given as complex or implicit in terms of parameters, which is difficult to integrate in traditional way. The Metropolis-Hastings (M-H) algorithm is employed within the variants of the MCMC (Andrieu et al. 2003).
Step 4:
Establish posterior predictive distribution.

In both cases, the posterior predictive distributions of X ^p and $\hat{{G}}^p$ are obtained by integrating out the conditional distribution and the posterior distribution in terms of the parameters, as indicated in (4). In case of input uncertainty, the conditional distribution $f\left( {\bf x}^p | \boldsymbol\theta \right)$ is just its own probability model. If for instance, a single variable X follows normal distribution, the parameters $\boldsymbol\theta $ are its mean μ and standard deviation σ. In case of metamodel uncertainty, the surrogate $\hat{{G}}^p$ is given by a multivariate normal distribution with the mean and variance being (12) and (13) respectively.
Step 5:
Draw samples of posterior predictive distribution.

Although the predictive distribution is given in the integral form, the integration is not performed in practice. Instead, the predictive sampling is exploited, which is to draw samples using the posterior samples already obtained at Step 3. In the input uncertainty, the samples of ${\bf X}^p=\left\{ {x_i^p ,\,i=1,\ldots ,N} \right\}$ are drawn from the conditional probability using each samples of the parameters $\boldsymbol\theta $. In the case of normal distribution for instance, x _i are drawn from $N\left( {\mu_i ,\sigma_i^2 } \right)$. In the metamodel uncertainty, samples of $\hat{{G}}^p=\left\{ {\hat{{g}}_i^p ,\,i=1,\ldots ,N} \right\}$ at an untried point is drawn from the multivariate normal distribution using each sample set of the parameters $\boldsymbol\beta $, σ, h. The obtained random samples of X ^p and $\hat{{G}}^p$ represent the distribution of x and $\hat{{g}}$ due to the statistical uncertainty caused by the limited data and metamodel uncertainty caused by a limited number of computer outputs, respectively.
Step 6:
Conduct integrated reliability analysis.

When both uncertainties co-exist, predictive distribution of surrogate response should be obtained at the random input variable X ^p which shows statistical uncertainty. In this case, predictive samples of $\hat{{G}}^p$ are drawn from the multivariate normal distribution, conditional on the two sets of the samples $\boldsymbol\beta_{i}$, σ _i, h _i and ${\bf x}_i^p ,\,\,i=1,\ldots ,N$, where the former are the posterior samples of metamodel parameters from Step 3 and the latter are predictive samples of input variables from Step 5. Consequently, P [G < 0] is calculated from the drawn N number of $\hat{{g}}^p$ values.

Table 1 Procedure of integrated Bayesian reliability analysis

Full size table

5 Mathematical examples

In this section, two mathematical examples are studied to examine the effects of input and metamodel uncertainties. In each example, the proposed method based on the Bayesian framework will be addressed, which includes input uncertainty, metamodel uncertainty and integrated reliability analysis.

5.1 Single variable example

First is a function of single variable as follows (O’Hagan 2006):

$$ g\left( X \right)=X+3\sin \left( {X/2} \right) $$

(14)

The variable X is assumed as aleatory following the normal distribution with the mean μ being 3.5 and the standard deviation σ being 1.2. The PDF distribution of g(X) due to the randomness of X is obtained by using the classical MCS with N = 10⁶, which is plotted in Fig. 3. Assuming g(X) < 3.5 as a failure event, the probability P[G < 3.5] is calculated as 0.0462.

Now the input variable is changed to the statistical uncertainty due to the limited amount of data with number ne. Model parameters $\boldsymbol\theta $, which are the mean μ and standard deviation σ, become unknown accordingly. Let us assume that still the same values for the sample mean and standard deviation are observed from the data, i.e., $\bar{{x}}=3.5$ and s = 1.2. Using the Bayes’ theorem under a non-informative prior, the joint posterior PDF of the model parameters is given by (Gelman et al. 2004):

$$\begin{array}{rll} &&p\left( {\mu ,\sigma^2\left| {x^e} \right.} \right)\\ &&{\kern6pt}\propto \sigma ^{-n-2} \exp \left( {-\frac{1}{2\sigma^2}\left[ {\left( {n-1} \right)s^2+n\left( {\bar{{x}}-\mu } \right)^2} \right]} \right) \end{array} $$

(15)

where the prior for μ and σ is based on the non-informative assumption and is given by (Martz and Waller 1982)

$$ p\left( {\mu ,\sigma^2} \right)\propto \left( {\sigma^2} \right)^{-1} $$

(16)

If there exists a specific prior for $\boldsymbol\Theta $ from the previous experience, non-conjugate prior should be employed, which is one of the key feature of Bayesian approach. In that case, as the posterior PDF takes an arbitrary form, the MCMC is better suited than the other conventional methods.

In the nested reliability analysis, outer loop is implemented to determine posterior PDF of $\boldsymbol\Theta $ and the inner loop to determine the probability P[G < 3.5] respectively. Number of samples is 10⁴ in each simulation, total of which amounts to 10⁸. In Fig. 4, the sampling results of $\boldsymbol\theta $ in the case ne = 5 are presented, in which (a) is the trace of MCMC sampling, and (b) is the scatter plot of drawn samples from MCMC. As a result of Bayesian reliability analysis, the full PDF of probability is given in Fig. 5a, which represents the degree of belief on the probability conditional on the provided data. As the number of data increases, the PDF gets narrower, and converges to a single value that was obtained with aleatory uncertainty. In Table 2, the 90% confidence bounds are listed at each number ne. In the design practice, natural choice is the upper bound values of the probability for the sake of safety.

Table 2 Bayesian reliability analysis results of example 1

Full size table

In the predictive reliability analysis, predictive samples of X ^p are drawn from the following conditional probability distribution at each values of $\boldsymbol\theta $ obtained from the outer loop:

$$ X^p\vert \mu ,\sigma^2\sim N\left( {\mu ,\sigma^2} \right) $$

(17)

In this case, analytic form of the posterior predictive distribution for X ^p is available, which is t-distribution with the mean $\bar{{x}}$, standard deviation $s\sqrt {1+1/n} $ and ne-1 degrees of freedom. Predictive samples following this distribution are drawn with N = 10⁴ numbers. Finally, response functions are obtained at each sample of X ^p, from which the probability P [G < 3.5] is calculated. The results are listed in Table 2, and are also plotted in Fig. 5b. The probability converges to the aleatory value as in the nested case.

Comparing with the upper bound values of nested approach, the predictive values are much lower, hence, are less conservative. Recall here that (4) represents a marginal distribution of X ^p conditional on x ^e, i.e., average of the conditional prediction over the posterior distribution of $\boldsymbol\Theta $. This means that the predictive values of the probability agree with the mean values of the full PDF of the probability in the nested approach. This can be evidenced by comparing the predictive values to the mean values in the Table 2.

In this example, though the function is pretty simple to calculate, it is approximated by a Kriging model in order to illustrate the metamodel uncertainty, using a finite set of responses at the equally spaced points. Two cases with the number of points nc = 4 and 6 are considered in the range from 0 to 15. The trial functions and associated parameters are f = [1, x] and $\boldsymbol\beta = [\beta_{1}, \beta_{2}]$ respectively. As was noted previously, conventional method to determine the correlation parameter h is to use MLE. The obtained h, however, often fails to provide the best Kriging model. In order to illustrate this, h is determined by minimizing equivalent likelihood function for the first example (Sacks et al. 1989).

$$ \begin{array}{lll} \emph{Minimize}\;\;\sigma^2\left| {\bf R} \right|^{1/nc} \\ \emph{Subject\;to}\;h>0 \\ \end{array} $$

(18)

Then the other parameters β ₁, β ₂ are determined by the equation using the obtained h and the Kriging model is constructed using the parameters. The equivalent likelihood functions in terms of h are plotted in Fig. 6a and c for the cases with nc = 4 and 6, respectively. Minimum h are determined using the function ‘fmincon’ in the Matlab optimization toolbox. In the case nc = 4, it is found that the function is minimum over the range of 0 to 1.5. Kriging curves are made using several optimum h values at 0.5, 1, 1.5 and another value at 5 for comparison. The results are plotted in Fig. 6b. In the case nc = 6, the optimum is found at 5.1 although the function appears to be almost flat below that point. Kriging curves are drawn using optimum h at 5.1 along with the other values at 0.5, 2, and 12, which are shown in Fig. 6d. As seen in the two cases, the optimum h are not easy to obtain, nor do they provide the best model. In case nc = 4, the best curve with sufficient smoothness is found at h = 5. In case nc = 6, all the curves with h from 5 to 12 are close to the true model. From this study, it is found that the larger the h value, the smoother the Kriging model and gets closer to the true model. Too large h, however, induces singularity in the correlation matrix, which makes assigning h still difficult.

In order to avoid this problem, h is included as the uncertain parameter. The unknown parameters are then β ₁, β ₂, σ, and h which were given in (5) and (6). The posterior distributions of these parameters are obtained in the form of samples with N = 30,000 using MCMC technique. Note that h is implicitly expressed within the correlation matrix R in the posterior distribution (11), which justifies the use of the MCMC. The results for the cases nc = 4 and 6 are shown in Fig. 7. The distributions represent the uncertainty of the parameters due to the employment of metamodel. From the histograms of h, the medians are computed as 2.77 and 5.18 for nc = 4 and 6, respectively. After obtaining the posterior distribution of the parameters as given by (11), the predictive distribution is obtained by drawing samples from the multivariate normal distribution with the mean and variance being (12) and (13) respectively at each sample of the parameters. The 90% prediction intervals are shown in Fig. 8 for the various cases. In the figures, upper and lower row represent the results with the number of DOE points nc = 4 and 6 respectively. The first, second and third column represent the results of two constant h’s with the value 0.5 and 5, and the uncertain h, respectively. The solid curve is the actual model, the dashed curve is the estimated mean of the metamodel, and the dotted curve is its 90% prediction interval, respectively. From the figures, it is observed that overall, the prediction interval is narrowed down as nc is increased. In terms of h, the uncertainty is reduced as the value becomes higher which makes the function more smooth. In particular, in the Fig. 8e of nc = 6 and h = 5, the metamodel is very close to the actual model. The results of uncertain h indicate that their intervals are a little wider than those of h = 5. In fact, this can be interpreted as the average of the predicted samples in terms of the posterior distribution of h. In the Fig. 8, it is recalled that the better curves with smaller intervals obtained by h = 5 are just a result of an arbitrary trial, not the result of optimization. As found previously in Fig. 6, determination of h by optimization is not always likely to provide ‘true’ optimum. Instead, it was found by increasing the value arbitrarily until the singularity is encountered. By adopting h as uncertain, this arbitrary procedure can be avoided although a wider interval is encountered as an expense. From these observations, it is concluded that the Bayesian approach efficiently quantifies the uncertainty of the Kriging metamodel in the form of prediction interval. The effect of uncertain h is also incorporated during the procedure, which is another advantage since it does not suffer from the arbitrary assignment of constant h.

The final goal is to implement reliability analysis under the integrated input variable and metamodel uncertainty. As was noted in the general procedure, the predictive distribution of surrogate response $\hat{{G}}^p$ at x = 3.5 are obtained by drawing samples from multivariate normal distribution at each sample of the two sets—posterior distribution of $\boldsymbol\beta $, σ ², h and predictive distribution of X ^p. Probability of an event for g is then calculated from the drawn samples. The predicted probabilities P[G < 3.5] under various cases are shown in Fig. 9a with the parameter h being uncertain. At each number of DOE points nc, probability converges from above to a value as the number of data ne increases. As nc increases, the probability also converges, but much more rapidly. After all, the probability converges to the aleatory value 0.0462. In Fig. 9b, the results under a constant h = 5 are also given for comparison, of which the values incidentally are lower than the case with uncertain h. With different h, however, the results may vary arbitrarily, i.e., can be higher or lower depending on the value of h. This arbitrariness is avoided by employing uncertain h in the procedure.

5.2 Two variables example

Consider next a function of two variables as follows (Youn and Wang 2008):

$$ g\left( {\rm {\bf X}} \right)=1-X_1^2 X_2 /20 $$

(19)

where both variables are assumed aleatory with normal distributions X ₁ ~N(2.9, 0.2) and X ₂ ~N(2.8, 0.2). Then, the probability P[G < 0] is calculated as 0.8412 using the classical MCS. In this example, statistical uncertainty is introduced to X ₁, i.e., only a small number of data are available for X ₁ with $\bar{{x}}_1 =2.9$ and s ₁ = 0.2, while X ₂ is still aleatory. The results are listed in Table 3. Same feature is observed as the previous example. The confidence bounds of nested approach as well as the predictive values converge in common toward the aleatory value. The predictive values are close to the mean values of the PDF as was found previously. The probability in this example can be expected to be reliability, in which the lower bound in the nested approached may be favored.

Table 3 Bayesian reliability analysis results of example 2

Full size table

In comparison of the nested and predictive approaches, it can be concluded that the nested approach provides much more information, i.e., the full PDF of probability at the expense of increased computation. On the other hand, the predictive approach provides just a point estimated value which is the mean of the PDF. Nevertheless, the predictive approach also accounts for the statistical uncertainty in a single loop analysis, which is more advantageous in the design practice. The smaller the data are provided, the higher the statistical uncertainty, which means that higher (or lower) failure probability (or reliability) are assigned.

In the metamodel uncertainty, the trial functions and associated parameters are f = [1, x ₁, x ₂] and $\boldsymbol\beta $ = [β ₁, β ₂, β ₃] respectively. Then total of five parameters are considered including σ and h. In the two variables problem, Latin Hypercube Sampling (LHS) (Iman 1999) is used to generate 10 DOE points in the range of x = [1,5]. Finite number of responses are calculated at these points to construct metamodel. Given the response values, the posterior distributions of the unknown parameters are obtained via MCMC technique in Fig. 10. The upper and lower bounds of the predictive interval of the metamodel incorporating the uncertain h are also given by surface plot in Fig. 11. In this figure, the dots at the bottom plane denote the LHS points.

As in the previous example, the probabilities under the integrated uncertainty are computed in various cases. The results are given in Fig. 12a in terms of nc and ne. Overall, similar pattern is observed. In this case, the probability, representing the reliability, converges from below to 0.8412 as opposed to the 1D results where the probability of small value, possibly representing the failure probability, converges from higher side. Nevertheless, the difference either from the lower or higher side represents the safety margin due to the uncertainties arising from the lack of data, which may be reduced as the both numbers increase. Of noteworthy is that a significant jump of the probability is observed from nc = 6 to 8 in Fig. 12a and c, which indicates that there may be a least necessary number of DOE points to obtain plausible result, and advises that the metamodel be constructed in adaptive manner. In the case of nc = 8 in Fig. 12a, some of the probabilities are higher than the value by aleatory uncertainty even under the epistemic uncertainties, which contradicts our expectation. The reason can be attributed to the poor quality of DOE points around the point of reliability analysis as shown in Fig. 12b, in which the dots and × denote the DOE points and the analysis point, respectively. As a remedy to this problem, LHS with a proper normal distribution is employed, in order to arrange more DOE points near the reliability analysis point. The result by applying this method is given in Fig. 12c using the DOE points given by Fig. 12d.

6 Engineering application

Consider a helical coil spring assembled in a McPherson strut suspension as given in Fig. 13. In this problem, maximum shear stress τ _max is calculated under the prescribed displacement load. To obtain τ _max, nonlinear FEA is carried out, which includes large deformation and contact between the coil and upper/lower seats as well as the adjacent wires. Using the FEA model with 1664 number of elements as depicted in Fig. 13, the computational cost is about 30 sec with the CPU operating at 2.52 GHz. In this problem, the wire diameter denoted by X ₁ in Fig. 13 and Young’s modulus denoted by X ₂ are treated as uncertain input parameters. Reliability analysis is conducted to evaluate the probabilistic behavior τ _max due to these uncertainties using the proposed integrated method. At the risk of costly computation, classical MCS with 10⁴ samples is employed for the purpose of accuracy verification of the method.

In the case of aleatory uncertainty, the variables X ₁ and X ₂ are assumed as normally distributed with N(14, 0.14)mm and N(206, 6.18)GPa, respectively. In the case of statistical uncertainty, it is assumed that only 10 number of data are available for X ₂ of which the sample mean and standard deviation are the same as the above values. As opposed to the simple math example, huge number of FEA are needed in this example to implement MCMC sampling procedure. To avoid this, metamodel is introduced, in which the DOE points are made by LHS in the domain of μ ± 6σ, that is, 13.16 ≤ x ₁ ≤ 14.84 and 168.92 ≤ x ₂ ≤ 243.08. Metamodel for τ _max is then constructed based on the computed responses at DOE ponts. The unknown parameters are the model parameters (μ, σ) for the input variable X ₂ and the metamodel parameters $\boldsymbol\beta $, σ, h. MCMC is implemented to obtain 10⁴ samples of the posterior distribution of these parameters. In Table 4, the reliability $P\left[ {\tau _{\max } <1{,}000~\mathrm{MPa}} \right]$ is calculated for the three cases which are the results of the input uncertainties only, and input plus metamodel uncertainties with nc = 10 and nc = 5, respectively. Predictive samples of $\hat{{\tau }}_{\max } $ are obtained from the metamodel using each sample set of the parameters, which would be prohibitive using the actual FEA model. As shown in Table 4, the reliability increases as the associated uncertainties are reduced, with the highest reliability being 0.8525 which is the value under the aleatory assumption. Comparing with the result by MCS, there is tremendous difference in the computational cost but the reliability results are similar between the two methods, which are metamodel method and MCS using actual model. The more complex problem the more difference in the computational cost will occur. The integrated reliability method therefore, can be used effectively.

Table 4 Integrated reliability analysis results of suspension coil spring example

Full size table

7 Summary and conclusions

In this paper, an integrated reliability analysis procedure is developed based on a Bayesian framework, which can address both the statistical uncertainties arising from the limited data of input variables and construction of the metamodel. In the input uncertainty, a method of posterior prediction is proposed to efficiently evaluate the failure probability or reliability. From the study, it is found that the probability by the predictive approach corresponds to the mean of PDF of probability by the full Bayesian approach. Though the information of full PDF is lost in the predictive approach, it is indeed a more cost-effective method since it catches the influence of the input uncertainties using only a single pass of reliability analysis. In the metamodel uncertainty, the uncertainty due to the metamodel construction at a finite number of DOE points is quantified in the form of predictive interval by employing Gaussian process model. During the procedure, the uncertainty of the correlation parameter is also incorporated, which has been taken as a constant value in most of the previous engineering literatures. A general procedure to integrate these uncertainties is presented based on a Bayesian framework. For an efficient evaluation of the posterior distribution in the procedure, Markov Chain Monte Carlo (MCMC) method is employed. The feasibility of the proposed method is validated by the mathematical and engineering problems.

References

Andrieu C, de Freitas N, Doucet A, Jordan MI (2003) An introduction to MCMC for machine learning. Mach Learn 50:5–43
Article MATH Google Scholar
Bae HR, Grandhi RV, Canfield RA (2006) Sensitivity analysis of structural response uncertainty propagation using evidence theory. Struct Multidisc Optim 31(4):270–279
Article Google Scholar
Choi JH, An D, Won JH (2010) Bayesian approach for structural reliability analysis and optimization using the Kriging Dimension Reduction Method. ASME J Mech Des 132(5):051003
Article Google Scholar
Cruse TA, Brown JM (2007) Confidence interval simulation for systems of random variables. ASME J Eng Gas Turbine Power 129(3):836–842
Article Google Scholar
Der Kiureghian A, Ditlevsen O (2009) Aleatory or epistemic? Does it matter? Struct Saf 31(2):105–112
Article Google Scholar
Du L, Choi KK, Youn BD (2006) Inverse possibility analysis method for possibility based design optimization. AIAA J 44(11):2682–2690
Article Google Scholar
Etman LFP (1994) Design and analysis of computer experiments: the method of Sacks et al. Engineering Mechanics report WFW 94.098, Eindhoven University of Technology
Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian data analysis. Chapman & Hall/CRC, New York
MATH Google Scholar
Gunawan S, Papalambros PY (2006) A Bayesian approach to reliability-based optimization with incomplete information. ASME J Mech Des 128(4):909–918
Article Google Scholar
Haldar A, Mahadevan S (2000) Probability, reliability, and statistical methods in engineering design. Wiley, New York
Google Scholar
Iman RL (1999) Latin hypercube sampling. In: Encyclopedia of statistical sciences, vol 3, pp 408–411
Kennedy MC, O’Hagan A (2001) Bayesian calibration of computer models (with discussion). J R Stat Soc Ser B 63(3):425–464
Article MathSciNet MATH Google Scholar
Lee I, Choi KK, Du L, Gorsich D (2008) Inverse analysis method using MPP-based dimension reduction for reliability-based design optimization of nonlinear and multi-dimensional systems. Comput Methods Appl Mech Eng 198(1):14–27
Article MATH Google Scholar
Lee I, Choi KK, Gorsich D (2010) System reliability-based design optimization using MPP-based dimension reduction method. Struct Multidisc Optim 41(6):823–839
Article Google Scholar
Martz HF, Waller RA (1982) Bayesian reliability analysis. Wiley, New York
MATH Google Scholar
Mourelatos ZP, Zhou J (2006) A design optimization method using evidence theory. ASME J Mech Des 128(4):901–908
Article Google Scholar
Noh Y, Choi KK, Lee I, Gorsich D, Lamb D (2011) Reliability-based design optimization with confidence level under input model uncertainty due to limited test data. Struct Multidisc Optim 43:443–458
Article MathSciNet Google Scholar
O’Hagan A (2006) Bayesian analysis of computer code outputs: a tutorial. Reliab Eng Syst Saf 91(10–11):1290–1300
Article Google Scholar
Qiu Z, Ma Y, Wang X (2004) Comparison between non-probabilistic interval analysis method and probabilistic approach in static response problem of structures with uncertain-but-bounded parameters. Commun Numer Methods Eng 20(4):279–290
Article MATH Google Scholar
Rahman S (2008) A polynomial dimensional decomposition for stochastic computing. Int J Numer Methods Eng 76:2091–2116
Article MATH Google Scholar
Rahman S (2009) Extended polynomial dimensional decomposition for arbitrary probability distributions. J Eng Mech 135(12):1439–1451
Article Google Scholar
Rahman S, Wei D (2008) Design sensitivity and reliability-based structural optimization by univariate decomposition. Struct Multidisc Optim 35:245–261
Article Google Scholar
Rahman S, Xu H (2004) A univariate dimension-reduction method for multi-dimensional integration in stochastic mechanics. Probab Eng Mech 19(4):393–408
Article Google Scholar
Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. The MIT Press
Sacks J, Welch WJ, Mitchell TJ, Wynn HP (1989) Design and analysis of computer experiments. Stat Sci 4(4):409–435
Article MathSciNet MATH Google Scholar
Sasena MJ, Papalambros P, Goovaerts P (2002) Exploration of metamodeling sampling criteria for constrained global optimization. Eng Optim 34(3):263–278
Article Google Scholar
Youn BD, Wang PF (2008) Bayesian reliability-based design optimization using Eigenvector Dimension Reduction (EDR) method. Struct Multidisc Optim 36(2):107–123
Article MathSciNet Google Scholar
Xu H, Rahman S (2004) A generalized dimension-reduction method for multi-dimensional integration in stochastic mechanics. Int J Numer Methods Eng 61:1992–2019
Article MATH Google Scholar
Xu H, Rahman S (2005) Decomposition methods for structural reliability analysis. Probab Eng Mech 20:239–250
Article Google Scholar
Zhou J, Mourelatos ZP (2008) A sequential algorithm for possibility-based design optimization. ASME J Mech Des 130(1):011001
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 2009-0081438)

Author information

Authors and Affiliations

Department of Aerospace & Mechanical Engineering, Korea Aerospace University, Goyang-City, Gyeonggi-do, Korea
Dawn An
School of Aerospace & Mechanical Engineering, Korea Aerospace University, Goyang-City, Gyeonggi-do, Korea
Joo Ho Choi

Authors

Dawn An
View author publications
You can also search for this author in PubMed Google Scholar
Joo Ho Choi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joo Ho Choi.

Nomenclature

f(x):: Joint PDF of input variables x
${\bf f} =\mbox{[} f_1 ,\ldots ,f_m \mbox{]}$ :: trial (or basis) functions for the regression in the Kriging model
g :: Response function
G :: Response function with random behavior
$\hat{{G}}^p$ :: Posterior predictive distribution of surrogate response function
$\hat{{g}}^p$ :: Realization of the random function $\hat{{G}}^p$
${\bf g}^c=\left\{ {g_1^c ,...,g_{nc}^c } \right\}$ :: Finite number of computed outputs of g at a set of DOE points
h :: Correlation parameter that controls the degree of smoothness of Kriging model: h can be a deterministic value or treated as random variable depending on the cases.
N :: Number of samples in MCMC simulation
p _g :: Probability of an event P[G < 0]
P _g :: Probability as a random variable
R(x _i, x _j):: Correlation function between two points x _i and x _j
x :: Input variables
X :: Random input variables
${\bf x}^c=\left\{ {\bf x}_1^c ,..., {\bf x}_{nc}^c \right\}$ :: DOE points for computer experiments to make metamodel
${\bf x}^e=\left\{ {\bf x}_1^e , {\bf x}_2^e ,..., {\bf x}_{ne}^e \right\}$ :: Experimentally observed data of X that follow a certain probability distribution
x ^p :: Point at which the posterior predictive distribution of $\hat{{G}}^p$ is computed.
X ^p :: Posterior predictive distribution of X
$\boldsymbol\beta =\mbox{[} \beta_1 ,...,\beta_m \mbox{]}^T$ :: Regression coefficients in the Kriging model: $\boldsymbol\beta $ can be a set of deterministic values or treated as random variables depending on the cases.
$\boldsymbol\theta $ :: Model parameters of X
$\boldsymbol\Theta $ :: Model parameters of X with random behavior

Rights and permissions

Reprints and permissions

About this article

Cite this article

An, D., Choi, J. Efficient reliability analysis based on Bayesian framework under input variable and metamodel uncertainties. Struct Multidisc Optim 46, 533–547 (2012). https://doi.org/10.1007/s00158-012-0776-6

Download citation

Received: 28 April 2011
Revised: 06 January 2012
Accepted: 23 January 2012
Published: 28 February 2012
Issue Date: October 2012
DOI: https://doi.org/10.1007/s00158-012-0776-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Efficient reliability analysis based on Bayesian framework under input variable and metamodel uncertainties

Abstract

Similar content being viewed by others

Hybrid reliability analysis with both random and probability-box variables

Bayes theorem–based and copula-based estimation for failure probability function

A bayesian model calibration under insufficient data environment

1 Introduction

2 Reliability analysis under input uncertainty

3 Reliability analysis under metamodel uncertainty

4 Integrated reliability analysis using Bayesian approach