Conservative reliability-based design optimization method with insufficient input data

Cho, Hyunkyoo; Choi, K. K.; Gaul, Nicholas J.; Lee, Ikjin; Lamb, David; Gorsich, David

doi:10.1007/s00158-016-1492-4

Conservative reliability-based design optimization method with insufficient input data

RESEARCH PAPER
Published: 06 June 2016

Volume 54, pages 1609–1630, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Conservative reliability-based design optimization method with insufficient input data

Download PDF

Hyunkyoo Cho¹,
K. K. Choi¹,
Nicholas J. Gaul²,
Ikjin Lee³,
David Lamb⁴ &
…
David Gorsich⁴

1030 Accesses
42 Citations
Explore all metrics

Abstract

Reliability analysis and reliability-based design optimization (RBDO) require an exact input probabilistic model to obtain accurate probability of failure (PoF) and RBDO optimum design. However, often only limited input data is available to generate the input probabilistic model in practical engineering problems. The insufficient input data induces uncertainty in the input probabilistic model, and this uncertainty forces the PoF to be uncertain. Therefore, it is necessary to consider the PoF to follow a probability distribution. In this paper, the probability of the PoF is obtained with consecutive conditional probabilities of input distribution types and parameters using the Bayesian approach. The approximate conditional probabilities are obtained under reasonable assumptions, and Monte Carlo simulation is applied to calculate the probability of the PoF. The probability of the PoF at a user-specified target PoF is defined as the conservativeness level of the PoF. The conservativeness level, in addition to the target PoF, will be used as a probabilistic constraint in an RBDO process to obtain a conservative optimum design, for limited input data. Thus, the design sensitivity of the conservativeness level is derived to support an efficient optimization process. Using numerical examples, it is demonstrated that the conservativeness level should be involved in RBDO when input data is limited. The accuracy and efficiency of the proposed design sensitivity method is verified. Finally, conservative RBDO optimum designs are obtained using the developed methods for limited input data problems.

A novel probabilistic feasible region method for reliability-based design optimization with varying standard deviation

Article 05 September 2023

Reliability measure approach for confidence-based design optimization under insufficient input data

Article 11 June 2019

Extending SORA method for reliability-based design optimization using probability and convex set mixed models

Article 31 October 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Output variability, which is the variability of performance measures, impedes a design’s ability to sustain performance during its life cycle. To obtain a safe and reliable design under output variability, reliability-based design optimization (RBDO) has been developed using the first-order reliability method (FORM) (Hasofer and Lind 1974; Ditlevsen and Madsen 1996; Tu et al. 1999; Haldar and Mahadevan 2000; Tu et al. 2001; Gumbert et al. 2003; Hou 2004), the second-order reliability method (SORM) (Hohenbichler and Rackwitz 1988; Breitung 1984; Lee et al. 2012; Lim et al. 2014), the dimension reduction method (DRM) (Rahman and Wei 2006; Rahman and Wei 2008; Lee et al. 2010), and Monte Carlo simulation (MCS) (Rubinstein and Kroese 2008; Lee et al. 2011a, b). Output variability is induced by input variability – i.e., the variability of the input random variable. In RBDO, an input probabilistic model, which is a statistical representation of the input variability, is used to obtain the output variability. Therefore, the accuracy of the input probabilistic model is necessary to obtain correct reliability of the RBDO optimum design.

The aforementioned RBDO methods require an accurate input probabilistic model – i.e., “true” input probabilistic model. Obtaining the true input probabilistic model is very difficult because it requires a very large number of test data for all subjects in the system. Unfortunately, due to cost and time constraints, it is highly probable that insufficient input data will be available for an input probabilistic model in a practical problem. Then, the input probabilistic model generated using the limited number of data becomes uncertain. As a result, the uncertainty in the input probabilistic model forces probability of failure (PoF), a measure of the reliability, to be uncertain. Consequently, new methods need to be developed to obtain more conservativeness when there is insufficient input data.

A safety factor approach could be an intuitive start to consider uncertainty in the input probabilistic model (Elishakoff 2004). P-boxes and probability bounds, which are essentially a new input probabilistic model at a certain confidence level based on the input data, have been developed to capture the uncertainty in the input probabilistic model (Tucker and Ferson 2003; Aughenbaugh and Paredis 2006; Utkin and Destercke 2009). The uncertainty in the input probabilistic model and the variability of the input random variables can be combined in a modified input probabilistic model by using intentionally enlarged input variances (Noh et al. 2011a, b). All of these methods adjust the input probabilistic model to reflect the uncertainty in it. However, the uncertainty in the input probabilistic model transfers to the PoF through performance measures. When the performance measure is nonlinear, it is hardly possible to estimate the uncertainty of the PoF accurately by altering the input probabilistic model. Moreover, modifying the input probabilistic model may mix the effect of input uncertainty (the uncertainty in the input probabilistic model due to insufficient data) and input variability (variability of the input random variables), which are essentially two different sources of output uncertainty and variability.

The Bayesian approach would be better for directly accessing the PoF and separating the effect of the input uncertainty and variability. In one study, the probability of fatigue failure of a steel bridge was estimated by combining several input probabilistic models and two crack propagation models with the Bayesian method and nondestructive inspection (NDI) data (Zhang and Mahadevan 2000), and the PoF was updated as more NDI data became available. The mean of the simulation output was qualified in the presence of the input uncertainty using the Bayesian model average (BMA) approach (Chick 2001), but the two sources were not clearly distinguished. Later, Gunawan and Papalambros successfully separated the two sources and assumed that the PoF follows beta distribution (Gunawan and Papalambros 2006). That is, the PoF, which quantifies the output variability induced by the input variability, also follows another probability distribution (the beta distribution, in this case) due to the input uncertainty. The cumulative distribution function (CDF) of the beta distribution at a certain PoF is the conservativeness level of the PoF. Using these observations, an RBDO problem of minimizing cost and maximizing conservativeness level was performed. Youn and Wang obtained an extreme case of the beta distribution using extreme distribution theory, and the median value of the extreme case was used as new probabilistic constraints for RBDO (Youn and Wang 2008). In addition, the design sensitivity of the probabilistic constraints was developed. However, the probability of the PoF still has not been fully utilized. Once the probability is obtained, the conservativeness level of the PoF is directly accessible. Then, the input uncertainty induced by insufficient input data is measured by the conservativeness level, while the input variability is captured by the PoF.

In this paper, a new method to estimate the conservativeness level of the PoF is presented. The new method directly accesses the probability of the PoF using the Bayesian approach and distinguishes the input uncertainty and the input variability. By separating the two sources, users can specify separate target values for both the input uncertainty and variability in the new RBDO process. Moreover, a design sensitivity for the conservativeness level is developed to ensure the effectiveness and efficiency of the new RBDO process. In Section 2, the relationship between the PoF and insufficient input data is shown. In addition, how to deal with the input data in the new RBDO process is explained. In Section 3, the probability of the PoF is obtained using the Bayesian method, and the estimation method of conservativeness level is presented. In Section 4, the design sensitivity of the conservativeness level is derived. Then numerical examples are used to show the effectiveness and efficiency of the estimation of the conservativeness level, the design sensitivity method, and the new conservative RBDO process in Section 5. In Section 6, an 11-dimensional problem is tested to see the performance of developed method in high-dimensional applications. Finally, the conclusion is presented in Section 7.

2 Probability of failure and insufficient input data

Probability of failure is revisited in this section to associate it with input distribution types and parameters. Through this association, propagation of the input uncertainty due to insufficient input data to probability of the PoF is characterized in the following sections. Therefore, it is worth discussing the PoF before moving on to the main discoveries of this paper. In addition, how to treat input data in the RBDO process is explained in this section as well.

2.1 Probability of failure

The PoF $ {p}_F $ is defined using a multi-dimensional integral and an indicator function as

$$ {p}_F\left(\boldsymbol{\upzeta}, \boldsymbol{\uppsi} \right)={\displaystyle {\int}_{{\mathrm{\mathbb{R}}}^N}{I}_{\Omega_F}\left(\mathbf{x}\right){f}_{\mathbf{X}}\left(\mathbf{x};\boldsymbol{\upzeta}, \boldsymbol{\uppsi} \right)\ d\mathbf{x}} $$

(1)

where $ {\Omega}_F $ is the failure domain such that a performance measure $ G\left(\mathbf{x}\right) $ is larger than zero (i.e., $ G\left(\mathbf{x}\right)>0 $), $ {f}_{\mathbf{X}}\left(\mathbf{x};\boldsymbol{\upzeta}, \boldsymbol{\uppsi} \right) $ is a joint probability density function (PDF) of input random variables X with input distribution types ζ and input distribution parameters ψ, x is realization of X, N is the number of input random variables, and $ {I}_{\Omega_F}\left(\mathbf{x}\right) $ is an indicator function defined as

$$ {I}_{\Omega_F}\left(\mathbf{x}\right)\equiv \left\{\begin{array}{cc}\hfill 1,\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ \mathbf{x}\in {\Omega}_F\hfill \\ {}\hfill 0,\hfill & \hfill \mathrm{otherwise}.\hfill \end{array}\right. $$

(2)

In this paper, it is assumed that each input random variable $ {X}_i $ is statistically independent and has marginal distribution with two parameters. Under the assumptions, the joint PDF in (1) can be expressed using marginal PDF as

$$ {f}_{\mathbf{X}}\left(\mathbf{x};\boldsymbol{\upzeta}, \boldsymbol{\uppsi} \right)={\displaystyle \prod_{i=1}^N}{f}_{X_i}\left({x}_i;{\zeta}_i,{\mu}_i,{\sigma}_i^2\right) $$

(3)

where $ {f}_{X_i} $, $ {\zeta}_i $, $ {\mu}_i $, and $ {\sigma}_i^2 $ are marginal PDF, marginal distribution type, and mean and variance of input random variable $ {X}_i $, respectively. Input distribution types and parameters can be defined as ζ = {ζ ₁, …, ζ _N} and ψ = {μ ₁, σ ²₁ …, μ _N, σ ²_N }, respectively. It is noted that mean ($ {\mu}_i $) and variance ($ {\sigma}_i^2 $) are used in (3) instead of the two parameters of the marginal distribution because they are invariants of the marginal distribution type and the two parameters can be uniquely determined using the mean and variance.

As a specific example, consider an input joint PDF with three input random variables of X = [X ₁,X ₂, X ₃]^T, which follow the Normal, Lognormal, and Gamma marginal distribution types, respectively. Each input random variable $ {X}_i $ has mean and variance of $ {\mu}_i $ and $ {\sigma}_i^2 $. Then, the input joint PDF of the three input random variables can be expressed using (3) as

$$ {f}_{\mathbf{X}}\left(\mathbf{x};\ \boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)={f}_{X_1}\left({x}_1;{\zeta}_1,{\mu}_1,{\sigma}_1^2\right)\ {f}_{X_2}\left({x}_2;{\zeta}_2,{\mu}_2,{\sigma}_2^2\right){f}_{X_3}\left({x}_3;{\zeta}_3,{\mu}_3,{\sigma}_3^2\right) $$

(4)

with input distribution types ζ = {ζ ₁,ζ ₂, ζ ₃}, input distribution parameters ψ = {μ ₁, σ ²₁ , μ ₂, σ ²₂ , μ ₃, σ ²₃ }, and $ {\zeta}_1 $ =Normal, $ {\zeta}_2 $ =Lognormal, $ {\zeta}_3 $ =Gamma. As marginal distribution types and the distribution parameters are specified, the joint PDF in (4) is a specific PDF that can produce one value of the PoF in (1).

If the population data, which is a complete set of data for input random variables, is available, the true input distribution types ζ and the true input distribution parameters ψ can be obtained. If this is the case, as explained earlier, (1) produces a fixed PoF value. However, in practical engineering problems, only limited data is available, which makes ζ and ψ follow probability distributions instead of being fixed types or values. Therefore, capital characters Z and Ψ will be used to represent their randomness in the presence of limited data with ζ and ψ being the corresponding realizations, respectively. Z, Ψ, and the amount of data cause the PoF to follow a probabilistic distribution.

The conventional RBDO methods that find optimum design using the fixed target PoF value based on a realization set ζ and ψ can no longer produce reliable design when only limited data is available. Thus, a new RBDO method needs to be developed in order to produce a conservative optimum design that we can rely on, when only insufficient data is available, by considering the probability distributions of Z and Ψ and, eventually, the probability distribution of the PoF.

2.2 Input data decomposition

Let *x be the given input data set. For simplicity of explanation, the number of data for each input random variable is set to ND. This can be easily extended to a case in which the numbers of data are not the same. The input data set *x could contain the following data subsets:

$$ *\mathbf{x}=\left\{*{\mathbf{x}}_1,\dots, *{\mathbf{x}}_N\right\}. $$

(5)

The data subset *x _i for the i-th input random variable $ {X}_i $ is a column vector of size ND as

$$ *{\mathbf{x}}_i={\left[\begin{array}{cc}\hfill \begin{array}{cc}\hfill *{x}_i^{(1)}\hfill & \hfill *{x}_i^{(2)}\hfill \end{array}\hfill & \hfill \begin{array}{cc}\hfill \cdots \hfill & \hfill *{x}_i^{(ND)}\hfill \end{array}\hfill \end{array}\right]}^{\mathrm{T}} $$

(6)

where *x ^(j)_i is the j-th data for $ {X}_i $. The data subset *x _i can be decomposed into two parts as

$$ {}^{*}{\mathbf{x}}_i{=}^{*}{\overline{\mathbf{x}}}_i{+}^{*}{\tilde{\mathbf{x}}}_i $$

(7)

where $ {}^{*}{\overline{\mathbf{x}}}_i $ is a column vector of size ND, the entities of which are the sample mean of the i-th data subset *x _i as

$$ {}^{*}{\overline{\mathbf{x}}}_i={\left[{}^{*}{\overline{x}}_i*{\overline{x}}_i\kern0.75em \cdots {\kern0.5em }^{*}{\overline{x}}_i\right]}^{\mathrm{T}}\kern0.5em \mathrm{such}\ \mathrm{t}\mathrm{h}\mathrm{a}{\mathrm{t}}^{*}{\overline{x}}_i=\frac{1}{ND}{\displaystyle {\sum_{j=1}^{ND}}^{*}{x}_i^{(j)}}. $$

(8)

Some of the input random variables are related to design variables. If $ {X}_i $ is related to a design variable $ {d}_i $, the i-th data subset *x _i is assumed in the RBDO process as

$$ {}^{*}{\mathbf{x}}_i={\mathbf{d}}_i{+}^{*}{\tilde{\mathbf{x}}}_i $$

(9)

where d _i is the i-th design variable vector defined as

$$ {\mathbf{d}}_i={\left[{d}_i\kern0.75em {d}_i\kern0.75em \cdots \kern0.75em {d}_i\right]}^T. $$

(10)

In (9), the input data in the RBDO process are changed to be centered at the current design point d. Hence, as the design optimization proceeds, d moves according to the optimization process, and the data *x follows the design movement. However, $ {}^{*}{\tilde{\mathbf{x}}}_i $, which is the dispersion of the input data with respect to the design point, is maintained in the RBDO process. An example is shown in Fig. 1. A pair of input random variable has five data pairs. The mean of data pairs $ {}^{*}\overline{\mathbf{x}}=\left\{{}^{*}{\overline{\mathbf{x}}}_1{,}^{*}{\overline{\mathbf{x}}}_2\right\} $ has been moved to a design point d = {d ₁, d ₂}. However, the dispersion of data pairs $ {}^{*}\tilde{\mathbf{x}}=\left\{{}^{*}{\tilde{\mathbf{x}}}_1{,}^{*}{\tilde{\mathbf{x}}}_2\right\} $ with respect to the center points $ {}^{*}\overline{\mathbf{x}} $ and d is maintained. The data decomposition in (9) is a usual practice in the conventional RBDO process. In the process, the design variable, which is the mean of the corresponding input random variable, changes as design iteration proceeds, while the variance of the input random variable is constant. The same concept is applied to the RBDO with insufficient input data in this paper by decomposing input data and maintaining $ {}^{*}{\tilde{\mathbf{x}}}_i $ in the RBDO process. It is noted that $ {}^{*}{\tilde{\mathbf{x}}}_i $ contains the input uncertainty due to insufficient data and that it remains the same while the design is changed during the RBDO process.

3 Probability of PoF

In this section, the probability of the PoF is obtained using the given input data, the general expression of the joint PDF, and the Bayesian method. We can obtain a conservative RBDO optimal design, even with limited input data, by securing a certain probability of the PoF that is larger than a user-specified conservativeness level.

3.1 Probability of PoF

Consider a given input data set *x. As explained earlier, input distribution types Z and parameters Ψ follow probability distributions if the input data set *x, not true input distribution, is provided. In this paper, it is assumed that the probability distributions of Z and Ψ can be analogized from the *x. Using Bayes’ theorem and the given *x, a joint PDF of the PoF P _F, input distribution types Z, and input distribution parameters Ψ is obtained as

$$ f\left({p}_F,\ \boldsymbol{\upzeta},\ \left.\boldsymbol{\uppsi} \right|*\mathbf{x}\right)=f\left(\left.{p}_F\right|\boldsymbol{\upzeta}, \boldsymbol{\uppsi} {,}^{*}\mathbf{x}\right)\ P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} {,}^{*}\mathbf{x}\right)\ f\left({\left.\boldsymbol{\uppsi} \right|}^{*}\mathbf{x}\right). $$

(11)

In (11), the joint PDF is a product of three successive conditional probabilities. If all conditional probabilities on the right side of (11) are available, the PDF of $ {P}_F $ can be obtained by integrating Z and Ψ in (11) as

$$ {f}_{P_F}\left(\left.{p}_F\right|*\mathbf{x}\right)={\displaystyle \sum_{\mathbf{Z}}}{\displaystyle {\int}_{\Omega_{\boldsymbol{\Psi}}}f\left({p}_F,\ \boldsymbol{\upzeta},\ \left.\boldsymbol{\uppsi} \right|*\mathbf{x}\right)\ d\boldsymbol{\uppsi}},\kern1.25em {p}_F\in \left[0,\ 1\right] $$

(12)

Furthermore, the CDF of $ {P}_F $ is obtained by integrating (12) with respect to the PoF as

$$ {F}_{P_F}\left(\left.{p}_F\right|*\mathbf{x}\right)={\displaystyle {\int}_0^{p_F}{\displaystyle \sum_{\mathbf{Z}}}}{\displaystyle {\int}_{\Omega_{\boldsymbol{\Psi}}}f\left(\phi,\ \boldsymbol{\upzeta},\ {\left.\boldsymbol{\uppsi} \right|}^{*}\mathbf{x}\right)\ d\boldsymbol{\uppsi} d\phi },\kern1.25em {p}_F\in \left[0,1\right] $$

(13)

where ϕ is the variable that corresponds to $ {P}_F $. The value of the CDF of $ {P}_F $ in (13) represents the probability that $ {P}_F $ of a design with the input data *x is less than a specified value $ {p}_F $. In other words, the CDF value is the probability that a design is more conservative and safer than $ {p}_F $. Hence, in this paper, the CDF value of $ {P}_F $ at the specified $ {p}_F $ is designated as the “conservativeness level” of $ {p}_F $.

Among the three conditional probabilities on the right side of (11), the first term f(p _F|ζ, ψ, *x) is the probability of $ {P}_F $ with the given input distribution types ζ, parameters ψ, and data *x. As explained earlier, the PoF in (1) is determined by ζ and ψ. Consequently, when ζ and ψ are given, the PoF is a deterministic value, and the probability becomes a Dirac-delta measure as

$$ f\left(\left.{p}_F\right|\boldsymbol{\upzeta}, \boldsymbol{\uppsi}, *\mathbf{x}\right)=\updelta \left[{p}_F-{p}_F\left(\boldsymbol{\upzeta}, \boldsymbol{\uppsi} \right)\right]. $$

(14)

The second and third conditional probabilities are obtained in Sections 3.2 and 3.3.

3.2 Joint PDF of input distribution parameters

The last conditional probability in (11) is the joint PDF of input distribution parameters Ψ with given input data set *x. The exact joint PDF of Ψ is not known unless population data, which is the test data of all subjects, of the input random variable is provided. In fact, if input data set *x is population data, we can obtain the exact value of Ψ, so that the joint PDF becomes a Dirac delta measure. If not, all we can obtain is an approximated joint PDF of Ψ. Hence, the approximated joint PDF of Ψ is obtained in this section. As indicated earlier, input mean $ {\mu}_i $ and variance $ {\sigma}_i^2 $ are used as input distribution parameters. In addition, they are expressed using capital symbols $ {M}_i $ and $ {\varSigma}_i^2 $, respectively, because of their random features.

The joint PDF of input distribution parameters Ψ is a product of joint PDFs of input mean $ {M}_i $ and input variance $ {\varSigma}_i^2 $. $ {M}_i $ and $ {\varSigma}_i^2 $ of each input random variable have separate joint PDFs. Therefore, the joint PDF of Ψ can be expressed as

$$ f\left({\left.\boldsymbol{\uppsi} \right|}^{*}\mathbf{x}\right)={\displaystyle \prod_{i=1}^N}f\left(\left.{\mu}_i,{\sigma}_i^2\right|*{\mathbf{x}}_i\right). $$

(15)

The central limit theorem is a widely used method for obtaining the PDF of the input mean $ {M}_i $ with the given input data *x. Though the central limit theorem produces the PDF of the input mean under the assumption that the input data follow Normal distribution, it produces a well-approximated PDF of the input mean when the input data follow other distributions. In the same sense, the joint PDF of the input mean $ {M}_i $ and variance $ {\varSigma}_i^2 $ are obtained using Bayes’ theorem under the assumption that the given input data *x follows Normal distribution in this paper. This does not mean that the input distribution types Z are always Normal distributions; this is only an intermediate assumption to find the approximate joint PDF of the input mean $ {M}_i $ and variance $ {\varSigma}_i^2 $. Also, the non-informative prior, which means that there is no information except the given input data *x, is used for Bayes’ theorem. It will be shown that the result of Bayes’ theorem is the same as the one from the central limit theorem.

Under the Normality assumption described above and with the non-informative prior, the input variance $ {\varSigma}_i^2 $, for the i-th independent random variable $ {X}_i $ and the given data subset *x _i, follows inverse-gamma distribution as (Gelman et al. 2004)

$$ {\left.{\varSigma}_i^2\right|}^{*}{\mathbf{x}}_i \sim \mathrm{I}\mathrm{n}\mathrm{v}-{\chi}^2\left(ND-1,{s}_i^2\right)=\mathrm{I}\mathrm{G}\left(\frac{ND-1}{2},\ \frac{\left(ND-1\right){s}_i^2}{2}\right) $$

(16)

where the sample variance $ {s}_i^2 $ can be calculated as

$$ {s}_i^2=\frac{1}{ND-1}{\displaystyle \sum_{m=1}^{ND}}{\left({}^{*}{\overset{\sim }{x}}_i^{(m)}\right)}^2=\frac{1}{ND-1}*{\overset{\sim }{\mathbf{x}}}_i^{\mathrm{T}}*{\overset{\sim }{\mathbf{x}}}_i. $$

(17)

$ {s}_i^2 $ is constant in the RBDO process because the amount of data $ ND $ and the dispersion of input data $ {}^{*}{\tilde{\mathbf{x}}}_i $ are invariant. Therefore, the inverse-gamma distribution of $ {\varSigma}_i^2 $ in (16) does not change during the RBDO process because the parameters for the distribution are $ {s}_i^2 $ and $ ND $. The distribution has larger uncertainty when input data with smaller $ ND $ are provided. The larger uncertainty makes the PoF more uncertain. Eventually, the enlarged uncertainty of the PoF reduces the conservativeness level in (13).

The input mean $ {M}_i $ of the i-th independent variable $ {X}_i $ follows Normal distribution based on the non-informative prior, given input variance $ {\sigma}_i^2 $ and data *x _i as (Gelman et al. 2004)

$$ \left.{M}_i\right|{\sigma}_i^2{,}^{*}{\mathbf{x}}_i \sim \mathcal{N}\left({}^{*}{\overline{x}}_i,{\sigma}_i^2/ND\right) $$

(18)

where $ {}^{*}{\overline{x}}_i $ is mean of data subset *x _i as defined in (8). In the distribution of $ {M}_i $, the realization $ {\sigma}_i^2 $ of input variance $ {\varSigma}_i^2 $ is required, which means the distribution of $ {\varSigma}_i^2 $ in (16) is also used to derive the distribution of $ {M}_i $ in (18). The distribution of $ {M}_i $ is the same as the distribution from the central limit theorem; hence the distribution of $ {M}_i $ as well as the distribution of $ {\varSigma}_i^2 $, which affects the distribution of $ {M}_i $, are reasonable and trustworthy. If $ {X}_i $ is related to a design variable $ {d}_i $, (18) can be expressed as

$$ \left.{M}_i\right|{\sigma}_i^2{,}^{*}{\mathbf{x}}_i \sim \mathcal{N}\left({d}_i,{\sigma}_i^2/ND\right) $$

(19)

because the sample mean of data set $ {}^{*}{\overline{x}}_i $ changes to $ {d}_i $. It is noted that the design variable $ {d}_i $ is deterministic for the purpose of design optimization, and the input uncertainty is considered in $ {M}_i $ by treating it as a random variable. It can be seen that in (18) and (19), smaller ND makes the input mean $ {M}_i $ have larger variability, so the conservativeness level of the PoF decreases.

Finally, the joint PDF of input random variable and variance can be derived using the distributions obtained as

$$ f\left({\left.{\mu}_i,{\sigma}_i^2\right|}^{*}{\mathbf{x}}_i\right)=f\left(\left.{\mu}_i\right|{\sigma}_i^2{,}^{*}{\mathbf{x}}_i\right)f\left(\left.{\sigma}_i^2\right|*{\mathbf{x}}_i\right) $$

(20)

where f(σ ²_i |*x _i) and f(μ _i|σ ²_i , *x _i) are the PDF forms of (16) and (19), respectively. Finally, (20) can be used to obtain the joint PDF of input distribution parameters Ψ in (15).

3.3 Probability mass function of input distribution types

The probability mass function of an input distribution types Z with the given input data *x and given parameters ψ is obtained using Bayes’ theorem as

$$ P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} {,}^{*}\mathbf{x}\right)=\frac{P\left({}^{*}\left.\mathbf{x}\right|\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)\ P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} \right)}{{\displaystyle {\sum}_{\mathbf{Z}}}P\left({}^{*}\left.\mathbf{x}\right|\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)\ P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} \right)}=\frac{L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)\ P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} \right)}{{\displaystyle {\sum}_{\mathbf{Z}}}L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)\ P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} \right)} $$

(21)

where the likelihood function L(*x; ζ, ψ) is a product of the PDF value at each input data point as

$$ L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)={\displaystyle \prod_{i=1}^N}{\displaystyle \prod_{m=1}^{ND}}{f}_{X_i}\left(\left.{}^{*}{x}_i^{(m)}\right|{\zeta}_i;{\mu}_j;{\sigma}_j^2\right). $$

(22)

The term $ P\left(\boldsymbol{\upzeta} |\boldsymbol{\uppsi} \right) $ in (21) is a constant under the assumption that there is no prior information. This assumption means that all candidate distribution types are equally probable before the analysis using the given input data *x. Then, (21) can be simplified as

$$ P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} {,}^{*}\mathbf{x}\right)=\frac{L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)}{{\displaystyle {\sum}_{\mathbf{Z}}}L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)}. $$

(23)

There are many marginal distributions; however, it is impossible to cover all the types in the evaluation of (23). Hence, it is reasonable to set combinations of marginal distribution types and then evaluate the probability of each combination. In this paper, seven marginal distribution types with two distribution parameters are used. Probability density functions of the selected types are listed in Appendix A.

As explained earlier, there could be a case in which each input data subset has a different amount of data. In this case, equations can be generalized by replacing $ ND $ with $ {ND}_i $ for the i-th data subset *x _i.

3.4 Calculation of conservativeness level of PoF

Since all terms to evaluate the probability of the PoF in (11) are now available in (14), (15), and (23), the conservativeness level of the PoF in (13) at a PoF value $ {p}_F $ can be calculated. When $ {p}_F $ is given, (14) needs to be evaluated to calculate the conservativeness level. However, it is too complicated to solve (14) analytically because it involves the Dirac-delta measure. In addition, as the probability of the PoF is very likely not a standard distribution type, FORM, SORM or DRM is not applicable to the conservativeness level estimation. Therefore, the conservativeness level is calculated numerically using MCS as (Rubinstein and Kroese 2008)

$$ \begin{array}{l}{F}_{P_F}\left({\left.{p}_F\right|}^{*}\mathbf{x}\right)\hfill \\ {}\cong \frac{1}{NMC{S}_{\mathbf{Z}}\ NMC{S}_{\boldsymbol{\Psi}}}{\displaystyle {\int}_0^{p_F}{\displaystyle \sum_{n=1}^{NMC{S}_{\boldsymbol{\Psi}}}}}{\displaystyle \sum_{m=1}^{NMC{S}_{\mathbf{Z}}}}f\left(\left.\phi \right|{\boldsymbol{\upzeta}}^{(m)},{\boldsymbol{\uppsi}}^{(n)}{,}^{*}\mathbf{x}\right)d\phi \hfill \\ {}=\frac{1}{NMC{S}_{\mathbf{Z}}\ NMC{S}_{\boldsymbol{\Psi}}}{\displaystyle {\int}_0^1{\displaystyle \sum_{n=1}^{NMC{S}_{\boldsymbol{\Psi}}}}}{\displaystyle \sum_{m=1}^{NMC{S}_{\mathbf{Z}}}}{I}_{\left[0,{p}_F\right]}\left(\phi \right)\ \delta \left[\phi -{p}_F\left({\boldsymbol{\upzeta}}^{(m)},{\boldsymbol{\uppsi}}^{(n)}\right)\right]d\phi \hfill \\ {}=\frac{1}{NMC{S}_{\mathbf{Z}}\ NMC{S}_{\boldsymbol{\Psi}}}{\displaystyle \sum_{n=1}^{NMC{S}_{\boldsymbol{\Psi}}}}{\displaystyle \sum_{m=1}^{NMC{S}_{\mathbf{Z}}}}{I}_{\left[0,{p}_F\right]}\left[{p}_F\left({\boldsymbol{\upzeta}}^{(m)},{\boldsymbol{\uppsi}}^{(n)}\right)\right]\hfill \end{array} $$

(24)

where $ {I}_{\left[0,{p}_F\right]}\left(\phi \right) $ is an indicator function, the value of which is 1 when ϕ is between 0 and $ {p}_F $, and 0 otherwise. ζ ^(m) and ψ ⁽ⁿ⁾ are the m-th realization of (Z│$ \boldsymbol{\uppsi} $ , ^* x) and the n-th realization of (Ψ|*x), respectively. $ NMC{S}_{\boldsymbol{\Psi}} $ and $ {NMCS}_{\mathbf{Z}} $ are the MCS sample sizes for Ψ and Z, respectively. The overall procedure to evaluate (24) is shown in Fig. 2.

4 Design sensitivity of conservativeness level

The conservativeness level calculated in Section 3.4 can be used as a constraint in the RBDO process. The RBDO process is called confidence-based RBDO (C-RBDO) because we can have confidence that its optimum design has a certain amount of conservativeness even when there is limited input data. The constraint can be expressed as

$$ {F}_{P_F}\left(\left.{p}_F^{Tar}\right|*\mathbf{x}\right)\ge C{L}^{Tar} $$

(25)

where $ {p}_F^{Tar} $ and $ C{L}^{Tar} $ are the target PoF and the target conservativeness level, respectively, for the constraint. By using the two target values, C-RBDO is able to secure user-specified conservativeness even with a finite amount of data.

4.1 Design sensitivity

The design sensitivity of the conservativeness level is developed to provide an accurate and efficient direction for the design search in the C-RBDO process. The finite difference method (FDM) could be used to calculate the design sensitivity; however, it requires a great deal of computational time to calculate accurate design sensitivity. Hence, an analytical design sensitivity is necessary to perform C-RBDO efficiently.

The derivative of (13) with respect to a design variable $ {d}_i $ yields

$$ \begin{array}{l}\frac{\partial }{\partial {d}_i}{F}_{P_F}\left({p}_F\Big|{}^{*}\mathbf{x}\right)=\frac{\partial }{\partial {d}_i}{\displaystyle {\int}_0^{p_F}{\displaystyle \sum_{\mathbf{Z}}}}{\displaystyle {\int}_{\Omega_{\boldsymbol{\Psi}}}f\left({\left.\phi,\ \boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right|}^{*}\mathbf{x}\right)}\ d\boldsymbol{\uppsi} d\phi \hfill \\ {}={\displaystyle {\int}_0^{p_F}{\displaystyle \sum_{\mathbf{Z}}}}\left[{\displaystyle {\int}_{\Omega_{\boldsymbol{\Psi}}}f}\left(\phi,\ \boldsymbol{\upzeta},\ {\left.\boldsymbol{\uppsi} \right|}^{*}\mathbf{x}\right)\frac{\partial }{\partial {d}_i}\left\{ \ln P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} {,}^{*}\mathbf{x}\right)+ \ln\ f\left({\left.\boldsymbol{\uppsi} \right|}^{*}\mathbf{x}\right)\right\}\ d\boldsymbol{\uppsi} \right]d\phi .\hfill \end{array} $$

(26)

Compared with (13), there are two additional terms in (26). The first additional term is the log-derivative of the probability mass function of the input distribution types. This term is derived analytically in Section 4.2 and is defined for now as

$$ \frac{\partial }{\partial {d}_i} \ln P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} {,}^{*}\mathbf{x}\right)\equiv {S}_{\mathbf{Z}}\left(\boldsymbol{\upzeta}, \boldsymbol{\uppsi} {,}^{*}\mathbf{x},{d}_i\right) $$

(27)

As discussed earlier, a design change in the optimization process does not affect the probability of the input variance; therefore, the probability is independent of the design variable $ {d}_i $. Hence, the second additional term in (26), which is the log-derivative of the joint PDF of input distribution parameters, is derived when $ {d}_i $ is the design variable that corresponds to $ {X}_i $:

$$ \begin{array}{l}\frac{\partial }{\partial {d}_i} \ln f\left({\left.\boldsymbol{\uppsi} \right|}^{*}\mathbf{x}\right)=\frac{\partial }{\partial {d}_i} \ln f\left(\left.{\mu}_i\right|{\sigma}_i^2{,}^{*}\mathbf{x}\right)\hfill \\ {}\begin{array}{cc}\hfill \kern0.72em \hfill & \hfill \begin{array}{ccc}\hfill \begin{array}{cc}\hfill \hfill & \hfill \hfill \end{array}\hfill & \hfill \hfill & \hfill =\frac{ND\left({\mu}_i-{d}_i\right)}{\sigma_i^2}\equiv {S}_{\boldsymbol{\Psi}}\left({\mu}_i,{\sigma}_i^2,{d}_i,ND\right).\hfill \end{array}\hfill \end{array}\hfill \end{array} $$

(28)

Although the additional terms in (26) have analytical expression in (27) and (28), the design sensitivity cannot be calculated directly. The reason is the same as why (13) is evaluated using the MCS method in (24) of Section 3.4. Hence, the design sensitivity in (26) is calculated using the MCS method as well. The design sensitivity for the design variable $ {d}_i $ is

$$ \begin{array}{l}\frac{\partial }{\partial {d}_i}{F}_{P_F}\left({\left.{p}_F\right|}^{*}\mathbf{x}\right)\hfill \\ {}\cong \frac{1}{NMC{S}_{\mathbf{Z}}\ NMC{S}_{\boldsymbol{\Psi}}}{\displaystyle \sum_{n=1}^{NMC{S}_{\boldsymbol{\Psi}}}}{\displaystyle \sum_{m=1}^{NMC{S}_{\mathbf{Z}}}}\left\{{I}_{\left[0,{p}_F\right]}\left[{p}_F\left({\boldsymbol{\upzeta}}^{(m)},{\boldsymbol{\uppsi}}^{(n)}\right)\right]\right.\hfill \\ {}\left.\times \left[{S}_{\mathbf{Z}}\left({\boldsymbol{\upzeta}}^{(m)},{\boldsymbol{\uppsi}}^{(n)}{,}^{*}\mathbf{x},{d}_i\right)+{S}_{\boldsymbol{\Psi}}\left({\mu}_i^{(n)},{\sigma}_i^{2(n)},{d}_i,ND\right)\right]\right\}.\hfill \end{array} $$

(29)

(29) is quite similar to the conservativeness level of the PoF in (24). Only the additional terms in (27) and (28) have to be calculated at each MCS sample, and they are computationally inexpensive. Hence, the design sensitivity can be calculated with little additional effort during the calculation of the conservativeness level of the PoF. It is noted that the equations in this section can be easily generalized by replacing $ ND $ with $ {ND}_i $ for the i-th data subset *x _i in a case where each input data subset has a different amount of data.

4.2 Log-derivative of probability mass function of input distribution types

In Section 4.1, (27), which is the log-derivative of the probability mass function of the input distribution types to the design variable $ {d}_i $, is required for the design sensitivity of the conservativeness level in (29). The easiest way to calculate (27) is with the FDM because it only requires evaluations of the probability mass function of the input distribution types in (23) at the perturbed design and the current design. However, the FDM could be inaccurate when appropriate perturbation size is not provided. Moreover, there may be no unique perturbation size that is appropriate for all candidate distribution types. Hence, determining perturbation size could cause unnecessary difficulty and inaccuracy when calculating (27) using the FDM.

If analytical expressions of marginal PDFs are available, (27) could be derived analytically by taking the log-derivative of (23) with respect to the design variable $ {d}_i $. First, the expression of data in (6), (9) and (10) are recalled:

$$ {}^{*}{\mathbf{x}}_i={\mathbf{d}}_i{+}^{*}{\tilde{\mathbf{x}}}_i $$

(9)

where

$$ {}^{*}{\mathbf{x}}_i={\left[\begin{array}{cc}\hfill \begin{array}{cc}\hfill {}^{*}{x}_i^{(1)}\hfill & \hfill {}^{*}{x}_i^{(2)}\hfill \end{array}\hfill & \hfill \begin{array}{cc}\hfill \cdots \hfill & \hfill {}^{*}{x}_i^{(ND)}\hfill \end{array}\hfill \end{array}\right]}^T, $$

(6)

$$ {\mathbf{d}}_i={\left[\begin{array}{cc}\hfill \begin{array}{cc}\hfill {d}_i\hfill & \hfill {d}_i\hfill \end{array}\hfill & \hfill \begin{array}{cc}\hfill \cdots \hfill & \hfill {d}_i\hfill \end{array}\hfill \end{array}\right]}^T. $$

(10)

(9) indicates that each data point *x ^(j)_i is a function of the design variable $ {d}_i $ because the input data subset *x _i moves exactly the same amount as $ {d}_i $ moves while $ {}^{*}{\tilde{\mathbf{x}}}_i $ is invariant in an optimization process. Let h(*x) be a general function of the input data *x. Then, h(*x) contains the input data points *x ⁽¹⁾_i , …, *x ^(ND)_i in its expression. Therefore, the derivative of h(*x) with respect to $ {d}_i $ is the summation of the derivative of the function with respect to data *x ^(j)_i :

$$ \frac{\partial }{\partial {d}_i}h\left({}^{*}\mathbf{x}\right)={\displaystyle \sum_{j=1}^{ND}}\frac{\partial }{\partial^{*}{x}_i^{(j)}}h\left({}^{*}\mathbf{x}\right). $$

(30)

The log-derivative of the probability mass function of the input distribution types in (23) yields

$$ \begin{array}{l}\frac{\partial }{\partial {d}_i} \ln P\left(\left.\boldsymbol{\upzeta} \right|\boldsymbol{\uppsi} {,}^{*}\mathbf{x}\right)\hfill \\ {}=\frac{\partial }{\partial {d}_i}\left[ \ln L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)- \ln {\displaystyle {\sum}_{\mathbf{Z}}L}\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)\right]\hfill \\ {}=\frac{\partial }{\partial {d}_i} \ln L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)-\frac{1}{{\displaystyle {\sum}_{\mathbf{Z}}}L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)}{\displaystyle {\sum}_{\mathbf{Z}}\left[L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)\frac{\partial }{\partial {d}_i} \ln L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)\right]}.\hfill \end{array} $$

(31)

Therefore, the log-derivative of the likelihood function is necessary for (31). The log-derivative is derived using the relationship in (30) as

$$ \begin{array}{l}\frac{\partial }{\partial {d}_i} \ln L\left({}^{*}\mathbf{x};\boldsymbol{\upzeta},\ \boldsymbol{\uppsi} \right)={\displaystyle \sum_{m=1}^{ND}}\frac{\partial }{\partial^{*}{x}_i^{(m)}} \ln {f}_{X_i}\left({}^{*}\left.{x}_i^{(m)}\right|{\zeta}_i,{\mu}_j;,{\sigma}_j^2\right)\hfill \\ {}\begin{array}{cccc}\hfill \begin{array}{ccc}\hfill \hfill & \hfill \hfill & \hfill \hfill \end{array}\hfill & \hfill \hfill & \hfill \hfill & \hfill ={\displaystyle \sum_{m=1}^{ND}}\frac{1}{f_{X_i}\left({}^{*}\left.{x}_i^{(m)}\right|{\zeta}_i,{\mu}_j;,{\sigma}_j^2\right)}\frac{\partial {f}_{X_i}\left({}^{*}\left.{x}_i^{(m)}\right|{\zeta}_i,{\mu}_j;,{\sigma}_j^2\right)}{\partial^{*}{x}_i^{(m)}}.\hfill \end{array}\hfill \end{array} $$

(32)

In (32), the derivatives of the marginal PDF $ {f}_{X_i} $ are required. The derivatives of commonly used marginal PDFs are derived in Table 1 using the original expressions of marginal PDFs in Appendix A, Table 18. Using the derivatives, (32), the log-derivative of the probability mass function of the input distribution types can be obtained.

Table 1 Derivative of PDF

Conservative reliability-based design optimization method with insufficient input data

Abstract

Similar content being viewed by others

A novel probabilistic feasible region method for reliability-based design optimization with varying standard deviation

Reliability measure approach for confidence-based design optimization under insufficient input data

Extending SORA method for reliability-based design optimization using probability and convex set mixed models

1 Introduction

2 Probability of failure and insufficient input data

2.1 Probability of failure

2.2 Input data decomposition

3 Probability of PoF

3.1 Probability of PoF

3.2 Joint PDF of input distribution parameters

3.3 Probability mass function of input distribution types

3.4 Calculation of conservativeness level of PoF

4 Design sensitivity of conservativeness level

4.1 Design sensitivity

4.2 Log-derivative of probability mass function of input distribution types

5 Numerical example: 2-dimensional mathematical example

5.1 Conservativeness level calculation

5.2 Accuracy of design sensitivity of conservativeness level

5.3 Confidence-based RBDO

5.4 Convergence test of C-RBDO

5.5 Repeated test of C-RBDO

6 Numerical example: 11-dimensional vehicle side impact problem

6.1 Test cases

6.2 Candidate distribution type selection

6.3 Confidence-based RBDO result

7 Conclusion

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation