1 Introduction

Reliability analysis theory has been widely used in many engineering problems [19, 22]. The most basic requirement is to calculate the failure probability. Presently, reliability methods mainly include numerical simulation methods and moment estimation methods. The most commonly used numerical simulation method is the crude Monte Carlo (MC) method [18]. However, since a large number of function calls are required, the efficiency is quite low. Moment estimation methods use the performance function values at some feature points to calculate moment information, and the failure probability is approximated through the calculated moment information. Presently, the first and second-order reliability methods are two classical methods [16]. These methods use the linear or quadratic function at the design point to approximate the real performance function, which is related to the nonlinearity degree of the performance function. For high-level nonlinear problems, the accuracy might be low.

To increase the efficiency and reduce the number of required function calls of the MC method, researchers often use surrogate models to fit the real performance function, such as Kriging [10, 35, 39, 40, 48, 52], neural networks [3, 23], radial basis functions [9, 26, 51] and support/relevant vector machines [19, 20, 24]. Presently, the most popular surrogate model might be the Kriging model, owing to its built-in error and uncertainty measures [18, 30, 38]. As one of the classical methods, AK-MCS [6] combines the U learning function and the Monte Carlo (MC) method. Based on the idea of AK-MCS, different learning functions [25, 29, 39, 41, 49] and stopping criteria [14, 33] are also proposed. However, there are some shortcomings in the MC-based surrogate methods. First, the candidate sample size of MC is usually quite large. As the values of the learning function should be calculated for all candidate samples, MC may consume computer memory for small failure probability problems [13]. Second, the candidate samples in MC are randomly generated in the whole sample space. This will result in many unnecessary training points with a low contribution to failure probability.

To solve the problems of crude MC, researchers often combine some improved sampling strategies with the Kriging model, such as importance sampling (IS) [31, 50], subset simulation (SS) [2, 17, 32], line sampling [21, 27], region partition sampling methods [11, 27], directional and [47] importance directional sampling [12]. These methods all have their own advantages. In this paper, we mainly focus on the IS methods. Table 1 lists the information on the current IS-based surrogate mode methods.

Table 1 Relevant IS-based surrogate model methods

As listed in Table 1, a commonly used strategy is to establish the IS density function at the design point. However, this strategy has some limitations. First, it is not suitable for reliability problems with multiple failure modes. Second, the design point calculation is usually a constrained optimization problem. For the performance function with high nonlinearity, design points may not be easily available. Another widely used strategy is to use the clustering method to establish the IS density function. This approach is suitable for multiple failure modes, but it requires an artificial assumption of the number of clusters. For most reliability problems, it is quite hard to apply if the shape of the performance function is not known. In addition, the U function is often adopted for the adaptive Kriging-based IS method. The limitation is that the stopping criterion of the U function is too conservative, and it is established only based on the Kriging prediction variance without the distribution information of random samples. The above shortcomings will result in the selection of many unnecessary training samples with a low contribution to failure probability and low computational efficiency. Wen [34] and Xiao [36] have verified that by considering the probability density function in the original distribution, the computation cost of active learning in MC-based Kriging methods is greatly reduced. In the IS method, as random samples are generated based on the IS density function, the information on the IS density function should also be considered in active learning to select the most informative samples. Presently, the learning functions for the MC method are often directly applied to IS, and there is no special research on the adaptive learning function for the IS method. Since Meta-IS-AK includes the advantages of both Meta-IS and AK-IS, the purpose of this paper is to propose an improved Meta-IS-AK method for reliability analysis to further increase efficiency. First, the k-means cluster algorithm in IS density function construction is improved by the silhouette plot. Through the mean silhouette value, the silhouette plot can judge the optimal clusters, thus solving the problem of only an arbitrarily given number of clusters. Second, a novel learning function considering the characteristics of the IS density function is proposed for Kriging model updating. Then a novel stopping criterion is adopted to increase the efficiency of active learning. The proposed active learning strategy is established based on the variance of failure probability caused by the Kriging prediction uncertainty, which is specially used for the IS method.

The rest of the paper is organized as follows: Sect. 2 introduces the concept of explosion Meta-IS-AK. Section 3 introduces the proposed active learning Kriging model for IS. Section 4 introduces the proposed improvements for Meta-IS-AK. Section 5 summarizes the steps of the proposed method. Section 6 uses the numerical model to illustrate the proposed method. Section 7 draws the conclusions.

2 Reliability analysis based on Meta-IS-AK

Based on the Kriging model \(g_K \left( {{\varvec{x}}} \right)\), the IS density function \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) in Meta-IS-AK is defined as:

$$h_{{\varvec{X}}} \left( {{\varvec{x}}} \right) = {{\pi \left( {{\varvec{x}}} \right)f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)} / {P_{f\varepsilon } }}$$
(1)

where \(f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) is the joint probability density function of input variables \({{\varvec{x}}} = \left( {x_1 ,x_2 , \cdot \cdot \cdot ,x_n } \right)\). \(\pi \left( {{\varvec{x}}} \right)\) is the probability classification function:

$$\pi \left( {{\varvec{x}}} \right) = P\left\{ {g_K \left( {{\varvec{x}}} \right) \le 0} \right\} = \phi \left( { - \frac{{\mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right)$$
(2)

where \(\mu_{g_K } \left( {{\varvec{x}}} \right)\) and \(\sigma_{g_K } \left( {{\varvec{x}}} \right)\) are the mean and standard deviation of the Kriging predicted value, respectively. \(\phi \left(\bullet \right)\) is the cumulative distribution function of the standard normal distribution. \(P_{f\varepsilon }\) is the augmented failure probability, which is defined as:

$$P_{f\varepsilon } = \int_{R^n } {\pi \left( {{\varvec{x}}} \right)f_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}}}$$
(3)

The failure probability \(P_f\) based on Eq. (1) is calculated by:

$$P_f = \int_{R^n } {I_F \left( {{\varvec{x}}} \right)\frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}} h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}} = \int_{R^n } {I_F \left( {{\varvec{x}}} \right)\frac{{P_{f\varepsilon } f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{\pi \left( {{\varvec{x}}} \right)f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}} h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}} = P_{f\varepsilon } \alpha_{corr}$$
(4)
$$\alpha_{corr} = \int_{R^n } {\frac{{I_F \left( {{\varvec{x}}} \right)}}{{\pi \left( {{\varvec{x}}} \right)}}} h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}}$$
(5)

where \(I_F \left( {{\varvec{x}}} \right)\) is the indicator function. If \(g_K \left( {{\varvec{x}}} \right) \le 0\), \(I_F \left( {{\varvec{x}}} \right) = 1\), otherwise \(I_F \left( {{\varvec{x}}} \right) = 0\).\(\alpha_{corr}\) is the correction factor. The estimated \({\mathop{P}\limits^{\frown}}_{f\varepsilon }\) and \({\mathop{\alpha }\limits^{\frown}}_{coor}\) and the corresponding coefficient of variance (COV) are calculated by:

$${\mathop{P}\limits^{\frown}}_{f\varepsilon } = \frac{1}{N_\varepsilon }\sum_{i = 1}^{N_\varepsilon } {\pi \left( {{{\varvec{x}}}_f^{\left( i \right)} } \right)}$$
(6-1)
$${\text{Var}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right) = \frac{1}{N_\varepsilon - 1}\left( {\frac{1}{N_\varepsilon }\sum_{i = 1}^{N_\varepsilon } {\pi^2 \left( {{{\varvec{x}}}_f^{\left( i \right)} } \right) - {\mathop{P}\limits^{\frown}}_{f\varepsilon }^2 } } \right)$$
(6-2)
$${\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right) = {{\sqrt {{{\text{Var}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right)}} } / {{\mathop{P}\limits^{\frown}}_{f\varepsilon } }}$$
(6-3)
$${\mathop{\alpha }\limits^{\frown}}_{corr} = \frac{1}{{N_{corr} }}\sum_{j = 1}^{N_{corr} } {\frac{{{\mathop{I}\limits^{\frown}}_F \left( {{{\varvec{x}}}_h^{\left( j \right)} } \right)}}{{\pi \left( {{{\varvec{x}}}_h^{\left( j \right)} } \right)}}}$$
(7-1)
$${\text{Var}}\left( {{\mathop{\alpha }\limits^{\frown}}_{corr} } \right) \approx \frac{1}{{N_{corr} - 1}}\left( {\frac{1}{{N_{corr} }}\sum_{j = 1}^{N_{corr} } {\frac{{{\mathop{I}\limits^{\frown}}_F \left( {{{\varvec{x}}}_h^{\left( j \right)} } \right)}}{{\pi^2 \left( {{{\varvec{x}}}_h^{\left( j \right)} } \right)}} - {\mathop{\alpha }\limits^{\frown}}_{corr}^2 } } \right)$$
(7-2)
$${\text{Cov}}\left( {{\mathop{\alpha }\limits^{\frown}}_{corr} } \right) = {{\sqrt {{{\text{Var}}\left( {{\mathop{\alpha }\limits^{\frown}}_{corr} } \right)}} } / {{\mathop{\alpha }\limits^{\frown}}_{corr} }}$$
(7-3)

where \({{\varvec{x}}}_f^{\left( i \right)}\) and \({{\varvec{x}}}_h^{\left( j \right)}\) are the random samples generated by \(f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) and \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\), respectively. \(N_\varepsilon\) and \(N_{corr}\) are the size of random samples and IS samples, respectively.

The estimated failure probability \({\mathop{P}\limits^{\frown}}_f\) and its COV are defined as:

$${\mathop{P}\limits^{\frown}}_f = {\mathop{P}\limits^{\frown}}_{f\varepsilon } {\mathop{\alpha }\limits^{\frown}}_{corr} ,{\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_f } \right) \approx \sqrt {{{\text{Cov}}^2 \left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right) + {\text{Cov}}^2 \left( {{\mathop{\alpha }\limits^{\frown}}_{corr} } \right)}}$$
(8)

\({\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_f } \right)\) should be less than a small constant \(\lambda_{Pf}\) to ensure the robustness of \({\mathop{P}\limits^{\frown}}_f\), which is usually defined as \(\lambda_{Pf} = 5\%\). The failure probability in Meta-IS-AK is calculated in two stages. In the first stage, the updating strategy in Meta-IS based on the k-means algorithm is adopted to approximate \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) and obtain IS samples. By setting the number of clusters \(K\), the clustering centers are taken as the added samples to update the Kriging model. When the leave-one-out estimate \({\mathop{\alpha }\limits^{\frown}}_{corrLOO}\) meets the convergence criterion, the first-stage Kriging model is obtained. \({\mathop{\alpha }\limits^{\frown}}_{corrLOO}\) is defined as:

$${\mathop{\alpha }\limits^{\frown}}_{corrLOO} = \frac{1}{m}\sum_{i = 1}^m {\frac{{I_F \left( {{{\varvec{x}}}_T^{\left( i \right)} } \right)}}{{P\left\{ {g_{K\left( {T/x_T^{\left( i \right)} } \right)} \left( {{{\varvec{x}}}_T^{\left( i \right)} \le 0} \right)} \right\}}}} = \frac{1}{m}\sum_{i = 1}^m {\frac{{I_F \left( {{{\varvec{x}}}_T^{\left( i \right)} } \right)}}{{\pi_{\left( {T/x_T^{\left( i \right)} } \right)} \left( {{{\varvec{x}}}_T^{\left( i \right)} } \right)}}}$$
(9)

where \(m\) is the current size of training samples. \(\pi_{\left( {T/x_T^{\left( i \right)} } \right)} \left( {{{\varvec{x}}}_T^{\left( i \right)} } \right)\) is the probability classification function constructed by the Kriging model without the \(ith\) training sample. As suggested by Zhu [53], the convergence criterion of \({\mathop{\alpha }\limits^{\frown}}_{corrLOO}\) and \(m\) should be \({\mathop{\alpha }\limits^{\frown}}_{corrLOO} \in \left[ {0.1,10} \right] \cap m > 30\). If the condition is satisfied, \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) is considered to be convergent. Then \({\mathop{P}\limits^{\frown}}_{f\varepsilon }\) can be calculated through Eq. (6) by generating random samples through \(f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\). The second stage is to update the Kriging model based on the idea of AK-IS through the generated IS samples and U learning function. When the second-stage Kriging model has sufficient accuracy,\({\mathop{\alpha }\limits^{\frown}}_{corr}\) could be calculated through Eq. (7). Once \({\mathop{P}\limits^{\frown}}_{f\varepsilon }\) and \({\mathop{\alpha }\limits^{\frown}}_{corr}\) are calculated, \({\mathop{P}\limits^{\frown}}_f\) can be easily obtained.

3 The proposed adaptive Kriging model for Meta-IS-AK

This paper proposes a novel active learning strategy considering the influence of \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) for Meta-IS-AK to select the most informative samples. The proposed method consists of two parts: the learning function and the stopping criterion. The two parts will be introduced below.

3.1 The learning function

Based on Eq. (2), \(I_F \left( {{\varvec{x}}} \right)\) predicted by the Kriging model is subjected to a Bernoulli distribution with a mean \(E\left[ {I_F \left( {{\varvec{x}}} \right)} \right]\) and a standard deviation \(V\left[ {I_F \left( {{\varvec{x}}} \right)} \right]\):

$$E\left[ {I_F \left( {{\varvec{x}}} \right)} \right] = \phi \left( {\frac{{ - \mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right),V\left[ {I_F \left( {{\varvec{x}}} \right)} \right] = \phi \left( {\frac{{ - \mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right)\phi \left( {\frac{{\mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right)$$
(10)

Based on Eq. (10), the IS-based failure probability \(P_f^r\) is also a random variable, which is caused by the prediction uncertainty of the Kriging model. The corresponding mean \(E\left[ {P_f^r } \right]\) and variance \(V\left[ {P_f^r } \right]\) can be calculated by:

$$\begin{aligned} E\left[ {P_f^r } \right] & = E\left[ {\int_{R^n } {I_F \left( {{\varvec{x}}} \right)\frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}}} } \right] = \int_{R^n } {E\left[ {I_F \left( {{\varvec{x}}} \right)} \right]\frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}}} \\ & = \int_{R^n } {\phi \left( {\frac{{ - \mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right)\frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}}} = E\left[ {\phi \left( {\frac{{ - \mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right)\frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}} \right] \, \\ \end{aligned}$$
(11)
$$\begin{aligned} V\left[ {P_f^r } \right] & = E\left[ {\left( {P_f^r - {{\mathbb{E}}}\left[ {P_f^r } \right]} \right)^2 } \right] \\ & = E\left[ {\left( {\int_{R^n } {\left( {I_F \left( {{\varvec{x}}} \right) - {{\mathbb{E}}}\left[ {I_F \left( {{\varvec{x}}} \right)} \right]} \right)\frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}}} } \right)^2 } \right] \\ & = E\left[ {\left( {\int_{R^n } {\left( {I_F \left( {{\varvec{x}}} \right) - {{\mathbb{E}}}\left[ {I_F \left( {{\varvec{x}}} \right)} \right]} \right)\frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}}} } \right)\left. {\left( {\int_{R^n } {\left( {I_F \left( {{{\varvec{x}}}^{\prime}} \right) - E\left[ {I_F \left( {{{\varvec{x}}}^{\prime}} \right)} \right]} \right)\frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right)}}h_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right){\text{d}}{{\varvec{x}}}^{\prime}} } \right)} \right]} \right. \\ & = \int_{R^n } {\int_{R^n } {E\left[ {\left( {I_F \left( {{\varvec{x}}} \right) - E\left[ {I_F \left( {{\varvec{x}}} \right)} \right]} \right)\left( {I_F \left( {{{\varvec{x}}}^{\prime}} \right) - E\left[ {I_F \left( {{{\varvec{x}}}^{\prime}} \right)} \right]} \right)} \right]} } \frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right)}}h_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right){\text{d}}{{\varvec{x}}}{\text{d}}{{\varvec{x}}}^{\prime} \\ & = \int_{R^n } {\int_{R^n } {CV\left[ {I_F \left( {{\varvec{x}}} \right),I_F \left( {{{\varvec{x}}}^{\prime}} \right)} \right]\frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right)}}h_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right){\text{d}}{{\varvec{x}}}{\text{d}}{{\varvec{x}}}^{\prime}} } \\ \end{aligned}$$
(12)

where \(CV\left[ {I_F \left( {{\varvec{x}}} \right),I_F \left( {{{\varvec{x}}}^{\prime}} \right)} \right]\) is the covariance between \(I_F \left( {{\varvec{x}}} \right)\) and \(I_F \left( {{{\varvec{x}}}^{\prime}} \right)\). As suggested by Dang [4], Cauchy–Schwarz inequality could be adopted to estimate \(CV\left[ {I_F \left( {{\varvec{x}}} \right),I_F \left( {{{\varvec{x}}}^{\prime}} \right)} \right]\), that is:

$$CV\left[ {I_F \left( {{\varvec{x}}} \right),I_F \left( {{{\varvec{x}}}^{\prime}} \right)} \right] \le \sqrt {{V\left[ {I_F \left( {{\varvec{x}}} \right)} \right]}} \sqrt {{V\left[ {I_F \left( {{{\varvec{x}}}^{\prime}} \right)} \right]}}$$
(13)

Then Eq. (12) is changed to:

$$\begin{aligned} V\left[ {P_f^r } \right] & \le \int_{R^n } {\int_{R^n } {\sqrt {{V\left[ {I_F \left( {{\varvec{x}}} \right)} \right]}} \sqrt {{V\left[ {I_F \left( {{{\varvec{x}}}^{\prime}} \right)} \right]}} } \frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right)}}h_{{\varvec{X}}} \left( {{{\varvec{x}}}^{\prime}} \right){\text{d}}{{\varvec{x}}}{\text{d}}{{\varvec{x}}}^{\prime}} \\ & = \left( {\int_{R^n } {\sqrt {{V\left[ {I_F \left( {{\varvec{x}}} \right)} \right]}} \frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}h_{{\varvec{X}}} \left( {{\varvec{x}}} \right){\text{d}}{{\varvec{x}}}} } \right)^2 = \left( {E\left[ {\sqrt {{\phi \left( {\frac{{ - \mu_{{\mathop{g}\limits^{\frown}} } \left( {{\varvec{x}}} \right)}}{{\sigma_{{\mathop{g}\limits^{\frown}} } \left( {{\varvec{x}}} \right)}}} \right)\phi \left( {\frac{{\mu_{{\mathop{g}\limits^{\frown}} } \left( {{\varvec{x}}} \right)}}{{\sigma_{{\mathop{g}\limits^{\frown}} } \left( {{\varvec{x}}} \right)}}} \right)}} \frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}} \right]} \right)^2 \, \\ \end{aligned}$$
(14)

Based on Eq. (14), the proposed novel variance-based learning function \(VL\left( {{\varvec{x}}} \right)\) is defined as:

$$VL\left( {{\varvec{x}}} \right) = \sqrt {{\phi \left( {\frac{{ - \mu_{{\mathop{g}\limits^{\frown}} } \left( {{\varvec{x}}} \right)}}{{\sigma_{{\mathop{g}\limits^{\frown}} } \left( {{\varvec{x}}} \right)}}} \right)\phi \left( {\frac{{\mu_{{\mathop{g}\limits^{\frown}} } \left( {{\varvec{x}}} \right)}}{{\sigma_{{\mathop{g}\limits^{\frown}} } \left( {{\varvec{x}}} \right)}}} \right)}} \frac{{f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}{{h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)}}$$
(15)

\(VL\left( {{\varvec{x}}} \right)\) reflects the uncertainty of Kriging prediction at the sample point \({{\varvec{x}}}\). The larger \(VL\left( {{\varvec{x}}} \right)\) means the greater variance with greater uncertainty. Therefore, the adding point criterion could be defined as: \({{\varvec{x}}}^* = \max \left[ {VL\left( {{\varvec{x}}} \right)} \right]\). Compared with the existing learning functions, the greatest advantage of \(VL\left( {{\varvec{x}}} \right)\) is that \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) is included, which can more fully reflect the distribution characteristic of IS samples. In the Meta-IS-AK, one of the key steps is to calculate \(\alpha_{corr}\). As \(\alpha_{corr}\) is related to \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\), the proposed \(VL\left( {{\varvec{x}}} \right)\) learning function is more suitable for Meta-IS-AK.

3.2 The stopping criterion

The stopping criterion of the U learning function is adopted in the Meta-IS-AK methods. However, this criterion is too conservative, which will lead to too many redundant training samples. To solve the problem, this paper will adopt two different stopping criteria for Meta-IS-AK. The first is the proposed COV-based stopping criterion considering the mean and standard deviation of \(P_f^r\). Based on Eqs. (11) and (14), the estimated \(E\left[ {{\mathop{P}\limits^{\frown}}_f^r } \right]\) and \(V\left[ {{\mathop{P}\limits^{\frown}}_f^r } \right]\) can be calculated by:

$$E\left[ {{\mathop{P}\limits^{\frown}}_f^r } \right] = \frac{1}{{N_{IS} }}\sum_{i = 1}^{N_{IS} } {\phi \left( {\frac{{ - \mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right)\frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}}$$
(16)
$$V\left[ {{\mathop{P}\limits^{\frown}}_f^r } \right] = \left[ {\frac{1}{{N_{IS} }}\sum_{i = 1}^{N_{IS} } {\sqrt {{\phi \left( {\frac{{ - \mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right)\phi \left( {\frac{{\mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right)}} \frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}} } \right]^2$$
(17)

where \(N_{IS}\) is the size of generated IS samples. Then the estimated COV of \(P_f^r\) can be calculated by:

$${\text{COV}}\left[ {{\mathop{P}\limits^{\frown}}_f^r } \right] = \frac{{\sum_{i = 1}^{N_{IS} } {\sqrt {{\phi \left( {\frac{{ - \mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right)\phi \left( {\frac{{\mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right)}} \frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}} }}{{\sum_{i = 1}^{N_{IS} } {\phi \left( {\frac{{ - \mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right)\frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}} }}$$
(18)

\({\text{COV}}\left[ {{\mathop{P}\limits^{\frown}}_f^r } \right]\) should be small enough to ensure the accuracy. Then the COV-based stopping criterion could be defined as: \({\text{COV}}\left[ {{\mathop{P}\limits^{\frown}}_f^r } \right] < \lambda_{thr}\). \(\lambda_{thr}\) is a small constant defining the allowable \({\text{COV}}\left[ {{\mathop{P}\limits^{\frown}}_f^r } \right]\).

The second is the existing error-based stopping criterion. This criterion was first proposed by Wang [33] for AK-MCS. In order to make it more suitable for Meta-AK-IS, this paper adopts the improved error-based stopping criterion proposed by Yun [44], considering the effect of \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\). For a given IS sample, the probability that the predicted sign of \(g_K \left( {{\varvec{x}}} \right)\) is wrong is written as:

$$P\left( {I_F^w = 1} \right) = \Phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right|} \right)$$
(19)

where \(I_F^w = 1\) means the event that the predicted sign is wrong. Based on Eq. (19), \(I_F^w = 1\) is also a random variable with a mean \(E\left[ {I_F^w = 1} \right]\) and a variance \(V\left[ {I_F^w = 1} \right]\):

$$E\left[ {I_F^w = 1} \right] = \Phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right|} \right),V\left[ {I_F^w = 1} \right] = \Phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right|} \right)\left[ {1 - \Phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{\varvec{x}}} \right)}}{{\sigma_{g_K } \left( {{\varvec{x}}} \right)}}} \right|} \right)} \right]$$
(20)

As suggested by Ref. [40], the size of samples \({\mathop{S}\limits^{\frown}}_s\) predicted by the Kriging model that fail but actually reliable follows a normal distribution with a mean \(\, \mu_{{\mathop{S}\limits^{\frown}}_s }\) and a standard deviation \(\sigma_{{\mathop{S}\limits^{\frown}}_s }\), that is:

$$\begin{aligned} & {\mathop{S}\limits^{\frown}}_s \ N\left( {\mu_{{\mathop{S}\limits^{\frown}}_s } ,\sigma_{{\mathop{S}\limits^{\frown}}_s } } \right) \\ & \mu_{{\mathop{S}\limits^{\frown}}_s } = \sum_{i = 1}^{{\mathop{N}\limits^{\frown}}_s } {\frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}\phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right|} \right),} \, \sigma_{{\mathop{S}\limits^{\frown}}_s } = \sqrt {{\sum_{i = 1}^{{\mathop{N}\limits^{\frown}}_s } {\frac{{f_{{\varvec{X}}}^2 \left( {{{\varvec{x}}}_i } \right)}}{{h_{{\varvec{X}}}^2 \left( {{{\varvec{x}}}_i } \right)}}\phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right|} \right)\left[ {1 - \phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right|} \right)} \right]} }} \\ \end{aligned}$$
(21)

where \({\mathop{N}\limits^{\frown}}_s\) is the size of predicted reliable samples. Reference [33] suggested that the size of samples \({\mathop{S}\limits^{\frown}}_f\) predicted by the Kriging model that are reliable but actually fail can be approximately represented by a Poisson distribution. As the size of failed samples generated by the Meta-AK-IS is significantly more than that by the MC method, \({\mathop{S}\limits^{\frown}}_f\) is subjected to a normal distribution in this paper with a mean \(\mu_{{\mathop{S}\limits^{\frown}}_f }\) and a standard deviation \(\sigma_{{\mathop{S}\limits^{\frown}}_f }\), that is:

$$\begin{aligned} & {\mathop{S}\limits^{\frown}}_f \ N\left( {\mu_{{\mathop{S}\limits^{\frown}}_f } ,\sigma_{{\mathop{S}\limits^{\frown}}_f } } \right) \\ & \mu_{{\mathop{S}\limits^{\frown}}_f } = \sum_{i = 1}^{{\mathop{N}\limits^{\frown}}_f } {\frac{{f_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}{{h_{{\varvec{X}}} \left( {{{\varvec{x}}}_i } \right)}}\phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right|} \right),} \, \sigma_{{\mathop{S}\limits^{\frown}}_f } = \sqrt {{\sum_{i = 1}^{{\mathop{N}\limits^{\frown}}_f } {\frac{{f_{{\varvec{X}}}^2 \left( {{{\varvec{x}}}_i } \right)}}{{h_{{\varvec{X}}}^2 \left( {{{\varvec{x}}}_i } \right)}}\phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right|} \right)\left[ {1 - \phi \left( { - \left| {\frac{{\mu_{g_K } \left( {{{\varvec{x}}}_i } \right)}}{{\sigma_{g_K } \left( {{{\varvec{x}}}_i } \right)}}} \right|} \right)} \right]} }} \\ \end{aligned}$$
(22)

where \({\mathop{N}\limits^{\frown}}_f\) is the size of predicted failed samples.

To ensure accuracy of \(g_K \left( {{\varvec{x}}} \right)\), the difference between \({\mathop{N}\limits^{\frown}}_f\) and \(N_f\), where \(N_f\) is the size of samples, which actually fail, should be small enough. The maximum relative error \(MRE\) should be less than a small constant \(\lambda_{thr}\), that is:

$$MRE = \left| {\frac{{{\mathop{N}\limits^{\frown}}_f }}{N_f } - 1} \right| \le \max \left( {\left| {\frac{{{\mathop{N}\limits^{\frown}}_f }}{{{\mathop{N}\limits^{\frown}}_f - {\mathop{S}\limits^{\frown}}_f^u }}} \right|,\left| {\frac{{{\mathop{N}\limits^{\frown}}_f }}{{{\mathop{N}\limits^{\frown}}_f + {\mathop{S}\limits^{\frown}}_s^u }}} \right| - 1} \right) = \lambda_{thr}$$
(23)

where \({\mathop{S}\limits^{\frown}}_f^u\) and \({\mathop{S}\limits^{\frown}}_s^u\) are the upper bounds of \({\mathop{S}\limits^{\frown}}_f\) and \({\mathop{S}\limits^{\frown}}_s\) respectively, which are calculated by [46]:

$${\mathop{S}\limits^{\frown}}_f^u = \mu_{{\mathop{S}\limits^{\frown}}_f } + \delta \sigma_{{\mathop{S}\limits^{\frown}}_f } ,{\mathop{S}\limits^{\frown}}_s^u = \mu_{{\mathop{S}\limits^{\frown}}_s } + \delta \sigma_{{\mathop{S}\limits^{\frown}}_s }$$
(24)

where \(\delta\) is a constant reflecting the confidence interval of \({\mathop{S}\limits^{\frown}}_f\) and \({\mathop{S}\limits^{\frown}}_s\). If \(\delta = 3\), the likelihood of \({\mathop{S}\limits^{\frown}}_f\) and \({\mathop{S}\limits^{\frown}}_s\) located in the region \(\left[ {\mu_{{\mathop{S}\limits^{\frown}}_f } - \delta \sigma_{{\mathop{S}\limits^{\frown}}_f } ,\mu_{{\mathop{S}\limits^{\frown}}_f } + \delta \sigma_{{\mathop{S}\limits^{\frown}}_f } } \right]\) and \(\left[ {\mu_{{\mathop{S}\limits^{\frown}}_s } - \delta \sigma_{{\mathop{S}\limits^{\frown}}_s } ,\mu_{{\mathop{S}\limits^{\frown}}_s } + \delta \sigma_{{\mathop{S}\limits^{\frown}}_s } } \right]\) will both reach 99.73%. When \(\lambda_{thr}\) is small enough, the Kriging model has sufficient accuracy. Therefore, the stopping criterion is defined as \(MRE < \lambda_{thr}\).

4 Silhouette plot method

In the Meta-IS-AK method, the k-means clustering algorithm is adopted to cluster the important samples and calculate \({\mathop{\alpha }\limits^{\frown}}_{corrLOO}\). To overcome the shortcoming of assuming \(K\) artificially, the silhouette plot method is introduced. Based on the silhouette value, the silhouette plot can determine whether the cluster of each sample is reasonable, which is defined as:

$$S\left( i \right) = \frac{\min \left( b \right) - a}{{\max \left[ {a,\min \left( b \right)} \right]}},\quad i = 1,2, \ldots m$$
(25)

where \(S\left( i \right)\) is the silhouette value of the ith point. \(a\) is the average distance between the ith point and other points in the same cluster. \(b\) is a vector and its elements represent the average distance between the ith point and other points in different clusters. The larger \(S\left( i \right)\) means the more reasonable cluster. In this paper, the mean silhouette value is used to determine \(K\). The clusters correspond to the largest \(\frac{1}{m}\sum\nolimits_{i = 1}^m {S\left( i \right)}\) are selected, and the corresponding cluster centers could be added to the training set.

It is noted that the silhouette plot is used after obtaining cluster centers. Once the important sample set and \({\mathop{\alpha }\limits^{\frown}}_{corrLOO}\) are obtained, the optimal clusters should be determined. It is noted that \(K\) should be repeatedly calculated until the condition \({\mathop{\alpha }\limits^{\frown}}_{corrLOO} \in \left[ {0.1,10} \right] \cap m > 30\) is satisfied. Therefore, in the proposed method, \(K\) is continuously changing with Kriging updates. The recommended number of clusters in this article is \(\left[ {m_f ,\max \left( {2m_f ,n} \right)} \right]\), where \(m_f\) represents the number of failure modes and \(n\) is the number of input variables. The feasibility of this strategy will be verified through numerical examples.

5 Summarized of the proposed method

Based on the previous sections, this paper proposes an improved Meta-IS-AK method for reliability analysis with the proposed variance-based learning function and COV-based stopping criterion, which is called the Meta-IS-VL method. The specific steps are summarized as follows:

  • Step 1: Transform the random variables into standard normal space and establish the initial Kriging model. As suggested by Zhang [49], the samples with the population \(N_{ini} = \max \left( {12,n} \right)\) are generated by the Sobol sequence as the initial training set in the interval [-5,5]. Calculate the real values of the performance function of these samples and the initial Kriging model is established.

  • Step 2: Generate IS samples and update the Kriging model. Based on Eq. (1), \(N_{corr}\) IS samples under the current \(g_K \left( {{\varvec{x}}} \right)\) can be generated through the MCMC method.

  • Step 3: Based on the k-means algorithm, the candidate centroids are obtained. Calculate \(S\left( i \right)\) of all IS samples. The \(K\) cluster centers correspond to the largest \(\frac{1}{m}\sum_{i = 1}^m {S\left( i \right)}\) are added to the training set to update the Kriging model.

  • Step 4: Calculate \({\mathop{\alpha }\limits^{\frown}}_{corrLOO}\) and judge the convergence criterion. If \({\mathop{\alpha }\limits^{\frown}}_{corrLOO} \in \left[ {0.1,10} \right] \cap m > 30\), output \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) and turn to Step 4. Otherwise, return to Step 2.

  • Step 5: Generate \(N_\varepsilon\) samples based on \(f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\). Calculate \({\mathop{P}\limits^{\frown}}_{f\varepsilon }\) nd \({\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right)\) based on the first-stage Kriging model. If \({\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right) < \lambda_{f\varepsilon }\), output \({\mathop{P}\limits^{\frown}}_{f\varepsilon }\) and \({\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right)\). Otherwise, expand the size of \(N_\varepsilon\) and repeat Step 5.

  • Step 6: Update the Kriging model. Based on the previous steps, the first-stage Kriging model and the IS samples are obtained. Based on the proposed variance-based learning function, the Kriging model is updated through the selected optimal training samples.

  • Step 7: Judge the stopping criterion. If the stopping criterion defined by Eq. (18) or Eq. (23) is satisfied, output the second-stage Kriging model and turn to Step 7. Otherwise, return to Step 5 and re-update the Kriging model.

  • Step 8: Calculate \({\mathop{\alpha }\limits^{\frown}}_{corr}\) based on the second-stage Kriging model. If \({\text{Cov}}\left( {{\mathop{\alpha }\limits^{\frown}}_{corr} } \right) < \lambda_{corr}\), output \({\mathop{\alpha }\limits^{\frown}}_{corr}\) and turn to Step 8. Otherwise, return to Step 2 and expand the size of \(N_{corr}\).

  • Step 9: Calculate \({\mathop{P}\limits^{\frown}}_f\) using Eq. (8).

In Step 5 and Step 8, \(\lambda_{f\varepsilon }\) and \(\lambda_{corr}\) are both small constants that define the allowable COV. In this paper, \(\lambda_{corr} = 2\%\) and \(\lambda_{f\varepsilon } = 4\%\) are suggested. The reasons are listed as follows: First, it can guarantee that \({\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right) < 5\%\), and 5% is usually defined as the maximum allowable COV of failure probability. Second, \({\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right)\) is derived from the random samples generated by \(f_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\). For the problems with a small failure probability, the size of \(N_\varepsilon\) is much larger than that of \(N_{corr}\). Although it doesn’t require additional function calls, it still requires a large data storage space. T reduce the required computer memory, the allowable value of \({\text{Cov}}\left( {{\mathop{P}\limits^{\frown}}_{f\varepsilon } } \right)\) could be larger than \({\text{Cov}}\left( {{\mathop{\alpha }\limits^{\frown}}_{corr} } \right)\).

The proposed method is the further development of Ref. [53]. The following improvements are made: (1) The k-means algorithm combined with a silhouette plot are adopted to calculate \({\mathop{\alpha }\limits^{\frown}}_{corrLOO}\) in the first-stage Kriging model construction. Compared with the single k-means, the proposed strategy can judge the optimal number of centroids without a single artificial assumption, which has stronger data robustness. (2) A novel learning function considering the characteristics of \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\) is proposed to update the Kriging model in the second stage. Compared with the U function, the proposed learning function considers the characteristics of the IS density function, which is more suitable for the Meta-IS-AK. (3) The COV-based stopping criterion is introduced considering the distribution characteristics of \(h_{{\varvec{X}}} \left( {{\varvec{x}}} \right)\), which can reduce the number of training samples and improve the efficiency of active learning in the second stage of Kriging model construction.

6 Numerical examples

In this section, four numerical examples are used to illustrate the performance of the proposed Meta-IS-VL method. The MC method is used as the benchmark, and several existing methods are used for comparative calculation. The error- and the proposed COV-based stopping criteria are both adopted to compare the performance, which are called Meta-IS-VL-ESC and Meta-IS-VL-COV, respectively. Each method is repetitively calculated 10 times, and the mean values are taken as the final results. Considering that \(\lambda_{thr}\) will affect convergence and accuracy, \(\lambda_{thr} = 0.01,0.05,{\text{ and }}0.1\) are used, respectively, to obtain the optimal \(\lambda_{thr}\). In this work, the efficiency of each method is measured by the number of required function calls. The robustness will be evaluated through the width of the failure probability and the number of evaluated performance functions.

6.1 A classical series system

This section studies a classical series system [53], as shown in Eq. (26). \(x_1\) and \(x_2\) are both standard independent normal variables. This example is used to elaborate on the intermediate calculation process of the method.

$$G_1 = \min \left\{ \begin{gathered} 3 + {{\left( {x_1 - x_2 } \right)^2 } / {10}} - {{\left( {x_1 + x_2 } \right)} / {\sqrt {2} }} \hfill \\ 3 + {{\left( {x_1 - x_2 } \right)^2 } / {10}} + {{\left( {x_1 + x_2 } \right)} / {\sqrt {2} }} \hfill \\ \left( {x_1 - x_2 } \right) + {7 / {\sqrt {2} }} \hfill \\ \left( {x_2 - x_1 } \right) + {7 / {\sqrt {2} }} \hfill \\ \end{gathered} \right.$$
(26)

The results of different methods are listed in Table 2. It can be seen that for the proposed Meta-IS-VL, with the increase of \(\lambda_{thr}\), the number of required function calls will decrease, but the relative error will increase. When \(\lambda_{thr} = 0.1\), both the relative errors of error- and COV-based stopping criteria are greater than 2%. Therefore, the suitable \(\lambda_{thr}\) for the proposed Meta-IS-VL should be smaller than 0.05. The required function calls of Meta-IS-AK and Meta-AK-IS2 are 88.2 and 128.6, respectively. The required function calls of Meta-IS-VL-ESC and Meta-IS-VL-COV are 51.4 and 72.3, respectively, and the relative errors are 1.64% and 1.48%, respectively. The results show that the proposed method can significantly improve efficiency without losing accuracy. In addition, the size of the mean required function calls of the Meta-IS-VL-ESC is smaller than that of Meta-IS-VL-COV. From the perspective of mean value, the efficiency of the COV-based stopping criterion is lower than the error-based stopping criterion. The boxplots of failure probability and required function calls of the proposed Meta-IS-VL are shown in Fig. 1. It can be seen that the range of failure probability calculated by the Meta-IS-VL-ESC is between 2.1e−3 and 2.4e−3, while the range of Meta-IS-VL-COV is between 2.18e−3 and 2.3e−3. The results show that the robustness of Meta-IS-VL-COV is higher than that of Meta-IS-VL-ESC. Table 3 shows the calculated parameters of the proposed Meta-IS-VL when \(\lambda_{thr} = 0.01\).

Table 2 Results of failure probability under different methods in Example 1
Fig. 1
figure 1

Boxplots of failure probability and numbers of required function calls

Table 3 Results of calculation parameters in Example 1

When the Kriging model finishes updating in the first stage, that is, the condition \({\mathop{\alpha }\limits^{\frown}}_{corrLOO} \in \left[ {0.1,10} \right] \cap m > 30\) is satisfied and \(N_{corr}\) important samples are generated, the results of the silhouette plots under different \(K\) are shown in Fig. 2. The mean \(S\left( i \right)\) is listed in Table 4. When \(K > 4\), \(S\left( i \right)\) has negative values. When \(K = 4\), the mean \(S\left( i \right)\) is the largest. Therefore, the optimal \(K\) is 4. Therefore, the optimal number of clusters can be determined with the help of a silhouette plot, thus effectively solving the problem of only assuming \(K\) arbitrarily in the Meta-IS-AK.

Fig. 2
figure 2

Silhouette plots of different \(K\)

Table 4 Mean silhouette value in Example 1

Figure 3 illustrates the fitting accuracy of the limit state boundary (LSB) and the distribution of samples. It can be seen that when the condition \({\mathop{\alpha }\limits^{\frown}}_{corrLOO} \in \left[ {0.1,10} \right] \cap m > 30\) is satisfied, the accuracy of the fitted LSB is still quite low. When the Kriging updating by learning function is finished, the fitting accuracy of Meta-IS-VL-COV is higher than that of Meta-IS-VL-ESC, as Meta-IS-VL-ESC fails to identify a corner area. From the boxplots of failure probability, it can be seen that although the relative error of the mean failure probability is small, this phenomenon may lead to low robustness results. Figure 4 illustrates the iteration curves of the performance function value of added training samples and the convergence in the second stage construction process of the Kriging model. It can be seen that when the number of added training samples is greater than 30, the second-stage Kriging model starts to build through the learning function. Compared with the initial and the first-stage training samples, the performance function values of the second-stage training samples are much closer to 0. The results show that the proposed learning function can obtain the training samples with a large contribution to the failure probability.

Fig. 3
figure 3

Fitting of the LSB and the distribution of training samples

Fig. 4
figure 4

Performance function values of the training samples and the convergence performance in Example 1

In addition, the influence of the initial sample size on the results is studied. \(\lambda_{thr} = 0.01\) is used, as it corresponds to the highest accuracy and robustness. The results are shown in Table 5. It can be seen that when \(N_{ini} = 6\), the size of the function call is the smallest, while the relative error is the largest. When \(N_{ini} = 12,\;{\text{ or }}18\) the relative errors and the size of function calls have little difference. When \(N_{ini} = 24\) the relative error has a small difference from the relative error when \(N_{ini} = 12,\;{\text{ or }}18\). However, the size of function calls when \(N_{ini} = 24\) is much higher than that of \(N_{ini} = 12,\;{\text{ or }}18\). This conclusion is suitable for the two stopping criteria. Therefore, the adopted strategy of initial sample size in this example is reasonable.

Table 5 Results of the proposed Meta-IS-VL method with different initial sample size

6.2 An oscillator system

An oscillator system from Song [28] is used in this section, as shown in Fig. 5. The performance function is defined in Eq. (27), and the random variables are listed in Table 6. This example is used to verify the applicability of a nonlinear performance function.

$$G_2 = 3r - \left| {\frac{2F_1 }{{c_1 + c_2 }}\sin \left( {\sqrt {{\frac{{\left( {c_1 + c_2 } \right)}}{2}}} \frac{t_1 }{m}} \right)} \right|$$
(27)
Fig. 5
figure 5

An oscillator system

Table 6 Random variables parameters of Example 2

The results are shown in Table 7. The boxplots of failure probability and function calls of the proposed method are shown in Fig. 6. It can be seen that the required function calls of Meta-IS-VL-ESC and Meta-IS-VL-COV when \(\lambda_{thr} = 0.01\) are 57.4 and 70.5, respectively, and the relative errors are 0.36% and 0.57%, respectively. From the perspective of mean value, the results show that the efficiency of the proposed COV-based stopping criterion is lower than that of error-based stopping criterion. However, from boxplots, it can be seen that the failure probability range of the error-based stopping criterion is wider than that of the COV-based stopping criterion. The results illustrate that the robustness of the proposed COV-based stopping criterion is higher than that of error-based stopping criterion. The required function call size of the Meta-IS-AK is 78.9. Compared with the traditional Meta-IS-AK, the proposed method can significantly improve efficiency without losing accuracy. In addition, it can be seen that when \(\lambda_{thr} = 0.{0}1\) the relative errors of the two stopping criteria are both smaller than 1%. When \(\lambda_{thr} = 0.05\) the relative errors are all greater than 3%. When \(\lambda_{thr} = 0.1\) the relative errors are larger than 8%, The results show that the suitable \(\lambda_{thr}\) in this example should be 0.01.

Table 7 Results of failure probability under different methods in Example 2
Fig. 6
figure 6

Boxplots of failure probability and required function calls

Table 8 shows the required parameters of the proposed Meta-IS-VL method through different stopping criteria. Figure 7 shows the performance function values of the training samples and the convergence in the second-stage Kriging model updating. It can be seen that the performance function values of the added training samples in the second-stage Kriging model updating are much closer to 0 compared with the initial samples. Therefore, the proposed variance-based learning function can effectively select the training samples around the LSB.

Table 8 Results of calculation parameters in Example 2
Fig. 7
figure 7

Performance function values of the training samples and the convergence performance in Example 2

When the updating of the Kriging model in the first stage is finished, the mean \(S\left( i \right)\) are listed in Table 9. It can be seen that the mean \(S\left( i \right)\) decreased with the increase of \(K\). \(K = 2\) corresponds to the largest mean \(S\left( i \right)\). Therefore, the optimal \(K\) in this example is 2. The silhouette plots under different \(K\) are shown in Fig. 8.

Table 9 Mean silhouette value in Example 2
Fig. 8
figure 8

Silhouette plots of different \(K\)

The influence of the initial sample size when \(\lambda_{thr} = 0.01\) is studied, as shown in Table 10. It can be seen that when \(N_{ini} = 6\), the size of a function call is the smallest, while the relative error is the largest. When \(N_{ini} = 12,\;{\text{ or }}18\) the relative errors and the size of function calls have little difference. When \(N_{ini} = 24\) the relative error is the smallest. However, the size of function calls is also the highest. This conclusion is suitable for the two stopping criteria. Therefore, the adopted initial sample size strategy is suitable in this example.

Table 10 Results of the proposed Meta-IS-VL method with different initial sample size

6.3 A cantilever beam-bar system

A cantilever beam-bar system from Yun [45] is studied in this section, as shown in Fig. 9. The performance function is defined as:

$$G_3 = \min \left\{ \begin{gathered} \max \left\{ \begin{gathered} T - {{5X} / {16}} \hfill \\ M - {L / X} \hfill \\ \end{gathered} \right. \hfill \\ \max \left\{ \begin{gathered} M - {{3LX} / 8} \hfill \\ M - {{LX} / 8} \hfill \\ \end{gathered} \right. \hfill \\ \max \left\{ \begin{gathered} M - {{3LX} / 8} \hfill \\ M + 2LT - LX \hfill \\ \end{gathered} \right. \hfill \\ \end{gathered} \right.$$
(28)

where \(L = 5\).\(M\),\(T\), \(X\) are normal variables with means \(\mu_M = 1000,\mu_T = 110,\mu_X = 150\) and standard deviations \(\sigma_M = 300,\sigma_T = 20,\sigma_X = 30\), respectively. This example is used to evaluate the applicability of the proposed method in a series–parallel hybrid system. The results are listed in Table 11. It can be seen that the relative errors of Meta-IS-VL-ESC and Meta-IS-VL-COV are 1.29% and 1.27%, respectively when \(\lambda_{thr} = 0.01\), and the required function calls are 125.8 and 127.4, respectively. The results show that the proposed method has high accuracy, and the efficiency of the two stopping criteria is basically the same. The number of required function calls of Meta-IS-AK is 177.3, which is much larger than that of the proposed method. This illustrates that the traditional U learning function and its stopping criterion will lead to many redundant training samples. The required function calls of Meta-AK-IS2 is 169.2, and the efficiency is much lower than that of the proposed method. In addition, it can be seen that when \(\lambda_{thr} = 0.05\) the relative errors of the Meta-IS-VL under the two stopping criteria are all greater than 3%. When \(\lambda_{thr} = 0.1\) the relative errors are larger than 10%, The results show that the suitable \(\lambda_{thr}\) in this example should be 0.01.

Fig. 9
figure 9

A cantilever beam-bar system

Table 11 Results of failure probability under different methods in Example 3

The boxplots of failure probability and the number of required function calls are shown in Fig. 10. It can be seen that the ranges of failure probability for the two stopping criteria have little difference. The mean required function calls size of the error-based stopping criterion is smaller than that of the COV-based stopping criterion. However, the range widths of COV and ESC stopping criterions are 48 and 95, respectively. Therefore, the robustness of the proposed COV-based stopping criterion is higher than that of the error-based stopping criterion. Table 12 lists the calculated parameters of the proposed Meta-IS-VL method under different stopping criteria. Figure 11 shows the performance function value of the training samples and the convergence performance in the second stage Kriging model.

Fig. 10
figure 10

Boxplots of failure probability and required function calls

Table 12 Results of calculation parameters in Example 3
Fig. 11
figure 11

Performance function values of the training samples and the convergence performance in second stage Kriging model Example 3

This example has four basic failure modes, which means that the candidate \(K\) is between 4 and 8. The results of silhouette plots are shown in Fig. 12, and the mean \(S\left( i \right)\) is listed in Table 13. It can be seen that \(K = 7\) corresponds to the largest mean \(S\left( i \right)\). Therefore, the optimal \(K\) in example 1 is 7.

Fig. 12
figure 12

Silhouette plots of different \(K\)

Table 13 Mean silhouette value in Example 1

The results of different initial sample size when \(\lambda_{thr} = 0.01\) are shown in Table 14. It can be seen that when \(N_{ini} = 6\), the size of the function call is the smallest, while the relative error is the largest. When \(N_{ini} = 12,\;{\text{ or }}18\) the relative errors and the size of function calls have little difference. When \(N_{ini} = 24\) the relative error has a small difference from the relative error when \(N_{ini} = 12,\;{\text{ or }}18\). However, the size of function calls when \(N_{ini} = 24\) is higher than that for \(N_{ini} = 12,{\text{ or }}18\). This conclusion is suitable for the two stopping criteria. Therefore, the applicability of the initial sample size strategy is verified in this example.

Table 14 Results of the proposed Meta-IS-VL method with different initial sample size

6.4 Latch lock mechanism of hatch

The latch lock mechanism from Ling [15] is studied in this section. The structure is shown in Fig. 13, and information about random variables is demonstrated in Table 15. As reported in Ling [15], the result of MC is 2.1880e−7. This example is used to verify the performance of the proposed method for the problem with a small failure probability.

Fig.13
figure 13

A latch lock mechanism of hatch

Table 15 Random variables parameters of Example 4

All variables are subjected to an independent normal distribution, and the performance function is defined as:

$$G_4 = r\cos \left( {\alpha_1 } \right) + \sqrt {{L_1^2 - \left( {e - r\sin \left( {\alpha_1 } \right)} \right)^2 }} + L_2 - 270$$
(29)

The results are listed in Table 16. The relative errors of Meta-IS-AK and Meta-AK-IS2 are 1.77% and 3.45%, respectively. When \(\lambda_{thr} = 0.01,0.05\) the relative errors of Meta-IS-VL-COV and Meta-IS-VL-ESC are all lower than 3%. When \(\lambda_{thr} = 0.1\) the relative errors are greater than 5%. The results show that the suitable \(\lambda_{thr}\) for the proposed method should be smaller than 0.05. The accuracy of Meta-IS-VL-ESC, Meta-IS-VL-COV, and Meta-IS-AK is basically the same. However, the required function calls of Meta-IS-VL-ESC and Meta-IS-VL-COV when \(\lambda_{thr} = 0.01\) are only 40.4 and 39.4, respectively. The efficiency of the proposed method is much higher than that for Meta-IS-AK and Meta-AK-IS2. In addition, the efficiency of the two stopping criteria is basically the same. This example illustrates that for small failure probability problems, the proposed method can further improve the efficiency of Meta-IS-AK without losing accuracy. The boxplots of failure probability and required function call size are shown in Fig. 14. It can be seen that the range of failure probability calculated by the COV-based stopping criterion is narrower than that by the error-based stopping criterion. The required parameters calculated by Meta-IS-VL are listed in Table 17.

Table 16 Results of failure probability under different methods in Example 4
Fig. 14
figure 14

Boxplots of failure probability and required function calls

Table 17 Results of calculation parameters in Example 4

The results of the mean \(S\left( i \right)\), when the updating of the Kriging model in the first stage is finished, are listed in Table 18, and the silhouette plots are shown in Fig. 15. The maximum mean silhouette value corresponds to \(K = 3\). The calculation parameters of Meta-IS-VL-ESC and Meta-IS-VL-COV are shown in Table 19, and the performance function values of the training samples and the convergence performance are shown in Fig. 16.

Table 18 Mean silhouette value in Example 1
Fig. 15
figure 15

Silhouette plots of different \(K\)

Table 19 Results of the proposed Meta-IS-VL method with different initial sample size
Fig. 16
figure 16

Performance function values of the training samples and the convergence performance in Example 4

The results corresponding to different initial sample sizes are shown in Table 19. When \(N_{ini} = 6\) the relative error is the largest. When \(N_{ini} = 6\) the size of the required function call is the largest. The numbers of required functions call when \(N_{ini} = {18}\) and \(N_{ini} = 12\) have little difference, and the relative errors are all lower than 2%. When \(N_{ini} = {24}\), the number of required function calls is much higher than \(N_{ini} = {18}\) and \(N_{ini} = 12\). However, the relative errors in the three cases show little difference.

6.5 A high dimensional problem

This section studies a classical high-dimensional problem [30], as shown in Eq. (30).

$$G_{5} = n + 3\sigma \sqrt {n} - \sum_{i = 1}^n {x_i }$$
(30)

where \(x_i \sim N\left( {0,\sigma } \right)\). This example is used to study the applicability of a high-dimensional problem. Different \(n\) are studied, including 10, 20, and 30. The results are shown in Tables 20, 21 and 22. It can be seen that when \(n = 10\), Meta-IS-VL still has high accuracy under \(\lambda_{thr} = 0.01\), as the relative errors are all lower than 2%. The sizes of function call in Meta-IS-VL-COV and Meta-IS-VL-ESC are both lower than those in Meta-IS-AK, which means that the efficiency is also higher. In addition, with the increase \(\lambda_{thr}\), the size of required function calls will decrease, while the relative error will decrease. The above conclusions are the same as the previous examples. However, when \(n = {2}0\), the relative errors of the method listed in Table 6 are larger than 10%. When \(n = {3}0\) it completely loses the computing function. The results show that the proposed method may not be suitable for high-dimensional problems.

Table 20 Results of failure probability under different methods in Example 5 when \(n = 10\)
Table 21 Results of failure probability under different methods in Example 5 when \(n = {2}0\)
Table 22 Results of failure probability under different methods in Example 5 when \(n = {3}0\)

7 Conclusions

An improved Meta-AK-IS method is proposed to further improve efficiency, which is called the Meta-IS-VL method. First, the k-means cluster algorithm in IS density function construction is improved by the silhouette plot to solve the problem of only assuming clusters arbitrarily. Second, a novel learning function is proposed considering the distribution characteristics of the IS density function; thus, it can fully reflect the impact of the constructed optimal IS function. The COV information is adopted to define a novel stopping criterion for active learning, and the traditional error-based stopping criterion is also adopted to validate the performance. The conclusions are summarized as follows:

  1. (1)

    From the iteration curves, it can be seen that the performance function values of the added samples in the second stage of Kriging model updating are much closer to 0 compared with the initial samples. This means that the proposed learning function can effectively select the optimal training samples around the LSB.

  2. (2)

    The influence of the convergence criterion and the size of initial samples on failure probability are studied. The results show that the suitable \(\lambda_{thr}\) should be smaller than 0.05, and the adopted initial sample size strategy is suitable for problems with variable dimensions less than 10.

  3. (3)

    In examples 1–4, the mean required function calls of Meta-IS-VL-ESC are 51.4, 57.4, 125.8, and 40.4, respectively, and the required function calls of Meta-IS-VL-COV are 72.3, 70.2, 127.4, and 39.4, respectively. The relative errors of the proposed method under the two stopping criteria are all lower than 2%. Compared with the traditional Meta-IS-AK, the proposed Meta-IS-VL can significantly improve efficiency without losing accuracy.

  4. (4)

    Analysis of the mean required function calls indicates that the efficiency of the proposed COV-based stopping criterion is comparatively lower than the existing error-based stopping criterion. The boxplot representations across examples 1–4 reveal a wider range of failure probability estimations obtained through the error-based criterion in contrast to the COV-based criterion. Additionally, within examples 2–3, the error-based criterion demonstrates a larger range in the size of function calls. Hence, the proposed stopping criterion exhibits higher robustness.

  5. (5)

    While effective for various scenarios, the proposed method exhibits limitations in handling high-dimensional problems. As demonstrated in example 5, the method showcases commendable accuracy in estimating failure probabilities even with 10 variables. However, as the number of variables exceeds 20, a notable increase in relative error becomes apparent. Consequently, the method's optimal utility is primarily observed in scenarios where variable dimensions are below 10.