1 Introduction

In the simulation process, there are a large number of uncertain factors in material properties, loads, laboratory test, simulation model and other respects. All of these uncertain factors have a significant impact on the safety of slope engineering, so it is difficult to reliably assess the performance of a slope only with a factor of safety. To calculate the possibility of slope failure, based on probability theory and mathematical statistics, reliability analyses are adopted to reasonably reflect the actual safety of the slope [1], of which many of the uncertainties are transformed into random variables.

According to geotechnical models and properties of calculation methods, the slope reliability analysis can be divided into traditional methods and intelligent algorithms. The traditional methods of slope reliability analysis include the mean-value first-order second-moment (MVFOSM) method [2, 3], advanced first-order second-moment (AFOSM) method [4], second-order reliability methods [5, 6], response surface methods [7], Monte Carlo simulations (MCSs) [8, 9], and other categories [10]. Generally, the main shortcomings of the traditional methods are derivative of the function and a large number of calculations. Therefore, in recent decades, the use of a combination of new intelligent algorithms and traditional methods is helping to improve the efficiency and accuracy of slope reliability analysis. The artificial neural network method has been applied to slope reliability analysis by many scholars [11,12,13], but it has some limitations, such as a large number of training samples, unstable results, local minima, and overfitting problems, as well as the inability for its performance function to be explicitly expressed.

As a key and the most rapidly developing algorithm of intelligent algorithms, machine learning is also widely used in slope reliability analysis. Some scholars have attempted to improve the support vector machine (SVM) method and to explore its applications [14,15,16,17,18,19]. Tan [20] studied reliability probability using the SVM and radial basis function neural network methods with an MCS. To obtain the reliability index of a slope, He [21] considered the SVM result as a response surface, and the SVM method exhibited good results compared with those of the first-order second-moment (FOSM), second-order reliability, and MCS methods. Kang [22] studied slope stability as a system reliability problem using Gauss process regression and Latin hypercube sampling. In addition, Zhao [23, 24] employed the SVM to explicitly express the performance function and derive its partial derivative, and a slope reliability analysis was then carried out with the MVFOSM, providing a new approach for slope reliability analysis. Recently, the RVM is introduced to fitting and prediction problems in various fields, which sparse model structure, relatively low computational complexity, and fewer parameters compared with the SVM [9, 25]. Subsequently, the relevance vector machine (RVM) was constructed as a modification of the traditional method to analyze the slope reliability with the MVFOSM method [25]. The abovementioned research proved that the factor of safety estimated by the RVM is feasible, which offers some important insights into slope reliability analysis. However, there is still an obvious error in the calculation result of the RVM-MVFOSM, because the center point of the MVFOSM significantly deviates from the original limit state surface. The main reason is the mean point is assumed as the center point for the MVFOSM, which deviates from the limit state surface. In general, the AFOSM is more reliable than the MVFOSM, but the partial derivative of the performance function or the finite-difference method is required in the calculation process of the AFOSM. The application of machine learning has provided the possibility for deriving the partial derivative of AFOSM, so the combination of the machine learning and the AFOSM deserves to be studied in the slope reliability analysis.

To increase the efficiency and accuracy of reliability analysis, combining with the RVM, the AFOSM is adopted by deriving the partial derivative of the RVM in slope reliability analysis. On one hand, the multi-kernel function is introduced to establish multi-kernel relevance vector machine (MKRVM) to calculate the factor of safety, of which multi-kernel parameters are optimized by the harmony search (HS). On the other hand, the AFOSM was adapted to replace the MVFOSM to improve the calculation reliable. Otherwise, the partial derivative of multi-kernel function is derived for the first time in this paper, which plays a vital role in the application of the AFOSM in the reliability analysis. Therefore, the MKRVM–AFOSM is established, which can make use of the advantages of the MKRVM and the AFOSM to calculate slope reliability. Finally, the stabilities of a single-layer slope and a multilayer slope are calculated to verify the calculation results of MKRVM–AFOSM.

2 Basic principles

2.1 Principle of MKRVM algorithm

The RVM, introduced by Tipping [26], is a probabilistic learning model based on a Bayesian framework. Compared with the SVM, the advantages of the RVM are its sparse model structure, relatively low computational complexity, and fewer parameters [27]. In addition, the RVM, with a kernel function that does not need to meet Mercer’s condition, can provide variance. By considering the Gaussian prior probability controlled by the hyperparameters, the RVM performs machine learning under the Bayesian framework. In addition, the RVM can not only output a mean value, namely, a quantitative prediction, but also export its variance. Compared with RVMs, the MKRVM composites have the advantages of including different types of kernel functions, so it has higher accuracy.

Given a training dataset \(\left. {\left\{ {{\varvec{x}}_{n} ,t_{n} } \right.} \right\}_{n = 1}^{N}\), N is the total number of samples, and xn tn are the input data and the output target value of n for the sample, respectively. The output tn can be written as follows:

$$t_{n} = y({\varvec{x}}_{n} ;{\varvec{\omega}}) + \varepsilon_{n} = \sum\limits_{n = 1}^{N} {\omega_{n} K({\varvec{x}},{\varvec{x}}_{n} ) + \omega_{0} } + \varepsilon_{n} ,$$
(1)

where y is the intermediate variable. \({\varvec{w}} = (\omega_{0} ,\omega_{1} , \ldots \omega_{N} )^{T}\) is the parameter vector, \(\omega_{0}\) and \(\omega_{n} (n = 1,2, \ldots ,N)\) are the base quantity and the weight, respectively. \(\varepsilon_{n}\) is the error and obeys a Gaussian distribution, of which the mean value and variance are 0 and \(\sigma^{2}\), respectively. \(K({\varvec{x}},{\varvec{x}}_{n} )\) is a kernel function. If tn is independently distributed, the likelihood of the complete dataset can be expressed as follows:

$$p({\varvec{t}}\left| {{\varvec{w}},\sigma^{2} ) = (2\uppi \sigma^{2} )^{ - N/2} } \right.{\rm exp}\left( { - \frac{{\left\| {{\varvec{t}} -{\varvec{\varPhi}}^{T} {\varvec{w}}} \right\|^{2} }}{{2\sigma^{2} }}} \right),$$
(2)

where \({\varvec{t}} = (t_{1} ,t_{2} , \ldots ,t_{x} )^{T}\) is the output target vector and \({\varvec{\varPhi}}\) is a basis function matrix: \({\varvec{\varPhi}}= [{\varvec{\varPhi}}({\varvec{x}}_{1} ),{\varvec{\varPhi}}({\varvec{x}}_{2} ), \ldots{\varvec{\varPhi}}({\varvec{x}}_{N} )]\), and \({\varvec{\phi}}({\varvec{x}}_{n} ) = [1,K({\varvec{x}}_{n} ,{\varvec{x}}_{1} ),K({\varvec{x}}_{n} ,{\varvec{x}}_{2} ), \ldots ,K({\varvec{x}}_{n} ,{\varvec{x}}_{N} )]^{T} .\)

In the process of using the maximum likelihood function to estimate w, to prevent overfitting, \(\omega_{n}\) is assumed to obey the Gaussian distribution G whose mean value is 0 and whose variance is \(\alpha_{n}^{ - 1}\):

$$p\left( {{\varvec{w}}\left| \alpha \right.} \right) = \prod\limits_{n = 0}^{N} {G\left( {\omega_{n} \left| {0,\alpha_{n}^{ - 1} } \right.} \right)} ,$$
(3)

in which \(\alpha\), which is related to only w, is a hyperparameter that controls the value of w.

It is assumed that \(\alpha\) and \(\sigma^{2}\) obey the gamma prior probability, so the posterior distribution of w can be given according to the defining prior distribution and likelihood distribution:

$$\begin{aligned} p({\varvec{w}}\left| {{\varvec{t}},\alpha ,\sigma^{2} )} \right. & = \frac{{p({\varvec{t}}\left| {{\varvec{w}},\sigma^{2} )p({\varvec{w}}\left| {\alpha )} \right.} \right.}}{{p({\varvec{t}}\left| {\alpha ,\sigma^{2} )} \right.}} \\ & = (2\uppi )^{ - M/2} \left|{\varvec{\varSigma}}\right|^{ - 1/2} \exp \left( { - \frac{1}{2}({\varvec{w}} - {\varvec{\mu}})^{\rm{T}}{\varvec{\varSigma}}^{ - 1} ({\varvec{w}} - {\varvec{\mu}})} \right), \\ \end{aligned}$$
(4)

where M is the number of relevance vector machines, \({\varvec{\varSigma}}= (\sigma^{ - 2}{\varvec{\varPhi}}^{\rm{T}}{\varvec{\varPhi}}+ {\varvec{A}})^{ - 1}\) and \({\varvec{\mu}} = \sigma^{ - 2} {{\varvec{\varSigma}}{\varvec{\varPhi}}}^{\rm{T}} {\varvec{t}}\) are the posterior covariance matrix and mean, respectively, with diagonal \({\varvec{A}} = {\text{diag}}(\alpha_{0} ,\alpha_{1} , \ldots ,\alpha_{N} )\).

The marginal distribution, which is controlled by hyperparameters \(\alpha\) and \(\sigma^{2}\), can be obtained through the integral of w in the likelihood function of training samples:

$$p\left( {{\varvec{t}}\left| {\alpha ,\sigma^{2} } \right.} \right) = (2\uppi )^{ - N/2} \left|{\varvec{\varOmega}}\right|^{ - 1/2} \exp \left( {\frac{{ - {\varvec{t}}^{\rm{T}}{\varvec{\varOmega}}^{ - 1} {\varvec{t}}}}{2}} \right),$$
(5)

where \({\varvec{\varOmega}}= \sigma^{2} {\varvec{E}} + {{\varvec{\varPhi}} A}^{ - 1}{\varvec{\varPhi}}^{T}\) and E is an identity matrix.

To solve \(\alpha\), the partial derivative of expression (5) is calculated by an iterative method, and \(\alpha \to \infty\) occurs more often in this process. The corresponding \(\omega_{n}\) is equal to zero, and the corresponding basis vector will be deleted. Hence, a sparse RVM is constructed. To calculate the hyperparameters and the noise variance, the rapid sequence of sparse Bayesian learning is adapted to calculate the covariance matrix \(\varvec{\sum }\) [28, 29].

After obtaining the values of the hyperparameters \(\alpha\) and \(\sigma^{2}\), the mean \(y^{ * }\) and variance \(\sigma^{2 * }\) can be evaluated to describe the uncertainty prediction if an arbitrary input value \({\varvec{x}}^{ * }\) is given. \(\sigma_{{{\text{MP}}}}\) is the optimal value of the hyperparameter \(\sigma\), and the mean and the variance are defined as follows:

$$y^{ * } = {\varvec{\mu}}^{T} {\varvec{\phi}}\left( {{\varvec{x}}^{ * } } \right),$$
(6)
$$\sigma^{2 * } = \sigma_{{{\text{MP}}}}^{2} + {\varvec{\phi}}\left( {{\varvec{x}}^{ * } } \right)^{T} {{\varvec{\varSigma}} {\varvec{\varphi}} }\left( {{\varvec{x}}^{ * } } \right).$$
(7)

For the RVM, it is determined by the kernel function type that the sample-mapping mode is from low-dimensional space to high-dimensional space. Meanwhile, the value of the kernel parameter has a great influence on the calculation results of the RVM. At present, the following kernel functions are commonly used: (1) a local kernel function that has a strong local interpolation ability, such as the Gauss kernel function shown in Eq. (8) and (2) a global kernel function that has a strong generalization ability, such as the polynomial kernel function shown in Eq. (9) [30]. Considering the slope stability analysis characteristics, the calculation model needs a very strong local interpolation ability because the factor of safety will greatly change due to the small changes in one variable. At the same time, the range of each variable is large, so the calculation model should also have a generalization ability to some extent. Therefore, the multi-kernel function shown in Eq. (10) is introduced to establish the MKRVM, which has the advantages of the abovementioned kernel functions:

$$K({\varvec{x}},{\varvec{x}}_{n} ) = \exp \left( { - \left| {{\varvec{x}} - {\varvec{x}}_{n} } \right|^{2} /d^{2} } \right),$$
(8)
$$K({\varvec{x}},{\varvec{x}}_{n} ) = (\eta ({\varvec{x}}\, \cdot \,{\varvec{x}}_{n} ) + r)^{q} ,$$
(9)
$$K({\varvec{x}},{\varvec{x}}_{n} ) = m \times \exp \left( { - \left| {{\varvec{x}} - {\varvec{x}}_{n} } \right|^{2} /d^{2} } \right) + (1 - m) \times (\eta ({\varvec{x}}\, \cdot \,{\varvec{x}}_{n} ) + r)^{q} ,$$
(10)

where d, the bandwidth parameter, is the Gauss kernel parameter. \(\eta\), r and q are the polynomial kernel parameters, and m is the ratio parameter of the multi-kernel function. Compared with the single kernel function, the RVM with multi-kernel function does not significantly increase the computational workload, so the computational efficiency of the MKRVM is stable. The multi-kernel function can maintain a good balance between local interpolation and generalization, and there is no explicit stipulation about the five kernel parameters d, \(\eta\), r, q, and m.

2.2 Principle of HS algorithm

All kernel parameters have a significant influence on the MKRVM, so the HS algorithm is used to find the global optimization of the five kernel parameters. The HS is a heuristic algorithm with a global random searching ability [31]. The process of generating a harmony, for which musicians repeatedly adjust various musical instrument tones, is simulated to show that optimization problems can be solved by the HS. The flow chart of the HS is shown in Fig. 1:

Fig. 1
figure 1

The flow chart of the HS

The implementation steps of the HS are shown in Fig. 1.

  1. 1.

    The main parameters of the HS are initialized: the number of variables L, the maximum number of iterations Tmax, the harmony memory size HMS, the rate of a new tone from harmony memory HMCR, the readjustment rate of the tone PAR, and the readjustment bandwidth of the tone bw.

  2. 2.

    The harmony memory HM is initialized. The initialization harmonies are randomly generated and then saved. The HM is used to store the best fitness \(f({\varvec{Z}}^{i} )\quad (i = 1,2, \ldots ,{\rm HMS)}\) of the harmonies, where the number of harmonies is always equal to the HMS. \({\varvec{Z}}^{{\text{i}}} \user2{ = }\left( {z_{1}^{i} ,z_{2}^{i} , \ldots ,z_{k}^{i} } \right)\) is the harmony, and \(z_{nk}^{{\text{i}}} \, (nk = 1{,2,} \ldots {,}k)\) is the tone, and k is the number of input variables. The HM is expressed as follows:

    $$\user2{HM = }\left[ {\begin{array}{*{20}c} {{\varvec{Z}}^{1} } &\mid & {f({\varvec{Z}}^{1} )} \\ {{\varvec{Z}}^{2} } &\mid & {f({\varvec{Z}}^{2} )} \\ \vdots &\mid & \vdots \\ {{\varvec{Z}}^{{{\text{HMS}}}} } &\mid & {f({\varvec{Z}}^{{{\text{HMS}}}} )} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {z_{1}^{1} } & {z_{2}^{1} } & \cdots & {z_{k}^{1} } &\mid & {f({\varvec{Z}}^{1} )} \\ {z_{1}^{2} } & {z_{2}^{2} } & \cdots & {z_{k}^{2} } &\mid & {f({\varvec{Z}}^{2} )} \\ \vdots & \vdots & \ddots & \vdots &\mid & \vdots \\ {z_{1}^{{{\text{HMS}}}} } & {z_{2}^{{{\text{HMS}}}} } & \cdots & {z_{k}^{{{\text{HMS}}}} } &\mid & {f({\varvec{Z}}^{{{\text{HMS}}}} )} \\ \end{array} } \right].$$
    (11)
  3. 3.

    When each new tone is constructed, a new harmony can be combined. Each tone \(z_{nk}^{{\text{i}}}\) of the new harmony Zi is generated through the following process: (1) the probability that the new tone is from the HM is HMCR. Otherwise, the probability is (1 − HMCR). (2) If the new tone is from the HM, then the probability that the tone will be readjusted is equal to PAR.

  4. 4.

    The HM is updated. If the fitness of the new harmony is better than the worst harmony in the HM, the new harmony will replace the worst harmony in the HM.

  5. 5.

    The iteration time is checked to determine whether the maximum number of iterations is reached. If not, steps (3) and (4) are repeated until the iteration time \(T_{\max }\) is reached.

2.3 Principle of AFOSM

Compared with the center point of the MVFOSM, the design point of the AFOSM is closer to the limit state surface of the structures, so its calculation accuracy is clearly improved. The performance function of the slope can be expressed as [23] follows:

$$Z = g\left( {X_{1} ,X_{2} , \ldots ,X_{k} } \right) = f\left( {X_{1} ,X_{2} , \ldots ,X_{k} } \right) - 1,$$
(12)

where Z is the value of the performance function and \(g\left( {X_{1} ,X_{2} , \ldots ,X_{k} } \right)\) is the performance function. \(X_{i} \left( {i = 1,2, \ldots ,k} \right)\) are random variables affecting the slope stability, and k is the number of input variables. f is the factor of safety.

Z > 0 and Z < 0 indicate that the slope is in a state of stability and instability, respectively. Z = 0 indicates that the slope is on the boundary between stable and unstable. Setting \({\varvec{X}}^{ * } = \left( {X_{1}^{ * } {,}X_{2}^{ * } {,} \ldots {,}X_{k}^{ * } } \right)\) as a point on the limit state surface, the Taylor expansion can be performed on the point and reduced to a one-degree term, and the linear function ZL is as follows [10]:

$$g({\varvec{X}}^{ * } ) = 0,$$
(13)
$$Z_{\rm {L}} = g({\varvec{X}}^{ * } ) + \sum\limits_{i = 1}^{k} {\frac{{\partial g({\varvec{X}}^{ * } )}}{\partial Xi}} \left( {X_{i} - X_{i}^{ * } } \right).$$
(14)

If the variables are assumed to be noncorrelated and obey a normal distribution \((\mu _{{X_{i} }} ,\sigma _{{X_{i} }}^{2} )\), the reliability index \(\beta\) in the slope reliability analysis with AFOSM is defined according to the mean value \(\mu_{{{\text{z}}_{\rm {L}} }}\) and the standard deviation \(\sigma_{{Z_{\rm {L}} }}\) of \(Z_{\rm {L}}\):

$$\beta = \frac{{\mu _{{{\text{z}}_{\rm{L}} }} }}{{\sigma _{{Z_{\rm{L}} }} }} = {{\left[ {g(\varvec{X}^{ * } ) + \sum\limits_{{i = 1}}^{k} {\frac{{\partial g(\varvec{X}^{ * } )}}{{\partial X_{i} }}} (\mu _{{X_{i} }} - X_{i}^{ * } )} \right]} \mathord{\left/ {\vphantom {{\left[ {g(\varvec{X}^{ * } ) + \sum\limits_{{i = 1}}^{k} {\frac{{\partial g(\varvec{X}^{ * } )}}{{\partial X_{i} }}} (\mu _{{X_{i} }} - X_{i}^{ * } )} \right]} {\sqrt {\sum\limits_{{i = 1}}^{k} {\left[ {\frac{{\partial g(\varvec{X}^{*} )}}{{\partial X_{i} }}} \right]} ^{2} \sigma _{{X_{i} }}^{2} } }}} \right. \kern-\nulldelimiterspace} {\sqrt {\sum\limits_{{i = 1}}^{k} {\left[ {\frac{{\partial g(\varvec{X}^{*} )}}{{\partial X_{i} }}} \right]} ^{2} \sigma _{{X_{i} }}^{2} } }}.$$
(15)

The sensitivity vector \({\varvec{\alpha}}_{X} = \left\{ {\alpha_{{X_{i} }} {,}\left( {i = 1,2, \ldots ,k} \right)} \right\}\), called the sensitivity coefficient [10], can be calculated as follows:

$$\alpha_{{X_{i} }} = - {{\frac{{\partial g({\varvec{X}}^{*} )}}{{\partial X_{i} }}\sigma_{{X_{i} }} } \mathord{\left/ {\vphantom {{\frac{{\partial g({\varvec{X}}^{*} )}}{{\partial X_{i} }}\sigma_{{X_{i} }} } {\sqrt {\sum\limits_{i = 1}^{k} {\left[ {\frac{{\partial g({\varvec{X}}^{*} )}}{{\partial X_{i} }}} \right]^{2} \sigma_{{X_{i} }}^{2} } } }}} \right. \kern-\nulldelimiterspace} {\sqrt {\sum\limits_{i = 1}^{k} {\left[ {\frac{{\partial g({\varvec{X}}^{*} )}}{{\partial X_{i} }}} \right]^{2} \sigma_{{X_{i} }}^{2} } } }}.$$
(16)

Then, the new design points \({\varvec{p}}^{ * } = \left\{ {p_{i}^{ * } {,}\left( {i = 1,2, \ldots ,k} \right)} \right\}\) can be acquired by the following equation:

$$p_{i}^{ * } = \mu_{{X_{i} }} + \beta \sigma_{{X_{i} }} \alpha_{{X_{i} }} .$$
(17)

Since the direct solution is very difficult to determine, the iterative method is usually adopted in Eq. (13). Then, using Eqs. (13), (15), (16) and (17), the reliability index \(\beta\) and the new design points \({\varvec{p}}^{ * }\) can be obtained. The main steps of the AFOSM can be described as follows: first, the initial design point is generally assumed to be the mean point. Then, \({\varvec{\alpha}}_{X}\), \(\beta\), and the new design point are calculated by the above equations. If the termination condition is satisfied, the reliability analysis is finished. If not, the process will be repeated again.

In this process, all the limit-equilibrium methods or the finite-element method can be used to compute the factor of safety f [23]. The Bishop method [32], one of the limit-equilibrium methods, is chosen to calculate the factor of safety in this paper. Therefore, it is convenient to compare with other reliability calculation methods. In the past, it is difficult to derive the derivative of the performance function directly for the AFOSM, so the finite-difference method is implemented as an approximate method for reliability analysis. All the sampling points of the finite-difference method are selected in the range of \(\mu_{{X_{i} }} \pm b\sigma_{{X_{i} }}\), where b is the step control factor. The step scheme is determined according to Rajashekhar [33], b = 3 in the initial iteration, and then b = 1 in the following iterations. The finite-difference method is also adopted in the AFOSM without the MKRVM.

3 Establishment of MKRVM–AFOSM

3.1 Initialization of the HS by Latin hypercube sampling

To optimize the kernel parameters of the MKRVM, the variation range of kernel parameters are provided for the HS, and all the kernel parameters samples are generated in this variation range. The search result significantly depends on the quantity and quality of the samples. Therefore, Latin hypercube sampling is adopted in the initialization process of the HS to quickly generate representative samples. Compared with the random sampling method, uniform design method and orthogonal design method, the Latin hypercube sampling method can better reflect the probability distribution of variables with fewer samples. As a multidimensional and stratified sampling method, Latin hypercube sampling divides the distribution interval of each variable into several subintervals according to the probability. Then, the samples of each variable, which are extracted as subintervals by the inverse transformation, are randomly combined to the samples of Latin hypercube sampling. As a result, all the samples of the HS are created, and the initialization HS is completed.

3.2 Derivation of the partial derivative of the MKRVM

After training the MKRVM, the Bishop method can be replaced by the MKRVM to evaluate the performance function. Therefore, the performance function value Z can be rewritten as follows:

$$Z = \sum\limits_{n = 1}^{M} {w_{n} \left[ {m \times \exp \left( { - \left| {x - x_{n} } \right|^{2} /d^{2} } \right) + (1 - m) \times (\eta (x\, \cdot \,x_{n} ) + r)^{q} } \right]} .$$
(18)

To apply the AFOSM in the reliability analysis, the partial derivative of multi-kernel function must be derived for the first time in this paper. The partial derivative of the MKRVM is derived as follows:

$$\frac{\partial Z}{{\partial x_{i} }} = \sum\limits_{n = 1}^{M} {w_{n} \left[ { - 2m \times \left( {\frac{{x - x_{n} }}{{d^{2} }}} \right)\exp \left( { - \frac{{\left| {x - x_{n} } \right|^{2} }}{{d^{2} }}} \right) + q\eta x_{n} \left( {1 - m} \right) \times (\eta (x\, \cdot \,x_{n} ) + r)^{q - 1} } \right]} .$$
(19)

3.3 Steps of the MKRVM–AFOSM

The above equation is then transformed into the AFOSM to solve the reliability index. Hence, the MKRVM–AFOSM flow chart is shown in Fig. 2:

Fig. 2
figure 2

The MKRVM–AFOSM flow chart

The main calculation steps of the MKRVM–AFOSM can be described as follows:

Step 1: The distribution forms of each variable are determined, and \(X_{i} \, \left( {i = 1,2, \ldots ,k} \right)\) is constructed as the input of the samples. Then, a traditional method, such as the Bishop method, is used to calculate the factor of safety as the output of these samples. The samples are divided into a training dataset and a testing dataset.

Step 2: The HS is initialized by the Latin hypercube sampling, and then, the HS is used to optimize the kernel parameters of the MKRVM. The training dataset and testing dataset are used to train and estimate the MKRVM, respectively. In the optimizing process, the mean absolute error (MAE) of the factor of safety is calculated by the following equation as the fitness fMAE:

$$f_{{{\text{MAE}}}} = \frac{1}{N}\sum\limits_{n = 1}^{N} {\left| {t_{n} - y^{ * }_{n} } \right|} ,$$
(20)

where tn and \(y^{ * }_{n}\) are the factor of safety calculated by the Bishop and the MKRVM of sample n, respectively.

Step 3: To simplify the complexity and accelerate the calculation speed, the samples are directly into training and test samples to show the fitting and protection effect. According to the minimum MAE of the testing dataset, it is determined whether the training target is achieved. If not, the kernel parameters are changed, and step 2 is repeated. When the target is achieved, the MKRVM training is completed.

Step 4: With the AFOSM, the mean point is assumed to be the initial design point in the first iteration. In each iteration, the MKRVM is used instead of the traditional method to calculate the factor of safety and the first-order derivative. As a result, a new design point can be found.

Step 5: If the difference value of each variable of the design point is less than the allowed error for the last two iterations, the iteration calculation of the AFOSM is finished, and the reliability index \(\beta\) is obtained. Otherwise, the process returns to step 4.

Otherwise, to increase efficiency, the program of the MKRVM–AFOSM was developed in MATLAB according to the abovementioned steps. Meantime, Geostudio 2007 is adopted to calculate the factor of safety, and the reliability analysis using the MCS method also archives by the Geostudio 2007. Notably, the factor of safety should be subtracted from one, which is the theoretical output value of the MKRVM. The five parameter values of the HS are conventional, so their values are set in Table 1. Meanwhile, the ranges of the variation in the multi-kernel parameters are provided in Table 1.

Table 1 Parameters of the HS and the MKRVM

After the abovementioned steps are completed, the slope reliability analysis model based on the MKRVM–AFOSM can be established. The MKRVM–AFOSM can make use of the MKRVM to estimate the factor of safety in a less time-consuming manner than that of traditional methods. Then, the trained MKRVM is applied in the AFOSM to quickly obtain the first-order derivative of the design point.

4 Application

4.1 Example 1: single-layer slope

To compare this approach with other algorithms, the reliability analysis of a single-layer slope commonly studied in academic papers is discussed for the cross-section shown in Fig. 3 [25], and the slope angle of this single-layer slope is 18.43°. The cohesion c, internal friction coefficient \(\varphi\) and density \(\gamma\) are considered in the reliability analysis, and their mean values are \(c = 12.00{\text{ kPa}}\), \(\varphi = 16.68 \, ^{{\text{o}}}\), and \(\gamma = 19.06{\text{ kN/m}}^{{3}}\), respectively. It is usually assumed that those random variables obey the noncorrelated normal distribution, and the slope reliability index is calculated when the coefficients of variation in all the parameters are 5%, 10%, and 15%. From the 40 samples evaluated by the Bishop method with circular slip surfaces in Samui [25], the first 28 are adopted to train the MKRVM, and the remaining 12 are used to estimate the fitting effect of the MKRVM. The material parameters and the factor of safety are input and output of samples to train the MKRVM, respectively. Then, the AFOSM iteratively calculates the reliability index of the slope engineering.

Fig. 3
figure 3

Single-layer slope geometry

The multi-kernel parameters optimized by the HS are \(m = 0.9999955\), \(d = 9.3329415\), \(\eta = 0.4669322\), \(r = 7.0304817\), and \(q = 1.5597846\), and the number of relevance vectors is 6. Figure 4 shows the factors of safety calculated by the Bishop and the MKRVM methods about the training and testing dataset. For the training datasets, the MAE of the factor of safety is 0.0071, and the mean relative error (MRE) is only 0.47% calculated as the Eq. (21). The correlation coefficient \(\gamma\) of the training dataset is 0.9965, according to the Eq. (22):

$${\text{MRE}} = \frac{1}{N}\sum\limits_{n = 1}^{N} {\frac{{\left| {t_{n} - y^{ * }_{n} } \right|}}{{t_{n} }}} ,$$
(21)
$$\gamma \left( {t_{n} ,y^{ * }_{n} } \right) = \frac{{{\text{Cov}}\left( {t_{n} ,y^{ * }_{n} } \right)}}{{\sqrt {{\text{Var[}}t_{n} {\text{]Var[}}y^{ * }_{n} {]}} }},$$
(22)

where Cov and Var are the covariation and variance. The MAE and the MRE of the testing dataset are only 0.0292 and 1.90%, respectively, and the correlation coefficient is 0.9956. Thus, the MKRVM has a good predictive ability and can reliably estimate the factor of safety, providing an appropriate method for slope reliability analysis.

Fig. 4
figure 4

Performance of the training and testing datasets of the single-layer slope

The slope reliability analysis results, with varying coefficients of variation, are shown in Fig. 5. Among them, the results of the MVFOSM and the RVM-MVFOSM are obtained from Samui [25]. The “Geostudio/MCS” results, which were calculated 106 times, were achieved using the software Geostudio 2007. The results of the AFOSM and the MKRVM–AFOSM were calculated using MATLAB 2010a.

Fig. 5
figure 5

Slope reliability results of the single-layer slope using different methods

The design point with different coefficients of variation is \(c = 10.36\;{\text{kN/m}}^{{2}}\), \(\varphi = 12.83^{^\circ }\), and \(\gamma = 24.87\;{\text{kN/m}}^{{3}}\), and the factor of safety is 1.042, indicating that the design point is very close to the limit state surface of the slope stability. Hence, the factor of safety of design point proves the correctness of the MKRVM–AFOSM, which is feasible for the single-layer slope. These values more represent the mathematical meaning than the actual value of the slope, so there is a great chance that the values of the design point are inappropriate for the empirical about the soil properties.

In each iteration process of the AFOSM without the MKRVM, the finite-difference method is used as an approximate method for reliability analysis. Although it is very time-consuming, the reliability result determined with the AFOSM is reliable. Meanwhile, the MCS usually provides unbiased estimates for the failure probability [22]. Consequently, the AFOSM and MCS are considered to estimate the reliability of other algorithms. The following can be concluded from Fig. 5:

  1. 1.

    The results of the MVFOSM and the RVM–MVFOSM are similar, but they exhibit a significant deviation from the AFOSM and MCS results. The main reason for this deviation is that the MVFOSM and the RVM–MVFOSM both rely on the MVFOSM. The center point of the MVFOSM is generally not on the limit state surface, and its Taylor expansion surface will significantly deviate from the original limit state surface. Therefore, there is a clear deviation of the reliability result of the MVFOSM.

  2. 2.

    The results of the Geostudio/MCS, the AFOSM and the MKRVM–AFOSM are close to each other, and the results of the Geostudio/MCS and the MKRVM–AFOSM are especially similar. Compared with the AFOSM, the MAE and the MRE of the MKRVM–AFOSM are 0.093 and 1.88%, respectively, which are smaller than other algorithms. About the MCS, although the calculation with 106 cycles is time-consuming, the results of Geostudio are reliable. Therefore, the reliability analysis of the single-layer slope present high precision by the MKRVM–AFOSM, benefiting from the accuracy factor of safety the MKRVM and the precision reliability analysis by the AFOSM.

  3. 3.

    Through the above three experiments, the average calculation times of the MCS and the AFOSM are more than 5 min with Geostudio 2007. The average calculation times of the MKRVM–AFOSM is 14.3 s with MATLAB 2010a when the MKRVM has been trained, significantly faster than the Geostudio/MCS or the AFOSM calculations. Therefore, for the single-layer slope, the MKRVM–AFOSM has the advantages of high precision and speed that can satisfy the requirements of the reliability analysis.

In the case that the RVM with the Gauss kernel function or the multi-kernel function, the reliability index of the single-layer slope is shown in Table 2. The reliability index of the multi-kernel function is closer to that of the AFOSM than that of the Gauss kernel function, and the ratio kernel parameter \(m = 0.9999955\) of the multi-kernel function shows that the Gauss kernel function is still dominant. The calculation accuracy and efficiency of the reliability are clearly improved by the MKRVM proposed in this paper due to the introduction of the polynomial kernel.

Table 2 The kernel parameters and reliability index of the single-layer slope by different kernel functions

4.2 Example 2: multilayer slope

To compare this approach with other algorithms, the reliability analysis of a multilayer slope commonly studied in academic papers is discussed for the cross-section in Fig. 6 [23], and the slope angle of this multilayer slope is 24.23°. This slope is complex, with three different soil layers, and its material parameters are shown in Table 3. It is assumed that the cohesion c and the internal friction coefficient \(\varphi\) obey a noncorrelated normal distribution, and the density \(\gamma\) is assumed to have a constant value [23]. Through reliability analysis of this multilayer slope, the advantages of the MKRVM–AFOSM can be well demonstrated.

Fig. 6
figure 6

Multilayer slope geometry

Table 3 Material parameters of the multilayer slope

Zhao [23] does not provide detailed data, so Latin hypercube sampling is used to generate random samples in the range of \(\mu_{{X_{i} }} \pm 3\sigma_{{X_{i} }}\) for the HS. To compare with Zhao [23], the factor of safety is evaluated by the Bishop method assuming circular slip surfaces. The first 36 of the 48 samples are used for training, and the remaining 12 samples are used for testing. The calculation process of the multilayer slope is the same as that of the single slope. The multi-kernel parameters optimized by the HS are \(m = 0.9999955\), \(d = 15.3330115\), \(\eta = 0.7380245\), \(r = 7.0304838\), and \(q = 1.5597849\), and the number of relevance vectors is 11. The fitting and prediction results of the multilayer slope are shown in Fig. 7.

Fig. 7
figure 7

Performance of the training and testing datasets of the multilayer slope

For the training dataset, the MAE of the factor of safety with the Bishop and the MKRVM methods is 0.016, and the MRE is 1.33%, the correlation coefficient of the training dataset is 0.9982. In the training process, the kernel parameters of the MKRVM are optimized by the HS, and the evolution of MAE value are shown in Fig. 8. Compared with the Gauss kernel, the MKRVM reaches the optimal value after 48 iteration calculation showing a high convergence speed and accuracy. Therefore, the MKRVM has obvious advantages in terms of calculation accuracy and speed for the factor of safety. The MAE and the MRE of the testing datasets are 0.110 and 8.052%, respectively, and the correlation coefficient of the testing dataset is 0.9719. The fitting and prediction precision of the multilayer slope are less than those of the single-layer slope, mainly due to the complexity of the multilayer slope and the low representativeness of the training dataset, which require further research.

Fig. 8
figure 8

The evolution of MAE value in the training process

The design points of the MKRVM–AFOSM are \(\varphi_{} = 32.95^{^\circ }\), \(c_{{\rm{II}}} = 5.12\;{\text{kN/m}}^{{2}}\), \(\varphi_{{{\text{II}}}} = 17.90^{^\circ }\), \(c_{} = 7.20{\text{ kN/m}}^{{2}}\), \(\varphi_{{{\text{III}}}} = 10.69^{^\circ }\), and the factor of safety is 1.044. It is clear that the design points are very close to the limit state surface of slope stability. Therefore, the factor of safety of design point proves the correctness of the MKRVM–AFOSM, which is feasible for the multilayer slope. According to the previous analysis, the AFOSM and MCS are considered to estimate the reliability of other algorithms. The reliability index of the multilayer slope with different methods is shown in Table 4.

Table 4 Slope reliability index of a multilayer slope with different methods

The following can be concluded from Table 4:

  1. 1.

    The results of the point estimation method and the SVM–MVFOSM obtained from this paper [23], are similar, but the result of the SVM–MVFOSM has an obvious deviation compared with those of the AFOSM and MCS. The main reason for this difference is that the center point of the MVFOSM is generally not close to the limit state surface.

  2. 2.

    Compared to the results of the SVM–MVFOSM and point estimation method, the results from the Geostudio software are closer to those of the AFOSM.

  3. 3.

    The results of the AFOSM and the MKRVM–AFOSM are similar. Compared with the AFOSM results, the MAE and the relative error of the MKRVM–AFOSM are 0.0029 and 0.077%, respectively, which are much lower than those of the other tested algorithms. Therefore, the reliability analysis of the multilayer slope also presents high precision by the MKRVM–AFOSM, benefiting from the accuracy factor of safety the MKRVM and the precision reliability analysis by the AFOSM. The computing time of the MKRVM–AFOSM is 16.8 s when the MKRVM has been trained. Compared with the SVM, the MKRVM will increase approximately 2.4 s of calculation time, but we think it is acceptable for the slope reliability analysis. The reliability index of the MKRVM–AFOSM is less than that of the AFOSM, so the MKRVM–AFOSM adequately considers safety. Therefore, for the reliability analysis of a multilayer slope, the MKRVM–AFOSM also has the advantages of high precision and speed that can satisfy the requirements of an actual project.

In the case that the RVM uses the Gauss kernel function or the multi-kernel function, the reliability index of the multilayer slope is shown in Table 5. The calculation accuracy and efficiency of the reliability are clearly improved by the multi-kernel in this paper due to the introduction of the polynomial kernel. Meanwhile, the computing time of the trained RVM with the Gauss kernel function is 16.6 s, which is similar with the MKRVM. Therefore, the computational efficiency of the MKRVM is stable compare with the traditional RVM.

Table 5 The kernel parameters and reliability index of the multilayer slope using different kernel functions

5 Conclusions

To increase the efficiency and accuracy of reliability analysis, the MKRVM and the AFOSM are combined to analyze the slope reliability in this paper. In this work, the multi-kernel function is introduced to establish the MKRVM, so the accurate calculation of the safety factor is further improved. In addition, the HS has the advantages of high precision and speed in the optimization of multi-kernel parameters. Furthermore, the partial derivative of the MKRVM was derived, so the AFOSM can be solved directly without the finite-difference method. Based on the above methods, the center point determined by MKRVM–AFOSM can be close to the limit state surface, meaning the calculation results of the slope reliability will be upgraded. Therefore, the MKRVM–AFOSM has the advantages of high precision, fast convergence speed and ease of use. A comparison of the results of two samples, for a single-layer slope and a multilayer slope, with the results of other methods shows that the calculation reliable and efficiency of the reliability index are clearly improved by utilizing the multi-kernel instead of a Gauss kernel. Thus, slope reliability analysis with the MKRVM–AFOSM is feasible and efficient. Despite its advantages, the assumptions of distribution law and the correlation of different material parameters in this paper have gaps with actual soil, and further studies need to be carried out to focus assumptions in slope reliability analysis. In short, the findings of this study provide a combined method with machine learning to reliability analysis, which has good application prospects in slope and other practical engineering.