1 Introduction

Uncertainties are widely encountered in geotechnical engineering analysis and design. In general, these uncertainties are associated with the following problems: (1) the inherent randomness of natural processes; (2) model uncertainty reflecting the inability of the simulation model, design technique or empirical formula to represent the true physical behavior of the system, such as the calculation of the safety factor of slopes using limit equilibrium methods; (3) model parameter uncertainties resulting from the inability to accurately quantify the model input parameters; and (4) data uncertainties including measurement errors, data inconsistency or non-homogeneity, and data handling. In slope stability analysis, uncertainties may be attributed to the detailed geological information losses in the exploration program and the inaccurate estimation of soil and rock properties that are difficult to quantify in the laboratory. For example, the spatial variability in the field cannot be reproduced accurately, and the same applies for fluctuation in pore water pressure, testing errors, and many other relevant factors.

The conventional approach for assessing the performance of a slope is to calculate the factor of safety (FOS). A major shortcoming of this approach is that the uncertainties in the material parameters, pore water pressures, and loads are not explicitly reflected in the FOS. In fact, deterministic analysis may cause the slope design to become over-conservative. To avoid such misleading results, reliability analysis is usually applied to the slope stability analysis. Over the past few decades, a number of methods for slope reliability analysis have been presented and have stimulated the interest for the probabilistic design of slope. The probabilistic approach was applied to slope stability analysis using the first-order second-moment method (Sung 2009; Suchomel and Masin 2010; Wu and Kraft 1970; Cornell 1971; Alonso 1976; Tang et al. 1976; Venmarcke 1977; Wol 1985; Li and Lumb 1984; Barabosa et al. 1989). In the previous studies, a number of applications of slope reliability analysis using other methods such as Monte Carlo simulation or point estimate method were reported (Dai et al. 1993; Chowdhury and Xu 1993). Tobutt (1982) used the Monte Carlo method as a sensitivity-testing tool for slope stability and also as a method for calculating reliability of a given slope. In addition, artificial neural networks (ANN) have also been applied to the reliability analysis (Deng et al. 2005; Deng 2006; Gomes and Awruch 2004).

As a simulation method, Monte Carlo simulation (MCS) possesses the following advantages: it may be applied to many practical problems, allowing the direct consideration of any type of probability distribution for the random variables, it is able to compute the probability of failure with the desired precision, and it is easy to implement. Despite these advantages, this method has not been widely used in reliability analysis due to the fact that MCS requires a large amount of stability analyses performed through the limit equilibrium or finite elements methods, which could result in high computing costs, especially for large and complex geotechnical problems.

Support vector machine (SVM) appears to be a promising technique to overcome this difficulty. In recent years, SVM has been rapidly developed for universal function approximations (Vapnik et al. 1996). Feng et al. (2004) and Feng and Hudson 2004 employed SVM to express the complex nonlinear relationship between the displacement of geotechnical structures and the mechanical parameters of geomaterials. Similarly, in the process of slope reliability analysis by MCS, SVM may be employed to approximate the safety factor of slope, replacing the cumbersome limit equilibrium and finite element methods. This will allow the determination of the slope reliability after only a very small number of operations.

Nevertheless, the general SVM has a major weakness in accuracy when dealing with complex slope stability analysis problems. In order to overcome this difficulty, an updated SVM (Zhao et al. 2007) may be adopted by incorporating the particle swarm optimization (PSO) method to search the optimal SVM parameters.

Therefore, the main concept of the proposed method (SVM-based MCS) may be described as follows: First, the parameters of SVM are selected by PSO. Next, an SVM model is constructed to substitute the modeling of slope stability analysis. Then, the index of slope reliability is computed by the SVM-based MCS method. Finally, to investigate the performance and suitability of this approach, the results obtained by the proposed method are compared to those obtained by the point estimation method (PEM), as well as several other numerical analysis methods.

2 Methods

2.1 Monte Carlo simulation

In reliability analysis of geotechnical engineering, the MCS method is particularly appropriate when an analytical solution is not attainable, and the limit state function cannot be expressed or approximated in direct formulation. This is mainly the case in problems of complex nature with a large number of basic variables where no other reliability analysis methods are applicable. Despite the fact that the mathematical formulation of the MCS is relatively simple and the method has the capability of handling virtually every possible case regardless of its complexity, this approach has not received extensive acceptance due to the excessive computational effort involved.

A reliability problem is usually formulated by a performance function, g(X 1, X 2, …, X k ), where X 1, X 2, …, X k are random variables. The performance function of slope reliability analysis may be established as follows:

$$ Z = g\left( {X_{1} ,X_{2} ,X_{3} ,\, \ldots ,\,X_{k} } \right) = F\left( {X_{1} ,X_{2} ,X_{3} ,\, \ldots ,\,X_{k} } \right) - 1 $$
(1)

where X i (i = 1, 2, …, k) are the random variables in the slope reliability analysis; \( g\left( {X_{1} ,X_{2} ,X_{3} ,\, \ldots ,\,X_{k} } \right) \) is the performance function; Z > 0 indicates that the slope is stable, Z < 0 indicates that the slope has failed, and Z = 0 means that boundary is hovering between stable and unstable. \( F\left( {X_{1} ,X_{2} ,X_{3} ,\, \ldots ,\,X_{k} } \right) \) is the FOS. In order to calculate the reliability index, an adequate number of n independent random samples are produced based on the probability distribution for each random variable. The value of the performance function of the slopes is computed for each random sample X i, and the Monte Carlo estimation of the mean value and the standard deviation of the safety factor F are given by the following formula:

$$ \mu_{z} = \frac{1}{n}\sum\limits_{i = 1}^{n} {z_{i} } $$
(2)
$$ \sigma_{z} = \frac{1}{n - 1}\sum\limits_{i = 1}^{n} {\left( {z_{i} - \mu_{z} } \right)^{2} } $$
(3)

Then, the reliability index can be determined as follows:

$$ \beta = \frac{{\mu_{z} }}{{\sigma_{z} }} = \frac{{\mu_{\text{F}} - 1}}{{\sigma_{\text{F}} }} $$
(4)

where β is the reliability index, and \( \mu_{\text{F}} \) and \( \sigma_{\text{F}} \) are the mean value and the standard deviation of the safety factor \( F\left( {X_{1} ,X_{2} ,X_{3} ,\, \ldots ,\,X_{k} } \right) \), respectively.

The Monte Carlo method allows the determination of an estimate of the probability of failure, given by the following:

$$ n_{\text{f}} = \sum\limits_{i = 1}^{n} {I\left( {X_{1} ,X_{2} ,\, \ldots ,\,X_{k} } \right)} $$
(5)

where \( I\left( {X_{1} ,X_{2} ,\, \ldots ,\,X_{k} } \right) \) is a function defined as follows:

$$ I\left( {X_{1} ,X_{2} ,\, \ldots ,\,X_{k} } \right) = \left\{ {\begin{array}{*{20}c} 1 & {\text{if}} & {g\left( {X_{1} ,X_{2} ,\, \ldots ,\,X_{k} } \right) \le 0} \\ 0 & {\text{if}} & {g\left( {X_{1} ,X_{2} ,\, \ldots ,\,X_{k} } \right) > 0} \\ \end{array} } \right. $$
(6)

According to Eq. (5), n independent sets of values X 1, X 2, …, X k are obtained based on the probability distribution for each random variable, and the performance function is computed for each sample. Using MCS, an estimation of the probability of slope failure is obtained by the following:

$$ p_{\text{f}} = \frac{{n_{\text{f}} }}{n} $$
(7)

where \( n_{\text{f}} \) is the total number of cases where failure has occurred, and n is the total number of simulations.

2.2 Updated support vector machine

In this section, only the basic concepts of SVM which are essential in clearly explaining the integration method of SVM and MCS are discussed. SVM was originally proposed by Vapnik et al. (1996) and has been used for practical applications in many research fields. Reliability analysis of geotechnical engineering using MCS is a highly intensive computational problem which renders conventional approaches incapable of treating real large-scale complex problems. The idea presented in this paper is to train SVMs to provide computationally inexpensive estimation of slope FOS which is required in the reliability analysis. The major advantage of an SVM over the conventional numerical process, under the provision that the predicted results fall within acceptable tolerances, is that the results may be produced in only a few clock cycles, requiring orders of magnitude less computational effort than the conventional computational process.

According to support vector machine theory and algorithm, suppose we are given a set of observation data (samples) (X 1, y 1), (X 2, y 2), …, (X k , y k ), \( X_{i} \in R^{n} ,\,y_{i} \in R \). For the regression problem based on the SVM, we may obtain the regression function of the SVM using the following equation:

$$ f\left( X \right) = \sum\limits_{i = 1}^{n} {\left( {\alpha_{i} - \alpha_{i}^{*} } \right)\,} \left( {X \cdot X_{i} } \right) + b $$
(8)

where \( \alpha_{i} ,\,\alpha_{i}^{*} \) are the coefficients of lagrange, and b is a variable of SVM. The value of \( \alpha_{i} ,\,\alpha_{i}^{*} \) and b may be obtained by solving the following optimal problems.

maximize

$$ W\left( {\alpha ,\,\alpha^{*} } \right) = - \frac{1}{2}\sum\limits_{i,j = 1}^{n} {\left( {\alpha_{i} - \alpha_{i}^{*} } \right)} \,\left( {\alpha_{j} - \alpha_{j}^{*} } \right)\,K\left( {X_{i} \cdot X_{j} } \right) + \sum\limits_{i = 1}^{n} {y_{i} \left( {\alpha_{i} - \alpha_{i}^{*} } \right)} - \varepsilon \sum\limits_{i = 1}^{n} {\left( {\alpha_{i} + \alpha_{i}^{*} } \right)} $$
(9)

subject to

$$ \left\{ {\begin{array}{*{20}c} {\sum\limits_{i = 1}^{n} {\left( {\alpha_{i} - \alpha_{i}^{*} } \right)} = 0,} & {} \\ {0 \le \alpha_{i} ,\,\alpha_{i}^{*} \le C,} & {i = 1,2,\, \ldots ,\,n} \\ \end{array} } \right. $$
(10)
$$ f\left( X \right) = \sum\limits_{i = 1}^{n} {\left( {\alpha_{i} - \alpha_{i}^{*} } \right)\,} K\left( {X \cdot X_{i} } \right) + b $$
(11)

where \( K\left( {X_{i} \cdot X_{j} } \right) \) is the kernel function, penalty factor C is a constant that depends on the algorithm of SVM. In this paper, sequential minimal optimization (SMO) was used to solve the optimization problem described in Eqs. 9 and 10 (Platt 1998; Smola and Schoelkopf 1998).

The selection of SVM parameters has much influence on the performance of SVM. In this paper, PSO was used to solve this problem. Compared to genetic algorithms, PSO is more efficient in seeking optimal or near-optimal solutions in large search spaces (Shi and Eberhart 1999; Kennedy and Eberhart 1995; Feng et al. 2006).

In this algorithm, f(X) is assumed to be the objective function, X i  = (x i1, x i2, …, x in ) is the current particle position, V i  = (v i1, v i2, …, v in ) is the current particle speed, P i  = (p i1, p i2, …, p in ) is the best position where particle flied, then the best position of particle i may be computed based on the following formula:

$$ P_{i} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {P_{i} \left( t \right)} & {\text{if}} & {f\left( {x_{i} \left( {t + 1} \right) \ge f\left( {P_{i} \left( t \right)} \right)} \right)} \\ {X_{i} \left( {t + 1} \right)} & {\text{if}} & {f\left( {x_{i} \left( {t + 1} \right) < f\left( {P_{i} \left( t \right)} \right)} \right)} \\ \end{array} } \right. $$
(12)

If the population is s, and P g(t) is the global best position where all particle flied then

$$ P_{\text{g}} \left( t \right) \in \left\{ {P_{0} \left( t \right),P_{1} \left( t \right),\, \ldots ,\,P_{\text{s}} \left( t \right)} \right\}|f\left( {P_{\text{g}} \left( t \right)} \right) = \hbox{Min} \left\{ {f\left( {P_{0} \left( t \right)} \right),f\left( {P_{1} \left( t \right)} \right), \ldots ,f\left( {P_{\text{s}} \left( t \right)} \right)} \right\} $$
(13)

According to the theory of PSO, the following equation represents the evolutionary process.

$$ v_{i} \left( {t + 1} \right) = wv_{i} \left( t \right) + c_{1} r_{1} \left( t \right)\left( {p_{ij} \left( t \right) - x_{i} \left( t \right)} \right) + c_{2} r_{2} \left( t \right)\left( {p_{\text{g}} \left( t \right) - x_{i} \left( t \right)} \right) $$
(14)
$$ x_{ij} \left( {t + 1} \right) = x_{ij} \left( t \right) + v_{ij} \left( {t + 1} \right) $$
(15)

where v i is the velocity for particle i, which represents the distance to be traveled by this particle from its current position; x ij represents the position of particle i; p ij represents the best previous position of particle i; p g represents the best position among all particles in the population; r 1 and r 2 are two independently uniformly distributed random variables with range [0, 1]; c 1 and c 2 are positive constant parameters known as acceleration coefficients, which control the maximum step size; and w is the inertia weight, which controls the impact of the previous velocity of the particle on its current one. In PSO algorithms, Eq. (14) is used to calculate the new velocity based on its previous velocity, as well as the distance of its current position from both its own best historical position and its neighbors’ best position. In general, the value of each component in v i may be clamped to the range [−v max, v max] in order to control excessive roaming of particles outside the search space. Then, the particle flies toward a new position, as shown in Eq. (15). This process is repeated until a user-defined stopping criterion is reached.

2.3 Integration method of SVM and MCS to slope reliability analysis

2.3.1 Representation of FOS and random variables by SVM

The nonlinear relationship between slope FOS and random variables may be described using a support vector machine SVM(X) as

$$ {\text{SVM}}\left( \varvec{X} \right):R^{n} \to R $$
(16)
$$ y = {\text{SVM}}\left( \varvec{X} \right) $$
(17)
$$ X = \left( {x_{1} ,x_{2} ,\, \ldots ,\,x_{n} } \right) $$
(18)

where x i (i = 1, 2, …, n) are random variables in MCS, for example, Young’s modulus, friction angle, geo-stress coefficients, etc., and y is the slope FOS.

In order to obtain SVM(X), a training process based on the known data set is required. The training of SVM includes creation of training samples using numerical analysis and the determination of training parameters of SVM. The former is performed by using numerical analysis for the given set of tentative random variables to obtain the corresponding slope FOS.

2.3.2 Determination of the parameters of SVM by PSO

Considering the influence of training parameters on the generalization performance of SVM, PSO is adopted to search the training parameters in a global space. The algorithm is described below.

Step 1: Estimate the valuing ranges of random variables to be recognized. A set of tentative random variables is given in their valuing ranges. Numerical analysis is used to calculate the corresponding slope FOS for every set of tentative random variables. Each set of random variables with the corresponding FOS is considered as a training sample set. In order to obtain the best generalization performance of SVMs, both for training samples and new samples with similar conditions, another set of samples should be created to test the applicability of the SVMs. This set of samples is known as the testing sample set.

Step 2: Initialize parameters of PSO such as number of evolutionary generation, population size, inertia weight, acceleration coefficients, range of kernel function, and its parameters including C and σ.

Step 3: Randomly select a kernel function of SVM from common examples of kernel functions such as polynomials, Gaussian radial base, and Sigmoid. Randomly produce a set of C and σ in the given ranges. Each selected kernel function and its parameters such as C and σ is regarded as an individual of SVM.

Step 4: Use SMO algorithm to solve the quadratic programming problems including each individual to obtain the values of the Lagrange multipliers and their support vectors.

Step 5: Use the selected parameters and the obtained support vectors to represent an SVM model. The testing samples are used to test the prediction ability of the SVMs. Applicability of the model is measured by fitness as follows:

$$ {\text{Fitness}} = {\text{Max}}\,(\left| {y_{i} - y^{\prime }_{i} } \right|/y_{i} ) $$
(19)

where y i and y i are the estimated FOS of tentative SVM and calculation of numerical analysis for the slope.

Step 6: If the fitness is accepted, then the training procedure of SVM ends and the best SVMs are determined. Otherwise, produce new particles of PSO as shown in Eqs. (14) and (15).

Step 7: If all new individuals of population size are generated in the PSO algorithm, refer to Step 4. Otherwise, refer to Step 6.

2.3.3 SVM-based MCS for slope reliability analysis

The slope stability analysis required by the MCS is performed by a support vector machine prediction of the slope FOS. Using this method, the efficiency of slope reliability analysis using MCS is greatly enhanced. The basic concept of the algorithm is shown in Fig. 1.

Fig. 1
figure 1

Flow chart of slope reliability analysis based on updated SVM and MCS

The algorithm of the proposed method includes the three main contents: the first is to prepare the parameters and learning samples via numerical analysis; the second part is the algorithms of PSO-based SVM, as presented above; and the last part is using an updated SVM to approximate the performance function of slope reliability analysis and enhance the efficiency and problem scale of MCS.

3 Examples and results

In order to verify the proposed method of slope reliability analysis, it was applied to a classic slope which includes three different soil layers (Chen 2003). The cross-section of the slope is shown in Fig. 2, and the parameters of the soil are listed in Table 1. First, learning samples for updated SVM were built using the orthogonal design method, and the slope FOS of every sample was calculated using the Bishop and Spencer method, respectively (see Table 2). The position of the potential critical slip surface was searched using the Spencer method, as shown in Fig. 2. Then, the parameters of SVM were optimally selected by PSO, and the relevant parameters of PSO were set as follows:

Fig. 2
figure 2

The cross-section of the classic slope

Table 1 The parameters of soil and standard deviation of random variables (for each soil layer: γ = 19.5 kN/m3, E = 1.0E4, ν = 0.25, K 0 = 0.65)
Table 2 Learning samples

Population size is 50, c 1 = c 2 = 2, the initial value of w is 1, then linear induce to 0.4. The searching range of parameters of SVM is from 0 to 10,000.

Based on the algorithms described in the previous sections, the SVM model was finally determined through the search using PSO. The convergence process of PSO is shown in Fig. 3 and indicates that the PSO is capable of obtaining the expected results in a very time-effective manner. The support vectors are obtained accordingly and listed in Table 3, and the parameters of SVM are shown in Table 4. The performance function of slope reliability analysis was determined in this way. The comparisons of slope FOS between the predicted value by SVM model and that calculated by the slope stability analysis method (Bishop and Spencer method) are shown in Figs. 4 and 5. In addition, the mean value and standard deviation of the random variables are shown in Table 5. Then, the reliability index was determined. The results of the slope reliability analysis are shown in Table 5, and the comprehensive results of the number of MCS, reliability index, mean value and the standard deviation are listed in Table 6.

Fig. 3
figure 3

The convergence process of parameters search of SVM using PSO

Table 3 Support vector and the value of α and α*
Table 4 The learning parameters of SVM (for each numerical method, penalty factor C = 10,000.0000)
Fig. 4
figure 4

Comparison of slope FOS between calculated by Bishop method and predicted by SVM

Fig. 5
figure 5

Comparison of slope FOS between calculated by Spencer method and predicted by SVM

Table 5 Results of slope reliability analysis
Table 6 The relationship between reliability index and the numbers of MCS (for each line, SD is the same: σ 2 F  = 0.01238)

The design point was at the mean values in the point estimation method (PEM) (Rosenbleuth 1975). For more random variables, it is difficult to calculate the reliability index using PEM. The proposed method is faster and more applicable. It should also be noted that the SVM model expresses the performance function of the slope very accurately, as shown in Figs. 4 and 5. The results indicate that no variation in slope FOS is found between the approaches of the Bishop method and SVM. However, a slight variation is observed between the Spencer method and SVM (see Table 5). These results show that uncertainty in the model will affect the slope reliability index.

4 Discussions

4.1 Model performance

The parameters of SVM are acquired by PSO, and the convergence process is shown to be satisfactory (Fig. 3). The SVM model effectively presents the complex nonlinear relationship between random variables and the slope FOS (Figs. 4, 5). The SVM model may be used to predict the FOS in slope reliability analysis using the MCS method, substantially enhancing the efficiency of the MCS method. The reliability index calculated by SVM-based MCS agrees well with the PEM method, as shown in Table 5. This result indicates the proposed method is feasible and effective. When the number of simulation is 500, the result is satisfactory. When the number of simulation reaches 5,000, the value of reliability index becomes stable, as shown in Table 6. Therefore, the proposed method may be adopted for use in slope reliability analysis.

4.2 Effects of the parameter of random variables on reliability index

The variation coefficient of random variables influences the slope reliability index. In this paper, different variation coefficients and distributions of random variables are analyzed based on the Spencer and Bishop methods. The results show that the reliability index will decrease with an increase of the variation coefficient of random variables, as shown in Fig. 6. In order to verify the effects of non-normal distribution on the reliability index, the lognormal distribution is analyzed. The mean and standard variation of random variables agrees with normal distribution, as shown in Fig. 7.

Fig. 6
figure 6

Effect of variation coefficient of random variables on reliability index

Fig. 7
figure 7

Effect of lognormal distribution of random variables on reliability index

4.3 Effects of the methods of slope stability analysis on reliability index

In slope stability analysis, the slope FOS is different when different analysis methods are adopted. Different analysis methods will have a large influence on the results of the reliability index (Fig. 6). The reliability index obtained using the Bishop method is more conservative than the Spencer method under normal distribution (Fig. 6). However, the reliability index remains almost the same between the Bishop and Spencer methods under the lognormal distribution (Fig. 7). The probability density functions are shown in Figs. 8 and 9. There is a large difference between the Bishop and Spencer methods in normal distribution (Fig. 8), while they are very similar in lognormal distribution (Fig. 9). This result shows that the selection of distribution function, which depends on the experimental or test data, is important to the reliability analysis.

Fig. 8
figure 8

The probability density functions obtained from SVM-based MCS method with normal distribution

Fig. 9
figure 9

The probability density functions obtained from SVM-based MCS method with lognormal distribution

4.4 Effects of learning samples

In the proposed method, learning samples play an important role in model building. The range of samples relevant to experiment or test data should agree with the variable range of random variables, and the samples should cover the maximum and minimum value of soil mechanical parameters. In addition, as many learning samples as possible should be collected, in order to obtain more accurate solutions.

5 Conclusions

This paper presents a new methodology for slope reliability analysis. The approximate concepts that are inherent in reliability analysis, together with the time-consuming requirements of repeated slope stability analyses involved in Monte Carlo Simulation, motivated the use of support vector machine. Due to the enormous sample size and the computing time consumption required for each Monte Carlo run, the computational effort involved in the conventional Monte Carlo simulation in slope reliability analysis becomes very excessive. The application of support vector machine is capable of effectively eliminating any limitation on the problem scale and in the sample size of Monte Carlo simulation, provided that the predicted slope FOS, corresponding to different simulations, falls within acceptable tolerances.

SVM was successfully adopted to produce the approximate estimation of slope FOS, regardless of the size or complexity of the problem, leading to very accurate predictions. PSO was effectively verified to search the training parameters of SVM in a global space, resulting in a satisfactory SVM model and high generalization performance. The proposed methodology not only absorbs the merit of MCS in reliability analysis, but it also takes advantage of the updated SVM to enhance the computing efficiency. The results of this study indicate that the method integrating MCS and updated SVM is powerful and effective in dealing with the problem of slope reliability analysis.