1 Introduction

Embankments are important earthworks for flood prevention. A failure of the embankment may cause huge losses to human life and property at downstream. Due to inevitable construction defects and geological deposition process, the embankment problems are also fraught with various kinds of uncertainties, including the inherent spatial variability of earth materials, the measurement and transformation errors (e.g., [9, 23, 24, 27, 34, 38]). Traditional deterministic methods for safety assessment of embankments cannot account rationally for these uncertainties (e.g., [3, 18, 28]).

To tackle these uncertainties, reliability and risk analysis approaches have been extensively applied in safety assessment of embankments (e.g., [16, 20, 30, 32]). Random field theory is quite suitable to describe the inherent spatial variation of soil properties [40]. Up to now, many attempts have been made in the reliability analyses of embankments. Table 1 shows the probabilistic analyses of embankment seepage and slope stability accounting for the spatial variation of soil properties which were conducted in the past 20 years. As shown in Table 1, most studies merely investigated the statistics of seepage responses or seismic deformation of embankments. Few studies focused on the probability of failure for embankment slopes under unsaturated seepage. Additionally, the spatial variabilities of the saturated hydraulic conductivity or/and shear strength parameters were usually considered separately; while, the uncertainty in the soil–water characteristic curve (SWCC) generally was not accounted for in the reliability analysis.

Table 1 Summary of reliability analyses of embankments considering the spatial variability of soil parameters in the past 20 years

Numerous studies indicated that the variability of SWCC had an important impact on the unsaturated seepage and stability of slopes (e.g., [31, 36, 42]). For example, Chiu et al. [7] found that the uncertainties of fitting parameters of SWCC affected the slope stability analysis results significantly. Tan et al. [38] revealed that ignoring the spatial variability in the fitting parameters of an SWCC model would overestimate the flow rate of an earth dam. Nguyen and Likitlersuang [33] indicated that the spatial variabilities of the fitting parameters of an SWCC model were more important than that of saturated hydraulic conductivity in the probabilistic analysis of rainfall-induced unsaturated soil slope failure. If the spatial variability of the SWCC is ignored, the embankment safety assessment will deviate from geotechnical practice (e.g., [33, 38]). It should be pointed out that if the spatial variabilities of the hydraulic and shear strength parameters are simultaneously considered in the reliability analyses, quite a lot of random variables are required to characterize the spatial variability. Besides, direct Monte Carlo simulation (MCS) is frequently employed for reliability analysis of embankment seepage or slope stability. Although the MCS is simple and easy to be implemented, it needs to conduct numerous evaluations of the deterministic model, particularly for the reliability problems with low levels of probability of failure. As a result, the orders of magnitude of the studied probabilities of failure are relatively high (10−3–10−1), as shown in Table 1. How to characterize the spatial variabilities of hydraulic parameters (including saturated hydraulic conductivity, fitting parameters of SWCC) and shear strength parameters at the same time, and account explicitly for their effects on the embankment slope reliability with low levels of probability of failure remains an open question.

In this study, a non-intrusive approach for reliability analysis of unsaturated embankment slopes accounting for the spatial variabilities of hydraulic and shear strength parameters simultaneously is proposed. The structure of the paper is shown as follows. In Sect. 2, the proposed approach comprised of four parts is presented, namely deterministic unsaturated seepage and slope stability analyses for embankments, discretization of cross-correlated lognormal random fields, construction of surrogate models, and estimation of probability of failure for embankment slopes. In Sect. 3, a hypothetical embankment under unsaturated seepage is investigated to evaluate the effectiveness of the proposed approach. Parametric sensitivity studies are then performed to investigate the influences of the spatial variabilities of soil parameters, including fitting parameters of SWCC, on the reliability of embankment slopes.

2 Methodologies

2.1 Deterministic unsaturated seepage and slope stability analyses for embankments

Unsaturated seepage analysis of embankments is a complicated process since the water content and hydraulic conductivity rely highly on the matric suction of unsaturated soils. The SWCC is usually utilized to depict the relationship between the water content and the matric suction [13]. However, the laboratory measurement of SWCC is costly and time-consuming. In this study, a widely used SWCC model, the Van Genuchten model [39] is adopted, as defined by

$$S_{\text{e}} = \frac{{\theta - \theta_{\text{r}} }}{{\theta_{\text{s}} - \theta_{\text{r}} }} = \frac{1}{{[1 + (\psi /a)^{n} ]^{m} }},$$
(1)

where Se is the effective degree of saturation within the [0, 1] interval; \(\psi\) is the matric suction; \(\theta_{\text{s}}\) and \(\theta_{\text{r}}\) are the saturated and residual volumetric water contents, respectively; a, n and m are the non-negative curve fitting parameters. The physical meanings of a, n and m are explained as follows: a denotes the air-entry value of SWCC which relates with the inflection point separating the unsaturated state from the saturated state; n is related to the pore size distribution which reflects the variation rate of volumetric water content with matric suction in the initial intake stage; m is a soil parameter corresponding to the residual volumetric water content.

As indicated by Chiu et al. [7], the parameter m is usually set to be dependent on n, namely m = 1 − 1/n. In this way, only two fitting parameters (a and n) in Eq. (1) need to be determined. To ensure that the SWCC satisfies the condition of 0 ≤ Se ≤ 1.0, a lower bound of 1.05 for n is specified so that m always takes a positive value [38]. Then, the hydraulic conductivity for unsaturated soils can be estimated as [9]

$$k = k_{\text{s}} S_{\text{e}}^{1/2} \left[ {1 - \left( {1 - S_{\text{e}}^{1/m} } \right)^{m} } \right]^{2} ,$$
(2)

where ks is the saturated hydraulic conductivity. With Eqs. (1) and (2), the uncertainties of the fitting parameters (a, n) of SWCC and ks can be propagated to that of soil hydraulic conductivity of the embankment.

Once the soil hydraulic conductivity is determined, the distribution of pore water pressure and volumetric water content within the embankment can be evaluated through a finite element seepage analysis. The results can be further utilized to calculate the factor of safety for embankment slopes by employing Bishop’s simplified method (e.g., [16]):

$${\text{FS}} = {{\sum\limits_{i = 1}^{{n_{\text{s}} }} {\frac{1}{{m_{\alpha } }}\left[ {c^{\prime}_{i} b_{i} + \left( {W_{i} - u_{\text{a}} b_{i} } \right)\tan \phi^{\prime}_{i} + \left( {u_{\text{a}} - u_{\text{w}} } \right)_{i} \tan \phi^{b} b_{i} } \right]} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{{n_{\text{s}} }} {\frac{1}{{m_{\alpha } }}\left[ {c^{\prime}_{i} b_{i} + \left( {W_{i} - u_{a} b_{i} } \right)\tan \phi^{\prime}_{i} + \left( {u_{a} - u_{w} } \right)_{i} \tan \phi^{b} b_{i} } \right]} } {\sum\limits_{i = 1}^{{n_{\text{s}} }} {\left( {W_{i} \sin \alpha_{i} } \right)} }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{{n_{\text{s}} }} {\left( {W_{i} \sin \alpha_{i} } \right)} }},$$
(3)
$$m_{\alpha } = \left[ {{{1 + \left( {\tan \phi^{\prime}_{i} \tan \alpha_{i} } \right)} \mathord{\left/ {\vphantom {{1 + \left( {\tan \phi^{\prime}_{i} \tan \alpha_{i} } \right)} {\text{FS}}}} \right. \kern-0pt} {\text{FS}}}} \right]\cos \alpha_{i} ,$$
(4)

where FS is the factor of safety; ns is the number of soil slices; \(c^{\prime}_{i}\) and \(\phi^{\prime}_{i}\) are the effective cohesion and effective friction angle over the base of slice i, respectively; \(\alpha_{i}\) is the angle between the base of slice i and the horizontal; Wi is the weight of slice i, which equals \(W_{i} = \gamma_{i} h_{i} b_{i}\), wherein \(\gamma_{i}\), hi and bi are the average unit weight, height and width over slice i, respectively; (ua − uw)i is the matric suction at the base of slice i, in which ua and uw are the air pressure and pore-water pressure, respectively; \(\phi^{b}\) is the angle of shearing resistance with respect to matric suction, which is about one-third to two-thirds of \(\phi^{\prime}\) based on the statistical analysis results in Fredlund and Rahardjo [13]. Note that the FS is closely related with the hydraulic parameters (ks, a, n) and shear strength parameters (\(c^{\prime}\) and \(\phi^{\prime}\)).

2.2 Discretization of cross-correlated lognormal random fields

It is widely accepted that many soil parameters such as the effective cohesion and effective friction angle are negatively cross-correlated in geotechnical practice (e.g., [12]). This means the uncertainty in the calculated shear strength will be smaller than the total uncertainty in the effective cohesion and effective friction angle [8]. Besides, as stated in the literature (e.g., [4, 33, 36]), the fitting parameter a of SWCC is also negatively correlated with the fitting parameter n. A small value of a is typically related to a large value of n and vice versa. Additionally, lognormal distribution is commonly used for describing the probability distribution of these soil parameters which cannot take negative values (e.g., [2]). Thus, the effective cohesion, effective friction angle and fitting parameters (a, n) of SWCC are regarded to obey the lognormal distributions with means of \(\mu_{{X_{i} }}\) and standard deviations of \(\sigma_{{X_{i} }}\), respectively. Cross-correlated lognormal random fields \(\underline{{X_{i} }}\) (e.g., \(X_{i}\) = ks1, a, n, \(c^{\prime}_{1}\) and \(\phi^{\prime}_{1}\)) will be involved for modeling the spatial variation of earthen material parameters.

For the discretization of the cross-correlated lognormal random fields \(\underline{{X_{i} }}\), Karhunen–Loève (KL) expansion is utilized since it requires the minimum number of random variables for a given level of accuracy (e.g., [22, 25, 35, 41]). The series expansion of the underlying independent Gaussian random fields \(\ln \underline{{X_{i} }}\) can be obtained by KL expansion method as follows:

$$\ln \underline{{X_{i} }} (x,y) = \mu_{{\ln X_{i} }} + \sum\limits_{j = 1}^{M} {\sigma_{{\ln X_{i} }} \sqrt {\lambda {}_{j}} f_{j} (x,y)\xi_{{X_{i} ,j}} } ,$$
(5)

where \(\ln \underline{{X_{i} }} (x,y)\) are the realizations of the random fields \(\ln \underline{{X_{i} }}\) at a given location \((x,y)\); \(\mu_{{\ln \,X_{i} }}\) and \(\sigma_{{\ln \,X_{i} }}\) are the mean and standard deviation of \(\ln \underline{{X_{i} }}\), respectively, which are given by

$$\left\{ \begin{aligned} \mu_{{\ln \,X_{i} }} = \ln \mu_{{X_{i} }} - {{\sigma_{{\ln X_{i} }}^{2} } \mathord{\left/ {\vphantom {{\sigma_{{\ln X_{i} }}^{2} } 2}} \right. \kern-0pt} 2} \hfill \\ \sigma_{{\ln \,X_{i} }} = \sqrt {\ln \left( {1 + {\text{COV}}_{{X_{i} }}^{2} } \right)} \hfill \\ \end{aligned} \right.,$$
(6)

where \({\text{COV}}_{{X_{i} }}\) is the coefficient of variation (COV) of the original random field \(\underline{{X_{i} }}\), \(\text{COV}_{{X_{i} }} = {{\mu_{{X_{i} }} } \mathord{\left/ {\vphantom {{\mu_{{X_{i} }} } {\sigma_{{X_{i} }} }}} \right. \kern-0pt} {\sigma_{{X_{i} }} }}\). fj(.) and \(\lambda {}_{j}\) in Eq. (5) are the eigenfunctions and eigenvalues corresponding to an autocorrelation function, respectively. Specifically, fj(.) and \(\lambda {}_{j}\) can be determined by solving the second kind of homogeneous Fredholm integral equation with a bounded domain \(\varOmega\), \(\varOmega\) = {0 ≤ x ≤ Lx; 0 ≤ y ≤ Ly} [35]:

$$\int_{\varOmega } {\rho_{\ln } \left[ {(x_{1} ,y_{1} ),(x_{2} ,y_{2} )} \right]} \, f_{j} \left( {x_{2} ,y_{2} } \right){\text{d}}x_{2} {\text{d}}y_{2} = \lambda {}_{j}f_{j} \left( {x_{1} ,y_{1} } \right),$$
(7)

where (x1, y1) and (x2, y2) are two arbitrary locations in a two-dimensional (2D) space; \(\rho_{\ln } \left[ {(x_{1} ,y_{1} ),(x_{2} ,y_{2} )} \right]\) is a bounded, symmetric and positive definite autocorrelation function. A 2D squared exponential autocorrelation function is utilized for modeling the spatial correlation of soil parameters because it can produce smoother random field realizations (e.g., [22, 40]). The autocorrelation between two arbitrary locations (x1, y1) and (x2, y2) in the underlying Gaussian space is given by

$$\rho_{\ln } \left[ {(x_{1} ,y_{1} ),(x_{2} ,y_{2} )} \right] = \exp \left( { - \left[ {\left( {\frac{{\left| {x_{1} - x_{2} } \right|}}{{\theta_{\ln ,h} }}} \right)^{2} + \left( {\frac{{\left| {y_{1} - y_{2} } \right|}}{{\theta_{\ln ,\,v} }}} \right)^{2} } \right]} \right),$$
(8)

where \(\theta_{\ln ,\,h}\) and \(\theta_{\ln ,\,v}\) are the horizontal and vertical autocorrelation distances for the underlying normal random fields \(\ln \underline{{X_{i} }}\), respectively. \(\xi_{{X_{i} ,j}}\) in Eq. (5) is an independent standard normal random variable, in which j = 1, 2, …, M, and M is the number of KL expansion terms to be retained (i.e., number of eigenmodes). Following Laloy et al. [25] and Jiang et al. [22], the criterion whether the ratio of the expected energy, \(\varepsilon\), approaches 1.0 is adopted to determine the value of M. The \(\varepsilon\) is evaluated as [25]

$$\varepsilon = {{\sum\limits_{j = 1}^{M} {\lambda_{j} } } \mathord{\left/ {\vphantom {{\sum\limits_{j = 1}^{M} {\lambda_{j} } } {\sum\limits_{j = 1}^{\infty } {\lambda_{j} } }}} \right. \kern-0pt} {\sum\limits_{j = 1}^{\infty } {\lambda_{j} } }} = {{\sum\limits_{j = 1}^{n} {\lambda_{j} } } \mathord{\left/ {\vphantom {{\sum\limits_{j = 1}^{n} {\lambda_{j} } } {L_{x} L_{y} }}} \right. \kern-0pt} {L_{x} L_{y} }},$$
(9)

where the eigenvalues \(\lambda_{j}\) are sorted in a descending order. Note that a large \(\varepsilon\) will result in a high accuracy of the truncated series.

For illustrative purpose, the discretization of the cross-correlated lognormal random fields of two hydraulic parameters [i.e., a and n in Eq. (1)] is briefly explained as below:

  1. 1.

    The fj(.) and \(\lambda {}_{j}\) in Eq. (5) are numerically solved using a wavelet-Galerkin scheme [35].

  2. 2.

    An independent standard normal random sample matrix \(\underline{\varvec{\xi}}\) with a dimension of (Nf× M) × Np is generated by adopting the Latin hypercube sampling (LHS) technique (e.g., [22]). Nf is the number of random fields, Nf = 2 herein; Np is the sampling size that equals the number of deterministic model evaluations. Each of the Np columns of the matrix \(\underline{\varvec{\xi}}\) is partitioned into Nf vectors with a dimension M.

    The kth realization of each of the independent Gaussian random fields \(\ln \underline{{X_{i} }}\) can be generated by treating the k-th column of the matrix \(\underline{\varvec{\xi}}\) as the basis, \(\underline{\varvec{\xi} }^{k} = \left\{ {\varvec{\xi}_{a}^{k} = \left( {\varvec{\xi}_{a,1}^{k} ,\xi_{a,2}^{k} , \ldots ,\xi_{a,M}^{k} } \right),\varvec{\xi}_{n}^{k} = \left( {\xi_{n,1}^{k} ,\xi_{n,2}^{k} , \ldots ,\xi_{n,M}^{k} } \right)} \right\}^{T}\), k = 1, 2, …, Np:

    $$\ln \underline{{X_{i} }}^{k} (x,y) = \mu_{{\ln X_{i} }} + \sum\limits_{j = 1}^{M} {\sigma_{{\ln X_{i} }} \sqrt {\lambda {}_{j}} f_{j} (x,y)\xi_{{_{{X_{i} ,j}} }}^{k} } \, \left( {{\text{for }}X_{i} = a,n} \right).$$
    (10)
  3. 3.

    A cross-correlation coefficient matrix is constructed as \(\varvec{R} = \left[ {\begin{array}{*{20}c} {1.0} & {\rho_{\ln a,\ln n} } \\ {\rho_{\ln a,\ln n} } & {1.0} \\ \end{array} } \right]\), in which \(\rho_{\ln a,\ln n}\) is the cross-correlation coefficient between \(\ln a\) and \(\ln n\). A lower triangular matrix L is then estimated by factoring the matrix R using standard Cholesky decomposition algorithm. Then, the k-th realization of each of the cross-correlated Gaussian random fields \(\ln \underline{{X_{i}^{D} }}\) can be obtained as follows:

    $$\ln \underline{{X_{i} }}^{D,k} (x,y) = \varvec{L}\ln \underline{{X_{i} }}^{k} (x,y) = \left[ {\ln \underline{a}^{k} (x,y),\ln \underline{a}^{k} (x,y)\rho_{{\text{ln}\,a,\text{ln}\,n}} + \ln \underline{n}^{k} (x,y)\sqrt {1 - \rho_{{\text{ln}\,a,\text{ln}\,n}}^{2} } } \right].$$
    (11)
  4. 4.

    The k-th realization of each of cross-correlated lognormal random fields \(\underline{{X_{i} }}\) can be obtained as follows:

    $$\underline{{X_{i} }}^{D,k} (x,y) = \exp \left[ {\ln \underline{{X_{i} }}^{D,k} (x,y)} \right] \, \left( {{\text{for }}X_{i} = a,n} \right).$$
    (12)

2.3 Construction of surrogate models

Once the realizations of random fields of the hydraulic parameters (ks1, a, n) and shear strength parameters (\(c^{\prime}_{1}\), \(\phi^{\prime}_{1}\)) are generated, they are assigned to the embankment model. Finite element analysis of unsaturated seepage for the embankment is first undertaken to obtain the distribution of pore water pressure and volumetric water content. The pore water pressure uw,i at the i-th position within the embankment can be expressed as the function of the discretized random variables of ks1, a and n:

$$u_{{{\text{w}},i}} = f_{1} \left( {k_{{{\text{s}}1,1}} ,k_{{{\text{s}}1,2}} , \ldots ,k_{{{\text{s}}1,N_{\text{e}} }} ,a_{1} ,a_{2} , \ldots ,a_{{N_{\text{e}} }} ,n_{1} ,n_{2} , \ldots ,n_{{N_{\text{e}} }} } \right),$$
(13)

where i = 1, 2, …, Ne, in which Ne is the number of discretized random field elements. Based on the pore water pressures and the realizations of shear strength parameters (\(c^{\prime}_{1}\), \(\phi^{\prime}_{1}\)) at different positions within the embankment, slope stability analyses are then performed to calculate the factors of safety of the embankment slopes using the method in Sect. 2.1. The factors of safety can be expressed as the function of the pore water pressures at different positions and the discretized random variables of \(c^{\prime}_{1}\) and \(\phi^{\prime}_{1}\):

$${\text{FS}} = f_{2} \left( {u_{{{\text{w}},1}} ,u_{{{\text{w}},2}} , \ldots ,u_{{{\text{w}},N_{\text{e}} }} ,c^{\prime}_{1,1} ,c^{\prime}_{1,2} , \ldots ,c^{\prime}_{{1,N_{\text{e}} }} ,\phi^{\prime}_{1,1} ,\phi^{\prime}_{1,2} , \ldots ,\phi^{\prime}_{{1,N_{\text{e}} }} } \right).$$
(14)

By integrating Eqs. (13) and (14), the factors of safety can be further expressed as

$${\text{FS}} = f\left( {k_{\text{s1,1}} ,k_{\text{s1,2}} , \ldots ,k_{{{\text{s}}1,N_{\text{e}} }} ,a_{1} ,a_{2} , \ldots ,a_{{N_{\text{e}} }} ,n_{1} ,n_{2} , \ldots ,n_{{N_{\text{e}} }} ,c^{\prime}_{1,1} ,c^{\prime}_{1,2} , \ldots ,c^{\prime}_{{1,N_{\text{e}} }} ,\phi^{\prime}_{1,1} ,\phi^{\prime}_{1,2} , \ldots ,\phi^{\prime}_{{1,N_{\text{e}} }} } \right).$$
(15)

The spatial correlated ks1, cross-correlated cohesion and friction angle, and cross-correlated a and n in Eq. (15) can be further expressed as the functions of the underlying independent standard normal random variables via the discretization of random fields using the method in Sect. 2.2.

Obviously, the output responses of embankments including the pore water pressures in Eq. (13) and the factor of safety in Eq. (14) cannot be explicitly expressed as the functions of input parameters because analytical solutions of these output responses generally do not exist. One has to resort to finite element method to evaluate these responses. However, the reliability analyses usually require the computation of these responses for many sets of inputs, which is quite time-consuming. To reduce the computational cost, the polynomial chaos expansion, Kriging, support vector machine and Gaussian process regression are often adopted to construct the surrogate models of these responses (e.g., [17, 37]). Because this study is focused on the reliability analysis of unsaturated embankment slopes, a Hermite polynomial chaos expansion (HPCE) is utilized to establish the surrogate model between the factor of safety for embankment downstream slope and the independent standard normal random variables as follows (e.g., [22]):

$${\text{FS(}}\underline{\xi } )= a_{0} \varGamma_{0} + \sum\limits_{{i_{1} = 1}}^{N} {a_{{i_{1} }} \varGamma_{1} \left( {\xi_{{i_{1} }} } \right) + } \sum\limits_{{i_{1} = 1}}^{N} {\sum\limits_{{i_{2} = 1}}^{{i_{1} }} {a_{{i_{1} i_{2} }} \varGamma_{2} \left( {\xi_{{i_{1} }} ,\xi_{{i_{2} }} } \right)} } + \sum\limits_{{i_{1} = 1}}^{N} {\sum\limits_{{i_{2} = 1}}^{{i_{1} }} {\sum\limits_{{i_{3} = 1}}^{{i_{2} }} {a_{{i_{1} i_{2} i_{3} }} \varGamma_{3} \left( {\xi_{{i_{1} }} ,\xi_{{i_{2} }} ,\xi_{{i_{3} }} } \right)} } } + \cdot \cdot \cdot ,$$
(16)

where N is the number of all involved random variables; \(a_{0} ,a_{{i_{1} }} ,a_{{i_{1} i_{2} }} ,a_{{i_{1} i_{2} i_{3} }} , \ldots\) are the unknown coefficients which need to be determined; \(\underline{\varvec{\xi}} = \left( {\xi_{1} ,\xi_{2} , \ldots ,\xi_{N} } \right)\) is the vector of independent standard normal random variables, which corresponds to the random variables \(\xi_{{X_{i} ,j}}\) in Eq. (5). \(\varGamma_{p} (.)\) are the Hermite polynomials of degree p. For the pth order HPCE, the number \(N_{\text{c}}\) of unknown coefficients is determined by [22]:

$$N_{\text{c}} = \frac{(N + p)!}{N!p!}.$$
(17)

Then, the unknown coefficients in the HPCE are evaluated by solving a series of linear equations given by Eq. (16), the left-side of which is the factors of safety obtained via deterministic embankment slope stability analyses. After that, the surrogate model for the factor of safety is constructed. Note that the finite element analysis of unsaturated seepage does not use the surrogate model in this study, which is carried out in the SEEP/W. The obtained unsaturated seepage results are directly imported into the slope stability analysis via a couple analysis between SEEP/W and SLOPE/W modules [14, 15].

2.4 Estimation of probability of failure for embankment slopes

Having obtained the HPCE-based surrogate model of factor of safety, the limit state function for the reliability analysis of embankment slope stability can be expressed as

$$G\left( {\underline{\xi } } \right) = {\text{FS}}\left( {\underline{\xi } } \right) - 1.0,$$
(18)

where \({\text{FS}}\left( {\underline{\varvec{\xi}} } \right)\) is the factor of safety explicitly expressed using Eq. (16). For the explicit limit state function in Eq. (18), the direct MCS can be readily used to estimate the probability of failure for embankment slopes even though a lot of random variables are involved:

$$p_{\text{f}} = \frac{1}{{N_{\text{MCS}} }}\sum\limits_{k = 1}^{{N_{\text{MCS}} }} {I\left[ {G\left( {\underline{\xi }^{k} } \right)} \right]} ,$$
(19)

where NMCS is the number of the samples; \(\underline{\varvec{\xi}}^{k}\) is the kth independent standard normal random sample; I(.) is the indicative function of the failure region, which is defined by:

$$I\left[ {G\left( {\underline{\xi }^{k} } \right)} \right] = \left\{ {\begin{array}{*{20}c} {1,} \\ {0,} \\ \end{array} } \right.\begin{array}{*{20}c} { \, G\left( {\underline{\xi }^{k} } \right) \le 0} \\ { \, G\left( {\underline{\xi }^{k} } \right) > 0} \\ \end{array} .$$
(20)

Note that estimating the probability of failure based on the surrogate models does not require performing deterministic analyses again. It only requires evaluating the algebraic expressions in Eq. (16), which is much more computationally efficient.

3 Illustrative example

A hypothetical embankment which was studied by Tan et al. [38] is investigated to illustrate the proposed approach. As shown in Fig. 1, the embankment with a height of 10 m is constructed on a foundation with a thickness of 8 m. The embankment upstream and downstream slopes are 3H:1V and 2.5H:1V, respectively. The width of embankment top is 4 m. The upstream and downstream water levels are H = 8.0 and h = 1.0 m above the foundation, respectively. The unit weights of the embankment and foundation are 15.4 and 15.5 kN/m3, respectively.

Fig. 1
figure 1

Illustration of geometry and boundary conditions for embankment

The SEEP/W module is utilized to conduct finite element analysis of unsaturated seepage under rainfall infiltration. The studied domain was discretized to 4999 3-node triangular and 4-node quadrilateral finite elements with a total of 4786 nodes. The steady-state seepage flow occurs through the body of a given soil medium. The boundary conditions are set as follows: (1) A potential seepage face with zero flux is imposed at the boundary (C–D–E–F) above the water levels; (2) The hydraulic heads on both sides (A–B–C, F–G–H) are constants, they are equal to 16 and 9 m, respectively; (3) A constant vertical infiltration flux q = 1.0 × 10−7 m/s (0.36 mm/h) is applied to the embankment surface (C–D–E–F), which simulates a long rainfall period with an average annual rainfall condition before the rainy season. Note that a zero flux and a rainfall flux q is applied to the embankment surface (C–D–E–F) at the same time. This can be done in the SEEP/W through defining a constant vertical infiltration flux q applied at the surface (C–D–E–F) and in the meantime ticking the option “Potential Seepage Face Review”.

3.1 Statistical information of uncertain soil parameters

For simplicity, the hydraulic parameters (ks1, a, n) are treated to be statistically independent of the shear strength parameters (c1′, ϕ1′) of the embankment; while, a and n, c1′ and ϕ1′ are modeled as cross-correlated lognormal random fields, respectively. The statistical properties of soil parameters for the embankment are listed in Table 2. Generally, the statistical properties of soil parameters should be determined with in situ and/or laboratory test data on soil samples. In this study, the mean values of all soil parameters and the COVs of ks1, a and n used herein are adopted from Tan et al. [38]. The cross-correlation coefficients, \(\rho_{a,n}\) = − 0.25 and \(\rho_{{c^{\prime}_{1} ,\phi^{\prime}_{1} }}\) = − 0.5, are determined according to Phoon et al. [36] and Cho [8], respectively. The generic ranges of COV of c1′ and ϕ1′ for clay are 0.1–0.7 and 0.1–0.5, respectively, as reported in Cherubini [5]. Thus, the random field of c1′ with a mean value of 24 kPa and COV of 0.25, and the random field of ϕ1′ with a mean value of 8° and COV of 0.15 are considered in this study.

Table 2 Statistical properties of soil parameters for the embankment

As for the foundation, the differences are generally small for the same soil parameter at different spatial locations because the foundation underwent strict excavation, backfilling and other processes during the embankment construction [38]. To this end, the soil parameters (\(k_{\text{s2}}\), \(c^{\prime}_{2}\) and \(\phi^{\prime}_{2}\)) of the foundation that is treated to be saturated are modeled by three lognormal random variables to save the computational time. The statistical properties of soil parameters of the foundation are listed in Table 3. Also, the cross-correlation coefficient between \(c^{\prime}_{2}\) and \(\phi^{\prime}_{2}\), \(\rho_{{c^{\prime}_{2} ,\phi^{\prime}_{2} }}\) = − 0.5, is considered.

Table 3 Statistical properties of soil parameters for the foundation

As mentioned in Sect. 2.2, the cross-correlation coefficient in the underlying standard normal space should be adopted for generating the realizations of uncertain soil parameters. However, the differences between the cross-correlation coefficients \(\rho_{{\text{ln}\,a,\text{ln}\,n}}\), \(\rho_{{\ln c^{\prime}_{1} ,\ln \phi^{\prime}_{1} }}\), \(\rho_{{\ln c^{\prime}_{2} ,\ln \phi^{\prime}_{2} }}\) and the original cross-correlation coefficients \(\rho_{a,n}\), \(\rho_{{c^{\prime}_{1} ,\phi^{\prime}_{1} }}\), \(\rho_{{c^{\prime}_{2} ,\phi^{\prime}_{2} }}\) for the given COVs in Tables 2 and 3 are minor, which do not significantly affect the reliability assessment results of embankment slopes. For simplicity, \(\rho_{a,n}\) = − 0.25, \(\rho_{{c^{\prime}_{1} ,\phi^{\prime}_{1} }}\) = − 0.5, \(\rho_{{c^{\prime}_{2} ,\phi^{\prime}_{2} }}\) = − 0.5 are directly used in following reliability analyses.

3.2 Determination of autocorrelation distances and random field element size

As mentioned previously, the accuracy of discretization of random fields using the KL expansion relies greatly on the number of eigenmodes to be retained, M. Generally, both the computational accuracy and effort increase with an increase of M. Figure 2 presents the decaying trends of the eigenvalues \(\lambda_{j}\) corresponding to different autocorrelation distances. It is evident that the eigenvalues \(\lambda_{j}\) decrease sharply as the number of KL terms increases. The rate of decay increases as the autocorrelation distances increase. When M = 10 is selected, the \(\varepsilon\) in Eq. (9) is estimated as 80.1, 95 and 98.7% for (\(\theta_{\ln ,h}\) = 20 m, \(\theta_{\ln ,\,v}\) = 2 m), (\(\theta_{\ln ,h}\) = 30 m, \(\theta_{\ln ,\,v}\) = 3 m) and (\(\theta_{\ln ,h}\) = 40 m, \(\theta_{\ln ,\,v}\)= 4 m), respectively. For the case of (\(\theta_{\ln ,h}\) = 30 m, \(\theta_{\ln ,\,v}\)= 3 m), \(\varepsilon\) only increases from 95 to 98.5% when M varies from 10 to 15. To balance the computational accuracy and efficiency, M = 10 is chosen. In this case, a total of 50 random variables are needed for the discretization of these five lognormal random fields of (ks1, a, n, c1′ and ϕ1′). It should be mentioned that the autocorrelation distances (i.e., \(\theta_{\ln ,\,h}\) = 30 m, \(\theta_{\ln ,\,v}\) = 3 m) for different soil parameters are kept the same since the autocorrelation distance is also a basic property of soil [12]. \(\phi^{b}\) is also treated as a spatially distributed variable with its random values being two-thirds of those of \(\phi^{\prime}\).

Fig. 2
figure 2

Variation of the eigenvalues of squared exponential autocorrelation function

The embankment is discretized into Ne = 233 random field elements as shown in Fig. 3. Each element has the side lengths of 3.0 and 0.5 m, respectively. The random field mesh mainly consists of 3-noded triangular and 4-noded quadrilateral elements. For the case of (\(\theta_{\ln ,\,h}\) = 30 m and \(\theta_{\ln ,\,v}\) = 3 m), the ratios of the horizontal and vertical element sizes to the corresponding scales of fluctuation, \({{3.0} \mathord{\left/ {\vphantom {{3.0} {30}}} \right. \kern-0pt} {30}} = 0.1\) and \({{0.5} \mathord{\left/ {\vphantom {{0.5} 3}} \right. \kern-0pt} 3} = 0.167\), are relatively small. This obeys the criterion as proposed by Ching and Phoon [6] that the random field element size shall be smaller than 0.23–0.30 times autocorrelation distance when employing a squared exponential autocorrelation function. With a carful comparison of the finite element mesh and random field mesh, it can be found that a random field element covers six squared finite elements with a size of 0.5 m, as shown in Fig. 3. Namely, these six finite elements will share the same random values of soil parameters.

Fig. 3
figure 3

A typical realization of the random fields of fitting parameters a and n and slope stability analysis results

3.3 Reliability analysis results

Modeling of the three lognormal random variables of the foundation is straightforward. Modeling of the five lognormal random fields of the embankment is presented in Sect. 2.2. For the purpose of illustration, Fig. 3 presents one typical realization of the random fields of a and n at \(\rho_{a,n}\) = − 0.25. The dark and light colors indicate large and small values of a and n, respectively. As expected, the simulated spatially distributed values of a and n exhibit a clearly negative correlation. Based on the typical realization of a and n in Fig. 3 and the corresponding realization of ks1, \(c^{\prime}_{1}\) and \(\phi^{\prime}_{1}\), the contours of pore water pressure within the embankment for different cases can be obtained through deterministic finite element seepage analyses, as shown in Fig. 4. Then, the factor of safety for the embankment downstream slope for Case 1 (reference case) is calculated as 1.508 by employing Bishop’s simplified method. In addition, the critical slip surface of downstream slope is shown in Fig. 3. Thereafter, the 2nd-order HPCE-based surrogate model between the factor of safety and a total of (N = 50 + 3) independent standard normal random variables is established using the method in Sect. 2.3. Np = 3500 samples are generated using the LHS technique to solve a number of \(N_{\text{c}}\) = 1485 unknown coefficients underlying the 2nd order HPCE.

Fig. 4
figure 4

Contours of pore water pressure within the embankment for different cases

To show the effectiveness of the 2nd-order HPCE-based surrogate model, the values of FS obtained from the surrogate model are compared with those determined from the deterministic finite element embankment seepage and slope stability analyses. Figure 5 compares the values of FS determined from the surrogate model versus those determined from the deterministic analyses at 100 direct MCS samples. As shown in Fig. 5, the values of FS determined from the two approaches agree well with each other in the entire sampling space (e.g., the coefficient of determination R2 = 95.4%). It indicates that the 2nd-order HPCE-based surrogate model can well approximate the deterministic embankment model, and thus can replace the deterministic analyses in the following reliability analysis.

Fig. 5
figure 5

Validation of the Hermite polynomial expansion-based surrogate model

Then the probability of failure for the downstream slope is efficiently estimated using the direct MCS although one million samples are adopted. It is 5.864 × 10−3, which is well consistent with the value (i.e., 5.429 × 10−3) estimated from the direct LHS technique with 3500 samples that are used in constructing the surrogate model. Moreover, the probability of failure is reduced to 1.3 × 10−4 when the soil parameters for the foundation are treated as deterministic quantities, which also matches with the value (e.g., 3.3 × 10−4) estimated from the direct LHS technique with 3000 samples that are used in constructing a new surrogate model. These consistencies imply the effectiveness of the proposed approach. Also, it is apparent that the probability of failure has been reduced by more than one order of magnitude when the uncertainties in the soil parameters of the foundation are ignored. This indicates the uncertain soil parameters of the foundation also affect the reliability of the embankment slopes significantly.

It is worth noting that that the proposed approach can well evaluate the embankment slope reliability at the probability levels of failure of 10−4–10−3 through only performing around 3000 evaluations of deterministic model. In contrast, the direct MCS method needs to conduct more than 100,000 deterministic analyses to obtain the similar results with the same accuracy. This comparison confirms the proposed approach can act as a practical and rigorous tool for reliability analysis of embankments in spatially variable soils at small levels of probability of failure. Compared with the non-intrusive stochastic finite element method previously developed by Jiang et al. [22], the proposed approach can take the random fields and random variables as the input of the surrogate model at the same time and be efficiently executed in MATLAB by directly calling “seep2.exe” and “slope2.exe” programs in the DOS environment for an integrated finite element seepage and slope stability analyses.

3.4 Parametric sensitivity study

To show the influences of the COVs of soil parameters on the embankment slope reliability, Figs. 6, 7, 8, 9, 10 present the probabilities of failure for embankment downstream slope for different COVs of soil parameters (c1′, ϕ1′, \(k_{\text{s1}}\), a and n), respectively. In the sensitivity study, the ranges of \(\text{COV}_{{c^{\prime}_{1} }}\), \(\text{COV}_{{\phi^{\prime}_{1} }}\), \(\text{COV}_{{k_{s1} }}\), COVa and COVn are set as [0.2, 0.3], [0.1, 0.2], [0.6, 0.9], [0.2, 0.6] and [0.2, 0.6], respectively, according to Cherubini [5] and Tan et al. [38]. When the COV of a concerned parameter varies, the COVs of the remaining parameters keep constant, namely equaling the base values as shown in Table 2.

Fig. 6
figure 6

Variation of the probability of failure for downstream slope with the COV of effective cohesion

Fig. 7
figure 7

Variation of the probability of failure for downstream slope with the COV of effective friction angle

Fig. 8
figure 8

Variations of the probability of failure for downstream slope with the COV of saturated hydraulic conductivity for different cases

Fig. 9
figure 9

Variations of the probability of failure for downstream slope with the COV of fitting parameter a for different cases

Fig. 10
figure 10

Variations of the probability of failure for downstream slope with the COV of fitting parameter n for different cases

Fig. 11
figure 11

Variation of the hydraulic conductivities corresponding to different values of n with the matric suction

As observed from Figs. 6, 7, the probability of failure for the embankment downstream slope increases as any of \(\text{COV}_{{c^{\prime}_{1} }}\) and \(\text{COV}_{{\phi^{\prime}_{1} }}\) increases. The variability of the effective cohesion affects the probability of failure the most. The probability of failure increases more than one order of magnitude (e.g., from 1.467 × 10−3 to 1.416 × 10−2) as \(\text{COV}_{{c^{\prime}_{1} }}\) varies from 0.2 to 0.3. In contrast, the probability of failure increases marginally from 5.59 × 10−3 to 5.69 × 10−3 as \(\text{COV}_{{k_{\text{s1}} }}\) varies from 0.6 to 0.9 for Case 1, as shown in Fig. 8. This result coincides with the observations in Dou et al. [11] and Nguyen and Likitlersuang [33]. Additionally, the variability of a almost do not affect the probability of failure of the embankment downstream slope (see Fig. 9), which is in line with the observation in Calamak [4]. It is interesting to note that the probability of failure decreases as COVn increases (see Fig. 10). In general, the spatial variabilities of hydraulic parameters (ks1, a, n) have minor effects on the embankment slope reliability in comparison with those of shear strength parameters (c1′, ϕ1′).

To further explore the influences of the spatial variabilities of hydraulic parameters on the embankment slope reliability, additional three cases (Cases 2–4) are investigated. The details of four cases are described as follows: (1) Case 1 (base case): ks1 = 6 × 10−7 m/s, H = 8 m, q = 1 × 10−7 m/s, a = 50 kPa, n = 1.5; (2) Case 2: ks1 = 6 × 10−7 m/s, H = 4 m, q = 1 × 10−7 m/s, a = 50 kPa, n = 1.5; (3) Case 3: ks1 = 6 × 10−7 m/s, H = 8 m, q = 3 × 10−6 m/s, a = 50 kPa, n = 1.5; (4) Case 4: ks1 = 3 × 10−6 m/s, H = 8 m, q = 1 × 10−7 m/s, a = 6.711 kPa, n = 1.289. Note that Cases 2 and 3 are set to reflect the effects of the upstream water level above the foundation and the vertical infiltration flux, respectively. ks1 = 3 × 10−6 m/s is adopted in Case 4, which is five times the base saturated hydraulic conductivity. According to Cho [10], the fitting parameters a and n corresponding to this new soil with ks1 = 3 × 10−6 m/s are equal to 6.711 kPa and 1.289, respectively.

Figures 8, 9, 10 further show the variations of the probability of slope failure for different cases with \(\text{COV}_{{k_{\text{s1}} }}\), COVa and COVn, respectively. As observed from Figs. 8, 9, 10, the total variation trends of the probability of slope failure with \(\text{COV}_{{k_{\text{s1}} }}\), COVa and COVn for Cases 2–4 are same as that for Case 1. Particularly in Fig. 10, the probability of slope failure also decreases with COVn for Cases 2–4. The impacts of \(\text{COV}_{{k_{\text{s1}} }}\), COVa and COVn on the probability of slope failure for Cases 2 and 3 are still insignificant. This is because the permeability of embankment with a mean value of ks1 = 6 × 10−7 m/s is still quite small, the unsaturated seepage analysis results do not change much compared to Case 1 although a lower upstream water level above the foundation and a larger vertical infiltration flux are applied. It also implies that the spatial variabilities of hydraulic parameters have slight effects on the probability of slope failure although a relatively large infiltration flux is adopted in Case 3 wherein the surface of the embankment has reached saturated condition, see Fig. 4c. In contrast, the influences of \(\text{COV}_{{k_{\text{s1}} }}\) and COVn become significant for Case 4 when the mean value of ks1 is increased to 3 × 10−6 m/s. This indicates that the permeability of embankment dominates the impacts of the spatial variabilities of hydraulic parameters on the slope reliability. Note that the COVa does not significantly affect the probability of slope failure even for an embankment with strong permeability (see Fig. 9).

To explain the phenomenon in Fig. 10, Fig. 11 shows the variation of the hydraulic conductivity of the embankment earth material with the matric suction. In the figure, the hydraulic conductivity is estimated using Eqs. (1) and (2). It is apparent that n has a great influence on the hydraulic conductivity. The hydraulic conductivity decreases rapidly at a small suction when n takes a relatively small value that is close to 1.05. In contrast, when n increases to more than 1.3, the differences among the curves of the hydraulic conductivity versus the matric suction are quite small, and the hydraulic conductivity is relatively larger. As shown in Fig. 11, the differences among the curves of the hydraulic conductivity versus the matric suction for the n being in the range of [1.3, 2.0] are significantly smaller than those for the n being in the range of [1.05, 1.3]. The hydraulic conductivity generally decreases as the value of n decreases, and it reduces to the minimum when the value of n approaches its lower bound of 1.05.

Figure 12 further compares the cumulative distribution functions (CDFs) of n associated with different values of COVn. As seen from Fig. 12, the fitting parameter n roughly obeys a lognormal distribution with a lower bound of 1.05. The probabilities for the n falling in the interval of [1.05, 1.3] are 0.26, 0.48, and 0.58 when the COVn equals 0.2, 0.4 and 0.6, respectively. It means that the larger the COVn is, the more realizations of n approach its lower bound of 1.05, leading to the more the hydraulic conductivities with relatively small random values spatially distributed within the embankment. Consequently, less seepage flowing through the embankment and safer downstream slope (i.e., smaller probability of failure) are achieved.

Fig. 12
figure 12

Comparison of the cumulative distribution functions of fitting parameter n

Additionally, to account for the effect of the cross-correlation between the hydraulic parameters a and n on the probability of failure of the embankment slope, Fig. 13 compares the probabilities of failure for different values of \(\rho_{a,n}\) under Case 4. In this study, the range of \(- 0.5 \le \rho_{a,n} \le 0.5\) is considered. Similar to the influence of the cross-correlation between the cohesion and friction angle (e.g., [22]), the probability of slope failure increases as \(\rho_{a,n}\) increases. The probability of slope failure under the assumption of independence between a and n is biased if the actual cross-correlation is positive or negative. In general, the cross-correlation between a and n barely affects the probability of slope failure. The probability of slope failure increases slightly from 6.88 × 10−3 to 7.47 × 10−3 as \(\rho_{a,n}\) varies from − 0.5 to 0.5.

Fig. 13
figure 13

Effect of the cross-correlation coefficient between a and n on the probability of failure for Case 4

4 Conclusions

In this paper, a non-intrusive approach for reliability analysis of unsaturated embankment slopes accounting for the spatial variabilities of hydraulic and shear strength parameters simultaneously is proposed. A hypothetical embankment under unsaturated seepage is investigated to illustrate the proposed approach. A series of parametric sensitivity studies are performed to show the influence of the spatial variation of soil parameters, including fitting parameters (a, n) of SWCC, on the embankment slope reliability. Several conclusions are drawn as below:

  1. 1.

    The proposed non-intrusive approach can act as a practical and rigorous tool for evaluating the reliability of embankment slopes with small levels of probability of failure (i.e., 10−4–10−3). The reason is that a Hermite polynomial chaos expression-based surrogate model is adopted instead of deterministic analysis to compute the factor of safety. Compared with the direct MCS and LHS methods, the proposed approach achieves much higher computational efficiency in computing the probability of slope failure.

  2. 2.

    Ignoring the uncertainties of soil parameters of the foundation will result in an underestimation of the probability of failure for embankment slopes. For the embankment in this study, the probability of failure will be reduced by more than one order of magnitude if the uncertainties in the soil parameters of the foundation are ignored. In practice, it is of great necessity to strictly control the filling and rolling quality of the embankment and foundation materials in order to reduce the uncertainties of soil parameters.

  3. 3.

    The spatial variability of the effective cohesion affects the probability of failure of the embankment slope the most, followed by the spatial variability of the friction angle, whereas those of the hydraulic parameters affect the probability of failure marginally when the embankment is subjected to unsaturated seepage. The influences of the spatial variabilities of the hydraulic parameters become important on the probability of failure when the permeability of embankment is sufficiently increased.

  4. 4.

    Interestingly, the probability of failure of the embankment slope decreases as the coefficient of variation of the fitting parameter n increases. This is because more realizations of n fall in the interval of [1.05, 1.3] for a larger COV of n, which result in more hydraulic conductivities with relatively small values spatially distributed within the embankment, and consequently less seepage flowing through the embankment and safer slope are achieved. Additionally, the spatial variability of the fitting parameter a has a tiny effect on the probability of slope failure even for an embankment with strong permeability. The statistics of the parameters (a, n) need to be further determined based on the real test data.