Prediction of landslide displacement with an ensemble-based extreme learning machine and copula models

Li, Huajin; Xu, Qiang; He, Yusen; Deng, Jiahao

doi:10.1007/s10346-018-1020-2

Prediction of landslide displacement with an ensemble-based extreme learning machine and copula models

Original Paper
Published: 29 May 2018

Volume 15, pages 2047–2059, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Landslides Aims and scope Submit manuscript

Prediction of landslide displacement with an ensemble-based extreme learning machine and copula models

Download PDF

Huajin Li¹,
Qiang Xu¹,
Yusen He² &
…
Jiahao Deng³

2316 Accesses
87 Citations
Explore all metrics

Abstract

Research on the dynamics of landslide displacement forms the basis for landslide hazard prevention. This paper proposes a novel data-driven approach to monitor and predict the landslide displacement. In the first part, autoregressive moving average time series models are constructed to analyze the autocorrelation of landslide triggering factors. A linear ensemble-based extreme learning machine using the least absolute shrinkage and selection operator is applied in predicting the displacement of landslides. Five benchmarking data-driven models, the support vector machine, neural network, random forest, k-nearest neighbor, and the classical extreme learning machine, are considered as baseline models for validating the ensemble-based extreme learning machines. Numerical experiments demonstrated that the proposed prediction model produces the smallest prediction errors among all the algorithms tested. In the second part, parametric copula models are fitted on the predicted displacement, to investigate the relationship between the triggering factors and landslide displacement values. The Gumbel-Hougaard copula model performs best, which indicates strong upper tail correlation between the triggering factors and displacement values. Thresholds for the triggering factors can be obtained by monitoring the landslide moving patterns with large displacement values. The effectiveness and utility of the proposed data-driven approach have been confirmed with the landslide case study in the region of the Three Gorges Reservoir.

Displacement prediction of step-like landslide by applying a novel kernel extreme learning machine method

Article 09 June 2018

Probabilistic forecasting of landslide displacement accounting for epistemic uncertainty: a case study in the Three Gorges Reservoir area, China

Article 12 January 2018

Ensemble learning for landslide displacement prediction: A perspective of Bayesian optimization and comparison of different time series analysis methods

Article 25 April 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Landslides are a recurrent geological phenomenon which pose serious threats to the local community. In the Three Gorges Reservoir region in central China, more than 4200 landslides have occurred, which are attributed to the complex geological environment and to heavy rainfall (Yin et al. 2010). This region has the highest frequency of landslide occurrence in China (Zhou et al. 2016) and the majority of landslides in this area shows multiple reactivations with large displacement (Cao et al. 2016). Hence, predictive modeling of landslide displacement is essential for prevention of landslide hazards.

Methods for predicting the cumulative displacements of landslides have been studied by researchers in previous years. The methods can be briefly categorized into two groups: physical models and data-driven models (Huang et al. 2017; Li et al. 2012). Due to the complexity of geological parameters for reservoir landslides, physical models are less used in practice (Huang et al. 2017).

Recently, data-driven models which contain the triggering factors as input and the moving response of the landslide as output of the prediction model are widely applied in displacement prediction. The input part refers to the earthquake, precipitation, and reservoir water level fluctuations while the output response denotes the landslide displacement. Xu et al. (2011) introduced the autoregressive time series model to predict landslide displacement. Krkač et al. (2017) presented a methodology for prediction of landslide movements based on random forest (RF). Du et al. (2013) also applied the time series model to decompose the cumulative displacement into a trend component and a seasonal component. Lian et al. (2015) developed artificial neural network (ANN) to model the displacement with highly accurate results. Zhou et al. (2016) selected particle swarm algorithm (PSO) to optimize the parameters of the support vector machine (SVM) and achieved promising results as well. Among data-driven models, machine learning algorithms provide more accurate results than classical statistical models.

However, the classical machine learning algorithms such as ANN and SVM have two major constraints: (1) The level of appropriateness of parameter initialization (e.g., number of hidden layers, C values) influences the performance of the algorithm, (2) overfitting is a common problem when less training data are available. At the same time, the study regarding the relationship between the triggering factors and the landslide displacement of landslides is still insufficient. Under extreme severe conditions such as heavy precipitation, a single factor may trigger landslide movement easily. Therefore, it is necessary to develop a predicting data-driven framework to solve this problem.

Extreme learning machine (ELM) is one of the most widely applied algorithms for predicting time series data. As a single-hidden layer feedforward neural network, ELM overcomes the challenge of appropriate parameter initialization. Its advantages include a simple theoretical basis, global minimum optimization, and powerful generalization, strengthening the capacity in prediction modeling (Huang 2003; Huang et al. 2006a, b). ELM has been successfully applied in many other fields with promising prediction results (Cao et al. 2016; Lian et al. 2013; Miche et al. 2015; Ouammi et al. 2010; Zong and Huang 2011).

The use of the single extreme learning machine may produce more prediction errors when a small dataset is used for training. An ensemble-based method is a commonly used solution to mitigate the problem (He and Kusiak 2018; Tramèr et al., 2017). A novel linear ensemble-based method named least absolute shrinkage and selection operator (LASSO) has been proposed to resolve this problem (Tibshirani 1996). This ensemble-based method minimizes the sum of squared errors by placing a boundary on the sum of the absolute values of the coefficients. Through assigning different weights to different ELMs, based on their prediction performances, LASSO is capable to reduce prediction errors. It also exhibits the stability of ridge regression (Sun et al. 2017). The assembly of different ELMs based on LASSO may further improve the prediction accuracy in comparison with any single ELM.

Beside the displacement prediction, advanced monitoring, and identification of landslides with seasonal reactivations are also highly demanded. The predicted slope failures of such seasonal landslide reactivations can prevent severe damages to local communities. Indicated by previous research, tail correlation exists between the triggering factors and displacement values (Cao et al. 2016). Therefore, in this research, parametric copula models are constructed to model the relationship between the triggering factors and landslide displacement values. Thresholds for the triggering factors can be computed to predict landslide reactivations.

In this paper, a novel data-driven framework with an integrated extreme learning machine containing copula models is proposed to monitor and predict the uncertainties of landslide displacement. Considering the computational expensiveness and less accurate results of the classical algorithms, the ELM is selected in this study. To overcome the deficiencies of the single ELM, a LASSO algorithm is utilized to assemble the ELMs. For monitoring the landslide reactivations with large displacement, tail correlations between displacement values and triggering factors including precipitation and reservoir water level fluctuations are investigated through parametric copula models.

Methodology

In this research, a novel data-driven predictive modeling and monitoring framework is proposed; two datasets are selected for this research: time series of monthly precipitation and monthly average reservoir water level (Fig. 1). First, the autoregressive moving average (ARMA) time series model is selected to analyze the trend and seasonality of the average monthly reservoir water level and monthly precipitation. Autocorrelation factors (ACFs) of the ARMA model are computed to illustrate their trend and seasonality. Second, Pearson’s correlation coefficients have been used to map the relationship between the instant displacements and historic precipitations or reservoir water level fluctuations. All significant positive values (with p values < 0.05) of historic precipitation or reservoir water level are considered as indicators of landslide displacement. Hence, they are selected as inputs in the prediction model based on LASSO-ELM. Next, four parametric copula models are constructed to fit the predicted displacement data. Parameters for each model are derived by the maximum likelihood estimation. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) are calculated to evaluate the performance of the four parametric copula models. Copula models with the smallest values of AIC and BIC perform best. Last, thresholds indicating large displacements can be extracted from the best copula models. Value-at-Risks (VaRs) of both triggering factors (e.g., precipitation and reservoir water level fluctuations) are obtained as thresholds to monitor the landslides for landslide hazard prevention.

Time series analysis

Autoregressive moving average models are widely utilized in analyzing time series data. The ARMA models are a generic class of time series models capable of predicting the current values of a variable from its past values and the past error terms. The parameters of an ARMA model produce general statistical inferences regarding the temporal dynamics and long-term memory of a variable (McLeod and Li 1983). A classical ARMA time series model is expressed in (1).

$$ {x}_k=\sum \limits_{i=1}^p{\varphi}_i{x}_{k-i}+{\alpha}_k+\sum \limits_{j=1}^q{\theta}_j{\alpha}_{k-j} $$

(1)

where x_k denotes the values of the current value of the target variable, α_k is the white noise, φ denotes the autoregressive parameter, θ denotes the moving average parameter, and p and q are order parameters. In this research, x_k represents the instant value of the triggering factors.

Based on the constructed ARMA model, the correlation between the current values of the variable and its past values may be computed. The autocorrelation function is widely employed to display the correlation. The ACF based on an ARMA model is expressed in (2) (Bustos and Yohai 1986).

$$ {r}_l=\frac{\frac{1}{N-1}\sum \limits_{i=1}^{N-l}\left({x}_i-\overline{x}\right)\left({x}_{i-1}-\overline{x}\right)}{\frac{1}{N}\sum \limits_{i=1}^N{\left({x}_i-\overline{x}\right)}^2} $$

(2)

where r_l is the ACF of the lth lag of the variable, x_i is the ith data sample, $ \overline{x} $ is the mean of all data samples, and N is the total number of samples in the dataset.

Extreme learning machine and LASSO regularization

ELM in Fig. 2 is a single-hidden layer feedforward neural network (SLFN). The basic ELM algorithm consists of three layers: the input layer, the hidden layer, and the output layer. The input layer weight matrix and the hidden layer biases are randomly assigned to compute the hidden layer output matrix. Based on that, the output layer weight matrix can be computed via least-square linear regression by using the hidden layer output matrix and target output. The learning model expressed in (3)–(4) contains the training set (x_i, t_i), the hidden node output function G(w,b,x), and the number of hidden nodes L (Huang et al. 2006a, b).

$$ {f}_L\left({x}_j\right)={y}_j,\forall j $$

(3)

$$ \sum \limits_{i=1}^L{\beta}_iG\left({w}_i,{b}_i,{x}_j\right)={t}_j,j=1,2,\dots, N $$

(4)

where x_j represents the input parameters, w_i is the weight vector connecting the ith hidden node and the input nodes, b_i is the bias of the ith hidden node, and β_i is the weight vector connecting the ith hidden node and the output nodes.

The training strategy of an ELM includes three steps. First, the hidden node parameters a_i and b_i are randomly assigned. Second, the hidden layer output matrix H is computed from (5). Last, the output weight β is computed as β = H⁺T, where H⁺ is the Moore-Penrose generalized inverse of the hidden layer output matrix H.

$$ H\left({w}_1,\dots, {w}_L;{b}_1,\dots, {b}_L;{x}_1,\dots, {x}_N\right)={\left[\begin{array}{ccc}G\left({w}_1,{b}_{1,}{x}_1\right)& \cdots & G\left({w}_L,{b}_{L,}{x}_1\right)\\ {}\vdots & \ddots & \vdots \\ {}G\left({w}_1,{b}_{1,}{x}_N\right)& \cdots & G\left({w}_L,{b}_{L,}{x}_N\right)\end{array}\right]}_{N\times L} $$

(5)

The single basic extreme learning machine randomly assigns hidden nodes and incrementally updates the output weights of the hidden layer nodes. With a small training dataset, the prediction error may be large. A linear ensemble-based ELM can overcome this challenge. Yu et al. (2013) proposed an ensemble of ELMs with least absolute shrinkage and selection operator (LASSO) regularization, which provides less prediction errors than any single ELM. The ensemble of ELMs is expressed in (6) and the LASSO regularization is expressed in (7).

$$ {\overline{y}}_i={B}_0+\sum \limits_{P=1}^8{y}_i^P{B}_P $$

(6)

$$ \underset{B_0,B}{\min}\left\{\sum \limits_{i=1}^N{\left({\overline{y}}_i-{B}_0-\sum \limits_{P=1}^8{y}_i^P{B}_P\right)}^2\right\} $$

(7)

where B₀ and B_P are estimated intercept and weights of the linear ensemble regression model, $ \overline{y_i} $ denotes the target output vector, and y_i^P represents the vector of the predicted output from the Pth ELM. In this study, our ensemble model is a regularized least-square regression model using LASSO. With LASSO regularization, we can estimate the intercept and weights B₀ and B_P of the linear ensemble model. The scheme of the linear ensemble-based ELMs can be illustrated in Fig. 3.

Benchmarking data-driven models

In this research, five benchmarking data-driven models including the support vector machine (SVM), neural network (NN), random forest (RF), k-nearest neighbor (kNN), and classical extreme learning machine (ELM) are compared with the proposed LASSO-ELM.

The support vector machine model (Abdi and Giveki 2013; Tong and Koller 2001) is applied in this study with the Gaussian kernel function expressed in (8).

$$ K\left(X,{X}^{\hbox{'}}\right)=\exp \left(\frac{-\left\Vert X-{X}^{\hbox{'}}\right\Vert }{2{\sigma}^2}\right) $$

(8)

where X is the vector of the input data and σ denotes the standard deviation of the input data. The capacity factor C of the SVM model with values 1, 10, 100, and 1000 is considered in the training of the SVM model. In addition, the parameter γ = 1 ∕ 2σ² with values of 0.001, 0.01, 0.1, and 1 is also considered during the training process. The optimal parameter settings of SVM are evaluated through 10-fold cross validation.

The neural network model (LeCun et al. 1990) in this research applies back-propagation (BP) to optimize its performance by adjusting the weight of each neuron. The structure of the NN model includes the input layer, the hidden layer, and the output layer. The Sigmoid function is selected as the activation function in this study. The number of hidden neurons in each hidden layer with values of 10, 20, 30, 40, and 50 and the number of hidden layers with values of 1, 2, 3, 4, and 5 are all evaluated via 10-fold cross validation to determine the optimal parameter setting.

The random forest model (Breiman 2001) assembles multiple decision trees, which are generated based on the values of an independent set of random variables. For classification and regression tasks, the best split is used among a subset of randomly selected predictors at the split node. The maximum depth of random forest with values of 10, 30, 50, 70, and 90 and the number of trees with values of 1000, 1500, 2000, and 2500 are all evaluated via 10-fold cross validation.

The k-nearest neighbor model (Denoeux 1995) for predicting the landslide displacement is also presented in this research. The Euclidean distance is used in the kNN model. Various values of k from 1 to 20 are evaluated through a 10-fold cross validation to determine the optimal k value.

The classical extreme learning machine model introduced in the “Extreme learning machine and LASSO regularization” section has been applied with the radial basis function (RBF) as the activation function. The numbers of hidden neurons within the hidden layer with values of 2, 4, 6, 8, ..., 98, and 100 are all evaluated via 10-fold cross validation to determine the optimal parameter setting.

Model evaluation

To assess the landslide displacement prediction performance of the data-driven models, four widely applied metrics, namely mean absolute error (MAE (9)), mean absolute percentage error (MAPE (10)), mean square error (MSE (11)), and root mean square error (RMSE (12)), are selected in this study.

$$ \mathrm{MAE}=\frac{1}{n}{\sum}_{i=1}^n\left|{\overline{y}}_i-{y}_i\right| $$

(9)

$$ \mathrm{MAPE}=\frac{1}{n}{\sum}_{i=1}^n\left|\frac{{\overline{y}}_i-{y}_i}{y_i}\right| $$

(10)

$$ \mathrm{MSE}=\frac{1}{n}\sum \limits_{i=1}^n{\left({\overline{y}}_i-{y}_i\right)}^2 $$

(11)

$$ \mathrm{RMSE}=\sqrt{\frac{1}{n}\sum \limits_{i=1}^n{\left({\overline{y}}_i-{y}_i\right)}^2} $$

(12)

where y_i denotes the actual displacement value and $ \overline{y_i} $ represents predicted displacement value.

Copula theory

The triggering factors of landslides (e.g., reservoir water level fluctuations, precipitation) are uncertain and dynamic. Analyzing the correlation structure between the triggering factors and the landslide displacement is essential for landslide hazard monitoring and prevention. The thresholds of these triggering factors are valuable for monitoring of reactivation of landslides. Applying copula models for tail correlation analysis, the thresholds for precipitation and reservoir water level can be computed. Hence, in this research, copula models are constructed to study the relationship between the triggering factors and displacement values.

The theory of copula was firstly proposed by Sklar (1996) for nonlinear and asymmetric multivariate analysis. Nonlinearity and tail correlation between the variables can be fully investigated through the copula models. The general formula of the copula model expressed in (13) is an N-dimensional joint distribution function composed of N univariate marginal distribution functions.

$$ F\left({x}_1,{x}_2,\dots, {x}_N\right)=C\left[{F}_{x_1}\left({x}_1\right),{F}_{x_2}\left({x}_2\right),\dots, {F}_{x_N}\left({x}_N\right)\right] $$

(13)

where x_N represents the Nth variable; $ {F}_{x_N}\left({x}_N\right) $ denotes the marginal cumulative density function of the Nth variable; F(x₁,…,x_N) is the N-dimensional joint distribution; and C[Fx₁(x₁),…,$ {F}_{x_N}\left({x}_N\right) $] is the copula function.

In practice, two copula families, namely the Archimedean copula family and the elliptical copula family, are widely used. In the elliptical copula family, the Gaussian copula and the Student t copula are most frequently used. In the Archimedean copula family, the Frank copula, the Joy-Clayton copula, the Gumbel-Hougaard copula, and the Ali-Mikhail-Haq copula are frequently used in practice (Reboredo 2011). All copula models in the elliptical family are symmetric and are less sensitive to tail correlations between variables. In the Archimedean copula family, the Frank copula evaluates the co-movement of highly associated variables (Hao and AghaKouchak 2013). The Gumbel-Hougaard copula is sensitive to the upper tail correlation and the Joe-Clayton copula fits the lower tail-correlated variables well. The Ali-Mikhail-Haq copula is a modified form of Gumbel-Houggaard copula and it is also performing well with upper correlated variables (Onken et al. 2009).

Previous research indicates that the landslides were mostly triggered by heavy precipitation and large fluctuation of reservoir water level (Du et al. 2013; Keefer et al. 1987; Yao et al. 2015). In this study, the Archimedean copula models which are more sensitive to the tail-correlated dataset are applied to model the data. To select the copula model that fits the data best, two performance evaluation metrics, namely Akaike information criterion (AIC (14)) and Bayesian information criterion (BIC (15)) (Chen and Gopalakrishnan, 1998; Posada and Buckley, 2004), are computed in this study.

$$ \mathrm{AIC}=-2\ln (L)+2m $$

(14)

$$ \mathrm{BIC}=-2\ln (L)+m\ln (n) $$

(15)

where m is the number of the estimated variables, n is the number of data samples, and L represents the maximum log-likelihood function. In this research, the copula model with the minimum AIC and BIC values performs best.

For landslide hazard prevention, thresholds of triggering factors can be obtained as the indicator of large landslide displacements. Since the copula model studies the correlation between triggering factors (e.g., reservoir water level fluctuations, precipitation) and displacement values, the thresholds can be computed through Value-at-Risk (VaR) (He et al. 2017) expressed in (16).

$$ {\mathrm{VaR}}_P\left({x}_1,\dots, {x}_N\right)={C}^{-1}\left({x}_1,\dots, {x}_N\right) $$

(16)

where x_N denotes the Nth triggering factor, C⁻¹ represents the reverse function of the copula function, and p is the confidence level to compute the thresholds. In this paper, p is set to 0.95 and only two-dimensional copula models are applied for our research respectively.

Field investigation

Study area

Case study was focused on the Baishuihe landslide, which is located on the south side of the Yangtze River within Zigui County, China. Its exact location is presented in Fig. 4. This landslide has a maximal length of 780 m from north to south and a maximum width of 700 m from east to west. The total area of the Baishuihe landslide is 0.42 km² and it contains a total volume of 1260 × 10⁴ m³. The current slide ranges in elevation between 75 and 390 m from the toe to the main scarp respectively (Du et al. 2013). The central part of the Baishuihe landslide is relatively flat while the gradients in the upper and lower parts of landslide are larger. The bedrock geology of the landslide area consists mainly of low-strength sandstones and mudstones (Lian et al. 2013). In recent years, the slope deformation of the Baishuihe landslide is more intense, which is attributed to fluctuations of the reservoir water level and heavy precipitation in the flood season. Baishuihe landslide is a retrogressive landslide, with failure surface which propagates from the Yangtze River in the direction of the slope. The cumulative displacement of the landslide has reached a distance of more than 2500 mm during the year 2003 to 2015.

Data collection

The Baishuihe landslide shows repeated reactivations since the first activation in 2003. After first reactivation, slope deformation monitoring has been conducted by configuring 11 global positioning system (GPS) points, with 1 -month data collection frequency, on the landslide surface area as illustrated in Fig. 5. The deformation rate in the bottom part is significantly higher than that in the upper part. Thus, more GPS monitoring points were assigned in this area. Among those, ZG93 and ZG118 have a complete displacement dataset over several years and are hence utilized for detailed numerical analysis in this study. Previous research indicated that the main triggering factors of landslides in the Three Gorges Reservoir are precipitation and reservoir water level fluctuations (Cao et al. 2016; Du et al. 2013; Zhou et al. 2016). Hence, the historic data of these two factors are also utilized in this study. The dataset of reservoir water level has been collected through on-site investigation, and the dataset of precipitation has been obtained from a monitoring site 9.5 km away from the landslide location. The raw dataset is presented in Fig. 6 respectively.

In this study area, based on the GPS time series data (see Fig. 6), two significant types of displacements occurred: seasonal faster displacement and slower displacement. The seasonal faster displacement which illustrates “step-like” patterns has steep positive gradients or steps in cumulative horizontal displacement plots (Massey et al. 2013). In addition, the average cycle length for seasonal faster displacement is approximately 1 year. The long period of slower motion comprises semi-constant displacement rates over several months or most parts of the year. Considering the seasonal patterns of reservoir water level fluctuations and precipitation illustrated in Fig. 6, the collected dataset suggests a complex relationship between them and the landslide displacements.

Experimental results

The landslide displacement data utilized in this paper were collected between 2003 and 2015 from the GPS monitoring points on the Baishuihe landslide in the Three Gorges Reservoir, China. Several reactivations of the landslide are recorded during this monitoring period. Before analyses of landslide displacement and triggers, all monitoring data are pre-processed, including outlier removal and missing value imputation.