GIS-based landslide susceptibility mapping using ensemble methods for Fengjie County in the Three Gorges Reservoir Region, China

Gong, W.; Hu, M.; Zhang, Y.; Tang, H.; Liu, D.; Song, Q.

doi:10.1007/s13762-021-03572-z

GIS-based landslide susceptibility mapping using ensemble methods for Fengjie County in the Three Gorges Reservoir Region, China

Original Paper
Published: 03 August 2021

Volume 19, pages 7803–7820, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Environmental Science and Technology Aims and scope Submit manuscript

GIS-based landslide susceptibility mapping using ensemble methods for Fengjie County in the Three Gorges Reservoir Region, China

Download PDF

W. Gong¹,
M. Hu¹,
Y. Zhang ORCID: orcid.org/0000-0003-2587-1112¹,
H. Tang¹,
D. Liu² &
…
Q. Song^3,4

970 Accesses
13 Citations
Explore all metrics

Abstract

Various landslide susceptibility models can be available in the literature, and each model has its unique advantages and limitations. Previous studies have shown that no single model performs best across diverse geoenvironmental conditions. To seek better prediction accuracy and reliability, this study proposes three different ensemble methods to take advantage of multiple landslide susceptibility models: qualitative matrix ensemble method, semi-quantitative partition ensemble method, and quantitative probability-weighted ensemble method. To illustrate the effectiveness of the three ensemble methods proposed, a case study is carried out in Fengjie County in the Three Gorges Reservoir Region, China. First, with the support of geographic information system, a total of 1550 historical landslides and the associated 12 conditioning factors are compiled, which are used for the training and validation of four selected single landslide susceptibility models, including two statistical approaches (i.e. frequency ratio and fuzzy assessment) and two machine learning approaches (i.e. backpropagation neural network and support vector machine). Then, the three ensemble methods are applied to integrate the outcomes of the four single models. Finally, an extensive comparative analysis is performed between the ensemble methods and single models using the receiver operating characteristics curve and information entropy. The results demonstrate that all the three ensemble methods achieve higher overall prediction accuracy (> 80%) than the four single models (< 80%), and the matrix ensemble method provides the best improvement. Besides, the ensemble methods can also enhance reliability by reducing the statistical discrepancy between distinct single models.

Landslide susceptibility modeling based on GIS and ensemble techniques

Article 11 April 2022

Comparisons of several methods for landslide susceptibility mapping: case of the Benzilan and Waka Towns, Southwest China

Article 01 August 2021

GIS-based ensemble modelling of fuzzy system and bivariate statistics as a tool to improve the accuracy of landslide susceptibility mapping

Article 11 March 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Landslides are one of the major geological disasters. The frequent occurrence of landslides has caused huge economic losses and casualties to society (Pourghasemi et al. 2018; Tang et al. 2019; Gong et al. 2021). Landslide susceptibility assessment deals with “where” landslides are most likely to occur (Guzzetti et al. 2005), which is a preliminary and imperative task for landslide prevention and control. The key content of this task is to select suitable landslide susceptibility models. It is known that landslide susceptibility assessment is based on the general assumption that slope failure in the future will be more likely to occur under those conditions which led to landslides in the past (Gemitzi et al. 2011). Thus, in landslide susceptibility modelling, the input should be a set of landslide conditioning factors which are important attributes to slope instability, while the output should be a landslide susceptibility zonation map that delineates the spatial incidence of landslides.

There are extensive studies on landslide susceptibility modelling and mapping in the past few decades. According to the difference in model principles and evaluation bases, landslide susceptibility models in the existing studies could be categorized into empirical approaches, statistical approaches, machine learning approaches, and others (Melchiorre et al. 2008). The empirical models usually combine engineering information with disaster prevention experience, based on which the evaluation standard of landslide susceptibility is formulated (Pourghasemi et al. 2012). The statistical models carry out mathematical and statistical analysis on regional landslide database, and then establish the relationship between landslide susceptibility levels and conditioning factors (Goetz et al. 2015). The most common statistical models include information value model (Achour et al. 2017), weight evidence analysis model (Ilia and Tsangaratos 2016), fuzzy assessment model (Gemitzi et al. 2011), and frequency ratio model (Mohammady et al. 2012; Ozdemir and Altural 2013). Machine learning models input relevant geological data into computer programs and adopt algorithms to derive landslide susceptibility maps (Pham et al. 2016). In recent years, there is an increasing preference towards machine learning models for landslide susceptibility analysis. The well-established machine learning models include the support vector machine model (Micheletti et al. 2014; Huang and Zhao 2018), decision tree model (Pradhan 2013), logistic regression model (Budimir et al. 2015), and artificial neural network model (Pradhan and Lee 2010).

It is noted that all of the above landslide susceptibility models have their specific advantages and limitations (Reichenbach et al. 2018). Although empirical models take advantage of prior knowledge and experience, the analysis results might be quite different from the actual situations (Goetz et al. 2011). Statistical models are able to reflect the relationship between the input geoenvironmental factors and the output evaluation results, but the model structure is relatively simple and prone to deviations (Shahabi and Hashim 2015; Juliev et al. 2019). Machine learning models have the advantages of strong adaptability and generalization, but are subjected to the selection of model parameters and the size of the training dataset (Goetz et al. 2015; Zhou et al. 2018). As a result, it can be found from previous studies that the performance of landslide susceptibility models varies in different geoenvironmental settings, and no individual model proves to be superior across all conditions (Reichenbach et al. 2018; Pourghasemi et al. 2018). Nonetheless, the majority of previous studies related to landslide susceptibility assessment implement only one single model, which might lead to unsatisfactory results (Vorpahl et al. 2012).

It is known that multi-model ensemble methods provide an efficient and practical tool to improve the results of single classification models by combining their outcomes. The trait of an ensemble method is the harnessing of collective intelligence of diverse individual models (referred to hereinafter as base models). Thus, the effect of the ensemble method can surpass that of the best base model. In recent years, diverse ensemble approaches have been developed and shown great effectiveness in the spatial evaluation of various environmental hazards, such as flood assessment (Mojaddadi et al. 2017; Shahabi et al. 2020), groundwater contamination mapping (Barzegar et al. 2018), gully erosion assessment (Hembram et al. 2020), and landslide susceptibility mapping (Aghdam et al. 2016; Saha et al. 2020).

The focus of this study is placed on the use of multi-model ensemble methods to combine the predictions made by individual base models to achieve high-accuracy and reliable prediction of landslide susceptibility. In this study, three ensemble methods are introduced and modified for landslide susceptibility mapping: matrix ensemble method (Wei et al. 2018; Ahmed et al. 2018), partition ensemble method (Hong et al. 2018b; Martinello et al. 2020), and probability-weighted ensemble method (Baldassarre et al. 2009; Zhang et al. 2013). These ensemble methods adopt different ensemble strategies. To illustrate the effectiveness of the proposed ensemble methods, four of the most commonly used landslide susceptibility models are selected as base models for performing the ensemble methods through a case study of Fengjie County located in the Three Gorges Reservoir Region, China. This study was carried out in the Faculty of Engineering, China University of Geosciences, Wuhan, during July 2020 to May 2021.

Materials and methods

Framework for implementation of ensemble methods

This study proposes three different types of ensemble methods to seek better prediction accuracy and reliability of landslide susceptibility mapping, including the qualitative matrix ensemble method, semi-quantitative partition ensemble method, and quantitative probability-weighted ensemble method. With the support of remote sensing and geographic information system (GIS) (Awais et al. 2021a, 2021b; Cheng et al. 2021; Shao et al. 2021; You et al. 2021), the historical landslides and associated landslide conditioning factors in the study area are derived as basic inputs for landslide susceptibility modelling.

The four base models selected for ensembles in the case study include two typical statistical models, i.e. frequency ratio (FR) model and fuzzy assessment (FA) model, and two prevailing machine learning models, i.e. backpropagation neural network (BPNN) model and support vector machine (SVM) model. These four base models are chosen mainly for the following two reasons. Firstly, the quality of base models directly affects the prediction performance of ensembles, and the four base models have proved to provide reasonably good performance in previous studies (Gemitzi et al. 2011; Bui et al. 2012; Reichenbach et al. 2018; He et al. 2019; Shano et al. 2021). Secondly, ensembles tend to yield better prediction performance when there is a significant diversity among the constituent base models (Kuncheva and Whitaker 2003), and the four base models are built on different modelling principles. It should be noted that the base models are not limited to the above four landslide susceptibility models, and the proposed three ensemble methods are also suitable for other well-established landslide susceptibility models.

Figure 1 shows a systematic framework for the application process of the three ensemble methods. Firstly, based on the geological and geomorphological settings of the study area (i.e. Fengjie County), a preliminary analysis of landslide conditioning factors is conducted. Secondly, the selected four base models are well trained and validated to evaluate the landslide susceptibility in the study area within a GIS environment. Thirdly, the three ensemble methods are employed to combine the outcomes of the four base models. Finally, an extensive comparative analysis of the prediction accuracy and reliability between the proposed three ensemble methods and the four base models is performed to validate the effects of the ensemble methods.

Study area and database

This section first introduces the geological and geomorphological settings of the study area, then presents a landslide inventory database consisting of historical landslides and landslide conditioning factors, and finally performs a sensitivity analysis of landslide conditioning factors to highlight their importance in causing landslides.

Geological and geomorphological settings in Fengjie County

Fengjie County is located in the east of Chongqing, China, as well as in the hinterland of the Three Gorges Reservoir Region, as shown in Fig. 2. It covers an area of 4098 km² between latitudes 30°29′19″ to 31°22′33″N, and longitudes 109°1′17″ to 109°45′58″E. Fengjie County belongs to the mountainous landform in the east of Sichuan Basin with undulating mountains and gullies. The elevation of the study area ranges from 175 to 2123 m above mean sea level. The terrain of Fengjie County is higher in the south-east, and lower in the middle and north-west. The river valley near the Yangtze River system in the middle of the county is relatively developed. The topography and geomorphology are crucial to the distribution of land coverage. Affected by topography and geomorphology, land coverage presents spatial differences in Fengjie County. With the increase in the elevation and slope degree, the exposed surface area gradually decreases and the vegetation coverage gradually increases.

Since Fengjie County is situated at the intersection of several fold belts, there exists complex geological structures mainly composed of folds. It has a humid subtropical monsoon climate with an average annual precipitation of 1132 mm, and the temperature drops sharply due to the great height of the mountains and the immense depth of the valleys. The Yangtze River runs through Fengjie County, and there are 49 reservoirs in the study area. Affected by the complex geological structures, continuous rainfalls, and periodic changes in reservoir water level, Fengjie County is saddled with frequent geological disasters such as landslides and collapses.

Establishment and preliminary analysis of landslide database

The description of the spatial relationship between historical landslides and their conditioning factors is essential to landslide susceptibility modelling. Firstly, it is necessary to prepare the landslide inventory database regarding landslides and associated landslide conditioning factors. A total of 1,550 historical landslides occurred from 1950 to 2019 in Fengjie County have been well documented, each annotated with crucial information on occurrence time, geographical coordinates of landslide location, geological and geomorphic conditions, triggering factors, type, scale, and consequences. Rainfall is observed to be the predominant triggering factor of landslides in Fengjie County. For spatial landslide susceptibility mapping, this study emphasizes the effect of the conditioning factors which are the intrinsic attributes that govern the stability condition of slopes, regardless of the role of rainfall which is the external triggering factor with a relative yearly recurrence pattern.

Meanwhile, to assist in the training of landslide susceptibility models, an equal number of non-landslide sample points are generated using the seed cell sampling strategy (Dagdelenler et al. 2016). To be specific, a total of 1550 non-landslide sample points are randomly extracted from the undisturbed areas that are over 2000 m away from the landslide locations. The 1550 historical landslides and 1550 non-landslide sample points are mapped out in the study area, which are, respectively, represented by the red and black dots as illustrated in Fig. 2.

Moreover, based on previous studies (Budimir et al. 2015; Wen et al. 2017; Hong et al. 2018a; Zhao and Chen 2020; Sun et al. 2020) and the local geoenvironmental conditions, a total of 12 key conditioning factors are determined: elevation, slope degree, slope aspect, terrain curvature, terrain roughness index (TRI), lithology, distance to fold, distance to river, stream power index (SPI), topographic wetness index (TWI), normalized difference vegetation index (NDVI), and distance to road. These factors basically capture the overall intrinsic features of the slope with respect to topography, geology, hydrology, land use, and human activities.

Elevation, slope degree, slope aspect, terrain curvature, and TRI characterize the terrain features in the study area. Elevation is an important topographic factor, which has significant influence on slope degree, terrain curvature, and other factors, and further affects slope stability (Pourghasemi et al. 2012; Zhou et al. 2018). Slope degree is a significant geomorphological feature of the slope, and varying slopes lead to spatial differences in stress distribution, loose material accumulation, and weathering degree of rock strata (Moayedi et al. 2019). Slope aspect has a certain influence on illumination, soil texture, wind speed, and vegetation development, which could affect the spatial distribution of landslides (Bui et al. 2011). Terrain curvature reflects terrain structure and shape (Kalantar et al. 2018). TRI describes surface fluctuations and reflects the topographic and geomorphic conditions (Hong et al. 2018a). The distribution maps for elevation, slope degree, slope aspect, terrain curvature, and TRI are produced from the digital elevation model (DEM) of Fengjie County (Fig. 3a–e). The DEM data, with a spatial resolution of 30 m, are derived from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) launched on NASA’s Terra satellite.

Lithology and distance to fold characterize geological conditions. Lithology determines rainfall erosion resistance and rock mass stability, and hence can strongly influence landslide occurrence (Roodposhti et al. 2014). The lithology data of the study area are extracted from the geological map of China Geological Survey (CGS) in 1:1 million scale (Fig. 3f). The presence of folds exerts significant influence on the distribution characteristics of rock strata, groundwater, and geological activities such as collapse and earthquake (Kayastha et al. 2013). The distribution map of distance to fold is generated by the Euclidean distance function in ArcGIS software (Fig. 3g).

Distance to river, SPI, and TWI characterize hydrogeological conditions. Distance to river directly affects the rock erosion process due to water level change and rainfall in the Three Gorges Reservoir Region. Based on the river network database in Fengjie County, the distribution map of distance to river is generated by the Euclidean distance function in ArcGIS software (Fig. 3h). SPI represents the degree of regional erosion by water flow, and TWI indicates ground humidity (Park et al. 2018). SPI and TWI are jointly determined by the river network database and the DEM data (Fig. 3i–j).

NDVI reflects regional land use and land cover in terms of the degree of vegetation coverage, which has great influence on the soil and hydrological conditions, and further affects slope stability (Jaafari et al. 2014). Using the Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite data, the NDVI map is generated from the Environment for Visualizing Images (ENVI) remote sensing image processing platform (Fig. 3k). Distance to road shows the influence of human engineering activities, as the modification of slopes in the process of road construction could influence landslide occurrence to some extent (Sujatha et al. 2012; Meneses et al. 2019). According to the road network database, the distribution map of distance to road is generated by the Euclidean distance function in ArcGIS software (Fig. 3l).

In Fig. 3, the whole study area is appropriately discretized into 4,861,338 square grids of pixels, and each pixel contains the data information of the 12 conditioning factors. Moreover, each conditioning factor is reclassified into five classes by the Jenks optimization method (Jenks 1967), as illustrated in Table 1. The Jenks optimization method classifies the conditioning factors using natural breaks in data values. The best arrangement of values into various classes is determined by minimizing the squared deviations of the class means. Consequently, the spatial distribution maps and statistical database for all of the 12 conditioning factors are constructed with their subclasses.

Table 1 Statistics of landslide conditioning factors

Full size table

To establish the spatial relationship between the landslides and the conditioning factors, for each class of every conditioning factor, the corresponding percentage of landslides and percentage of covered area are calculated and summarized in Table 1. The results demonstrate that landslide occurrence decreases with increasing elevation, terrain curvature, distance to river, distance to fold, distance to road, SPI, TWI, and TRI. The historical landslides in Fengjie County predominantly occur in elevation of 130–530 m, which is in line with the findings of Ercanoglu and Gokceoglu (2002) and Gemitzi et al. (2011). In addition, most landslides take place in slope degree between 12° and 30° in Fengjie County, which is similar to the resulting 15°–35° of the study area in Sujatha et al. (2012).

Sensitivity analysis of conditioning factors

A quantitative sensitivity analysis is performed to rank the importance of the 12 conditioning factors in causing landslides. The certainty factor is used to measure the relationship between the landslide density of a subarea and that of the whole study area (Sujatha et al. 2012). For each class of every conditioning factor, certainty factor is quantified by Eq. (1), and the result is given in Table 1. A positive value of certainty factor means the landslide density of the class is greater than the average value of the whole study area. The greater the certainty factor, the higher the probability of landslide occurrence.

$${\text{CF}}_{i} = \left\{ {\begin{array}{*{20}l} {\frac{{D_{i} - D}}{{D_{i} (1 - D)}},} \hfill & {D_{i} \ge D} \hfill \\ {\frac{{D_{i} - D}}{{D(1 - D_{i} )}},} \hfill & {D_{i} < D} \hfill \\ \end{array} } \right.$$

(1)

where CF_i and D_i, respectively, denote the certainty factor and landslide density in the ith class of a specified conditioning factor; and D denotes the average landslide density in the study area.

The sensitivity index of a conditioning factor is defined by the maximum difference between the certainty factor values in various classes of this conditioning factor, as seen in Eq. (2). A large value of the sensitivity index is indicated as a strong correlation with landslide susceptibility.

$${\text{SI}} = {\text{CF}}_{\max } - {\text{CF}}_{\min }$$

(2)

where SI stands for the sensitivity index of a conditioning factor; and CF_max and CF_min, respectively, represent the maximum value and minimum value of certainty factor among the five classes of this conditioning factor.

The obtained sensitivity indexes of the 12 landslide conditioning factors are then sorted in descending order to highlight their roles in causing slope failures in the study area, as shown in Fig. 4. The result shows that elevation is of utmost importance in landslide susceptibility, followed by lithology, distance to fold, and terrain curvature. As the result suggests, terrain features and geological formations and structures have dominant influence on slope stability. In contrast, topographic wetness index, aspect, and stream power index have lower sensitivity index values, indicating these three factors exert relatively less influence on landslide occurrence in the study area.

Four base models selected for ensembles

The above analysis of the spatial correlations between landslide conditioning factors and landslide distribution provides the basis for landslide susceptibility modelling and mapping. In this section, four of the most commonly used landslide susceptibility models (i.e. FR model, FA model, BPNN model, and SVM model) are introduced as the base models for ensembles. Each one of the four base models has been well trained and validated before performing ensemble methods.

Frequency ratio (FR) model

Frequency ratio is simply defined by the ratio of the landslide occurrence percentage to the area occupation percentage for various classes of every conditioning factor, as formulated in Eq. (3) (Samanta et al. 2018). A higher frequency ratio implies greater chances of landslides.

$${\text{FR}}_{i} = \frac{{{{N_{i} } \mathord{\left/ {\vphantom {{N_{i} } N}} \right. \kern-\nulldelimiterspace} N}}}{{{{S_{i} } \mathord{\left/ {\vphantom {{S_{i} } S}} \right. \kern-\nulldelimiterspace} S}}}$$

(3)

where FR_i stands for the frequency ratio of the ith class of a conditioning factor; N_i and S_i, respectively, denote the number of landslides and the area size within the ith class of this conditioning factor; N is the total number of landslides in the study area; and S is the total study area.

The FR values of the 12 conditioning factors (each with five classes) are calculated and presented in Table 1. For every pixel in the study area, the overall FR value is obtained by aggregating the FR values of the 12 conditioning factors. The overall FR values of all pixels are then grouped into five intervals by the Jenks optimization method. The study area is subsequently demarcated into five landslide susceptibility zones for visual interpretation: very low, low, medium, high, and very high, as shown in Fig. 5a. Comparing the produced landslide susceptibility map to the landslide/non-landslide points, it is found that the high- and very high-susceptibility classes agree well with the historical landslides along the Yangtze River, and the low- and very low-susceptibility classes agree well with the non-landslide sample points at the southern end of the study area.

Fuzzy assessment (FA) model

The key of the fuzzy assessment model is to establish appropriate membership functions for assigning membership values of landslide susceptibility to each pixel for every individual conditioning factor (Gemitzi et al. 2011). In this study, the membership function is expressed by Eq. (4). The membership values range between 0 and 1, reflecting the degree of certainty of membership.

$$\begin{aligned} \, M(T_{1} ) & = \left\{ \begin{gathered} 0, \, V > T_{2} \hfill \\ - (V - T_{2} )/(T_{2} - T_{1} ), \, T_{1} < V \le T_{2} \hfill \\ 1, \, V \le T_{1} \hfill \\ \end{gathered} \right. \\ M(T_{i = 2,3,4} ) & = \left\{ \begin{gathered} 0, \, V > T_{i + 1} \hfill \\ - (V - T_{i + 1} )/(T_{i + 1} - T_{i} ), \, T_{i} < V \le T_{i + 1} \hfill \\ (V - T_{i} )/(T_{i} - T_{i - 1} ), \, T_{i - 1} < V \le T_{i} \hfill \\ 0, \, V \le T_{i - 1} \hfill \\ \end{gathered} \right. \\ M(T_{5} ) & = \left\{ \begin{gathered} 1, \, V \ge T_{5} \hfill \\ (V - T_{4} )/(T_{5} - T_{4} ), \, T_{4} \le V \le T_{5} \hfill \\ 0, \, V < T_{4} \hfill \\ \end{gathered} \right. \\ \end{aligned}$$

(4)

where M(T_i) is the membership value; V is the value of a conditioning factor; and T₁–T₅ are the five control points of this conditioning factor derived by the natural breakpoint method. T_1–T₅ are corresponded to the five susceptibility levels (i.e. very low, low, medium, high, and very high) according to the certainty factor value of the five classes to which T₁–T₅ belong (as seen in Table 1). Thus, for every conditioning factor in each pixel, the membership values corresponding to the five susceptibility levels are determined. Subsequently, a 5 × 12 membership matrix is obtained for the 12 conditioning factors in each pixel.

In addition, the importance weight vector is also obtained as the normalized sensitivity index values of the 12 conditioning factors. Then through the product of the membership matrix and the importance weight vector, the weighted membership values of the five susceptibility levels are derived for each pixel. Finally, the susceptibility level of each pixel is recognized using the maximum membership principle. As shown in Fig. 5b, the resulting susceptibility map visually shows that most of the study area is predicted to be highly susceptible to landslides, and even some non-landslide sample points fall into the high- and very high-susceptibility classes.

Backpropagation neural network (BPNN) model

Backpropagation neural network is a multi-layer feedforward network trained by error backward propagation which is an optimization algorithm using gradient descent for supervised learning of artificial neural networks (Pradhan and Lee 2010). In this study, 1050 out of 1550 historical landslides and 1050 out of 1550 non-landslide sample points are randomly selected as the training dataset, and the remaining 500 landslides and 500 non-landslide sample points as the validation dataset. Then, the Bayesian regularization algorithm is applied to the BPNN model training, since this algorithm can achieve good generalization, fast convergence speed, and high learning accuracy (Bui et al. 2012). Meanwhile, the error backpropagation process is adjusted using the weight decay method to reduce the verification error.

Using the trained BPNN model, 72.14% of the training dataset and 85.10% of the validation dataset are successfully classified into the corresponding landslide susceptibility levels. It is noted that correct classifications include two circumstances: (1) historical landslides fall into the high- and very high-susceptibility classes, and (2) non-landslide sample points fall into the low- and very low-susceptibility classes. Finally, the landslide susceptibility zonation map of the study area is generated by the optimized BPNN model, as shown in Fig. 5c. It visually shows that the BPNN model has good performance in the north-west of the study area (where the densely distributed landslides fall into the very high-susceptibility classes) and at the south-east end of the study area (where the densely distributed non-landslide sample points fall into the very low-susceptibility classes).

Support vector machine (SVM) model

Support vector machine is a supervised nonlinear machine learning algorithm, which differentiates the classes with an optimal hyper-plane that maximizes the margin between the classes, and the data points closest to the hyper-plane are called support vectors (Vapnik 1995; Christianini and Shawe-Taylor 2000). Same as the dataset division for the BPNN model training, 1050 out of 1550 historical landslides and 1050 out of 1550 non-landslide sample points are randomly selected as the training dataset, and the remaining 500 landslides and 500 non-landslide sample points as the validation dataset. Then, the radial basis function (RBF) is employed as the kernel function to train the SVM model, since RBF kernel has been proved to produce the best prediction results in many studies (Pradhan 2013; Pourghasemi et al. 2013; He et al. 2019). The detailed description of underlying mathematics of applying the SVM model with RBF kernel to landslide prediction is presented in Pourghasemi et al. (2013).

Using the trained SVM model, 83.38% of the training dataset and 90.10% of the validation dataset are successfully classified into the corresponding landslide susceptibility levels. The resulting spatial prediction of landslide susceptibility in the study area is shown in Fig. 5d. It is found that the landslide and non-landslide points show satisfactory consistency with the detected very high- and very low-susceptibility classes, respectively.

Three types of ensemble methods proposed

Three different ensemble approaches, i.e. the matrix ensemble method, partition ensemble method, and probability-weighted ensemble method, are proposed to integrate the outcomes of the selected four base landslide susceptibility models mentioned above. The ensemble strategies and processes are described in detail below.

Matrix ensemble method

The matrix ensemble method is a qualitative decision support tool that uses a decision matrix to integrate two sources of information and make a comprehensive decision (Wei et al. 2018; Ahmed et al. 2018). In this study, a decision matrix is created to integrate landslide susceptibility results of different models, as shown in Table 2. The decision matrix is a two-dimensional symmetric matrix listing all possible combinations of landslide susceptibility levels (obtained from Model A and Model B) and the corresponding final susceptibility classifications. According to this decision matrix, for example, if the susceptibility levels of a pixel derived by Model A and Model B are “very high” and “medium”, respectively, then the ensemble result should be “high”; if the two models result in the same susceptibility level for a pixel, then the ensemble result will remain unchanged.

Table 2 Decision matrix for ensembling two landslide susceptibility models

Full size table

The process of applying the decision matrix to ensemble the four base models includes two steps, as illustrated in Fig. 6. First, the two statistical models (i.e. FR model and FA model) and the two machine learning models (i.e. BPNN model and SVM model) are ensembled separately according to the decision matrix. Then, the two sets of ensemble results obtained from the first step are ensembled again by the decision matrix. Consequently, the final landslide susceptibility zonation map is produced as seen in Fig. 5e. The map shows that most of the historical landslides fall into the predicted high- and very high-landslide-susceptibility areas such as the north-west region and the areas on both sides of the Yangtze River. The non-landslide sample points also agree well with the predicted low- and very low-landslide-susceptibility areas.

Partition ensemble method

The partition ensemble method firstly divides the study area into several subareas based on the regional topographic distribution characteristics, then selects the best evaluation results in each subarea, and assembles them together to generate a new landslide susceptibility map for the whole study area (Hong et al. 2018b; Martinello et al. 2020). Therefore, it is a semi-quantitative method. According to the presence of the main fold belts, the study area is partitioned into five subareas I–V as shown in Fig. 7. To compare the performance of the four base models in each subarea, the success rate involving both historical landslides and non-landslide sample points is calculated as below.

$${\text{SR}}_{i} = \frac{{n_{{i{\text{\_L}}}} + n_{{i{\text{\_NL}}}} }}{{N_{{i{\text{\_L}}}} + N_{{i{\text{\_NL}}}} }}$$

(5)

where SR_i is the success rate of a base model in subarea i; n_{i_L} is the number of historical landslides that fall into the high- and very high-susceptibility classes in subarea i; n_{i_NL} is the number of non-landslide sample points that fall into the low- and very low-susceptibility classes in subarea i; and N_{i_L} and N_{i_NL}, respectively, denote the total number of historical landslides and the total number of non-landslide sample points in subarea i.

Table 3 shows the calculated success rates of the four base models in each subarea. The largest success rate is considered to be the best evaluation result for each subarea among the four base models, which is highlighted in bold in each column of Table 3. It is found that the FA model performs best in terms of success rate in subarea I, and the SVM model performs best in the remaining four subareas. Consequently, the best evaluation results of landslide susceptibility in the five subareas are extracted to piece together the final landslide susceptibility zonation map, as shown in Fig. 5f. By selecting and combining the optimal result from the four base models in each subarea, the partition ensemble method yields better results than each base model.

Table 3 Comparison of the success rate between the four base models in each subarea

Full size table

Probability-weighted ensemble method

The probability-weighted ensemble method assigns probability weights to the outputs of the base models (Zhang et al. 2013; Hong et al. 2018a; Martinello et al. 2020). In this study, the weighting factors of the four base models are determined by the overall success rate of historical landslides as well as non-landslide sample points in the whole study area. Then the normalized weights assigned to various base models are calculated as follows (Zhang et al. 2013).

$${\text{P}} \left( {M_{i} |D} \right) = \frac{{{\text{P}} \left( {D|M_{i} } \right){\text{P}} \left( {M_{i} } \right)}}{{\sum\nolimits_{k = 1}^{K} {{\text{P}} \left( {D|M_{k} } \right){\text{P}} \left( {M_{k} } \right)} }}$$

(6)

where P(M_i|D) is the normalized posterior probability weight of the ith base model M_i; P(D|M_i) is the observation probability of M_i, which is equal to the overall success rate of historical landslides and non-landslide sample points in the whole study area obtained from M_i; P(M_i) is the prior probability weight of M_i, which is equally set to 0.25 for the four base models; D is the dataset consisting of the 1550 historical landslides and 1550 non-landslide sample points; and K is the number of base models which equals four.

Table 4 presents the overall success rates and the resulting normalized weights of the four base models. For each pixel in the study area, the landslide susceptibility levels predicted by the four base models are weighted correspondingly by the normalized weights to determine the final landslide susceptibility level of the pixel. Finally, the landslide susceptibility zonation map of the study area is produced using the probability-weighted ensemble method, as shown in Fig. 5g. This susceptibility map illustrates that most historical landslides fall into the high- and very high-susceptibility areas, and most non-landslide sample points fall into the low- and very low-susceptibility areas. A careful visual inspection reveals that the ensembled landslide susceptibility zonation maps obtained from the matrix and probability-weighted ensemble methods are very similar over the entire study area.

Table 4 Overall success rate and normalized weight of the four base models

Full size table

Results and discussion

To demonstrate the effectiveness of the proposed three ensemble methods for landslide susceptibility mapping, this section presents an extensive comparative analysis on the prediction performance of the three ensemble methods and four base models in a quantitative manner. Prediction performance has two connotations: prediction accuracy and reliability. In this study, prediction accuracy is examined by the two basic decision rules (used to define a high-quality landslide susceptibility zonation map) and the overall prediction accuracy (measured with the receiver operating characteristics curve). Reliability is characterized by the magnitude of statistical discrepancy calculated by the information entropy. Additionally, a brief discussion is provided to explicate what makes this study different from previous studies related to landslide susceptibility mapping, reveal the reasons behind the different effects of the proposed three ensemble methods, and recommend the most preferable ensemble method based upon prediction performance as well as practical operability.

Comparative analysis between the ensemble methods and base models

Two basic decision rules of a high-quality landslide susceptibility zonation map

In order to quantitatively evaluate and compare the prediction performance of the three ensemble methods and four base models, the percentage of landslide occurrence and the percentage of area occupation in each landslide susceptibility level are obtained from the produced landslide susceptibility zonation maps. The detailed statistical results are presented in Fig. 8.

A high-quality landslide susceptibility zonation map is supposed to fulfil the two basic decision rules: (1) most of the historical landslides should be located in the predicted high-susceptibility zones; (2) the high-susceptibility classes should cover small areas (Sujatha et al. 2012). In other words, the predicted highly susceptible zones are better to enclose more historical landslides and cover smaller areas. Based on the statistical results in Fig. 8, the percentage of landslide occurrence and the percentage of area occupation are calculated within the high- to very high-susceptibility classes for the three ensemble methods and four base models. The results are shown in Table 5.

Table 5 Statistical results of the four base models and three ensemble methods

Full size table

In the high- and very high-susceptibility classes obtained from the matrix ensemble method, partition ensemble method, and probability-weighted ensemble method, the percentages of landslide occurrence (i.e. success rate of historical landslides) are 96.00%, 90.26%, and 96.39%, respectively, and the percentages of area occupation are 54.71%, 35.85%, and 51.43%, respectively. Based on the two basic decision rules, it manifests that the matrix and probability-weighted ensemble methods produce relatively close results and perform better than the partition ensemble method in terms of success rate of historical landslides. The partition ensemble method yields smaller highly susceptible areas than the other two ensemble methods.

Among the four base models, the FA model has the highest success rate of historical landslides (95.23%) but also covers the largest areas (63.83%), while the SVM model covers the smallest highly susceptible areas (36.76%) but meanwhile encloses the least historical landslides (86.19%). The statistical results of the percentage of landslide occurrence and the percentage of area occupation in the high- to very high-susceptibility classes derived from the FR model and BPNN model are between that of the FA model and SVM model. The results demonstrate that none of the four base models can synchronously satisfy the above-mentioned two basic decision rules very well. Compared with the FA model, the matrix and probability-weighted ensemble methods perform better because they achieve not only higher success rate of historical landslides but also cover smaller highly susceptible areas. Compared with the SVM model, the partition ensemble method performs better because it achieves not only smaller highly susceptible areas but also higher success rate of historical landslides. The results show that the proposed three ensemble methods could better satisfy the two basic decision rules than the four base models, and thus are encouraging in improving the quality of the landslide susceptibility map for the study area.

It should be noted that although the two basic decision rules are widely used for directly assessing landslide susceptibility model performance (Sujatha et al. 2012), they only emphasize the classification accuracy and efficiency of landslide prone areas, and neglect the fitting goodness between the non-landslide sample points and the predicted stable zones, which might lead to limited insight into the full model performance. Moreover, the two basic decision rules could not identify the best ensemble method.

Overall prediction accuracy measured by the receiver operating characteristics curve

The receiver operating characteristics (ROC) curve is a standard tool for statistically measuring the overall prediction accuracy of landslide susceptibility models and the associated zonation maps by considering both landslides and non-landslides (Reichenbach et al. 2018). The ROC curve plots the true positive rate (sensitivity) on Y-axis and the false positive rate (1-specificity) on X-axis. The area under the curve (AUC) measures the probability of correct classification and thus characterizes the global accuracy of a model. An AUC value close to 1 indicates high accuracy. In this study, the ROC curve is acquired to further compare the overall prediction accuracy of the proposed three ensemble methods versus the four base models by comparing the susceptibility zonation maps to the 1550 historical landslides and 1550 non-landslide sample points.

Figure 9a depicts the ROC curves of the four base models. The FR model, FA model, BPNN model, and SVM model achieve AUC of 65.74%, 74.34%, 70.98%, and 78.60%, respectively. The AUC values are less than 80%, indicating the overall prediction accuracy of the four base models is not desirable. Nonetheless, the SVM model performs best among the four base models in terms of overall prediction accuracy. Figure 9b depicts the ROC curves obtained from the three ensemble methods. The matrix ensemble method, partition ensemble method, and probability-weighted ensemble method achieve AUC of 83.78%, 80.09%, and 82.61%, respectively. The ROC plots confirm that all the three ensemble methods have high performance (AUC > 80%) compared with the four base models (AUC < 80%). The matrix ensemble method provides the best improvement of overall prediction accuracy over the base models, followed by the probability-weighted ensemble method and partition ensemble method.

Statistical discrepancy quantification using information entropy

Reliability measures the consistency of results, which can be reasonably identified by the magnitude of statistical discrepancy (Karagrigoriou 2012). The smaller the observed discrepancy, the more consistent the evaluation results and the higher the reliability. There are various approaches to examine statistical discrepancy such as McNemar test (Kavzoglu et al. 2015), Friedman test (Zhou et al. 2018), Wilcoxon test (Sahin 2020), and information entropy theory (Grunwald and Dawid 2004).

In this study, information entropy theory is adopted to measure the discrepancy of landslide susceptibility mapping results. The map of information entropy provides a direct visualization of the discrepancy distribution in each pixel of the study area, and the average information entropy characterizes the overall statistical discrepancy in the whole study area. The information entropy in each pixel and the average information entropy in the whole study area, denoted as E(i) and E_A, respectively, are formulated as follows (Wellmann and Regenauer-Lieb 2012; Zhao et al. 2021).

$$E(i) = - \sum\limits_{l \in L} {{\text{P}}_{l} (i)} \log {\text{P}}_{l} (i)$$

(7)

$$E_{{\text{A}}} = - \frac{1}{{\left| {\mathbf{S}} \right|}}\sum\limits_{i \in S} {E(i)}$$

(8)

where P_l(i) is the probability of each landslide susceptibility level l in the pixel i; L is the set of the total five susceptibility levels; and |S| denotes the cardinality of the set S (herein |S| is equal to the total number of pixels). The information entropy in a pixel is 0 when the three ensemble methods or the four base models produce the exact same result of landslide susceptibility level, while the information entropy is the highest (i.e. equal to 1) when the methods/models result in completely distinct landslide susceptibility levels.

Figure 10 illustrates the information entropy distribution map of the four base models and the three ensemble methods. It is shown that the pixels with high entropy (the areas in red colour) in the information entropy map of the three ensemble methods are remarkably reduced compared to that of the four base models. Besides, the calculated average information entropies of the four base models and the three ensemble methods are 0.4937 and 0.2991, respectively, with a significant decrease of average information entropy after ensembles. The result indicates that the discrepancy of landslide susceptibility mapping results is effectively decreased by applying the proposed ensemble methods, even though their ensemble strategies are quite different. Hence, the uncertainty in landslide susceptibility mapping introduced by model diversity can be reduced through multi-model ensembles. From this point of view, the proposed ensemble methods can enhance the reliability of landslide susceptibility mapping in addition to increasing the prediction accuracy.

Discussion

The vast majority of previous studies related to landslide susceptibility mapping were carried out to develop optimized models for high prediction accuracy. For example, one of the most recent studies conducted by Sun et al. (2020) aimed to develop optimized random forest model involving hyper-parameter optimization using the Bayesian algorithm. Different from previous studies, this study adopts three different ensemble strategies to combine multiple well-developed landslide susceptibility models for an increase in prediction accuracy and higher reliability as well, and therefore is a systematic optimization research for landslide susceptibility mapping. Another merit of this study is that in addition to comparing the overall prediction accuracy, the reliability of the ensemble methods is further verified by measuring and comparing the statistical discrepancies between the methods/models.

As the evaluation results show, the proposed three ensemble methods exhibit different effects in improving the quality of landslide susceptibility mapping. The reason behind that is their different ensemble units and principles. As mentioned previously, the basic ensemble unit of the matrix and probability-weighted ensemble methods is the pixel, much finer than the subarea used in the partition ensemble method. As for the ensemble principles, the matrix and probability-weighted ensemble methods can fully integrate the outcomes of the four base models and produce comprehensive evaluation results. In other words, each base model contributes to the ensemble results. The partition ensemble method directly selects the result of a single base model with local optimum among the four base models. Therefore, the matrix and probability-weighted ensemble methods perform superior to the partition ensemble method.

Overall, among the three ensemble methods proposed, the matrix ensemble method is the most effective to improve the overall prediction accuracy of landslide susceptibility mapping. Besides, as it adopts a rather simple qualitative ensemble rule without complex calculation, this method is well universal and operable. To summarize, the matrix ensemble method has the best prediction performance as well as the strongest practical operability, and thus is the most preferable for regional landslide susceptibility mapping.

Conclusion

Each existing landslide susceptibility model has its specific strengths and limitations, which leads to the fact that there is no universal model that fits all geoenvironmental conditions. To improve the prediction accuracy and reliability of landslide susceptibility mapping, three different ensemble approaches for combining multiple susceptibility models are investigated in this paper, including the matrix ensemble method, partition ensemble method, and probability-weighted ensemble method. Within a GIS environment, a case study of Fengjie County located in the Three Gorges Reservoir Region, China, is conducted to illustrate the effectiveness of the proposed ensemble methods. Four of the most commonly used landslide susceptibility models are selected as base models for ensembles, including frequency ratio, fuzzy assessment, backpropagation neural network, and support vector machine models. The following conclusions can be drawn from this study.

1.
The three ensemble methods proposed can effectively improve the overall prediction accuracy of landslide susceptibility mapping by taking advantage of the base models. Compared to the four base models, the three ensemble methods better satisfy the two basic decision rules used to define a high-quality landslide susceptibility map. Moreover, the ensemble methods achieve higher overall prediction accuracy (AUC > 80%) than the four base models (AUC < 80%). In particular, the matrix ensemble method provides the best improvement in the overall prediction accuracy.
2.
The three ensemble methods proposed can also improve the reliability of landslide susceptibility mapping. The reliability of the ensemble methods is reasonably interpreted by the magnitude of statistical discrepancy of the landslide susceptibility mapping results which is measured by information entropy. The calculated average information entropy of the three ensemble methods is 0.2991, which is significantly smaller than that of the four base models (i.e. 0.4937). The results demonstrate that the proposed three ensemble methods could produce more consistent and reliable results. Therefore, the proposed multi-model ensemble methods are powerful tools for improving not only prediction accuracy but also reliability in landslide susceptibility mapping.
3.
Comparison between the three ensemble methods demonstrates that the matrix ensemble method is most advantageous in both prediction accuracy and practical operability, followed by the probability-weighted ensemble method and partition ensemble method. The final landslide susceptibility zonation map produced by the matrix ensemble method more accurately portrays the spatial likelihood of landslides occurring in Fengjie County. The percentages of area occupation in the very high-, high-, medium-, low-, and very low-landslide susceptibility classes are 34.42%, 20.30%, 10.90%, 8.74%, and 25.64%, respectively. This refined landslide hazard map can better facilitate local land planning, early warning, landslide mitigation, and other practical applications.

References

Achour Y, Boumezbeur A, Hadji R, Chouabbi A, Cavaleiro V, Bendaoud EA (2017) Landslide susceptibility mapping using analytic hierarchy process and information value methods along a highway road section in Constantine. Algeria Arab J Geosci 10(8):194
Article Google Scholar
Aghdam IN, Varzandeh MHM, Pradhan B (2016) Landslide susceptibility mapping using an ensemble statistical index (Wi) and adaptive neuro-fuzzy inference system (ANFIS) model at Alborz Mountains (Iran). Environ Earth Sci 75:553
Article Google Scholar
Ahmed B, Rahman M, Islam R, Sammonds P, Zhou C, Uddin K, Al-Hussaini TM (2018) Developing a dynamic Web-GIS based landslide early warning system for the Chittagong Metropolitan Area, Bangladesh. ISPRS Int J Geo-Inf 7(12):485
Article Google Scholar
Awais M, Li W, Cheema MJM, Hussain S, Ali A (2021a) Assessment of optimal flying height and timing using high-resolution unmanned aerial vehicle images in precision agriculture. Int J Environ Sci Technol 7:1–18
Google Scholar
Awais M, Li W, Cheema MJM, Hussain S, Ali A (2021b) Remotely sensed identification of canopy characteristics using uav-based imagery under unstable environmental conditions. Environ Technol Innov 1:101465
Article Google Scholar
Baldassarre GD, Castellarin A, Montanari A, Brath A (2009) Probability-weighted hazard maps for comparing different flood risk management strategies: a case study. Nat Hazards 50:479–496
Article Google Scholar
Barzegar R, Moghaddam AA, Deo R, Fijani E, Tziritis E (2018) Mapping groundwater contamination risk of multiple aquifers using multi-model ensemble of machine learning algorithms. Sci Total Environ 621:697–712
Article CAS Google Scholar
Budimir MEA, Atkinson PM, Lewis HG (2015) A systematic review of landslide probability mapping using logistic regression. Landslides 12(3):419–436
Article Google Scholar
Bui DT, Lofman O, Revhaug I, Dick O (2011) Landslide susceptibility analysis in the Hoa Binh province of Vietnam using statistical index and logistic regression. Nat Hazards 59(3):1413
Article Google Scholar
Bui DT, Pradhan B, Lofrnan O, Revhaug I, Dick OB (2012) Landslide susceptibility assessment in the Hoa Binh province of Vietnam: a comparison of the Levenberg–Marquardt and Bayesian regularized neural networks. Geomorphology 171–172:12–29
Google Scholar
Cheng Z, Gong WP, Tang HM, Juang CH, Deng Q, Chen J, Ye X (2021) UAV photogrammetry-based s assessment of the behavior of a landslide in Guizhou. China. Eng Geol 289:106172
Article Google Scholar
Christianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other Kernel-based learning methods. Cambridge University Press, London
Book Google Scholar
Dagdelenler G, Nefeslioglu HA, Gokceoglu C (2016) Modification of seed cell sampling strategy for landslide susceptibility mapping: an application from the Eastern part of the Gallipoli Peninsula (Canakkale, Turkey). B Eng Geol Environ 75(2):575–590
Article Google Scholar
Ercanoglu M, Gokceoglu C (2002) Assessment of landslide susceptibility for a landslide-prone area (north of Yenice, NW Turkey) by fuzzy approach. Environ Geol 41(6):720–730
Article Google Scholar
Gemitzi A, Falalakis G, Eskioglou P, Petalas C (2011) Evaluating landslide susceptibility using environmental factors, fuzzy membership functions and GIS. Global NEST J 13(1):28–40
Google Scholar
Goetz JN, Guthrie RH, Brenning A (2011) Integrating physical and empirical landslide susceptibility models using generalized additive models. Geomorphology 129(3–4):376–386
Article Google Scholar
Goetz JN, Brenning A, Petschko H, Leopold P (2015) Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput Geosci-UK 81:1–11
Article Google Scholar
Gong WP, Juang CH, Wasowski J (2021) Geohazards and human settlements: lessons learned from multiple relocation events in Badong, China-engineering geologist’s perspective. Eng Geol 285(7724):106051
Article Google Scholar
Grunwald PD, Dawid AP (2004) Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory. Ann Stat 32(4):1367–1433
Article Google Scholar
Guzzetti F, Reichenbach P, Cardinali M, Galli M, Ardizzone F (2005) Probabilistic landslide hazard assessment at the basin scale. Geomorphology 72:272–299
Article Google Scholar
He QF, Shahabi H, Shirzadi A, Li SJ, Chen W, Wang NQ, Chai HC, Bian HY, Ma JQ, Chen YT, Wang XJ, Chapi K, Ahmad BB (2019) Landslide spatial modelling using novel bivariate statistical based Naïve Bayes, RBF Classifier, and RBF Network machine learning algorithms. Sci Total Environ 663:1–15
Article CAS Google Scholar
Hembram TK, Paul GC, Saha S (2020) Modelling of gully erosion risk using new ensemble of conditional probability and index of entropy in Jainti River basin of Chotanagpur Plateau Fringe Area, India. Appl Geomat 12(11):337–360
Article Google Scholar
Hong H, Liu J, Bui DT, Pradhan B, Acharya TD, Pham BT, Zhu AX, Chen W, Ahmad BB (2018a) Landslide susceptibility mapping using J48 decision tree with AdaBoost, bagging and rotation Forest ensembles in the Guangchang area (China). CATENA 163:399–413
Article Google Scholar
Hong H, Pradhan B, Sameen MI, Kalantar B, Zhu A, Chen W (2018b) Improving the accuracy of landslide susceptibility model using a novel region-partitioning approach. Landslides 15(4):753–772
Article Google Scholar
Huang Y, Zhao L (2018) Review on landslide susceptibility mapping using support vector machines. CATENA 165:520–529
Article Google Scholar
Ilia I, Tsangaratos P (2016) Applying weight of evidence method and sensitivity analysis to produce a landslide susceptibility map. Landslides 13(2):379–397
Article Google Scholar
Jaafari A, Najafi A, Pourghasemi HR, Rezaeian J, Sattarian A (2014) GIS-based frequency ratio and index of entropy models for landslide susceptibility assessment in the Caspian forest, northern Iran. Int J Environ Sci Technol 11(4):909–926
Article Google Scholar
Jenks GF (1967) The data model concept in statistical mapping. Int Yearb Cartogr 7:186–190
Google Scholar
Juliev M, Mergili M, Mondal I, Nurtaev B, Pulatov A, Hübl J (2019) Comparative analysis of statistical methods for landslide susceptibility mapping in the Bostanlik District, Uzbekistan. Sci Total Environ 653:801–814
Article CAS Google Scholar
Kalantar B, Pradhan B, Naghibi SA, Motevalli A, Mansor S (2018) Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomat Nat Haz Risk 9(1):49–69
Article Google Scholar
Karagrigoriou A (2012) Measures of information and discrepancy in reliability engineering. Math Eng Sci Aerosp (MESA) 3(4):367–379
Google Scholar
Kavzoglu T, Sahin EK, Colkesen I (2015) Selecting optimal conditioning factors in shallow translational landslide susceptibility mapping using genetic algorithm. Eng Geol 192:101–112
Article Google Scholar
Kayastha P, Dhital MR, De Smedt F (2013) Application of the analytical hierarchy process (AHP) for landslide susceptibility mapping: a case study from the Tinau watershed, west Nepal. Comput Geosci-UK 52:398–408
Article Google Scholar
Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207
Article Google Scholar
Martinello C, Cappadonia C, Conoscenti C, Agnesi V, Rotigliano E (2020) Optimal slope units partitioning in landslide susceptibility mapping. J Maps. https://doi.org/10.1080/17445647.2020.1805807
Article Google Scholar
Melchiorre C, Matteucci M, Azzoni A, Zanchi A (2008) Artificial neural networks and cluster analysis in landslide susceptibility zonation. Geomorphology 94(3–4):379–400
Article Google Scholar
Meneses BM, Pereira S, Reis E (2019) Effects of different land use and land cover data on the landslide susceptibility zonation of road networks. Nat Hazard Earth Syst 19(3):471–487
Article Google Scholar
Micheletti N, Foresti L, Robert S, Leuenberger M, Pedrazzini A, Jaboyedoff M, Kanevski M (2014) Machine learning feature selection methods for landslide susceptibility mapping. Math Geosci 46:33–57
Article Google Scholar
Moayedi H, Mehrabi M, Mosallanezhad M, Rashid ASA, Pradhan B (2019) Modification of landslide susceptibility mapping using optimized PSO-ANN technique. Eng Comput-Germany 35(3):967–984
Article Google Scholar
Mohammady M, Pourghasemi HR, Pradhan B (2012) Landslide susceptibility mapping at Golestan Province, Iran: a comparison between frequency ratio, Dempster-Shafer, and weights-of-evidence models. J Asian Earth Sci 61:221–236
Article Google Scholar
Mojaddadi H, Pradhan B, Nampak H, Ahmad N, Ghazali AHB (2017) Ensemble machine-learning-based geospatial approach for flood risk assessment using multi-sensor remote-sensing data and GIS. Geomat Nat Hazards Risk 8:1080–1102
Article Google Scholar
Ozdemir A, Altural T (2013) A comparative study of frequency ratio, weights of evidence and logistic regression methods for landslide susceptibility mapping: Sultan Mountains, SW Turkey. J Asian Earth Sci 64:180–197
Article Google Scholar
Park SJ, Lee CW, Lee S, Lee MJ (2018) Landslide susceptibility mapping and comparison using decision tree models: a case study of Jumunjin Area. Korea Remote Sens 10(10):1545
Article Google Scholar
Pham BT, Pradhan B, Bui DT, Prakash I, Dholakia MB (2016) A comparative study of different machine learning methods for landslide susceptibility assessment: a case study of Uttarakhand area (India). Environ Model Softw 84:240–250
Article Google Scholar
Pourghasemi HR, Pradhan B, Gokceoglu C (2012) Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed, Iran. Nat Hazards 63(2):965–996
Article Google Scholar
Pourghasemi HR, Jirandeh AG, Pradhan B, Xu C, Gokceoglu C (2013) Landslide susceptibility mapping using support vector machine and GIS at the Golestan Province, Iran. J Earth Syst Sci 122(2):349–369
Article Google Scholar
Pourghasemi HR, Yansari ZT, Panagos P, Pradhan B (2018) Analysis and evaluation of landslide susceptibility: a review on articles published during 2005–2016 (periods of 2005–2012 and 2013–2016). Arab J Geosci 11(9):193
Article Google Scholar
Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci-UK 51:350–365
Article Google Scholar
Pradhan B, Lee S (2010) Regional landslide susceptibility analysis using back-propagation neural network model at Cameron Highland, Malaysia. Landslides 7(1):13–30
Article Google Scholar
Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180:60–91
Article Google Scholar
Roodposhti MS, Rahimi S, Beglou MJ (2014) PROMETHEE II and fuzzy AHP: an enhanced GIS-based landslide susceptibility mapping. Nat Hazards 73(1):77–95
Article Google Scholar
Saha S, Saha A, Hembram TK, Pradhan B, Alamri AM (2020) Evaluating the performance of individual and novel ensemble of machine learning and statistical models for landslide susceptibility assessment at Rudraprayag District of Garhwal Himalaya. Appl Sci 10(11):3772
Article CAS Google Scholar
Sahin EK (2020) Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Appl Sci 2(7):1–17
Article CAS Google Scholar
Samanta S, Pal DK, Palsamanta B (2018) Flood susceptibility analysis through remote sensing, GIS and frequency ratio model. Appl Water Sci 8(2):66
Article Google Scholar
Shahabi H, Hashim M (2015) Landslide susceptibility mapping using GIS-based statistical models and remote sensing data in tropical environment. Sci Rep-UK 5(1):1–15
Google Scholar
Shahabi H, Shirzadi A, Ghaderi K, Omidvar E, Al-Ansari N, Clague JJ, Geertsema M, Khosravi K, Amini A, Bahrami S, Rahmati O, Habibi K, Mohammadi A, Nguyen H, Melesse AM, Ahmad BB, Ahmad A (2020) Flood detection and susceptibility mapping using Sentinel-1 remote sensing data and a machine learning approach: hybrid intelligence of bagging ensemble based on K-nearest neighbor classifier. Remote Sens 12(2):266
Article Google Scholar
Shano L, Raghuvanshi TK, Meten M (2021) Landslide susceptibility mapping using frequency ratio model: the case of Gamo highland, South Ethiopia. Arab J Geosci 14(7):623
Article Google Scholar
Shao H, Sun X, Lin Y, Xian W, Qi J (2021) A method for spatio-temporal process assessment of eco-geological environmental security in mining areas using catastrophe theory and projection pursuit model. Prog Phys Geog 1:030913332098254
Google Scholar
Sujatha ER, Rajamanickam GV, Kumaravel P (2012) Landslide susceptibility analysis using probabilistic certainty factor approach: a case study on Tevankarai stream watershed, India. J Earth Syst Sci 121(5):1337–1350
Article Google Scholar
Sun DL, Wen HJ, Wang DZ, Xu JH (2020) A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm. Geomorphology 362:107201
Article Google Scholar
Tang HM, Wasowski J, Juang CH (2019) Geohazards in the three Gorges Reservoir Area, China-lessons learned from decades of research. Eng Geol 261:105267
Article Google Scholar
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Book Google Scholar
Vorpahl P, Elsenbeer H, Märker M, Schröder B (2012) How can statistical models help to determine driving factors of landslides? Ecol Model 239:27–39
Article Google Scholar
Wei LW, Huang CM, Chen H, Lee CT, Chi CC, Chiu CL (2018) Adopting the I3–R24 rainfall index and landslide susceptibility for the establishment of an early warning model for rainfall-induced shallow landslides. Nat Hazard Earth Sys 18:1717–1733
Article Google Scholar
Wellmann JF, Regenauer-Lieb K (2012) Uncertainties have a meaning: information entropy as a quality measure for 3-D geological models. Tectonophysics 526–529:207–216
Article Google Scholar
Wen HJ, Zhang YY, Duan GF, Fu HM, Xie P, Zhou P, Yang Y (2017) Quantitative assessment of rainfall-induced landslide susceptibility in new urban area of Fengjie County, Three Gorges area, China. Nat Hazards Earth Syst Sci Discuss 1:21. https://doi.org/10.5194/nhess-2017-99
Article Google Scholar
You Y, Wang YD, Li SY (2021) Effects of eco-policy on Kuwait based upon data envelope analysis. Environ Dev Sustain 1:1–14
Google Scholar
Zhang J, Zhang LM, Huang HW (2013) Evaluation of generalized linear models for soil liquefaction probability prediction. Environ Earth Sci 68(7):1925–1933
Article Google Scholar
Zhao X, Chen W (2020) Optimization of computational intelligence models for landslide susceptibility evaluation. Remote Sens 12(14):2180
Article Google Scholar
Zhao C, Gong WP, Li TZ, Juang CH, Tang HM, Wang H (2021) Probabilistic characterization of subsurface stratigraphic configuration with modified random field approach. Eng Geol 288(2):106138
Article Google Scholar
Zhou C, Yin K, Cao Y, Ahmed B, Li YY, Catani F, Pourghasemi HR (2018) Landslide susceptibility modeling applying machine learning methods: a case study from Longju in the Three Gorges Reservoir area, China. Comput Geosci-UK 112:23–37
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 41977242), the Major Program of National Natural Science Foundation of China (Grant No. 42090055), the Science and Technology Innovation Project of Chongqing Social Work and People’s Livelihood Guarantee (Grant No. cstc2017shmsA00002), and the Special Funding for Chongqing Postdoctoral Research Projects (Grant No. Xm2017006). These financial supports are gratefully acknowledged.

Author information

Authors and Affiliations

Faculty of Engineering, China University of Geosciences, Wuhan, 430074, Hubei, China
W. Gong, M. Hu, Y. Zhang & H. Tang
Chongqing Bureau of Geology and Minerals Exploration, Chongqing, 400039, China
D. Liu
Chongqing Key Laboratory of Geomechanics and Geoenvironment Protection, Army Logistics University, Chongqing, 401311, China
Q. Song
Chongqing Institute of Geology and Mineral Resources, Chongqing, 400042, China
Q. Song

Authors

W. Gong
View author publications
You can also search for this author in PubMed Google Scholar
M. Hu
View author publications
You can also search for this author in PubMed Google Scholar
Y. Zhang
View author publications
You can also search for this author in PubMed Google Scholar
H. Tang
View author publications
You can also search for this author in PubMed Google Scholar
D. Liu
View author publications
You can also search for this author in PubMed Google Scholar
Q. Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Y. Zhang.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Editorial responsibility: Shahid Hussain.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gong, W., Hu, M., Zhang, Y. et al. GIS-based landslide susceptibility mapping using ensemble methods for Fengjie County in the Three Gorges Reservoir Region, China. Int. J. Environ. Sci. Technol. 19, 7803–7820 (2022). https://doi.org/10.1007/s13762-021-03572-z

Download citation

Received: 11 February 2021
Revised: 19 May 2021
Accepted: 19 July 2021
Published: 03 August 2021
Issue Date: August 2022
DOI: https://doi.org/10.1007/s13762-021-03572-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

GIS-based landslide susceptibility mapping using ensemble methods for Fengjie County in the Three Gorges Reservoir Region, China

Abstract

Similar content being viewed by others

Landslide susceptibility modeling based on GIS and ensemble techniques

Comparisons of several methods for landslide susceptibility mapping: case of the Benzilan and Waka Towns, Southwest China

GIS-based ensemble modelling of fuzzy system and bivariate statistics as a tool to improve the accuracy of landslide susceptibility mapping

Explore related subjects

Introduction

Materials and methods

Framework for implementation of ensemble methods

Study area and database

Geological and geomorphological settings in Fengjie County

Establishment and preliminary analysis of landslide database

Sensitivity analysis of conditioning factors

Four base models selected for ensembles

Frequency ratio (FR) model

Fuzzy assessment (FA) model

Backpropagation neural network (BPNN) model

Support vector machine (SVM) model

Three types of ensemble methods proposed

Matrix ensemble method

Partition ensemble method

Probability-weighted ensemble method

Results and discussion

Comparative analysis between the ensemble methods and base models

Two basic decision rules of a high-quality landslide susceptibility zonation map

Overall prediction accuracy measured by the receiver operating characteristics curve

Statistical discrepancy quantification using information entropy

Discussion

Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation