Novel hybrid artificial intelligence approach of bivariate statistical-methods-based kernel logistic regression classifier for landslide susceptibility modeling

Chen, Wei; Shahabi, Himan; Shirzadi, Ataollah; Hong, Haoyuan; Akgun, Aykut; Tian, Yingying; Liu, Junzhi; Zhu, A-Xing; Li, Shaojun

doi:10.1007/s10064-018-1401-8

Novel hybrid artificial intelligence approach of bivariate statistical-methods-based kernel logistic regression classifier for landslide susceptibility modeling

Original Paper
Published: 09 November 2018

Volume 78, pages 4397–4419, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Bulletin of Engineering Geology and the Environment Aims and scope Submit manuscript

Novel hybrid artificial intelligence approach of bivariate statistical-methods-based kernel logistic regression classifier for landslide susceptibility modeling

Download PDF

Wei Chen ORCID: orcid.org/0000-0002-5825-1422^1,2,
Himan Shahabi³,
Ataollah Shirzadi⁴,
Haoyuan Hong^5,6,7,
Aykut Akgun⁸,
Yingying Tian⁹,
Junzhi Liu^5,6,7,
A-Xing Zhu^5,6,7 &
…
Shaojun Li¹⁰

2253 Accesses
149 Citations
Explore all metrics

Abstract

Globally, and in China, landslides constitute one of the most important and frequently encountered natural hazard events. In the present study, landslide susceptibility evaluation was undertaken using novel ensembles of bivariate statistical-methods-based (evidential belief function (EBF), statistical index (SI), and weights of evidence (WoE)) kernel logistic regression machine learning classifiers. A landslide inventory comprising 222 landslides and 15 conditioning factors (slope angle, slope aspect, altitude, plan curvature, profile curvature, stream power index, sediment transport index, topographic wetness index, distance to rivers, distance to roads, distance to faults, NDVI, land use, lithology, and rainfall) was prepared as the spatial database. Correlation analysis and selection of conditioning factors were conducted using multicollinearity analysis and classifier attribute evaluation methods, respectively. The receiver operating characteristic curve method was used to validate the models. The areas under the success rate (AUC_T) and prediction rate (AUC_P) curves and landslide density analysis were also used to assess the prediction capability of the landslide susceptibility maps. Results showed that the EBF-KLR hybrid model had the highest predictive capability in landslide susceptibility assessment (AUC values of 0.814 and 0.753 for the training and validation datasets, respectively; AUC_T value of 0.8511 and AUC_P value of 0.7615), followed in descending order by the SI-KLR and WoE-KLR hybrid models. These findings indicate that hybrid models could improve the predictive capability of bivariate models, and that the EBF-KLR is a promising hybrid model for the spatial prediction of landslides in susceptible areas.

Application of Naive Bayes, kernel logistic regression and alternation decision tree for landslide susceptibility mapping in Pengyang County, China

Article 25 May 2024

Spatial prediction of landslide susceptibility using data mining-based kernel logistic regression, naive Bayes and RBFNetwork models for the Long County area (China)

Article 13 March 2018

A comparative study of the bivariate, multivariate and machine-learning-based statistical models for landslide susceptibility mapping in a seismic-prone region in China

Article 06 March 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Geoengineering

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Landslides are important natural hazard events that occur frequently in China and around the world. Steep topography, heavy precipitation, weak lithological units, adverse anthropologic treatments to land, and earthquakes are among the factors primarily responsible for landslide occurrence (Althuwaynee et al. 2015; Hong et al. 2017; Ma et al. 2015; Yuan et al. 2013, 2015, 2016). Because the occurrence location, size, and volume of landslides are reasonably predictable parameters, the potential for mitigation of their adverse effects is much greater compared with earthquakes. Specifically, landslide susceptibility maps that show the spatial occurrence probability of such events have been used for regional land use management by decision makers because of their effectiveness and ease of production. In this context, many studies conducted in the last two decades have focused on landslide susceptibility mapping.

Close inspection of published reports of landslide susceptibility studies reveals that several datasets and assessment methodologies have been developed and discussed (Broeckx et al. 2018; Hong et al. 2018; Pham et al. 2018; Pourghasemi and Rahmati 2018; Reichenbach et al. 2018; Shirzadi et al. 2017). Although there is no consensus regarding the optimal dataset and assessment methodology, some datasets (e.g., slope angle, lithology, and land use/cover) have been accepted widely as fundamental in landslide susceptibility mapping (Youssef et al. 2015). Certain assessment methodologies have also been adopted in many landslide susceptibility studies, e.g., the analytical hierarchy process (Kumar and Anbalagan 2016; Pourghasemi and Rossi 2016), frequency ratio (Regmi et al. 2014; Wang et al. 2016), statistical index (SI) (Nasiri Aghdam et al. 2016; Zhang et al. 2016a), evidential belief function (EBF) (Ding et al. 2017; Pourghasemi and Kerle 2016; Zhang et al. 2016b), logistic regression (LR) (Raja et al. 2017; Tsangaratos et al. 2017), and weights of evidence (WoE) (Ding et al. 2017; Wang et al. 2016). Given this variety in datasets and methodologies, it is important to compare the results obtained by different methods and datasets to determine the optimal combination.

In addition to the above statistical methods, more sophisticated machine learning methods, such as artificial neural networks (Chen et al. 2017b; Tien Bui et al. 2016; Yilmaz 2010), kernel logistic regression (KLR) (Tien Bui et al. 2016), support vector machine (Chen et al. 2017c; Pham et al. 2015; Pradhan 2013; Tien Bui et al. 2016), random forests (Chen et al. 2017g; Hong et al. 2016; Pourghasemi and Kerle 2016), decision trees (Althuwaynee et al. 2014; Hong et al. 2015; Pradhan 2013), multivariate adaptive regression splines (Chen et al. 2017d; Pourghasemi and Rossi 2016), and derivative approaches of artificial neural networks (Chen et al. 2017a; Nasiri Aghdam et al. 2016; Pradhan 2013; Tien Bui et al. 2012) have also become popular assessment methodologies through integration with developing GIS technologies.

Two of the major drawbacks of bivariate statistical approaches, such as EBF, SI, and WOE, are that strict assumptions must be defined prior to conducting any study (Benediktsson et al. 1989) and that the relationships between conditioning factors are largely neglected. Conversely, machine learning methods do not require any statistical assumptions and they are capable of handling data with various measurement scales; however, they cannot be used to evaluate the relationships between individual factor classes and landslides.

Given the above, it may be concluded that complex and nonlinear problems could be handled using ensemble methods (Tehrany et al. 2014). In this context, the main aim of this study was to investigate the effectiveness of the ensemble methodologies of KLR with bivariate EBF, SI, and WoE models based on comparison of the results obtained. The second purpose, of course, was to build a landslide susceptibility map for the study area that could be used by local decision makers for effective land use planning purposes. The investigation of the use of the EBF, SI, WoE, and KLR ensembles constitutes the novelty of this study.

Materials and methods

The methodology design comprised five steps: (1) spatial data preparation including landslide inventory and conditioning factors; (2) estimation of the EBF, SI, and WOE methods; (3) selection of conditioning factors; (4) construction of landslide susceptibility maps using three bivariate models and three ensemble models; and (5) assessment and validation of model performance (Fig. 1).

Study area

The study area (Chongren County), which is located in the region 27°25′N–27°56′N, 115°49′E–116°17′E, covers an area of about 1520 km² in Jiangxi Province (China) (Fig. 2). Chongren County has a subtropical monsoon climate. The average annual temperature is 17.7 °C (Hong et al. 2017). The high frequency of intense rainfall during April–August accounts for 79.5% of the annual total. The average rainfall in May and June is 265 and 305 mm, respectively.

The rivers in Chongren County belong to the Fu River system. The total flow path is up to 910 km, and the drainage density is 0.6 km⁻². The main rivers within the study area are the Chongren and Yihuang rivers. Geologically, the Chongren area is located within the depression belt uplift in central–southern Jiangxi Province, and it is a transition zone between the Yu Mountains and the Gan-Fu Plain. The strata outcropped in the study area are mainly pre-Sinian, Sinian, Cambrian, Carboniferous, Triassic, Jurassic, Cretaceous, and Quaternary. The main lithologies are limestone, shale, sandstone, slate, and igneous rocks (Fig. 3).

Database

Landslide inventory

The compilation of a landslide inventory is the first step in landslide susceptibility modeling, and various methods for this process have been applied in different studies (Harp et al. 2011; Moosavi et al. 2014). Landslide inventory maps are effective and easily comprehensible products for geomorphologists, decision makers, planners, and civil defense managers (Galli et al. 2008). However, the advantages and limitations of applying new remote sensing data and technologies in the production of landslide inventory maps have been discussed in previous work (Guzzetti et al. 2012). Thus, in light of the above analysis, this study adopted field surveys, historical records, and high-resolution satellite images coupled with Google Earth™ technology to produce the landslide inventory map.

In the current study, 222 landslide events were identified and mapped with projected area in the Chongren area. Through investigation of the landslide inventory map, the largest landslide was found to be 15,000 m², the smallest landslide was 2.5 m², and the average was 841.3 m² (Hong et al. 2017). In the Chongren area, local government reports show only 19.1% of the total number of landslides are large-sized landslides (>800 m²) that affect 1365 people (http://www.jxcr.gov.cn/). Medium-sized (200–800 m²) landslides account for 25.4% of the total, and they affect 1019 people. Small-sized landslides (<200 m²) account for 55.5% of the total, and they affect 875 people.

Conditioning factors

The causes of landslide development and occurrence are complex and diverse, and there is no clear agreement with respect to the precise reasons for their manifestation (Domínguez-Cuesta et al. 2007). The complex nature of the development of landslides (Jiménez Sánchez et al. 1999) has caused many researchers to investigate how landslide occurrence might be affected by various conditioning factors, e.g., the topographical, geological, and environmental conditions (Zêzere et al. 1999). Therefore, the selection of appropriate conditioning factors is a challenging task. Some previous studies have assumed that the use of increased numbers of conditioning factors would enhance the precision of a landslide susceptibility map (van Westen et al. 2003). However, other research has indicated that conditioning factors with reasonable quality are necessary for producing accurate landslide susceptibility maps (Jebur et al. 2014). Thus, according to a literature review (Broeckx et al. 2018; Pourghasemi 2014; Reichenbach et al. 2018) and our actual analysis of the geo-environmental characteristics of the study area and data availability, this study considered 15 landslide conditioning factors that were grouped into three categories: topographical, geological, and environmental.

Topographical factors

Topographical factors, such as slope angle, slope aspect, altitude, plan curvature, profile curvature, stream power index (SPI), sediment transport index (STI), and topographic wetness index (TWI), were derived based on 1:50,000 topographic maps (http://www.jxgtt.gov.cn/). Among them, slope angle was used to classify the degree of steepness of hills and mountains (Iwahashi et al. 2003). The initial slope angle is an important factor that affects the peak strength of the slope material, and it controls the source of material available for landslides (Chen et al. 2016). Therefore, in this study, slope was selected as a conditioning factor.

Slope aspect is defined as the direction in which a slope faces and it relates to the degree of solar exposure. Aspect also influences both the vegetation coverage and the daily ranges of temperature and relative humidity of a slope (Jonathan et al. 2006). Many articles have discussed the relationship between slope aspect and landslides; however, there is a lack of consensus regarding its adoption as a conditioning factor. Because slope aspect has been shown to influence landslides triggered by rainfall (Beullens et al. 2014), slope aspect was selected as a conditioning factor in this study.

Altitude is defined as the elevation above a ground reference point, which is commonly the terrain elevation. Altitude is considered an important landslide conditioning factor because of its gravitational potential energy.

Plan curvature influences the convergence and divergence of flow across a surface. Profile curvature affects the acceleration and deceleration of downslope flows, and it influences the processes of erosion and deposition (Kritikos and Davies 2015). These two factors were also accepted as conditioning factors in this study.

SPI is a term that describes the potential flow erosion of the topographic surface at a given point. STI also characterizes the processes of erosion and deposition. TWI can be used to quantify the effects of hydrological processes in relation to topography. Therefore, these three factors were also accepted as conditioning factors.

Geological factors

The lithological data were collected from the China Geology Survey (http://www.cgs.gov.cn/) (1:200,000 scale). The lithology map was reclassified into ten groups according to their geological ages and lithofacies (Hong et al. 2017). The distance to fault map was constructed by generating buffers along the fault lines using ArcGIS software (ESRI 2014).

Environmental factors

The NDVI was derived from Landsat-8 Operational Land Imagery (Path/Row: 121/41; date: November 01, 2017; Product ID: LC81210412017305LGN00; available at http://www.gscloud.cn). The value of the NDVI was estimated using the formula: NDVI = (NIR − R)/(NIR + R), where NIR and R are the near-infrared band and red band, respectively. The land use map was also obtained from the same Landsat 7/ETM+ satellite images. Land use was classified into six categories: residential, bare, water, forest, farmland, and grass. The distance to rivers and the distance to roads maps were also constructed by buffering 1:50,000-scale topographic maps.

The rainfall data were provided by the Jiangxi Province Meteorological Bureau (http://www.weather.org.cn). The mean annual precipitation data for the period of 1960–2012 at 18 rainfall stations were used to construct the rainfall map by application of the inverse distance weighted method (Hong et al. 2017).

Finally, all landslide conditioning factors were converted into raster format with 25-m spatial resolution for application with the models (Fig. 4a–o). The detailed classification of the landslide conditioning factors is shown in Table 1. The area grid comprised 2286 rows by 1782 columns, which corresponded to 2,427,151 cells, 222 of which included landslide occurrences.

Table 1 Classes of landslide conditioning factors

Full size table

Methods

Evidential belief function (EBF)

In 1967, Dempster first proposed the basis of the Dempster–Shafer theory of evidence (Dempster 1967), which was developed further by Shafer in 1976 (Shafer 1976). This method incorporates four basic EBFs: degrees of belief (Bel), disbelief (Dis), uncertainty (Unc), and plausibility (Pls), of which Bel = low probability and Pls = upper probability constitute the main elements of the theory (Dempster 1967). Unc represents the ignorance of one’s belief in a proposition based on given evidence and its value is Pls – Bel. Dis is the belief that a proposition is not true based on given evidence, the value of which is equal to 1 – Pls or 1 – Bel – Unc. The EBF method is popular in many fields of study, such as forest fire susceptibility mapping (Pourghasemi 2016), landslide susceptibility mapping (Ding et al. 2017; Pourghasemi and Kerle 2016; Pradhan et al. 2014; Tien Bui et al. 2015), and groundwater potential mapping (Mogaji et al. 2016; Tahmassebipoor et al. 2016). The estimation of EBFs can be calculated as follows:

$$ Bel(Cij)=\frac{W_{C_{ij(landslide)}}}{\sum_{j=1}^n{W}_{C_{ij(landslide)}}}, $$

(1)

$$ {W}_{C_{ij}(landslide)=}\frac{N\left(T\cap {C}_{ij}\right)/N(T)}{\left[N\left({C}_{ij}\right)-N\left(T\cap {C}_{ij}\right)\right]/\left[N(C)-N(T)\right]}. $$

(2)

The numerator in Eq. (2) is the proportion of landslide pixels that occur in factor class C_ij, and the denominator is the proportion of non-landslide pixels that occur in factor class C_ij. $ {W}_{C_{ij(landslide)}} $ is the weight of C_ij that supports the belief that landslides are present more than absent.

$$ {W}_{C_{ij}\left( non- landslide\right)=}\frac{\left[N\left({C}_{ij}\right)-N\left(T\cap {C}_{ij}\right)\right]/N(T)}{\left[N(C)-N(T)-N\left({C}_{ij}\right)+N\left(T\cap {C}_{ij}\right)\right]/\left[N(C)-N(T)\right]}. $$

(3)

The numerator in Eq. (3) is the proportion of landslide pixels that do not occur in factor class C_ij, and the denominator is the proportion of non-landslide pixels in other attributes outside factor class C_ij. $ {W}_{C_{ij}\left( non- landslide\right)} $ is the weight of C_ij that supports the belief that landslides are absent more than present. Therefore, we have the following equations:

$$ Dis\left({C}_{ij}\right)=\frac{W_{C_{ij}\left( non- landslide\right)}}{\sum_{j=1}^n{W}_{C_{ij\left( non- landslide\right)}}}, $$

(4)

$$ Un{c}_{Cij}=1- Be{l}_{Cij}- Di{s}_{Cij}, $$

(5)

$$ Pl{s}_{Cij}= Be{l}_{Cij}+ Un{c}_{Cij}. $$

(6)

Statistical index (SI)

The SI was proposed by van Westen (1997). In the SI method, a weight value of a parameter class is characterized by the natural logarithm of the landslide density in the class divided by the landslide density in the entire map. The equation to calculate the weights is as follows (van Westen 1997):

$$ {W}_{SI}=\ln \left(\frac{Densclass}{Densmap}\right)=\ln \left[\raisebox{1ex}{$\frac{N(Si)}{N(Ni)}$}\!\left/ \!\raisebox{-1ex}{$\frac{\sum N(Si)}{\sum N(Ni)}$}\right.\right], $$

(7)

where W_SI is the weight for the given parameter class, Densclass is the landslide density within the parameter class, and Densmap is the landslide density within the entire map. N(Si) is the number of landslide pixels in parameter class i, and N(Ni) is the number of pixels in the same parameter class.

Weights of evidence (WoE)

As one of the most popular models, the WOE method adopts the Bayesian theory of conditional probability to quantify spatial associations between evidence layers and known mineral occurrences (Agterberg 1989; Bonham-Carter 1994). In this study, we use the WOE for modeling large-scale landslide susceptibility spatial prediction. Recently, many researchers have applied WoE in various ways, such as mineral prospective mapping (Zeghouane et al. 2016), flood susceptibility (Rahmati et al. 2016), landslide susceptibility mapping (Ding et al. 2017), and groundwater potential (Mogaji et al. 2016; Tahmassebipoor et al. 2016). It is worth noting that conditional independence is the most important issue to be considered in the WOE method (Zhang et al. 2014). Hence, the WOE is determined by the calculation of positive and negative weights W⁺ and W⁻, which can be expressed as follows:

$$ {W}^{+}=\ln \frac{p\left\{B|A\right\}}{p\left\{B|\overline{A}\right\}}, $$

(8)

$$ {W}^{-}=\ln \frac{p\left\{\overline{B}|A\right\}}{p\left\{\overline{B}|\overline{A}\right\}}. $$

(9)

In the above two equations, p represents the probability, ln is the natural log, B is the presence of a potential landslide predictive factor, $ \overline{B} $ is the absence of a potential landslide predictive factor, A is the presence of a landslide, and $ \overline{A} $ is the absence of a landslide. Thus, W⁺ indicates that the predictable variable is present at the landslide locations and W⁻ indicates the absence of the predictable variable. In landslide susceptibility prediction, we use the studentized contrast C/S(C) to measure and reflect the spatial association between the landslide conditioning factors and landslide occurrence, where C is the weight contrast and S(C) is the standard deviation of C. These can be expressed as follows:

$$ C={W}^{+}-{W}^{-}, $$

(10)

$$ S(C)=\sqrt{S^2{W}^{+}+{S}^2{W}^{-}}, $$

(11)

where S²W⁺ is the variance of the positive weights and S²W⁻ is the variance of the negative weights.

Kernel logistic regression (KLR)

KLR is one type of logistic regression that applies kernel theory. The main aim of this approach is to classify a large quantity of data in a high-dimensional space because it might be difficult to distinguish in the current dimensional space using a linear logistic regression model (Cawley and Talbot 2005; Tien Bui et al. 2016). We can express the KLR as follows:

$$ \log it\left\{p\right\}=w\cdot \varphi (u)+c, $$

(12)

where w is the vector of the landslide conditioning factors, φ(u) is a nonlinear transformation to each input variable, and c is a bias term. For convenience, φ(u) can be simply calculated, i.e., the cause φ(u)φ^'(u) is a certain outcome during the calculation procedure, which evaluates the inner product between the image of input vectors in the feature space:

$$ K\left(u,{u}^{\hbox{'}}\right)=\varphi (u)\cdot {\varphi}^{\hbox{'}}(u). $$

(13)

For a kernel to support the interpretation as an inner product in a fixed feature space, the kernel must obey Mercers’ condition (Mercer 1909). Many kernel functions have been suggested, such as the radial basis function (RBF) and the linear kernel (Lin and Lin 2003). In this study, KLR was used to describe the problem:

$$ K\left(u,{u}^{\hbox{'}}\right)={e}^{\left[\frac{-{\left\Vert u-{u}^{\hbox{'}}\right\Vert}^2}{2{\delta}^2}\right]}, $$

(14)

where δ is a turning parameter that controls the sensitivity of the kernel. According to the represented theorem (Kimeldorf and Wahba 1971; Schölkopf et al. 2001), vector w can be determined by minimizing a cost function, which can be expressed as follows:

$$ w={\sum}_{i-1}^n{\alpha}_i\varphi (u) $$

(15)

where α_i,_i=(1,2,…,n) is the vector of the landslide conditioning factors. Thus, we obtain the following formula:

$$ \log it\left\{p\right\}={\sum}_{i-1}^n{\alpha}_iK\left(u,{u}^{\hbox{'}}\right)+c. $$

(16)

Construction of training and validation datasets

The values of 15 conditioning factors for the three bivariate models were extracted to the landslide inventory in this study. Landslide locations (grid pixels) were assigned to 1, whereas the same number of non-landslide locations (grid pixels) outside the landslides were assigned to 0. To evaluate the prediction capability of landslide susceptibility models, the landslide inventory and non-landslide dataset should be divided into two subsets, i.e., the training and validation sets (Chung and Fabbri 2003). Therefore, the landslide inventory and non-landslide dataset were split randomly into two parts with a ratio of 70:30 to construct and validate the models, respectively. There were 155 landslide locations and 155 non-landslide locations in the training dataset, while the validation dataset had 67 landslide locations and 67 non-landslide locations.

Correlation analysis of conditioning factors

As the ensemble models are a combination of KLR developed from logistic regression, the assessment of correlation among the landslide conditioning factors is an important issue. There are two parameters for assessing the multicollinearity analysis: tolerance (TOL) and the variance inflation factor (VIF) (Chen et al. 2017f).

$$ TOL=1-{R}^2, $$

(17)

$$ VIF=1/ TOL. $$

(18)

According to the literature, a TOL of less than 0.20 or 0.10 and/or a VIF of more than 5 or 10 implies a multicollinearity problem (O’Brien 2007).

Model performance and validation of landslide susceptibility maps

The performances of three landslide ensemble models were evaluated using the receiver operating characteristic (ROC) curve. The area under the ROC curve (AUC) is a significant measurement for the assessment of the prediction capability of models in landslide modeling (Tien Bui et al. 2016). When the AUC is equal to 1, an ideal model is acquired (Chen et al. 2017e). The AUC can be computed using the following equation:

$$ AUC=\frac{\sum TP+\sum TN}{P+N} $$

(19)

where TP is the number of landslides classified correctly, TN is the number of landslides classified incorrectly, P is the total number of landslides, and N is the total number of non-landslides.

The success rate and prediction rate curves of the landslide susceptibility maps were also used in this study. The curves were obtained by plotting the cumulative percentage of landslide susceptibility maps on the x-axis and the cumulative percentage of landslide pixels on the y-axis. The areas under the curves of the success rate (AUC_T) and the prediction rate (AUC_P) were used to reflect the prediction capability of the landslide susceptibility maps.

Results

Analyses of landslide conditioning factors

Multicollinearity analysis was calculated with the training dataset using IBM SPSS Statistics software. The results, shown in Tables 2 and 3, indicate that there were no multicollinearities among the 15 landslide conditioning factors.

Table 2 Spatial relationship between each landslide conditioning factor and landslide by EBF, SI, and WoE models

Full size table

Table 3 Multicollinearity analysis

Full size table

In addition to the multicollinearity analysis, the predictive capabilities of the landslide conditioning factors were assessed by applying the KLR model with the RBF kernel function. The results of the most effective conditioning factors of the different ensemble models are shown in Table 4. The results indicate that all factors contributed to the models. Altitude, with the highest average merit (AM) in three ensemble models, was found to be the most important factor, followed in descending order by distance to rivers, distance to roads, STI, TWI, lithology, NDVI, distance to faults, SPI, slope angle, rainfall, aspect, land use, and plan and profile curvatures, respectively. Some factors including rainfall, aspect, land use, and plan and profile curvatures made only small contributions to the landslide modeling. However, as all the AMs had positive values, all 15 conditioning factors were considered in constructing the landslide susceptibility maps.

Table 4 Predictive capabilities of conditioning factors using KLR model

Full size table

Ensemble of EBF and KLR models

In this study, the parameters of the EBF method (i.e., Bel, Dis, Unc, and Pls) were obtained using the equations introduced earlier (section “Evidential belief function (EBF)”) for each class of conditioning factors. These parameters were computed based on the ratio between the number of landslides per class and the area of each class. The results of the EBF method can be seen in Table 2.

All Pls weights of the conditioning factors were used as input datasets for the EBF and EBF-KLR methods. For each pixel of the study area, the probability of landslide occurrence (PLO) using a linear logistic regression function (Eq. 12) was computed. The PLOs were reclassified based on the area percentage method to construct the landslide susceptibility map. The landslide susceptibility maps based on the EBF and the ensemble of EBF and KLR are shown in Fig. 5a and b, respectively.

Ensemble of SI and KLR models

Similar to the process of combination of the EBF and KLR methods, the SI was computed for each class of conditioning factors. Then, the SIs were assigned as weights to each class. Eventually, each conditioning factor was reclassified based on its SI and was determined as an input for overlaying with the landslides to extract the dataset for the KLR algorithm. The results of the SI are displayed in Table 2.

All conditioning factors were reclassified based on their SI and then applied as input datasets for the SI and SI-KLR methods. The PLOs were also reclassified based on the area percentage method to construct the landslide susceptibility map. The landslide susceptibility maps based on the SI and SI-KLR models are shown in Fig. 5c and d, respectively.

Ensemble of WoE and KLR models

In the WoE-KLR ensemble model, the parameters C, S (C), and C/S (C) were calculated first based on their function, as described in section “Weights of evidence (WoE)”. The C/S (C) weights were transferred to each class of conditioning factors. Then, each factor was reclassified and overlaid with the landslide locations to construct a database for computing the PLOs using the KLR algorithm.

Similar to the EBF and SI models, the WoE model was used to establish the spatial relationship between each conditioning factor and the landslide locations. The results of SI are presented in Table 2. Based on the C/S(C) of the WoE model, all conditioning factors were reclassified and applied as input datasets for the WoE and WoE-KLR methods. The PLOs were also reclassified according to the area percentage method to construct the landslide susceptibility map. The landslide susceptibility maps based on the WoE and WoE-KLR models are shown in Fig. 5e and f, respectively.

In order to present a better comparison of landslide susceptibility maps, the five landslide susceptibility classes were determined as very high (5%), high (10%), moderate (15%), low (20%), and very low (50%) for the six landslide susceptibility maps.

Ensemble models’ performance and comparison

The AUC curve with the training dataset was used to assess the performances of the three ensemble models, as shown in Table 5 and Fig. 6a. The results indicate that all ensemble models have high prediction accuracy according to the AUC values. Additionally, the EBF-KLR ensemble model has the highest AUC value (AUC = 0.814) followed by the SI-KLR ensemble model (AUC = 0.811) and the WoE-KLR ensemble model (AUC = 0.806). The performances of the three ensemble models based on the AUROC with the validation dataset are shown in Table 6 and Fig. 6b. The results indicate the EBF-KLR ensemble model (AUC = 0.753) outperformed the SI-KLR ensemble model (AUC = 0.752) and the WoE-KLR ensemble model (AUC = 0.744). These findings suggest that although all landslide susceptibility ensemble models showed high prediction accuracy, the EBF-KLR ensemble model had the highest prediction capability for landslide susceptibility mapping in the study area.

Table 5 Ensemble models’ performance using training dataset

Full size table

Table 6 Ensemble models’ performance using validating dataset

Full size table

Validation of landslide susceptibility maps

The validation of the six landslide susceptibility maps produced by the three bivariate models and the three ensemble models were assessed based on the spatial cross-validation procedure mentioned in section “Model performance and validation of landslide susceptibility maps”. The corresponding AUC_T and AUC_P curves are shown in Figs. 7 and 8 and validation of the landslide susceptibility maps is presented in Table 7. For the training dataset, the EBF-KLR model had the highest prediction capability (0.8511), followed in descending order by the SI-KLR model (0.8505), WoE-KLR model (0.8397), EBF model (0.7978), SI model (0.7951), and WoE model (0.7825). For the validation dataset, the EBF-KLR model had the highest prediction capability (0.7615), followed in descending order by the SI-KLR model (0.7595), SI model (0.7503), EBF model (0.7437), WoE-KLR model (0.7286), and WoE model (0.7198). Thus, the results show the ensemble EBF-KLR was the most capable of mapping landslide susceptibility within the study area.

Table 7 Validation of landslide susceptibility maps

Full size table

Landslide density (LD) was also calculated to validate the landslide susceptibility maps. The LD is defined as the ratio between the percentages of landslides and the percentages of each susceptible class (Pham et al. 2016); higher susceptible classes should have higher LDs for reliable landslide susceptibility maps. As mentioned above, the area percentages of each susceptible class were defined as very high (5%), high (10%), moderate (15%), low (20%), and very low (50%). Therefore, it is only necessary to calculate the percentages of landslide locations for each class. The results of the LD analysis are shown in Table 8. It can be observed that the very high class has the highest LD values, followed in descending order by the high, moderate, low, and very low classes. The results also show that the ensemble models yielded better performance than the individual bivariate models, and that the EBF-KLR model improved the performance of the bivariate EBF model more significantly than the other two ensemble models.

Table 8 Landslide density analysis on landslide susceptibility maps

Full size table

Discussion

Landslide susceptibility describes the probability of landslide occurrence within a particular area, and the correlation between previous landslide locations and possible conditioning factors (Romer and Ferentinou 2016). In previous decades, many methods including traditional statistical models (Ding et al. 2017; Zhang et al. 2016a) and sophisticated machine learning models (Chen et al. 2017g; Pourghasemi and Kerle 2016; Youssef et al. 2016) have been used in conjunction with the development of GIS technology to predict the spatial distributions of landslides. However, both bivariate and machine learning approaches have their limitations, which could potentially be eliminated by the use of ensemble models. Therefore, it is necessary to explore and to compare new ensemble methods and techniques in application to landslide modeling. Recently, some ensemble machine learning methods have been applied in landslide susceptibility, for example, Shirzadi et al. (2017) used a Naive Bayes trees (NBT) and random subspace (RS) ensemble method for landslide susceptibility mapping at the Bijar region, Kurdistan province (Iran), and their result showed that NBT-RS significantly improved the performance of the NBT base classifier. Hong et al. (2018) found that J48 Decision Tree with the Rotation Forest model presents the highest prediction capability (AUC =0.855); it improved the performance of the J48 Decision Tree base classifier significantly. Pham et al. (2018) integrated the MultiBoost (MB) ensemble and support vector machine (SVM) for modeling of susceptibility of landslides in the Uttarakhand State, Northern India, and their result showed that the MBSVM outperforms the LR and single SVM models. Though many ensemble methods have been applied in landslide susceptibility, until now, there is still no agreement on which is the best ensemble method in landslide susceptibility mapping. In addition, more experiments are needed to compare different areas to find the difference among each method.

The most important step was to select the landslide conditioning factors because they affect the quality of landslide susceptibility analysis (Irigaray et al. 2007; Romer and Ferentinou 2016). However, there are no standard guidelines regarding the selection of landslide conditioning factors (Tien Bui et al. 2016). This study built a landslide susceptibility model using 15 landslide conditioning factors that included topographical, geological, and environmental factors. Then, TOL and VIF were used to establish the absence of multicollinearity among the 15 landslide conditioning factors (Table 3). Subsequently, the classifier attribute evaluation method (Witten et al. 2011) using the KLR model with the RBF kernel function was used to assess the importance of the variables. The results showed that all 15 conditioning factors had positive predictive capability in the model (Table 4); therefore, they were all used to build landslide susceptibility models.

The goodness-of-fits of three ensemble models were evaluated using ROC and AUC values. The results indicated that all ensemble models had high prediction accuracy based on the AUC values. The EBF-KLR ensemble model had the highest AUC values for both the training (AUC = 0.814) and the validation (AUC = 0.753) datasets, followed in descending order by the SI-KLR ensemble model and the WoE-KLR ensemble model (Tables 5 and 6). However, the results also showed that different conditioning factors had different contributions to the models (Table 4). In general, altitude, distance to rivers, and distance to roads were found to be the most important factors for the three ensemble models. Conversely, the factors of land use, profile curvature, and plan curvature yielded the lowest predictive capabilities for the three ensemble models. The normalized predictive capabilities of the conditioning factors for the three ensemble models were used to visualize the relative importance of the 15 conditioning factors (Fig. 9). It was observed that altitude contributed the highest percentages of 10.573, 10.397, and 11.111% for the EBF-KLR, SI-KLR, and WoE-KLR models, respectively; distance to rivers yielded the second highest contributions of 9.688, 9.713, and 9.677%, respectively, and distance to roads yielded the third highest contributions of 9.209, 9.234, and 9.319%, respectively. In contrast, profile curvature yielded the lowest contributions of 1.569, 1.710, and 1.362%, respectively. Therefore, because of the types of input variables and the models used, it was concluded that landslide conditioning factors tend to have different contributions (Tien Bui et al. 2016). Further studies should be undertaken to explore the optimum method for selecting the optimal factors for both this and similar study areas.

To evaluate and compare the three ensemble models with the three individual bivariate models, this study adopted the methods of the AUC_T and AUC_P curves, and LD analysis. The results suggested the three ensemble models showed higher prediction capabilities for both the training and the validation datasets than each of the three individual bivariate models. The EBF-KLR ensemble exhibited the optimal performance, which could improve the performance of the EBF model significantly. However, it should be noted that the other two ensembles also yielded reasonable performance.

Conclusions

This study evaluated landslide susceptibility in Chongren County (China) using novel ensembles of bivariate statistical-methods-based (EBF, SI, and WoE) kernel logistic regression machine learning classifiers. A series of conditioning factors (slope angle, slope aspect, altitude, plan curvature, profile curvature, SPI, STI, TWI, distance to rivers, distance to roads, distance to faults, NDVI, land use, lithology, and rainfall) were used as the inputs to the three hybrid models. A landslide inventory comprising 222 landslides was divided randomly into a training set (70%) for evaluation of the landslide susceptibility models and a validation set (30%) for validation of the model procedure.

The results showed that the three hybrid models were successful at identifying landslide-prone areas. The results also showed that the hybrid models could improve the predictive capability of the bivariate models, and that the EBF-KLR hybrid model yielded the highest predictive capability in landslide susceptibility assessment.

In conclusion, the landslide susceptibility maps produced in the present study may be useful for land use planning and decision making in areas prone to landslides. Moreover, this study also demonstrated the superiority of hybrid models in landslide susceptibility modeling.

References

Agterberg FP (1989) Computer programs for mineral exploration. Science 245:76–81
Althuwaynee OF, Pradhan B, Park HJ, Lee JH (2014) A novel ensemble decision tree-based CHi-squared automatic interaction detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping. Landslides 11:1063–1078
Article Google Scholar
Althuwaynee OF, Pradhan B, Ahmad N (2015) Estimation of rainfall threshold and its use in landslide hazard mapping of Kuala Lumpur metropolitan and surrounding areas. Landslides 12:861–875
Article Google Scholar
Benediktsson JA, Swain PH, Ersoy OK (1989) Neural network approaches versus statistical methods in classification of multisource remote sensing data, geoscience and remote sensing symposium. Igarss'89. Canadian symposium on remote sensing, pp 489–492
Beullens J, Velde DVD, Nyssen J (2014) Impact of slope aspect on hydrological rainfall and on the magnitude of rill erosion in Belgium and northern France. Catena 114:129–139
Article Google Scholar
Bonham-Carter GF (1994) Geographic information systems for geoscientists-modeling with GIS. Computer methods in the geoscientists 13:398
Google Scholar
Broeckx J, Vanmaercke M, Duchateau R, Poesen J (2018) A data-based landslide susceptibility map of Africa. Earth Sci Rev 185:102–121
Article Google Scholar
Cawley GC, Talbot NL (2005) The evidence framework applied to sparse kernel logistic regression. Neurocomputing 64:119–135
Article Google Scholar
Chen X-L, Liu C-G, Chang Z-F, Zhou Q (2016) The relationship between the slope angle and the landslide size derived from limit equilibrium simulations. Geomorphology 253:547–550
Article Google Scholar
Chen W, Panahi M, Pourghasemi HR (2017a) Performance evaluation of GIS-based new ensemble data mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA), differential evolution (DE), and particle swarm optimization (PSO) for landslide spatial modelling. CATENA 157:310–324
Article Google Scholar
Chen W, Pourghasemi HR, Kornejady A, Zhang N (2017b) Landslide spatial modeling: introducing new ensembles of ANN, MaxEnt, and SVM machine learning techniques. Geoderma 305:314–327
Article Google Scholar
Chen W, Pourghasemi HR, Naghibi SA (2017c) A comparative study of landslide susceptibility maps produced using support vector machine with different kernel functions and entropy data mining models in China. Bull Eng Geol Environ. https://doi.org/10.1007/s10064-017-1010-y
Chen W, Pourghasemi HR, Naghibi SA (2017d) Prioritization of landslide conditioning factors and its spatial modeling in Shangnan County, China using GIS-based data mining algorithms. Bull Eng Geol Environ. https://doi.org/10.1007/s10064-017-1004-9
Chen W et al (2017e) A novel hybrid artificial intelligence approach based on the rotation forest ensemble and naïve Bayes tree classifiers for a landslide susceptibility assessment in Langao County, China. Geomatics, Nat Hazards Risk 8:1955–1977
Chen W et al (2017f) GIS-based landslide susceptibility modelling: a comparative assessment of kernel logistic regression, Naïve-Bayes tree, and alternating decision tree models. Geomat Nat Haz Risk 8:950–973
Chen W et al (2017g) A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160
Chung C-JF, Fabbri AG (2003) Validation of spatial prediction models for landslide hazard mapping. Nat Hazards 30:451–472
Article Google Scholar
Dempster AP (1967) Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 38:325–339
Article Google Scholar
Ding Q, Chen W, Hong H (2017) Application of frequency ratio, weights of evidence and evidential belief function models in landslide susceptibility mapping. Geocarto International 32:619–639
Domínguez-Cuesta MJ, Jiménez-Sánchez M, Berrezueta E (2007) Landslides in the central coalfield (Cantabrian Mountains, NW Spain): geomorphological features, conditioning factors and methodological implications in susceptibility assessment. Geomorphology 89:358–369
Article Google Scholar
ESRI (2014) ArcGIS desktop: release 10.2. Environmental Systems Research Institute, Redlands
Google Scholar
Galli M, Ardizzone F, Cardinali M, Guzzetti F, Reichenbach P (2008) Comparing landslide inventory maps. Geomorphology 94:268–289
Article Google Scholar
Guzzetti F et al (2012) Landslide inventory maps: new tools for an old problem. Earth Sci Rev 112:42–66
Article Google Scholar
Harp EL, Keefer DK, Sato HP, Yagi H (2011) Landslide inventories: the essential part of seismic landslide hazard analyses. Eng Geol 122:9–21
Article Google Scholar
Hong H, Pradhan B, Xu C, Tien Bui D (2015) Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 133:266–281
Article Google Scholar
Hong H, Pourghasemi HR, Pourtaghi ZS (2016) Landslide susceptibility assessment in Lianhua County (China): a comparison between a random forest data mining technique and bivariate and multivariate statistical models. Geomorphology 259:105–118
Article Google Scholar
Hong H et al (2017) Rainfall-induced landslide susceptibility assessment at the Chongren area (China) using frequency ratio, certainty factor, and index of entropy. Geocarto International 32:139–154
Google Scholar
Hong H et al (2018) Landslide susceptibility mapping using J48 decision tree with AdaBoost, bagging and rotation Forest ensembles in the Guangchang area (China). Catena 163:399–413
Article Google Scholar
Irigaray C, Fernández T, Hamdouni RE, Chacón J (2007) Evaluation and validation of landslide-susceptibility maps obtained by a GIS matrix method: examples from the Betic cordillera (southern Spain). Nat Hazards 41:61–79
Article Google Scholar
Iwahashi J, Watanabe S, Furuya T (2003) Mean slope-angle frequency distribution and size frequency distribution of landslide masses in Higashikubiki area, Japan. Geomorphology 50:349–364
Article Google Scholar
Jebur MN, Pradhan B, Tehrany MS (2014) Optimization of landslide conditioning factors using very high-resolution airborne laser scanning (LiDAR) data at catchment scale. Remote Sens Environ 152:150–165
Article Google Scholar
Jiménez Sánchez M, Farias P, Rodríguez A, Menéndez Duarte RA (1999) Landslide development in a coastal valley in northern Spain: conditioning factors and temporal occurrence. Geomorphology 30:115–123
Article Google Scholar
Jonathan B, Marko H, Robert B, Brian H (2006) Influence of slope and aspect on long-term vegetation change in British chalk grasslands. J Ecol 94:355–368
Article Google Scholar
Kimeldorf G, Wahba G (1971) Some results on Tchebycheffian spline functions. J Math Anal Appl 33:82–95
Article Google Scholar
Kritikos T, Davies T (2015) Assessment of rainfall-generated shallow landslide/debris-flow susceptibility and runout using a GIS-based approach: application to western southern Alps of New Zealand. Landslides 12:1051–1075
Article Google Scholar
Kumar R, Anbalagan R (2016) Landslide susceptibility mapping using analytical hierarchy process (AHP) in Tehri reservoir rim region, Uttarakhand. J Geol Soc India 87:271–286
Article Google Scholar
Lin H-T, Lin C-J (2003) A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Neural Comput 3:1–32
Ma T, Li C, Lu Z, Bao Q (2015) Rainfall intensity–duration thresholds for the initiation of landslides in Zhejiang Province, China. Geomorphology 245:193–206
Article Google Scholar
Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character 209:415–446
Google Scholar
Mogaji K, Omosuyi G, Adelusi A, Lim H (2016) Application of GIS-based evidential belief function model to regional groundwater recharge potential zones mapping in Hardrock geologic terrain. Environmental Processes 3:93–123
Article Google Scholar
Moosavi V, Talebi A, Shirmohammadi B (2014) Producing a landslide inventory map using pixel-based and object-oriented approaches optimized by Taguchi method. Geomorphology 204:646–656
Article Google Scholar
Nasiri Aghdam I, Varzandeh MHM, Pradhan B (2016) Landslide susceptibility mapping using an ensemble statistical index (Wi) and adaptive neuro-fuzzy inference system (ANFIS) model at Alborz Mountains (Iran). Environ Earth Sci 75:1–20
Article Google Scholar
O’brien RM (2007) A caution regarding rules of thumb for variance inflation factors. Qual Quant 41:673–690
Article Google Scholar
Pham BT, Tien Bui D, Pourghasemi HR, Indra P, Dholakia MB (2015) Landslide susceptibility assessment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol 128:255–273
Pham BT, Tien Bui D, Prakash I, Dholakia MB (2016) Rotation forest fuzzy rule-based classifier ensemble for spatial prediction of landslides using GIS. Nat Hazards 83:97–127
Article Google Scholar
Pham BT, Jaafari A, Prakash I, Bui DT (2018) A novel hybrid intelligent model of support vector machines and the MultiBoost ensemble for landslide susceptibility modeling. Bull Eng Geol Environ. https://doi.org/10.1007/s10064-018-1281-y
Pourghasemi HR (2014) Landslide hazard prediction using data mining methods in the North of Tehran City. Dissertation, Tarbiat Modares University, p 143 (In Persian)
Pourghasemi HR (2016) GIS-based forest fire susceptibility mapping in Iran: a comparison between evidential belief function and binary logistic regression models. Scand J For Res 31:80–98
Article Google Scholar
Pourghasemi HR, Kerle N (2016) Random forests and evidential belief function-based landslide susceptibility assessment in Western Mazandaran Province, Iran. Environ Earth Sci 75:1–17
Article Google Scholar
Pourghasemi HR, Rahmati O (2018) Prediction of the landslide susceptibility: which algorithm, which precision? CATENA 162:177–192
Article Google Scholar
Pourghasemi HR, Rossi M (2016) Landslide susceptibility modeling in a landslide prone area in Mazandaran Province, north of Iran: a comparison between GLM, GAM, MARS, and M-AHP methods. Theor Appl Climatol. https://doi.org/10.1007/s00704-016-1919-2
Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–365
Article Google Scholar
Pradhan B, Abokharima MH, Jebur MN, Tehrany MS (2014) Land subsidence susceptibility mapping at Kinta Valley (Malaysia) using the evidential belief function model in GIS. Nat Hazards 73:1019–1042
Article Google Scholar
Rahmati O, Pourghasemi HR, Zeinivand H (2016) Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran. Geocarto International 31:42–70
Article Google Scholar
Raja NB, Çiçek I, Türkoğlu N, Aydin O, Kawasaki A (2017) Landslide susceptibility mapping of the Sera River basin using logistic regression model. Nat Hazards 85:1323–1346
Article Google Scholar
Regmi AD et al (2014) Application of frequency ratio, statistical index, and weights-of-evidence models and their comparison in landslide susceptibility mapping in Central Nepal Himalaya. Arab J Geosci 7:725–742
Article Google Scholar
Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180:60–91
Article Google Scholar
Romer C, Ferentinou M (2016) Shallow landslide susceptibility assessment in a semiarid environment — a quaternary catchment of KwaZulu-Natal, South Africa. Eng Geol 201:29–44
Article Google Scholar
Schölkopf B, Herbrich R, Smola AJ (2001) A generalized representer theorem, computational learning theory. Springer, Heidelberg, pp 416–426
Book Google Scholar
Shafer G (1976) A mathematical theory of evidence. Technometrics 20:242
Google Scholar
Shirzadi A et al (2017) Shallow landslide susceptibility assessment using a novel hybrid intelligence approach. Environ Earth Sci 76:60
Article Google Scholar
Tahmassebipoor N, Rahmati O, Noormohamadi F, Lee S (2016) Spatial analysis of groundwater potential using weights-of-evidence and evidential belief function models and remote sensing. Arab J Geosci 9:1–18
Article Google Scholar
Tehrany MS, Pradhan B, Jebur MN (2014) Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS. J Hydrol 512:332–343
Article Google Scholar
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012) Landslide susceptibility mapping at Hoa Binh province (Vietnam) using an adaptive neuro-fuzzy inference system and GIS. Comput Geosci 45:199–211
Article Google Scholar
Tien Bui D et al (2015) A novel hybrid evidential belief function-based fuzzy logic model in spatial prediction of rainfall-induced shallow landslides in the Lang Son city area (Vietnam). Geomatics, Natural Hazards and Risk 6:243–271
Article Google Scholar
Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2016) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13:361–378
Article Google Scholar
Tsangaratos P, Ilia I, Hong H, Chen W, Xu C (2017) Applying information theory and GIS-based quantitative methods to produce landslide susceptibility maps in Nancheng County, China. Landslides 14:1091–1111
Article Google Scholar
van Westen C (1997) Statistical landslide hazard analysis. ILWIS 2.1 for Windows application guide, pp 73–84
van Westen CJ, Rengers N, Soeters R (2003) Use of geomorphological information in indirect landslide susceptibility assessment. Nat Hazards 30:399–419
Article Google Scholar
Wang L-J, Guo M, Sawada K, Lin J, Zhang J (2016) A comparative study of landslide susceptibility maps using logistic regression, frequency ratio, decision tree, weights of evidence and artificial neural network. Geosci J 20:117–136
Article Google Scholar
Witten IH, Frank E, Mark AH (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, Burlington
Yilmaz I (2010) Comparison of landslide susceptibility mapping methodologies for Koyulhisar, Turkey: conditional probability, logistic regression, artificial neural networks, and support vector machine. Environ Earth Sci 61:821–836
Article Google Scholar
Youssef AM, Al-Kathery M, Pradhan B (2015) Landslide susceptibility mapping at Al-hasher area, Jizan (Saudi Arabia) using GIS-based frequency ratio and index of entropy models. Geosci J 19:113–134
Article Google Scholar
Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM (2016) Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir region, Saudi Arabia. Landslides 13:839–856
Article Google Scholar
Yuan RM et al (2013) Density distribution of landslides triggered by the 2008 Wenchuan earthquake and their relationships to peak ground acceleration. Bull Seismol Soc Am 103:2344–2355
Article Google Scholar
Yuan R-m, Tang C-L, Deng Q-h (2015) Effect of the acceleration component normal to the sliding surface on earthquake-induced landslide triggering. Landslides 12:335–344
Article Google Scholar
Yuan R et al (2016) Newmark displacement model for landslides induced by the 2013 Ms 7.0 Lushan earthquake, China. Front Earth Sci 10:740–750
Article Google Scholar
Zeghouane H, Allek K, Kesraoui M (2016) GIS-based weights of evidence modeling applied to mineral prospectivity mapping of Sn-W and rare metals in Laouni area, central Hoggar, Algeria. Arab J Geosci 9:1–13
Article Google Scholar
Zêzere JLs, de Brum Ferreira A, Rodrigues MLs (1999) The role of conditioning and triggering factors in the occurrence of landslides: a case study in the area north of Lisbon (Portugal). Geomorphology 30:133–146
Article Google Scholar
Zhang D, Agterberg F, Cheng Q, Zuo R (2014) A comparison of modified fuzzy weights of evidence, fuzzy weights of evidence, and logistic regression for mapping mineral prospectivity. Math Geosci 46:869–885
Article Google Scholar
Zhang G et al (2016a) Integration of the statistical index method and the analytic hierarchy process technique for the assessment of landslide susceptibility in Huizhou, China. CATENA 142:233–244
Article Google Scholar
Zhang Z et al (2016b) GIS-based landslide susceptibility analysis using frequency ratio and evidential belief function models. Environ Earth Sci 75:1–12
Article Google Scholar

Download references

Acknowledgments

This research was supported by the International Partnership Program of Chinese Academy of Sciences (Grant No. 115242KYSB20170022), the National Natural Science Foundation of China (Grant No. 41807192), China Postdoctoral Science Foundation (Grant No. 2018 T111084, 2017 M613168), Project funded by Shaanxi Province Postdoctoral Science Foundation (Grant No. 2017BSHYDZZ07), and the Open Fund of Shandong Provincial Key Laboratory of Depositional Mineralization & Sedimentary Minerals (Grant No. DMSM2017029). We thank James Buxton MSc from Edanz Group (www.edanzediting.com./ac) for editing a draft of this manuscript.

Author information

Authors and Affiliations

College of Geology & Environment, Xi’an University of Science and Technology, Xi’an, 710054, Shaanxi, China
Wei Chen
Shandong Provincial Key Laboratory of Depositional Mineralization & Sedimentary Minerals, Shandong University of Science and Technology, Qingdao, 266590, China
Wei Chen
Department of Geomorphology, Faculty of Natural Resources, University of Kurdistan, Sanandaj, Iran
Himan Shahabi
Department of Rangeland and Watershed Management, Faculty of Natural Resources, University of Kurdistan, Sanandaj, Iran
Ataollah Shirzadi
Key Laboratory of Virtual Geographic Environment (Nanjing Normal University), Ministry of Education, Nanjing, 210023, China
Haoyuan Hong, Junzhi Liu & A-Xing Zhu
State Key Laboratory Cultivation Base of Geographical Environment Evolution (Jiangsu Province), Nanjing, 210023, China
Haoyuan Hong, Junzhi Liu & A-Xing Zhu
Jiangsu Center for Collaborative Innovation in Geographic Information Resource Development and Application, Nanjing, 210023, Jiangsu, China
Haoyuan Hong, Junzhi Liu & A-Xing Zhu
Geological Engineering Department, Karadeniz Technical University, Trabzon, Turkey
Aykut Akgun
Key Laboratory of Active Tectonics and Volcano, Institute of Geology, China Earthquake Administration, #1 Huayanli, Chaoyang District, PO Box 9803, Beijing, 100029, People’s Republic of China
Yingying Tian
State Key Laboratory of Geomechanics and Geotechnical Engineering, Institute of Rock and Soil Mechanics, Chinese Academy of Sciences, Wuhan, 430071, Hubei, China
Shaojun Li

Authors

Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Himan Shahabi
View author publications
You can also search for this author in PubMed Google Scholar
Ataollah Shirzadi
View author publications
You can also search for this author in PubMed Google Scholar
Haoyuan Hong
View author publications
You can also search for this author in PubMed Google Scholar
Aykut Akgun
View author publications
You can also search for this author in PubMed Google Scholar
Yingying Tian
View author publications
You can also search for this author in PubMed Google Scholar
Junzhi Liu
View author publications
You can also search for this author in PubMed Google Scholar
A-Xing Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Shaojun Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wei Chen or Haoyuan Hong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, W., Shahabi, H., Shirzadi, A. et al. Novel hybrid artificial intelligence approach of bivariate statistical-methods-based kernel logistic regression classifier for landslide susceptibility modeling. Bull Eng Geol Environ 78, 4397–4419 (2019). https://doi.org/10.1007/s10064-018-1401-8

Download citation

Received: 25 March 2018
Accepted: 04 October 2018
Published: 09 November 2018
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s10064-018-1401-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Novel hybrid artificial intelligence approach of bivariate statistical-methods-based kernel logistic regression classifier for landslide susceptibility modeling

Abstract

Similar content being viewed by others

Application of Naive Bayes, kernel logistic regression and alternation decision tree for landslide susceptibility mapping in Pengyang County, China

Spatial prediction of landslide susceptibility using data mining-based kernel logistic regression, naive Bayes and RBFNetwork models for the Long County area (China)

A comparative study of the bivariate, multivariate and machine-learning-based statistical models for landslide susceptibility mapping in a seismic-prone region in China

Explore related subjects

Introduction

Materials and methods

Study area

Database

Landslide inventory

Conditioning factors

Topographical factors

Geological factors

Environmental factors

Methods

Evidential belief function (EBF)

Statistical index (SI)

Weights of evidence (WoE)

Kernel logistic regression (KLR)

Construction of training and validation datasets

Correlation analysis of conditioning factors

Model performance and validation of landslide susceptibility maps

Results

Analyses of landslide conditioning factors

Ensemble of EBF and KLR models

Ensemble of SI and KLR models

Ensemble of WoE and KLR models

Ensemble models’ performance and comparison

Validation of landslide susceptibility maps

Discussion

Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation