Introduction

Landslides, resulting in significant damage to people and property, are one of the most costly and damaging geological hazards in many areas of the world. The frequency of landslide occurrences increases with growing human population. Globally, landslides cause hundreds of billions of dollars in damage, thousands of casualties and fatalities, and environmental losses each year (Aleotti and Chowdhury 1999). In China, more than 10,000 hazards associated with landslides occurred in 2014, which caused a total of 400 people dead or missing, 218 people injured, and a direct economic loss of 5.41 billion CNY (C.H. of China geological environment information sit (CIGEM) 2014). Currently, tens of millions of people still live under the high-risk threat of landslides (Liu et al. 2013).

In general, landslide susceptibility mapping, defined as qualitative methods which are direct hazard mapping techniques or quantitative methods which are indirect mapping techniques (Fell et al. 2008; Grozavu et al. 2013; Kayastha et al. 2013; Youssef et al. 2014a, b; Jaupaj et al. 2014), relies on a rather complex knowledge of slope movements and their controlling factors. The reliability of landslide susceptibility maps mainly depends on the amount and quality of available data, the working scale, and the selection of the appropriate methodology of analysis and modeling (Baeza and Corominas. 2001). Over the last decades, there have been studies on landslide susceptibility evaluation using GIS, and many of these studies have applied probabilistic models (Lee and Min 2001; Baeza and Corominas 2001; Dahal et al. 2008; Pradhan et al. 2006, 2011; Youssef et al. 2009, 2012; Cevik and Topal 2003; Pradhan and Youssef 2010; Vijith and Madhu 2008; Clerici et al. 2002, 2006; Donati and Turrini 2002; Luzi et al. 2000; Jibson et al. 2000; Zhou et al. 2002; Parise and Jibson 2000; Lee and Choi 2003; Lee et al. 2004a, b; Akgun et al. 2012a; Pareek et al. 2013; Kayastha 2015; Youssef et al. 2015a, b). The statistical models available, such as the logistic regression models (Bathrellos et al. 2009; Akgun 2012; Tunusluoglu et al. 2007; Xu et al. 2012b; Devkota et al. 2013; Ozdemir and Altural 2013; Kundu et al. 2013; Park et al. 2013; Grozavu et al. 2013) and bivariate models (Pradhan and Youssef 2010; Pareek et al. 2010; Pradhan and Lee 2010; Pourghasemi et al. 2013a), has also been applied to landslide susceptibility mapping. As other different methods such as certainty factor (CF) (Devkota et al. 2013; Pourghasemi et al. 2013b), analytical hierarchy process (AHP) (Rozos et al. 2011; Bathrellos et al. 2012, 2013; Pourghasemi et al. 2012, 2013a; Park et al. 2013; Youssef et al. 2014a, b), spatial multicriteria decision analysis (MCDA) (Akgun and Turk 2010; Akgun 2012), weights of evidence (WoE) (Ozdemir and Altural 2013; Pourghasemi et al. 2013b, c; Regmi et al. 2014), statistical index (SI) (Bui et al. 2011; Regmi et al. 2014), index of entropy (IoE) model (Mihaela et al. 2011; Devkota et al. 2013), artificial neural network (ANN) (Nefeslioglu et al. 2008; Poudyal et al. 2010; Yilmaz 2009a, b, 2010a, b), fuzzy logic (Akgun et al. 2012b; Pourghasemi et al. 2012; Sharma et al. 2013; Guettouche 2013), support vector machine (SVM) (Yilmaz 2010b; Marjanović et al. 2011; Xu et al. 2012a; Pradhan 2013), and decision tree (Pradhan 2013) have also been applied for landslide susceptibility evaluation. All these models provide solutions for integrating information levels and mapping the outputs.

The aim of the present study was to produce landslide susceptibility maps of Qianyang County in Baoji, China (Fig. 1). For this purpose, landslide-related data have been collected and constructed to spatial database; landslide-related factors have been extracted and overlaid using three statistical models: frequency ratio (FR), statistical index (SI), and weights-of-evidence (WoE) models in order to find the best model that is more accurate in landslide susceptibility mapping in the study area. These models exploit information obtained from an inventory map to offer a guide of landslide inventory or of landslide-prone area, in order to efficiently mitigate the hazard and even avoid the hazard in future. To evaluate the accuracy of three models, the landslide susceptibility analysis results were validated by comparing with the existing landslide locations according to the area under the curve (AUC). The models’ prediction capabilities were tested.

Fig. 1
figure 1

Location map of the study area

The main difference between the present study and the approaches described in the aforementioned publications is that the frequency ratio (FR), statistical index (SI), and weights-of-evidence (WoE) models were applied, and their results were compared for landslide susceptibility at the study area for the first time.

Study area

The study area is located in Qianyang County of Baoji City, China, between latitudes 34°34′34 ″ to 34°56′56″N and longitudes 106°56′15" to 107°22′31″E (Fig. 1). It covers roughly a surface area of 996.46 km2. The altitude of the area ranges from 752 to 1560 m a.s.l and decreases from the north to the south. The landform can be classified into mountain, hill, and plain. The slope angles of the area range from 0° to as much as 38°. The rivers of the study area are belonging to Wei and Jing river basins. The mean annual rainfall according to local station in a period of 40 years is around 627.4 mm. Also, based on the records from China’s meteorological department (C.H. of China Meteorological Administration (CMA) 2014), the minimum and maximum rainfall occurs in January and September, respectively. The average mean annual temperature is 11.8 °C. The stratigraphic column of the study area is shown in Fig. 2. The study area is mainly distributed by loess and 81 landslides distributed in the study area. Figure 3 shows significant photographs of landslides occurred in the study area.

Fig. 2
figure 2

The stratigraphic column of the study area

Fig. 3
figure 3

Field photographies of the study area

Methodology

Frequency ratio model

The frequency ratio (FR) approach, a variant of the probabilistic method, is based on the observed relationships between the distribution of landslides and each landslide conditioning factor (Tay et al. 2014). The frequency ratios for the class or type of each conditioning factor were calculated by dividing the landslide occurrence ratio by the area ratio. The landslide susceptibility index (LSI) was calculated by summation of each factor’s ratio value using Eq. (1) (Lee and Talib 2005):

$$ LSI={\displaystyle \sum FR} $$
(1)

where, LSI is the landslide susceptibility index. FR is the frequency ratio of each factor type or class.

Statistical index model

The statistical index approach, a bivariate statistical analysis, is considered as the simplest and quantitatively suitable method in landslide susceptibility mapping. In this method, the weighting value for each categorical unit is defined as the natural logarithm of the landslide density in a class divided by the landslide density in the whole studied area (Bourenane et al. 2015; Pourghasemi et al. 2013a). This method is based on the following equation:

$$ {W}_{ij}= \ln \left(\frac{D_{ij}}{D}\right)= \ln \left[\left(\frac{N_{ij}}{S_{ij}}/\frac{N}{S}\right)\right] $$
(2)

where W ij is the weight given to a certain class i of parameter j, D ij is the landslide density within class i of parameter j, D is the total landslide density within the entire map, N ij is the number of landslides in a certain class i of parameter j, S ij is the number of pixels in a certain class i of parameter j, N is the total number of landslides in the entire map, and S is the total pixels of the entire map.

Weights-of-evidence model

Weights-of-evidence (WoE), based on Bayesian Bayes’ theorem and assessing the relation between the spatial distribution of the areas affected by landslides and the spatial distribution of the conditioning factors causing landslides, is one of the bivariate models (Sujatha et al. 2014; Dahal et al. 2008). The WoE model is fundamentally based on the calculation of positive and negative weights W + and W , The positive and negative weights (W i + and W i ) are assigned to each of the different classes of causative factor (Van Westen et al. 2003), and positive and negative weights are defined as:

$$ {W}_i^{+}= \ln \frac{P\left\{B\Big|D\right\}}{P\left\{B\Big|\overline{D}\right\}} $$
(3)
$$ {W}_i^{-}= \ln \frac{P\left\{\overline{B}\Big|D\right\}}{P\left\{\overline{B}\Big|\overline{D}\right\}} $$
(4)

where P is the probability and ln is the natural log, B is the presence of potential landslide causative factor, \( \overline{D} \) is the absence of a potential landslide causative factor, D is the presence of landslide, and \( \overline{D} \) is the absence of a landslide. W i + and W i are the weights-of-evidence when the causative variable is present and absent at the landslide locations, respectively (Dahal et al. 2008; Oh and Lee 2011). The standard deviation of W is calculated as:

$$ S(C)=\sqrt{S^2{W}^{+}+{S}^2{W}^{-}} $$
(5)

where S 2(W +) and S 2(W ) are variances of W + and W , respectively. The difference between the two weights is known as the weight contrast, C(C = W + i  − W i ). C/S(C) provides a measure of the strength of the correlation between the analyzed variable and landslides (Dahal et al. 2008; Kouli et al. 2014).

Conditioning factors database

Landslide inventory

Historic information on landslide occurrences, giving shrewdness into the frequency, volumes, damages, and types of the landslide phenomena, is the backbone of landslide susceptibility studies (Youssef et al. 2015a, b; van Westen et al. 2006). Landslide inventories are the ones that collect the data including information related to topics such as the regional landslide locations, types, activities, and physical properties, usually mapped with an associated database (Fell et al. 2008; Demir et al. 2015). A landslide inventory map provides the basic information for evaluating landslide hazards or risks. Accurate collection of the data related to landslides is very important for landslide susceptibility analysis. In order to produce a detailed and reliable landslide inventory map, extensive field surveys and observations were performed in the study area. A total of 81 landslides (71 earth slide and 10 earth fall) were identified and mapped by evaluating aerial photos in 1:50,000 scale with well supported by field surveys and subsequently digitized for further analysis. The DEM of the study was generated from topographic maps in 1:10,000 scale with a contour interval of 10 m. The locations (centroid) of 81 landslides are mapped in Fig. 1. From these landslides, 56 (70 %) randomly selected were taken for making landslide susceptibility models, and 25 (30 %) were used for validating the models. The study area was divided into a grid with 50 × 50-m cell, occupying 984 rows and 1146 columns.

Landslide-conditioning factors

In the study, 13 landslide-conditioning factors (slope angle, slope aspect, curvature, plan curvature, profile curvature, altitude, distance to faults, distance to rivers, distance to roads, STI, SPI, TWI, and lithology) were considered during the landslide susceptibility mapping of the study area. All the data used in the current study were georeferenced to Gauss_Kruger coordinate system, D_Beijing_1954 datum, and zone 18 N. These factors fall under the category of preparatory factors, responsible for the occurrence of landslides in the region for which pertinent data can be collected from available resources as well as from the field surveys.

Slope angle

Slope angle is an important factor in the assessment of slope stability, and it is frequently used in preparing landslide susceptibility maps (Lee and Min 2001; Saha et al. 2005). The slope angle map of the study area is prepared from the digital elevation model (DEM) and was reclassified into five equal classes as 0–7, 7–14, 14–21, 21–28, and 28–38° (Fig. 4a).

Fig. 4
figure 4figure 4figure 4

Landslide-conditioning factors of the study area. a slope angle, b slope aspect, c curvature, d plan curvature, e profile curvature, f altitude, g distance to faults, h distance to rivers, i distance to roads, j STI, k SPI, l TWI, m lithology

Slope aspect

Slope aspect, accepted as a landslide-conditioning factor, describes the direction of slope (Ercanoglu et al. 2004; Pourghasemi et al. 2012). The slope aspect of the study area (Fig. 4b) is divided into eight directional classes as flat (−1), north (337.5–360°, 0–22.5°), northeast (22.5–67.5°), east (67.5–112.5°), southeast (112.5–157.5°), south (157.5–202.5°), southwest (202.5–247.5°), west (247.5–292.5°), and northwest (292.5–337.5°).

Curvature

Generally, curvature is defined as the rate of change of slope angle or aspect, and the characterization of slope morphology and flow can be analyzed with the help of the curvature map (Nefeslioglu et al. 2008; Catani et al. 2013). In this study, the curvature which is the combination of plane and profile curvature is taken into consideration (Fig. 4c). The curvature was derived from the DEM in Geographic information system software of ArcGIS 10.0 and divided into three classes: <−0.05, −0.05–0.05, and >0.05, respectively.

Plan curvature

Plan curvature is the curvature of a contour line formed by intersecting a horizontal plane with the surface. Plan curvature influences the convergence or divergence of water during downhill flow (Yilmaz et al. 2012). In this study, the plan curvature was derived from the DEM in Geographic information system software of ArcGIS 10.0 and divided into three classes: <−0.05, −0.05–0.05, and >0.05, respectively (Fig. 4d).

Profile curvature

Profile curvature is the curvature in the vertical plane parallel to the slope direction. It measures the rate of change of slope. Therefore, it influences the flow velocity of water draining the surface and thus erosion and the resulting down slope movement of sediment (Yilmaz and Topal 2012). In this study, the profile curvature was also derived from the DEM in Geographic information system software of ArcGIS 10.0 and divided into three classes: <−0.05, −0.05–0.05, and >0.05, respectively (Fig. 4e).

Altitude

Altitude or elevation is another frequently used conditioning factor for landslide susceptibility analysis. In the present study, the DEM of the study was generated from topographic maps in 1:10,000 scale with a contour interval of 10 m. The elevation of the study area ranged from 720 to 1560 m. The elevation values were divided into five categories using an interval of 150 m (Fig. 4f).

Distance to faults

Geological faults are responsible for triggering a large number of landslides due to the tectonic breaks that usually decrease the rock strength. In the study area, the faults of the study area were digitized from the geological map with 1:250,000 scale. The distance to faults is calculated at 2000-m intervals using the geological map (Fig. 4g).

Distance to rivers

Runoff plays an important role as a triggering factor for landslides due to rivers are the main mechanisms that contribute to the occurrence of landslides in mountainous regions (Park et al. 2013). For the current study, six different buffer categories were created within the study area to determine the degree to which the streams affected the slopes (Fig. 4h).

Distance to roads

The distance to roads has been considered as one of the most important anthropogenic factors influencing landslides occurrence that can be the cause of cut slope creations through construction of roads that disturbs the natural topology and affects the stability of the slope. The study area was divided into five different buffer zones to designate the influence of the road on slope stability (Fig. 4i): 0–1,000 m, 1000–2000, 2000–3000, 3000–4000 and >4000 m.

STI

The sediment transport index (STI) is characterized by the process of erosion and deposition (Devkota et al. 2013). In the present study, STI is divided into four classes <3, 3–9, 9–15, >15 (Fig. 4j).

SPI

The stream power index (SPI), a measure of the erosive power of water flow based on the assumption that discharge is proportional to the specific catchment area, is a compound topographic attribute (Conforti et al. 2011). The SPI map of the study area was classified into four classes: <5, 5–10, 10–40, and >40 (Fig. 4k).

TWI

The topographic wetness index (TWI) is another topographic factor within the runoff model (Pourghasemi et al. 2013d). In the present study, the TWI values, derived from the DEM, were arranged in four classes: <7, 7–10, 10–13, and >13 (Fig. 4l).

Lithology

Lithology is one of the most common determinant factors in most landslide stability studies. Since different lithological units have different landslide susceptibility values, they are very important in providing data for susceptibility studies. The lithology map of the study area is derived from existing geological maps in 1:250,000 scale. The study area is covered with various types of lithological units. Their names, lithologic characteristics, and ages of the geological units are provided in Table 1. As shown in Fig. 4m.

Table 1 Description of geological units of the study area

Results and discussion

Frequency ratio model

Using the frequency ratio model, frequency ratios for the class or type of each factor were calculated by dividing the landslide occurrence ratio by the area ratio. A frequency ratio value of 1 is an average value for the area landslides occurring in the total area. A frequency ratio value less than 1 indicates a lower correlation which indicates a high probability of landslide occurrence, and a weight value greater than 1 indicates a higher probability of landslide occurrence. The FR of all the thematic layers used in the present study was calculated in ArcGIS 10.0 and Microsoft Excel, and the result is given in Table 2.

Table 2 Spatial relationship between each landslide conditioning factor and landslide by FR, SI and WoE models

A landslide susceptibility map (Fig. 5) was constructed using the LSI value by the software of ArcGIS 10.0. The calculated LSI values for FR model of the study area range from about 6.59 to 25.32. Obviously, larger LSI values indicate a higher susceptibility for landsliding. The index values were classified into five zones (very low, low, moderate, high, and very high) using the natural break method. The susceptible area distributed in landslide susceptibility map is 7.57 % of the area under very high, 14.28 % of the area comes under high, and 23.99, 30.75, and 23.41 of the area occupies as moderate, low, very low, respectively.

Fig. 5
figure 5

Landslide susceptibility map derived from the FR model

Statistical index

To perform the statistical index modeling, the resultant weights for each thematic map for the SI model were calculated in ArcGIS 10.0 and Microsoft Excel, and the results are shown in Table 2. The higher resultant weight, the higher is the possibility that a mass movement occurs within the area covered by the considered class. These weights were analyzed by using the weighted sum option in the spatial analyst tools of ArcGIS 10.0 to get the final Landslide susceptibility map (Fig. 6). In this study, Landslide susceptibility map was classified into five categories by using the natural break method of ArcGIS. These categories include five classes of very low (−9.42–−5.06), low (−5.06–−2.69), moderate (−2.69–−0.38), high (−0.38–−1.93), and very high (1.93–−6.93). The area percentages in the very low, low, moderate, high, and very high landslide susceptibility classes are 20.49, 26.80, 21.66, 18.84, and 12.21 %, respectively.

Fig. 6
figure 6

Landslide susceptibility map derived from the SI model

Weights-of-evidence model

Every parameter map is crossed with the landslide inventory map based on the weights-of-evidence model using the ArcGIS 10.0 software, and the density of the landslide in each class is calculated. The resultant weights for each thematic map for the WoE model are given in Table 2. For getting the final LSI map (Fig. 7), these weights were analyzed by using the weighted sum option in the spatial analyst tools of ArcGIS 10.0. The final calculated LSI values of the study area for WoE model range from about −25.40 to 34.67. In this study, the LSI on the produced maps was grouped into five classes (very low, low, moderate, high, and very high) using the natural break method. According to this model, 8.94 % of the area is exposed to a very high susceptibility, and 17.14, 24.98, 25.87, and 23.07 % occupy high, moderate, low, and very low, respectively.

Fig. 7
figure 7

Landslide susceptibility map derived from the WoE model

Validation of the models used

Validation of landslide susceptibility models is an essential requirement to check the predictive capabilities of the landslide susceptibility map produced (Chung and Fabbri 2003). The landslide susceptibility maps derived by three models were tested by comparison of existing landslide data and landslide susceptibility analysis results for the study area. For this, the total landslides observed in the study area were split into two groups, 56 (70 %) landslides were randomly selected from the total 81 landslides as the training data, and the remaining 25 (30 %) landslides were kept for validation propose. In this study, the prediction capability of a landslide susceptibility model is usually estimated using area under the curve (AUC) methods. The rate curves were created, and their areas under the curve (AUC) used to qualitatively assess the prediction accuracy, were calculated (Fig. 8). The rate explains how well the model and controlling factors predict the landslide. The model with the highest AUC is considered to be the best.

Fig. 8
figure 8

AUC representing quality of the model

The success rate curve was obtained by comparing the landslide training data with the susceptibility maps (Fig. 8a). The AUC plot assessment results showed that the AUC values were 0.8362, 0.8345, and 0.8251 for FR, SI, and WoE models, and the training accuracy were 83.62, 83.45, and 82.51 %, respectively. The prediction-rate curve, obtained by comparing the landslide validation data with the susceptibility map (Fig. 8b), showed that the AUC values were 0.7940, 0.7935, and 0.7853 for FR, SI, and WoE models, and the prediction accuracy was 79.40, 79.35, and 78.53 %, respectively. The results of the AUC evaluation show that both the success rate and prediction rate curve have almost similar result. All the models employed in this study showed reasonably high prediction accuracy and can be used for the spatial prediction of landslide hazard analysis of the study area. On the other hand, the map produced by FR model exhibited the best result for landslide susceptibility mapping in the study area.

Conclusions

Generally, landslides are unpredictable; however, the susceptibility assessment of landslide occurrence can be determined using different GIS-based methods. In this study, we used three statistical models, such as frequency ratio (FR), statistical index (SI), and index of entropy (WoE) models, to produce landslide susceptibility maps for the Qianyang County of Baoji City, China. Their performances were compared by using area under the curve (AUC) methods. For generating landslide susceptibility maps in the study region, 13 landslide-conditioning factors were considered as slope angle, slope aspect, curvature, plan curvature, profile curvature, altitude, distance to faults, distance to rivers, distance to roads, STI, SPI, TWI, and lithology, for which maps were derived using various GIS tools. The selection of these factors was based on consideration of relevance, availability, and scale of data that was available for the study area. In this process, a total of 81 landslides were identified and mapped. Out of which, 56 (70 %) were randomly selected for generating a model, and the remaining 25 (30 %) were used for validation proposes. In this study, five landslide susceptibility classes, i.e., very low, low, moderate, high, and very high susceptibility for landsliding, were derived with natural break method. The verification results showed that the landslide susceptibility map generated by the FR model has the highest prediction accuracy (79.40 %), followed by the SI model (79.35 %), and the WoE model (78.53 %). Success rate curve gives similar result, with FR model the highest AUC value (83.62 %), followed by the SI model (83.45 %), and the WoE model (82.51 %). This shows that the three models have been applied successfully to the production of landslide susceptibility maps. The landslide susceptibility maps provide valuable information on the slope stability in study area, which may be used for infrastructure planning, land use, engineering, and hazard mitigation design. Also, it is helpful that the similar method can be used elsewhere where the similar landslide occurrence conditions.