Introduction

Landslides pose a serious threat to human properties and lives (Kirschbaum et al. 2015; Hölbling et al. 2015). Landslide susceptibility assessment is able to ensure landslide-prone zones, thus it is vital for landslides prevention work (Eeckhaut et al. 2009a).

The accuracy of landslide susceptibility assessment is affected by mapping unit, prediction method, type of landslides, resolution of data and so on (Haacke et al. 2015; Kreuzer et al. 2017). Selecting suitable mapping unit is vital for the following analyses and modeling (Zhuang et al. 2016). The mapping units are regarded as the sampling units in landslide susceptibility zonation (Erener and Düzgün 2012; Rotigliano et al. 2012). After determining mapping units, the value of every landslide influence factor can be allotted to each unit.

The popular mapping units in landslide susceptibility assessment include grid cells, unique-condition units, slope units, etc. (Meijerink 1988; Chung and Fabbri 1995; Erener and Düzgün 2012). Grid cells are regular square cells with the given size (Cama et al. 2016). It can be stored in a matrix form which is convenient for the calculation. However, the grid cells are not related closely to geological environments (Guzzetti et al. 1999). Unique-condition units can be obtained by overlaying different landslide influential classification maps, thus every unit is determined by the combination of different properties (Chiessi et al. 2016). The size of units depends on the number of influence factors, while the total number of units depends on classified criteria of landslide influence factors. But some studies indicate that the disadvantage of the unique-condition units is that the classified criteria of influence factors is subjective (Carrara and Guzzetti 1995). Slope units are the watershed area defined by drainage lines (valley lines) and water divide lines (ridge lines), which are the basic topographical units of geological hazard occurrence (Wang et al. 2017). Slope units based on drainage and divide lines are more related to geological environment (Guzzetti et al. 1999). Although the identification of sub-basin boundaries is difficult, the hydrological tools of geographical information systems (GIS) can solve this problem (Erener and Düzgün 2012). This paper selects grid cells and slope units as mapping units for landslide susceptibility assessment.

A variety of models have been used for landslide susceptibility zonation (Parise and Jibson 2000; Lee 2004; Yesilnacar and Topal 2005; Fell et al. 2008), for instance, artificial neural network model (Zare et al. 2013; Garcíarodríguez and Malpica 2010), support vector machine (Pradhan 2013; Tham 2008), logistic regression model (Raja et al. 2017; Pradhan 2010) and analytic hierarchy process (Komac 2006; Myronidis et al. 2016). Information value model is a popular method (Che et al. 2012; Sharma et al. 2015). However, information value model does not take into consideration the different importance of different landslide influence factors but just assigns equal weight to every landslide influence factor. In this study, we utilizes a modified information value model proposed by Ba et al. (2017) for landslide susceptibility mapping which obtains the relative weights of different classes of every landslide influence factor through calculating information values of landslide influence factors and determines which landslide susceptibility rank each mapping unit belongs to using gray clustering analysis (Ba et al. 2017). Eight landslide influence factors are utilized in the model for evaluating landslide susceptibility.

The grid cell-based landslide susceptibility assessment result and the slope unit-based landslide susceptibility assessment result are finally compared by the receiver operating characteristics curve.

Methodology

Mapping units

Mapping unit is the minimum significative spatial unit which is obtained by subdividing the land surface into homogeneous areas. This paper selects the grid cells and slope units as mapping units to assess landslide susceptibility.

Grid cells

Grid cell is a popular mapping unit for susceptibility assessment since it can be obtained easily. The grid cells are generated through dividing the area into regular squares of a given size. Its matrix form is convenient for data processing and calculating. However, grid cells are not associated with geological environments, which is an important shortcoming of this mapping unit (Chiessi et al. 2016). Selection of the appropriate grid cell size for susceptibility mapping is vital. Trigila et al. (2015) used the most frequent landslide area to determine the correct size of the grid cell. The most frequent landslide area can be obtained by the use of frequency–area statistics. Moreover, in order to determine the correct size of grid cells, comparison of the slope gradient, obtained from DEM (digital elevation model) with different resolution, should also be considered.

Slope units

Slope units are generated according to hydrological theory. Slope units are thought as the watershed defined by the ridge lines and valley lines (Xie et al. 2004; Jia et al. 2015) (Fig. 1). Using the ArcGIS hydrology tools, these lines can be extracted to generate slope units. The slope units are closely related to actual geological environments, which is a main shortcoming for grid cells as mapping units (Chiessi et al. 2016).

Fig. 1
figure 1

The generation of slope units

The slope units are generated through following steps (Fig. 1). Firstly, Reverse DEM can be generated by subtracting the elevation value from the highest elevation value in each unit (Xie et al. 2004). Secondly, the DEM and Reverse DEM are filled to remove small imperfections in the data, which can be achieved by the Fill tool of ArcGIS. Thirdly, the flow direction is calculated by the eight direction method which can be obtained using the Flow Direction tool of ArcGIS. Fourthly, the flow accumulation can be obtained using the flow direction data. This step can be achieved by the Flow Accumulation tool of ArcGIS. In the next step, the stream network is obtained by selecting the flow accumulation of every cell above a certain threshold. Generally, 1% of the maximum flow accumulation is defined as the threshold (Erener and Düzgün 2012). Then the watershed of DEM or Reverse DEM is obtained and this step can be achieved by the Watershed tool of ArcGIS (Erener and Düzgün 2012). Finally, the watershed raster map from DEM and the watershed raster map from Reverse DEM are converted to the vector format. Then the watershed polygons from DEM and Reverse DEM are dissolved. After amending the unreasonable polygons, the slope units are generated.

Then for the continuous landslide influence factors including slope gradient, aspect, NDVI and MAP, the mean value of every landslide influence factor within every slope unit was assigned to this unit. As to the categorized landslide influence factors including distance to tectonic features, lithology, distance to stream network and distance to roads, the predominant value of every landslide influence factor within every slope unit was assigned to the corresponding unit.

Modified information value model

The modified information value model combined information value model with gray clustering. This method utilizes information value to calculate the weight of every landslide influence factor and utilizes gray clustering to determine which landslide susceptibility rank each mapping unit belongs to. The procedures are as follows.

  1. Step 1

    The information values were calculated to represent the likelihood of landslide occurrence. A larger information value indicates a higher likelihood of landsliding. Each landslide influence factor was classified into five classes using Jenks natural breaks optimization, except for aspect which was classified into nine classes (Chen et al. 2013; Xu et al. 2013). Jenks natural breaks optimization is a commonly-used data classification method which is achieved by diminishing the difference within every class and magnifying the variance among different classes. After classifying the landslide influence factors using ArcGIS Reclassify tools, the information value of each landslide influence factor can be calculated using Eq. (1) (Yan 1988; Yin and Yan 1988; Chen et al. 2016).

$$ \mathrm{I}\left(i,j\right)=\mathit{\ln}\frac{A_{ij}/A}{S_{ij}/S} $$
(1)

I(i) represents the information value; i(i = 1, 2, …, n) indicates the ith landslide influence factor; j(j = 1, 2, …, m) represents the jth class of the landslide influence factor; S indicates the total area of all mapping units; A indicates the total area of all landslide events; Sij indicates the total area of those mapping units having the distribution of jth class of ith landslide influence factor; Ai indicates the total area of those landslide events having the distribution of jth class of ith landslide influence factor.

  1. Step 2

    The value of the ith(i = 1, 2, …, n) influence factor in the pth(p = 1, 2, …, t) mapping unit which is represented by xpi should be standardized to remove the effect of dimension by the use of the min-max normalization method as follows. xM and xmindicates the maximum xpi and the minimum xpi. ypi is the standardized xpi (Wei and Feng 2004; Xie et al. 2014):

$$ {y}_{pi}=\frac{x_{pi}-{x}_m}{x_M-{x}_m} $$
(2)
  1. Step 3

    According to the information value of the jth(j = 1,2, …, m) class of the ith(i = 1,2, …, n) landslide influence factor, the landslide susceptibility of the ith landslide influence factor was classified into five ranks, including very low, low, medium, high and very high susceptibility. The class of a landslide influence factor with a larger information value should be assigned to a higher rank of landslide susceptibility.

  2. Step 4

    Then the clustering weight ηi(i = 1, 2, …, n),representing the ith landslide influence factor’s effect on landsliding, can be determined as follows:

$$ {\eta}_i=\frac{\lambda_i}{\sum_{i=1}^n{\lambda}_i} $$
(3)

where λi indicates the sum of the positive information values of the ith factor, and \( \sum \limits_{i=1}^n{\lambda}_i \)represents the sum of the positive information value of all factors (Ba et al. 2017).

  1. Step 5

    Then the whitening weight functions of every landslide influence factor can be derived. It can be used to describe the degree every mapping unit belonged to a landslide susceptibility rank. The gray whitening weight function of the ith influence factor for the kth rank which is expressed as \( {f}_i^k\left(\cdotp \right)\left(i=1,2,\dots, n;k=1,2,\dots, s\right) \) can be determined through following equations (Li et al. 2014; Xie et al. 2014).

  • ➀ The lower whitenization weight function (Fig. 2a) \( \left[-,-,{y}_i^k(3),{y}_i^k(4)\right] \)

Fig. 2
figure 2

Whitenization weight functions

$$ {f}_i^k(y)=\left\{\begin{array}{c}0\kern8.5em y\notin \left[0,{y}_i^k(4)\right]\\ {}1\kern8.5em y\in \left[0,{y}_i^k(3)\ \right]\\ {}\frac{y_i^k(4)-x}{y_i^k(4)-{y}_i^k(3)}y\in \left[{y}_i^k(3),{y}_i^k(4)\right]\end{array}\right. $$
(4)
  • ➁ The moderate whitenization weight function (Fig. 2b) \( \left[y(1),{y}_i^k(2),-,{y}_i^k(4)\right] \)

$$ {f}_i^k(y)=\left\{\begin{array}{c}\ 0\kern1.75em y\notin \left[{y}_i^k(1),{y}_i^k(4)\right]\\ {}\frac{y-{y}_i^k(1)}{y_i^k(2)-{y}_i^k(1)}\ y\in \left[{y}_i^k(1),{y}_i^k(2)\right]\\ {}\frac{x_i^k(4)-y}{y_i^k(4)-{y}_i^k(2)}y\in \left[{y}_i^k(2),{y}_i^k(4)\right]\ \end{array}\right. $$
(5)
  • ➂ The upper whitenization weight function (Fig. 2c) \( \left[{y}_i^k(1),{y}_i^k(2),-,-\right] \)

$$ {f}_j^k(y)=\left\{\begin{array}{c}0\kern3.5em y<{y}_i^k(1)\\ {}\frac{y-{y}_i^k(1)}{\ {y}_i^k(2)-{y}_i^k(1)}\ y\in \left[{y}_i^k(1),{y}_i^k(2)\ \right]\\ {}\frac{y_i^k(4)-y}{y_i^k(4)-{y}_i^k(3)}y\ge {y}_j^k(2)\end{array}\right. $$
(6)
  1. Step 6

    The clustering coefficient of every mapping unit in every susceptibility rank can be determined by Eq. (7):

$$ {\sigma}_p^k=\sum \limits_{i=1}^n{f}_i^k\left({y}_{pi}\right)\cdot {\eta}_i,\left(i=1,2,\dots, n;k=1,2,\dots, s\right) $$
(7)

where\( {\sigma}_p^k \) indicates the clustering coefficient of the pth unit in the kth rank.

  1. Step 7

    Finally, for every mapping unit, the maximum of\( {\sigma}_p^k\left(k=1,2,..,s\right) \)can be identified, then the k value of the maximum of\( {\sigma}_p^k \)can be expressed as k.According to the maximum membership principle, the ith mapping unit belongs to the susceptibility rank k.

Receiver operating characteristics curve

Receiver Operating Characteristics Curve (ROC curve) is one of the most commonly used validation method which can be used to validate the landslide susceptibility assessment results (Conforti et al. 2013; Günther et al. 2014). The horizontal axis (1-specificity) indicated the proportion of the mapping units without landslide occurrence which were correctly predicted. The vertical axis (Sensitivity) indicated the proportion of mapping units having landslide occurrence which were correctly predicted (Wang et al. 2015). The area under the ROC curve (AUC) value is able to quantitatively measure these prediction results (Chalkias et al. 2014). The ranges of AUC value is 0.5–1. A larger AUC value means higher model accuracy (Lee and Park 2016).

Study region and dataset

General situations of study region

Chongqing, as the study region in this paper, is situated in southwestern China. It is seated between 105°11′E-110°11′E and 28°10′N-32°13′N. Chongqing is located in the transition zone between Middle-Lower Yangtze plains and Qinghai-Tibet Plateau which approximately occupies 82402km2. This region has an East-West-width of 470 km and a North-South-length of 450 km. The highest altitude was 2797 m. The climate belongs to subtropical monsoonal climate that has abundant and concentrated rainfall between late spring and early fall. This area has abundant precipitation whose mean annual precipitation reaches 1000–1400 mm. It has a distinct topographical relief and a series of tectonic folds and faults. The terrain tilts from the north and south to the Yangtze valley. Daba Mountain and Wuling Mountain are situated in southeast. Chongqing is a mountainous region with mountains and hills taking up 76% of the total area. Chongqing developed a complete set of emergence stratum with large thickness and wide distribution, and only upper Tertiary system was absent. The emergence stratum mainly were the sandstone, mudstone and shale in Middle Jurassic and Lower Jurassic and the loose deposit of Quaternary. The red clastic rock-based Jurassic stratum had a wide distribution. The Carbonatite-based lower Paleozoic erathem was mainly distributed in the southeast of Chongqing. The typical Karst landforms such as Stone Forest, Karst cave, Karst gorges and other landscape are distributed in this area. Yangtze River and Jialing River flow through this region. Recently, enhancive engineering constructions such as the development of Three Gorges Reservoir, also contributed to landslide occurrences.

Landslide inventory data

In this paper, landslide events before 2014 were collected from CIGMR (Chongqing Institute of Geology & Mineral Resources). As shown in Fig. 3, there were 8435 landslide events which were recorded as point features with the attribute of landslide area. All landslide events affected a total area of 194,442,814 m2. The largest and the smallest landslide area was 3,080,000 m2 and 3 m2, respectively. The most frequent landslide area was equal to 12,000 m2, obtained by Frequency tool of ArcGIS. The main types of the landslide events were debris flows, translational slides and rotational slides. A major triggering factor of these landslide events was rainfall. Xiao (1995) pointed out that the threshold of precipitation for landslide occurrence was 150 mm/day in Chongqing. Other factors such as earthquake, human activities and groundwater activity also contributed to the landslide occurrence.

Fig. 3
figure 3

The location of Chongqing and the distribution of landslide events. Points indicate landslide events this paper used

Landslide influence factors

A variety of factors make a contribution to the landslide occurrence. This paper selects eight landslide influence factors including slope, aspect, rainfall, the distance to tectonic features, lithology, the distance to roads, the distance to rivers and vegetation for landslide susceptibility assessment.

Slope gradient has a close relation with landsliding. In theory landslides easily happen on steep slopes (Shit et al. 2016). However, Eeckhaut et al. (2009b) indicate that the likelihood of landslide occurrence is larger on the moderate slope gradient, since steep slopes are lacking in material basis for landslide occurrences. Aspect indirectly affects the landslides occurrence through affecting soil, rock, water, and vegetation (Pourghasemi et al. 2012). Rainfall mainly contributes to the landslides, since it increases the weight of slopes and decreases the shearing strength of the sliding layer. The mean annual precipitation is utilized to represent the rainfall. Tectonic conditions are also related to landslides. As is known to all, landslides are apt to happen near tectonic features as it causes the development of fractures and the broken rock (Su et al. 2010). Thus the distance to tectonic features is selected as an influence factor. Landslide occurrence also has a strong connection with lithologies (Kouli et al. 2009). Different lithologies or rock types have different composition and structure. Compared with the weaker rocks, the stronger rocks give more resistance to the driving force, and hence are less prone to landslides (Kanungo et al. 2006). Road construction easily contributes to slope instability. Thus landslides tend to happen near the road network. The distance to roads is utilized in calculations. Streams negatively affect slope stability through eroding the slopes and absorbing the material at the bottom (Bhatt et al. 2013). Therefore, the distance to stream networks is selected as the corresponding index. Vegetation cover is also related to the landslide occurrence which can reduce the influence of rainfall through holding the soil (Lundgren 1978). This paper thus utilizes NDVI (normalized difference vegetation index) in calculations. Generally speaking, a smaller NDVI value indicates a larger likelihood of landslide occurrence (Wang et al. 2014).

The elevation, slope and aspect were derived from the stereoscopic data collected from ASTER GDEM with a resolution of 30 m (Fig. 4). From the elevation data, the stream network was obtained by the use of ArcGIS hydrology tools. The tectonic features and lithologies were obtained through digitizing the Chongqing geological map with a scale of 1:500,000. The roads in this region were extracted from the national electronic map with a scale of 1:25,000 in vector format. CIGMR also provided daily precipitation from 2005 to 2014 and the geographical coordinates of 1003 rainfall observation stations. The mean annual precipitation of every rainfall observation station was extracted by dividing the sum of the daily precipitations from 2005 to 2014 by the number of years. Then the mean annual precipitation of the whole region can be obtained through IDW (Inverse Distance Weighted) interpolation. The NDVI raster map was obtained from Landsat images with a resolution of 30 m. The derived data are shown in Fig. 5.

Fig. 4
figure 4

The digital elevation model (DEM) of Chongqing

Fig. 5
figure 5

Landslide influence factor dataset

Results and discussion

The landslide events were randomly split into two parts: 70% (5905) of the landslide events as model training data and residual 30% (2530) of all landslide events as validation data. Using both grid cells and slope units as the mapping units, the modified information value model was constructed to generate landslide susceptibility maps. Then the assessment results by using the two different mapping units are compared.

Grid cell-based susceptibility assessment

The most frequent landslide area is equal to 12,000 m2, obtained by the frequency-area statistics. The 10 × 10 m DEM maintain these morphological elements. Based on the above two reasons, the 10 × 10 m grid cell is the most appropriate in the study area. In this region, there are altogether 825,682,626 grid cells. The value of every landslide influence factor is allotted to every grid cell. Then the information values of landslide influence factors were obtained by Eq. (1) (Table 1).

Table 1 Information values of landslide influence factors based on grid cells

According to the information values of slope gradient, landslides frequently happened in the range of 10°-35°. Landslides were more likely to occur between 10° and 20° which had the maximum information value (0.494). With respect to aspect, landslides more easily happened in northwest direction aspect as the maximum information value (0.387) was found in this range. Landslides were less likely to occur in the flat area as the minimum information value (−0.226) was found in the area. As to the distance to stream network, the maximum of information value was 0.441 in the range of <1000 m. Thus landslides tend to occur in the area with the distance to stream network less than 1000 m. Information values decreased along with the increasing of the distance to stream network. Therefore, the closer the distance, the higher likelihood of landsliding. As for the distance to tectonic features, information values were positive in the range of 0–1800 m and information values of other classes were negative. Hence the possibility of landslide occurrence was higher at the interval of<1800 m than any other classes. On the whole, the likelihood of landsliding increased as the distance to tectonic features decreased. With regard to rainfall, landslides were most likely to occur in the range of 1100–1200 mm/year as this range had the largest information value (0.247). The information value in the interval of 1200–1250 mm/year was 0.216, next only to the class 1100–1200 mm/year. It is generally thought that landslides should be most likely to occur in the area with the highest precipitation, but the results were inconsistent with it. This may be because sudden rainstorms also contributed to landsliding (Rampone and Valente 2012). With regard to the distance to roads, the maximum (0.619) of information values was founded in the range of 0–200 m. Thus landslides more easily happened around roads. As to NDVI, the maximum (0.663) and minimum (−0.781) of information value was found in the <0.55 class and in the >0.85 class, respectively. The larger the NDVI, the smaller the likelihood of landsliding. As for lithology, the results indicated that the areas with clastic rocks had the highest information value (0.955), and the information value of the areas with shales was the smallest (−3.351). Therefore, landslides occurred predominantly in the weaker rocks area because these rocks are easily saturated and then soften quickly, resulting in slope failures.

According to Table 1, a smaller information value represents a lower susceptibility rank. The landslide susceptibility ranks for each factor based on grid cells was shown in Table 2. It can be seen that the classes with the largest information value including slope gradient larger than 10° and smaller than 20°, distance to tectonic features smaller than 600 m, the areas with clastic rocks, reserviors and sandstones, distance to roads smaller than 200 m, MAP larger than 1100 mm and smaller than 1200 mm, distance to stream network smaller than 1000 m, northwest aspect direction and NDVI smaller than 0.5 were in the very high rank. According to the susceptibility ranks, whitenization weight functions of influence factors were generated.

Table 2 The determination of susceptibility ranks for influence factors based on grid cells

Clustering weights reflects the effect of landslide influence factors on landsliding. The clustering weight of the lithology was the largest (0.250), while the clustering weight of distance to faults was the smallest (0.046). The clustering weight (0.181) of NDVI was the second largest, while the clustering weight (0.079) of aspect was the second smallest. The weights of slope, MAP, distance to stream network and distance to roads were 0.081, 0.081, 0.113 and 0.170, respectively. After that the clustering coefficient of every grid cell for every landslide susceptibility rank was calculated and then the clustering vector of every grid cell was constructed. Subsequently, the susceptibility rank each grid cell belonged to was ensured. Therefore the grid cell-based susceptibility map was created (Fig. 6).

Fig. 6
figure 6

Grid cell-based landslide susceptibility zonation

As shown in Fig. 6, high and very high susceptible zones were mainly distributed along tectonic features, roads and stream network. Moreover, low susceptible zones were mainly located far away from tectonic features, roads and stream network that were located in the west and north. High susceptible zones which covered 9.28%. Very high susceptible zones accounted for 10.33%.

Slope unit-based susceptibility assessment

Using the DEM data and the largest elevation value of this region, Reverse DEM was generated. After that the DEM and Reverse DEM were applied for generating slope units through these steps including fill, extraction of flow direction, calculation of flow accumulation, generation of stream network, generation of watershed, combination of the watersheds derived from DEM and Reverse DEM. There were altogether 34,453 slope units (Fig. 7).

Fig. 7
figure 7

The generated slope units

According to the classifications of landslide influence factors which were the same with the experiment based on grid cells, the information values of each landslide influence factor based on slope units was calculated (Table 3). In this table, the total area of the landslide events present in the aspect class of NW was 0, thus the information value of this class was assigned the smallest information value of aspect. Thus the information value of the northwest direction of aspect was equal to the information value of the flat class.

Table 3 Information values of landslide influence factors based on slope units

We can see from Table 3 that landslides easily happened in the slope gradient range of 10°–20° which has the maximum information value (0.485). The information value between 10° and 20° was second largest (0.042). Therefore, landslides extensively happened in the regions with medium slope gradient. As to aspect, the maximum (0.337) was distributed in the southwest aspect. In addition, the minimum (−0.381) was distributed in the flat regions and northwest aspect. Consequently, landslides were more likely to occur in the southwest direction aspect and were less likely to occur in the flat areas and northwest direction aspect. With regard to the distance to stream network, information values decreased along with the increasing of distance to stream network. Thus the smaller distance to stream network was, the higher likelihood of landsliding. With regard to distance to tectonic features, landslides were prone to occur in the areas at the interval of <600 m as a result of the largest information value (0.523). The likelihood of landsliding increased with the decrease of the buffer distance to tectonic features. As for MAP, the maximum (0.296) information value was found in the range of 1100–1200 mm/year, thus landslides were most likely to occur in this range. The likelihood of landslide occurrence at the interval of 1200–1250 mm/year was the second largest. With regard to the distance to roads, the maximum (0.360) of information value was found in the areas with the buffer distance to roads less than 200 m. Information values decreased along with the increasing of distance to road. Hence landslides more easily happened around roads. As for NDVI, the larger NDVI value was, the smaller information value was. Landslides were most likely to occur when the NDVI value was less than 0.5. With respect to lithology, the information value in the areas with clastic rocks was 1.047, next only to the areas with sandstones (0.131). Therefore, landslides occurred predominantly in the weaker rocks area because these rocks are easily saturated and then soften quickly, resulting in slope failures.

From Tables 1 and 3, it can be seen that the information values based on slope units are similar to the information values based on grid cells, except for aspect. The maximum information value of each influence factor was found in the same classes for grid cell-based and slope unit-based model, including slope gradient 10°–20°, distance to tectonic features <600 m, the areas with sandstones and clastic rocks, distance to roads <200 m, MAP 1100–1200 mm, distance to stream network <1000 m and NDVI <0.55. The information values of aspect based on grid cells are different from the results based on slope units. The maximum information value of aspect is in the northwest direction aspect the grid cell-based model, while the maximum value is found in the southwest and west direction aspect for the slope unit-based model. This difference may be principally because there are some grid cells whose aspects were greatly different from the aspects of neighboring cells.

Using the information values in Table 3, the susceptibility rank of every influence landslide factor based on slope units was generated (Table 4). The susceptibility rank increased along with the increasing of the information value. Hence the slope 10–20°, the southwest direction aspect, the distance to stream network <1000 m, the areas with sandstones and clastic rocks, the distance to tectonic <600 m, MAP 1100–1200 mm, the distance to roads <200 m and NDVI smaller than 0.5 were in very high susceptibility rank. Then the whitenization weight function of every landslide influence factor was determined.

Table 4 The determination of susceptibility ranks for influence factors based on slope units.

The clustering weights based on slope units, reflecting the effect of landslide influence factors on landsliding, were subsequently determined. Lithology had the highest clustering weight (0.189), while the clustering weight (0.079) of NDVI was the minimum. The weight of slope was the second smallest (0.083). The weight of distance to stream network was the second largest (0.174). The weights of slope, aspect, the distance to tectonic features and the distance to roads were 0.083, 0.149, 0.111 and 0.130. Then clustering coefficients were obtained according to Eq. (7). According to the maximum membership principle, the maximum clustering coefficient within every clustering vector was obtained and then the susceptibility rank for every unit was ultimately confirmed (Fig. 8).

Fig. 8
figure 8

Slope unit-based landslide susceptibility zonation

As indicated in Fig. 8, very low and low susceptible zones were mainly located in the west and north area. Very high and high susceptible zones were situated along tectonic features, rivers and roads. On the whole, the grid cell-based susceptibility zonation was similar to the slope unit-based susceptibility zonation. High and very high susceptible zones occupied 19.30 and 3.18%, respectively.

Validation

This paper applies ROC curve for validating the susceptibility assessment results based on grid cells and slope units. The model training accuracy and prediction accuracy were measured by the success rate and prediction rate, respectively. The success rate (Fig. 9a) can be derived though making a comparison between the 70% landslide events (training data) and the susceptibility zonation results. The AUC values of the grid cell-based and slope unit-based results were 0.809 and 0.832, respectively. Therefore, the model training accuracies of the grid cell-based and slope unit-based results were 80.9 and 83.2%, respectively. The prediction rate (Fig. 9b) can be derived though making a comparison between the residual landslide events and the susceptibility zonation results. The AUC values of the grid cell-based and slope unit-based results were 0.803 and 0.826. Therefore, the prediction accuracies of the grid cell-based and slope unit-based results were 80.3 and 82.6%, respectively. As a result, the slope unit-based model outperformed the grid cell-based model in landslide susceptibility assessment due to higher training accuracy and prediction accuracy. Grid cells can be easily obtained in GIS but do not have a close relationship with geological environments. In contrast, slope units are the basic units of landslide occurrence (Wang et al. 2017). A slope unit is defined as the watershed delimited by ridge lines and valley lines. Therefore, slope units are more related to geological environments, which make the evaluation results more conformable to reality (Wang et al. 2017).

Fig. 9
figure 9

ROC curves for grid cell-based and slope unit-based susceptibility assessment results

Conclusions

This paper mainly analyzed the influence of using different mapping units in a landslide susceptibility assessment model. The modified information value model was adopted to assess landslide susceptibility and slope units and grid cells were used as mapping units, respectively. Eight landslide influence factors, including slope gradient, aspect, MAP, distance to roads, distance to stream network, distance to tectonic features, lithology and NDVI, were utilized to construct the model.

The landslide susceptibility assessment results indicated that landslide-prone zones were mainly located around tectonic features, rivers and roads. ROC curve was used to evaluate the accuracy of the two models based on grid cells and slope units. Through calculating the training accuracy and prediction accuracy, slope unit-based model performed better in landslide susceptibility assessment than grid cell-based model. Although grid cells can be easily obtained in GIS and it is convenient for calculation, they are not related closely to geological environment. Slope unit is the basic unit of the landslide occurrence, and it is derived from the DEM data. Therefore,the slope units are more related to geological environment, which make the evaluation results accurate.

Nevertheless, the classifications of landslide influence factors were based on previous studies and might be not suitable for our study region. Therefore, further studies should propose an objective influence factor classification method for landslide susceptibility assessment. And because of the lack of other data, this paper just used eight landslide influence factors. Other factors such as earthquakes and land use change should be considered in the future studies. For the slope unit-based model, the same likelihood of landslide occurrence was allotted to a whole unit (Huabin et al. 2005). Thus it is difficult to determine within which part of the slope landslides tend to occur. This problem should be considered in the future studies. Moreover, the following studies should consider the seed cells which reflect the real effect of parameter maps over the distribution of landslides (Suzen and Doyuran 2004a; Suzen and Doyuran 2004b).