Introduction

Landslide susceptibility is a measurement of the occurrence probability of landslides under certain geo-environmental conditions. Landslide susceptibility assessment is generally regarded as a prior step towards an assessment of landslide hazard and risk (Corominas et al. 2014). The methods and techniques of landslide susceptibility assessment as well as the comparisons between different methods and techniques had been widely studied (e.g., Yesilnacar and Topal 2005; Chacón et al. 2006; Lee et al. 2007; Yilmaz 2009; Yalcin et al. 2011; Akgun 2012; Park et al. 2013; Nourani et al. 2014; Youssef 2015; Wang et al. 2016).

The methods of landslide susceptibility assessment can be categorized into three fundamental types, namely the qualitative “knowledge-driven methods,” quantitative “data-driven methods,” and quantitative “physically based methods” (Corominas et al. 2014). Knowledge-driven methods assess landslide susceptibility by ranking and weighting different landslide-related factors based on the knowledge of experts. Data-driven methods evaluate landslide susceptibility by referring to the geo-environmental characteristics of those locations where landslides had occurred. Physically based methods predict landslide susceptibility based on the mechanisms and processes that control the initiation (failure) of landslides. All the three types of methods have both advantages and drawbacks. Although subjectivity is inevitably involved in direct knowledge-driven methods, the quality of landslide susceptibility maps produced by indirect approaches can be improved by the introduction of expert knowledge (Thiery et al. 2014).

Generally, more complex methods that require larger amount of data are applied at larger scales of landslide susceptibility mapping (Thiery et al. 2007), and data-driven methods have become standard in regional scale landslide susceptibility assessment (Corominas et al. 2014). Data-driven methods broadly include bivariate methods and (multivariate) machine learning methods. Main bivariate methods commonly used are the frequency ratio method (e.g., Lee and Talib 2005; Kannan et al. 2013; Son et al. 2016), weight of evidence method (e.g., Lee and Choi 2004; Thiery et al. 2007; Kayastha et al. 2012), fuzzy logic method (e.g., Ilanloo 2011; Kayastha et al. 2013), information value method (e.g., Sarkar et al. 2013; Wang et al. 2015a), Dempster-Shafer method (e.g., Park 2011; Mohammady et al. 2012), geographic information system (GIS) matrix method (Irigaray et al. 2007), and so on. Main machine learning methods commonly used are the logistic regression method (e.g., Ayalew and Yamagishi 2005; Lee 2005; Bai et al. 2010; Wang et al. 2013), artificial neural network method (e.g., Lee et al. 2003; Ermini et al. 2005; Tsangaratos and Benardos 2014), support vector machine method (e.g., Yao et al. 2008; Marjanović et al. 2011), random forest method (e.g., Catani et al. 2013; Youssef et al. 2016), and so on.

Bivariate methods quantify landslide susceptibility through calculating the weight values of each class of individual landslide-related factors. Among bivariate methods, frequency ratio method is one of the most popular (Korup and Stolle 2014) and can have higher accuracies compared with other methods as shown by several case studies (e.g., Pradhan 2010; Mohammady et al. 2012; Ozdemir and Altural 2013; Regmi et al. 2014; Guo et al. 2015; Ramesh and Anbazhagan 2015; Chen et al. 2016; Ding et al. 2016; Vakhshoori and Zare 2016). Although many case studies had shown that machine learning methods generally perform better than frequency ratio method in landslide susceptibility assessments (e.g., Yilmaz 2009; Pradhan and Lee 2010a; Akgun 2012; Park et al. 2013; Youssef 2015), the frequency ratio method is still commonly used by researchers and practitioners. There are two reasons for the ongoing popularity of frequency ratio method. The first reason is that the frequency ratio method is friendly to end users because of the simplicity and clarity of the principles behind. The understandability of the input, calculation, and output procedures, as well as the ease of implementation on a GIS environment, makes the frequency ratio method be an acceptable simple tool of landslide susceptibility assessment when sufficient data are available (e.g., Lee et al. 2007; Yilmaz 2009). The second reason is that the vulnerabilities to landslide failure of individual landslide-related factors can be investigated by the frequency ratio values calculated for each factor class. Like other bivariate methods, the frequency ratio method not only produces landslide susceptibility maps but also serves to inspect the correlations between landslides and landslide-related factors (e.g., Akgun et al. 2008; Kayastha 2015; Guo et al. 2015).

The classification of landslide-related factors with continuous factor values is usually the first step of the conventional frequency ratio method (see, “Conventional frequency ratio method” section). However, the classifications of factors will induce a discontinuity problem of the frequency ratio values and a subjectivity problem. The discontinuity of the frequency ratio values means that all the factor values in the same factor class will have the same frequency ratio value, which will eventually result in a discontinuity of the spatial distribution of landslide susceptibility. The subjectivity problem means that the choices of the number and the bounds of the classes of factors are more or less subjective. Another problem faced in the practice is that the factor classifications and the calculations of frequency ratios for different factors need much manual labors. The subjectivity and manual labor problems can be moderated by adopting statistics in the classification of landslide-related factors. However, whether the classes yielded by statistics reflect reality or not remains a problem.

This paper attempts to make a modification on the conventional frequency ratio method and implement this modified method in a GIS environment to get a handy landslide susceptibility assessment tool. The performance of the modified frequency ratio method is evaluated using two case studies. It is worth noting that the term “landslide” (Cruden and Varnes 1996) can embrace different types of materials (e.g., rock, debris, and earth) and different types of movements (e.g., falls, slide, and flows). The landslide inventories used in the numerical experiments of this paper contain different types of landslides, which means that one single susceptibility map was produced for all types of landslides. This research is expected to benefit the landslide susceptibility assessments in practice.

Methods

Conventional frequency ratio method

Let L and F stand for landslides and a certain landslide-related factor, respectively. Given that the factor F is categorized into n types or subdivided into n classes, the frequency ratio (FR) for the ith type or the ith class of factor F (F i ) can be written as:

$$ \begin{array}{l}{FR}_i=\frac{{\mathrm{PL}}_i}{PF_i}\\ {}\kern3em =\frac{the\kern0.5em \mathrm{frequency}\kern0.5em \mathrm{of}\kern0.5em \mathrm{landslides}\kern0.5em \mathrm{in}\kern0.5em \mathrm{the}\kern0.5em {F}_i\kern0.5em \mathrm{area}}{the\kern0.5em \mathrm{frequency}\kern0.5em \mathrm{of}\kern0.5em \mathrm{the}\kern0.5em {F}_i\kern0.5em \mathrm{area}}\\ {}\kern3em =\frac{\mathrm{the}\kern0.5em \mathrm{area}\kern0.5em \mathrm{of}\kern0.5em \mathrm{landslides}\kern0.5em \mathrm{in}\kern0.5em \mathrm{the}\kern0.5em {F}_i\kern0.5em \mathrm{area}/\mathrm{the}\kern0.5em \mathrm{area}\kern0.5em \mathrm{of}\kern0.5em \mathrm{landslides}\kern0.5em \mathrm{in}\kern0.5em \mathrm{the}\kern0.5em \mathrm{study}\kern0.5em \mathrm{area}}{\mathrm{the}\kern0.5em \mathrm{area}\kern0.5em \mathrm{of}\kern0.5em \mathrm{the}\kern0.5em {F}_i\kern0.5em \mathrm{area}/\mathrm{the}\kern0.5em \mathrm{area}\kern0.5em \mathrm{of}\kern0.5em \mathrm{the}\kern0.5em \mathrm{study}\kern0.5em \mathrm{area}}\end{array} $$
(1)

A frequency ratio FR i larger than 1 indicates that “the frequency of landslides in the F i area” (PL i ) is larger than “the frequency of the F i area” (PF i ) and further indicates that the ith type or the ith class of factor F (F i ) favors the occurrence of landslides. On the contrary, a frequency ratio FR i smaller than 1 indicates that F i does not favor the occurrence of landslides. A transformation of Eq. (1) is as follows:

$$ \begin{array}{l}{FR}_i\kern0.5em =\frac{\mathrm{the}\kern0.5em \mathrm{area}\kern0.5em \mathrm{of}\kern0.5em \mathrm{landslides}\kern0.5em \mathrm{in}\kern0.5em \mathrm{the}\kern0.5em {F}_i\kern0.5em \mathrm{area}/\mathrm{the}\kern0.5em \mathrm{area}\kern0.5em \mathrm{of}\kern0.5em \mathrm{the}\kern0.5em {F}_i\kern0.5em \mathrm{area}}{\mathrm{the}\kern0.5em \mathrm{area}\kern0.5em \mathrm{of}\kern0.5em \mathrm{landslides}\kern0.5em \mathrm{in}\kern0.5em \mathrm{the}\kern0.5em \mathrm{study}\kern0.5em \mathrm{area}/\mathrm{the}\kern0.5em \mathrm{area}\kern0.5em \mathrm{of}\kern0.5em \mathrm{the}\kern0.5em \mathrm{study}\kern0.5em \mathrm{area}}\\ {}\kern3em =\frac{\mathrm{the}\kern0.5em \mathrm{probability}\kern0.5em \mathrm{of}\kern0.5em \mathrm{landslides}\kern0.5em \mathrm{in}\kern0.5em \mathrm{the}\kern0.5em {F}_i\kern0.5em \mathrm{area}}{\mathrm{the}\kern0.5em \mathrm{probability}\kern0.5em \mathrm{of}\kern0.5em \mathrm{landslides}\kern0.5em \mathrm{in}\kern0.5em \mathrm{the}\kern0.5em \mathrm{study}\kern0.5em \mathrm{area}}\\ {}\kern2em =\frac{p\left(L|{F}_i\right)}{p(L)}\end{array} $$
(2)

Since “the probability of landslides in the study areap(L) is predetermined given that the landslide and factor data are provided, the frequency ratio FR i is totally determined by “the probability of landslides in the F i area” p(L|F i ), which is in fact “the conditional probability of L given F i ” (e.g., Parise and Jibson 2000). A larger conditional probability p(L|F i ) means that the occurrence probability of landslides is larger in the ith type or the ith class of factor F (F i ).

Consider an arbitrary landslide-related factor F (j) (j = 1, 2, 3, …, m), its frequency ratios with regard to different types or different classes, namely FR i (j) (i = 1, 2, 3, …, n; j = 1, 2, 3, …, m), can be calculated according to Eq. (1). If the type or the class of F (j) at a certain location is F i (j), the frequency ratio of this factor at this location FR(j) will be FR i (j). Then, the landslide susceptibility index (LSI) at this location will be the summation of the frequency ratios of different landslide-related factors at this location (e.g., Lee and Pradhan 2007):

$$ LSI={\displaystyle \sum_{j=1}^m{FR}^{(j)}} $$
(3)

It is worth noting that if landslide areas are not available, the counts of landslides can be used instead of the areas of landslides in Eqs. (1) and (2).

Modified frequency ratio method

The modified method also calculates a landslide susceptibility index by summing the frequency ratios of different landslide-related factors. The framework of calculating the frequency ratio of a certain landslide-related factor is shown in Fig. 1. There are three essential procedures in this modified method.

  1. 1.

    Normalization. This procedure normalizes the continuous factor values of each factors to 0–1 range. The normalization of factor values makes it possible to use the same parameters for different factors in the following precision setting and frequency statistics procedures.

  2. 2.

    Precision setting. The purpose of precision setting is to reduce the calculation loads on the condition that adequate precision is guaranteed so that the calculation can be accelerated. For example, if the parameter “precision” is set to be 3, the normalized factor values will have only 3 digits after the decimal point. This means that there will exist at most 1001 identical normalized factor values. This further means that the variety of the frequency ratios of each landslide-related factor will be characterized by at most 1001 values, since frequency ratios will be calculated with regard to each identical normalized factor value in the frequency statistics step.

  3. 3.

    Frequency statistics. For the kth identical normalized factor value I k (k = 1, 2, 3, …, l), we mark those regions covered by normalized factor values within a certain neighborhood around this identical normalized factor value (I k ) as F k . Then, F k can be analogized to the ith type or the ith class of factor F (F i ) in the conventional frequency ratio method. Given that the area of F k (AF k ) and the area (or the count) of landslides in F k (AL k ) are calculated, “the frequency of F k ” (PF k ) and “the frequency of landslides in F k ” (PL k ) as well as “the frequency ratio of I k ” (FR k ) can be calculated according to Eq. (1). The size of the neighborhood is determined by a parameter “bin width,” which is often used in histogram statistics. This bin width is a value between 0 and 1 since the factor values have already been normalized. The bin width can be analogized to the widths of different factor classes in the conventional frequency ratio method.

Fig. 1
figure 1

The procedures of the modified frequency ratio method

The conventional frequency ratio method calculates the frequency ratios for only a few factor classes (or types), while the modified method calculates the frequency ratios for quite a few identical normalized factor values so that the variation of frequency ratio with factor value are more differentiable. The number of identical normalized factor values is determined by the original factor values and the precision. Yet, it must be emphasized that precision setting is not obligatory. Theoretically, if precision is not considered, the modified method can calculate a frequency ratio for each of the factor values within the study area. This modified frequency ratio method can be regarded as applying “moving frequency statistics” for every identical normalized factor value using a uniform neighborhood window size. The neighborhoods of different identical normalized factor values can have overlaps and can also have gaps if a low precision and a small bin width are adopted.

For landslide-related factors that are already categorized into different types, e.g., geological units and land use types, the frequency ratios are still calculated using the conventional method since continuous factor values are not available. For landslide-related factors with continuous factor values, through normalizing factor values, the modified method makes the calculations of frequency ratios for different factors constrained by only two uniform parameters (precision and bin width). Because precision setting is not obligatory, the bin width becomes the only obligatory parameter that needs to be input by the users. Therefore, the subjectivities associated with the manual classifications of factors in the conventional method can be reduced. Furthermore, the simplicity of the inputs of the modified method significantly reduces manual labor and therefore favors an automatic and quick assessment of landslide susceptibility.

GIS extension

A GIS extension called “Automatic Landslide Susceptibility Assessment (ALSA)” that implements the modified frequency ratio method was developed in ArcGIS using ArcObjects and C#. The interface of ALSA in ArcMap is shown in Fig. 2. The input data needed by ALSA include (1) landslides, (2) landslide-related factors, and (3) processing extent. The landslide data can be points or polygons. Setting weight is an option for point data so that the magnitudes of landslides can be represented if landslide areas (polygons) are not available. The landslide-related factor data must be in raster format. If a factor is already classified, the checkbox in front of its corresponding data layer is needed to be checked. If all the factors are classified, i.e., all the checkboxes are checked, the input textboxes for precision and bin width will become disabled since there are no factors with continuous factor values needed to be processed using the modified method. The processing extent can be automatically obtained by selecting vector or raster data. In this way, the extent of the data (a rectangular) will be accepted as the processing extent. The four coordinates of a rectangular processing extent can also be manually input and modified by the users. Using the geometry of a polygon feature as the processing extent is also an option, in which situation manual modification of processing extent is not allowed. The coordinate systems of the landslide data, landslide-related factor data, and the data defining the processing extent must be the same. The input parameters needed by ALSA include (1) cell size of output rasters, (2) precision of identical normalized factor values, and (3) bin width for frequency statistics. The full path of the output LSI raster should be defined by the users. Furthermore, if the checkbox “process individual factors (optional)” is checked, the LSI will not be calculated according to Eq. (3), and only the frequency ratio rasters for individual landslide-related factors will be the output.

Fig. 2
figure 2

The interface of ALSA extension in ArcMap

Case studies

The Anning Basin in Sichuan and the Caiyuan Basin in Fujian have been chosen to test the modified method because they suffer severe geological disasters (i.e., landslides). They are located in the southwest inland mountainous area and the southeast coastal mountainous area of China (Fig. 3), respectively.

Fig. 3
figure 3

The Anning Basin (a) and the Caiyuan Basin (b) as well as their locations in China (c)

Study areas and data

The Anning Basin

The Anning Basin (Fig. 3a) is the drainage basin of the Anning River. The Anning River is a tributary of the Yalong River, which is a tributary of the Yangtze River. The Anning Basin has an area of about 11,016 km2. The vigorous tectonic activities and steep topography make the Anning Basin highly sensitive to geological disasters. The 102 landslides in the Anning Basin (Fig. 3a) recorded in a geological disaster database of Sichuan were adopted in this case study. Only landslide points (locations) are available. Ten landslide-related factors with continuous factor values were adopted to assess landslide susceptibility, namely height, slope, aspect, curvature, relief, distance to river, distance to road, distance to fault, precipitation, and normalized difference vegetation index (NDVI). The height, slope, aspect, curvature, relief, and river data were derived from the shuttle radar topography mission digital elevation model (SRTM DEM) dataset with a 90 m × 90 m spatial resolution. The relief of a location is the elevation drop within a 1 km × 1 km square neighborhood around that location. The stream network (river) was extracted using hydrology analysis. The road data is composed of the China National Highways, China Provincial Highways, and China County Highways in the Anning Basin. The fault data was extracted from the 1:50,000 geological map of China. The grid precipitation data from 1981 to 2010 with a 1 km × 1 km spatial resolution (Wang et al. 2015b) was adopted. The NDVI data was the MOD13Q1 V006 data on the 353rd day of 2015 which was provided by the Land Processes Distributed Active Archive Center (LP DAAC). Geological formations are one of the most important parameters for landslide susceptibility assessments (e.g., van Westen et al. 2006; Corominas et al. 2014). However, no geological formation data with adequately large spatial scale is available for the Anning Basin. This will not be a big problem for this study since the modified method will yield the same results as the conventional method for classified landslide-related factors.

The Caiyuan Basin

The Caiyuan Basin (Fig. 3b) is located in Nanping City, Fujian Province, China. The Caiyuan Basin has an area of about 25.47 km2. The frequent storm rainfalls and mountainous topography make the Caiyuan Basin highly prone to geological disasters. Heavy rainfall struck Fujian in the mid-to-late June, 2010, and induced large numbers of landslides. Landslides in the Caiyuan Basin were manually mapped on 2.5 m spatial resolution SPOT images taken shortly after this rainfall event in a GIS platform. Totally, 1028 landslides (polygons) in the Caiyuan Basin (Fig. 3b) were adopted in this case study. Six landslide-related factors with continuous factor values were adopted to assess landslide susceptibility, namely height, slope, aspect, curvature, relief, and distance to river. All these factor data were derived from a 5 m × 5 m spatial resolution DEM. The relief of a location is the elevation drop within a 50 m × 50 m square neighborhood around that location, and the stream network (river) was extracted using hydrology analysis. Geological formation data with adequately large spatial scale is not available for the Caiyuan Basin.

Results

Landslide susceptibilities in the Anning Basin and in the Caiyuan Basin were assessed using both the modified and conventional frequency ratio methods. The cell sizes of output rasters for the Anning Basin and Caiyuan Basin were 100 and 5 m, respectively. The precision of identical normalized factor values and bin width for frequency statistics in the modified method were 4 and 0.1. The landslide-related factors with continuous factor values are needed to be classified first in the conventional method. Without loss of generality, we categorized all the landslide-related factors into five classes using the Jenks’s natural breaks method, which identifies class breaks that best group similar values and maximize the differences between classes.

It must be emphasized that the choosing and processing of input data (i.e., landslide inventory and landslide-related factors) can have significant effects on landslide susceptibility mapping results (e.g., Thiery et al. 2007; Costanzo et al. 2012; Fressard et al. 2014; Hussin et al. 2016). This is why expert knowledge is still essential for landslide susceptibility assessment although many sophisticated computer-aided methods of landslide susceptibility assessment had been developed (e.g., Thiery et al. 2014; Tsangaratos and Ilia 2016). In order to inspect the influences of the choosing and processing of input data on the assessment of landslide susceptibility, we randomly grouped the landslide inventories in the two study cases into training subsets and test subsets using three different grouping ratios for both modified and conventional methods, and for the conventional method, we also categorized all the landslide-related factors into five classes using quantile breaks. The three grouping ratios of landslide inventory for the Anning Basin are 51:51, 34:68, and 17:85. The three grouping ratios of landslide inventory for the Caiyuan Basin are 514:514, 128:900, and 32:996.

The success rate curve and prediction rate curve (Chung and Fabbri 2003) were used to quantitatively evaluate the performance of the methods. Lager values of AUC (area under the curve) indicate higher accuracies of assessment (e.g., Pradhan and Lee 2010b; Xu et al. 2012; Regmi and Poudel 2016). It is worth noting that if landslide area is negligible compared with the whole study area, the AUC values of success and prediction rate curves will approximate the AUC values of ROC (receiver operating characteristic) curve, which was also commonly used for evaluating landslide susceptibility maps in many studies (e.g., Mathew et al. 2009; Erener and Düzgün 2010; Akgun 2012; Ahmed 2015). The success and prediction rate curves of different scenarios of landslide susceptibility assessment for the Anning Basin and the Caiyuan Basin are shown in Figs. 4 and 5, respectively. The AUC values of different scenarios for the Anning Basin and the Caiyuan Basin are also shown in Tables 1 and 2, respectively.

Fig. 4
figure 4

The success and prediction rate curves of different scenarios of landslide susceptibility assessment for the Anning Basin. “MFRM” indicates the modified frequency ratio method and “CFRM, natural breaks” and “CFRM, quantile breaks” indicate the conventional frequency ratio method using the Jenks’s natural breaks method and quantile breaks to classify landslide-related factors, respectively. a The success rate curve for 51:51 grouping ratio of landslide inventory, b the prediction rate curve for 51:51 grouping ratio of landslide inventory, c the success rate curve for 34:68 grouping ratio of landslide inventory, d the prediction rate curve for 34:68 grouping ratio of landslide inventory, e the success rate curve for 17:85 grouping ratio of landslide inventory, and f the prediction rate curve for 17:85 grouping ratio of landslide inventory

Fig. 5
figure 5

The success and prediction rate curves of different scenarios of landslide susceptibility assessment for the Caiyuan Basin. “MFRM” indicates the modified frequency ratio method and “CFRM, natural breaks” and “CFRM, quantile breaks” indicate the conventional frequency ratio method using the Jenks’s natural breaks method and quantile breaks to classify landslide-related factors, respectively. a The success rate curve for 514:514 grouping ratio of landslide inventory, b the prediction rate curve for 514:514 grouping ratio of landslide inventory, c the success rate curve for 128:900 grouping ratio of landslide inventory, d the prediction rate curve for 128:900 grouping ratio of landslide inventory, e the success rate curve for 32:996 grouping ratio of landslide inventory, and f the prediction rate curve for 32:996 grouping ratio of landslide inventory

Table 1 The AUC values of the success and prediction rate curves of different scenarios of landslide susceptibility assessment for the Anning Basin
Table 2 The AUC values of the success and prediction rate curves of different scenarios of landslide susceptibility assessment for the Caiyuan Basin

Comparisons

Quantitative comparison

From the AUC values in Tables 1 and 2 and the curves in Figs. 4 and 5, we can find that there are no significant differences in accuracies between different scenarios of landslide susceptibility assessment for both the Anning and Caiyuan Basins. In order to inspect the macro characteristics of the differences between different scenarios, the average difference of AUC values calculated by different methods given the same grouping ratios and the average difference of AUC values calculated by the same methods given different grouping ratios were calculated (Tables 3 and 4). Since the AUC values of success rate curves generally increase with the decrease of grouping ratios (Tables 1 and 2), the differences of AUC values between different grouping ratios were calculated using the AUC values of smaller grouping ratios subtracting those of larger grouping ratios (Table 4). The results shown that a higher AUC value of success rate curve will commonly be accompanied by a lower AUC value of prediction rate curve, which might be owing to the over-fitting problem. The over-fitting commonly caused that training datasets with smaller amounts of landslides yield higher AUC values of success rate curve. Therefore, we further averaged the results for the success rate curve and prediction rate curve and will use this “final average difference of AUC values” for quantitative comparisons.

Table 3 The average difference of AUC values calculated by different methods given the same grouping ratios
Table 4 The average difference of AUC values calculated by the same methods given different grouping ratios

Although both the AUC values of individual landslide susceptibility maps and the differences of AUC values between different maps are relatively small, from Tables 3 and 4, we can still find that (1) the performance of the modified frequency ratio method is slightly better than the conventional method, (2) the performances of the Jenks’s natural breaks method and quantile breaks in the conventional method are almost the same, with the Jenks’s natural breaks method slightly better, and (3) the differences produced by different grouping ratios are larger than those produced by different methods. The results on the one hand indicate that the modified frequency ratio method had made a slight improvement, while on the other hand support the notion that the choosing and processing of input data in some situations can be more influential than methods in landslide susceptibility assessments. Although based on minor differences of AUC values, we think these observations are more or less representative since they were derived from systematic numerical experiments.

Qualitative comparison

The modification of the frequency ratio method also introduced some qualitative changes. The results in which landslide inventories were evenly grouped into training and test subsets were adopted in qualitative comparisons. And the results in which the Jenks’s natural breaks method was used in the conventional method were chosen since natural breaks performed slightly better than quantile breaks. The frequency ratios of different landslide-related factors calculated using the modified and conventional frequency ratio methods of the Anning and Caiyuan Basins are shown in Figs. 6 and 7, respectively. Landslide susceptibility maps of the Anning and Caiyuan Basins are shown in Figs. 8 and 9, respectively. Landslide susceptibility indices (LSIs) calculated using the modified and conventional frequency ratio methods were classified into four classes (low, moderate, high, and very high) using quantile breaks.

Fig. 6
figure 6

The frequency ratios of different landslide-related factors in the Anning Basin calculated using the modified and conventional frequency ratio methods. The x coordinates of the data points for the conventional method are the averages of all the factor values of the corresponding factor classes. a Height, b slope, c aspect, d curvature, e relief, f distance to river, g distance to road, h distance to fault, i precipitation, and j NDVI

Fig. 7
figure 7

The frequency ratios of different landslide-related factors in the Caiyuan Basin calculated using the modified and conventional frequency ratio methods. The x coordinates of the data points for the conventional method are the averages of all the factor values of the corresponding factor classes. a Height, b slope, c aspect, d curvature, e relief, and f distance to river

Fig. 8
figure 8

Landslide susceptibility indices (LSIs) of the Anning Basin calculated using the modified and conventional frequency ratio methods are classified into four classes using quantile breaks: a the classified LSI calculated using the modified method, b a close look of the classified LSI calculated using the modified method, c the classified LSI calculated using the conventional method, and d a close look of the classified LSI calculated using the conventional method

Fig. 9
figure 9

Landslide susceptibility indices (LSIs) of the Caiyuan Basin calculated using the modified and conventional frequency ratio methods are classified into four classes using quantile breaks: a the classified LSI calculated using the modified method, b a close look of the classified LSI calculated using the modified method, c the classified LSI calculated using the conventional method, and d a close look of the classified LSI calculated using the conventional method

The results show that the frequency ratios of different landslide-related factors calculated using the modified method and those calculated using the conventional method have similar fluctuating patterns. However, the results calculated using the modified method provide more detailed fluctuations of frequency ratio compared with the results calculated using the conventional method (Figs. 6 and 7). For example, in the Anning study case, for the height factor, the modified method gives 3715 frequency ratio values (one for each identical normalized factor value), while the conventional method gives only 5 frequency ratio values (one for each factor class). The frequency ratios of the height factor of the Anning study case calculated using the modified method present a summit around 1500 m, but the drop of frequency ratio before this summit has not been reflected by the results calculated using the conventional method (Fig. 6a). In the Caiyuan study case, the drop of frequency ratio as the slope exceeds about 58° (Fig. 7b) and the drop of frequency ratio as the relief exceeds about 42 m (Fig. 7e) observed in the results calculated using the modified method have not been reflected by the results calculated using the conventional method. The absences of these characteristic turning points in the results of the conventional method can mislead the analysis of vulnerabilities to landslide failure of individual landslide-related factors.

The high resolution of frequency ratios calculated using the modified method has also resulted in some “unexpected” high values of frequency ratio. For example, in the Anning study case, there are distinct summits of frequency ratio around high values of distance to river (>10,000 m, Fig. 6f) and high values of distance to fault (>15,000 m, Fig. 6h). In the Caiyuan study case, there is also a summit of frequency ratio around 240 m distance to river (Fig. 7f). These unexpected summits of frequency ratio, however, have truly reflected the nature of the factor data and landslide data. Taking distance to fault as an example, the low frequency of F k (PF k ) for high values of distance to fault caused that even low frequency of landslides in F k (PL k ) can induce relatively high frequency ratio of I k (FR k ) (Fig. 6h). These unexpected high values of frequency ratios not necessarily reflect the constraints of landslide-related factors on landslide initiations, on the contrary might just be statistical outcomes. Nevertheless, these unexpected high values of frequency ratio are accompanied by low frequency of F k (PF k ), i.e., only occupy small areas. These small areas with unexpected high values of frequency ratio will not distort the macro characteristics of landslide susceptibility map.

The differences in resolutions of frequency ratio curves will eventually be reflected in the spatial characteristics of landslide susceptibility zoning maps (Figs. 8 and 9). We can see from Fig. 8 that the conventional method failed to embrace some landslide points in very high or high susceptibility zones because it did not present high values of frequency ratio for distance to fault larger than 15,000 m. For the Caiyuan study case with larger scale, the “scattering of isolated raster cells” observed in the landslide susceptibility zoning map produced by the conventional method (Fig. 9d) fades significantly in the map produced by the modified method (Fig. 9b). The high resolutions of the frequency ratios calculated using the modified method significantly reduced the discontinuity of the spatial distribution of landslide susceptibility.

Conclusions

This paper has proposed a modification of the frequency ratio method for assessing landslide susceptibility. The modified method calculates the frequency ratios for every “identical normalized factor value” instead of only calculating the frequency ratios for every “class” of landslide-related factor with continuous factor values in the conventional method.

The modified method reduces the subjectivity involved in the conventional method because the classification of landslide-related factors with continuous factor values is no longer a prerequisite. The calculations of frequency ratios for different factors are constrained by only two uniform parameters (precision and bin width) in the modified method. The simplicity of the inputs of the modified method makes it possible to develop a handy tool for landslide susceptibility assessment, which in turn can accelerate the assessment of landslide susceptibility, reduce manual labor, and eventually, convenience for end users. A GIS extension called “Automatic Landslide Susceptibility Assessment (ALSA)” that implements the modified frequency ratio method has also been developed.

The modified method has been evaluated in two case studies based on systematic numerical experiments. The quantitative results show that the modified method deduced slightly larger AUC values than the conventional method. In addition, the modified method presented much more detailed variations of frequency ratio with factor value because the frequency ratios for every “identical normalized factor value” were calculated. The high resolution of frequency ratios derived by the modified method is able to reveal characteristic fluctuations of frequency ratio with factor values, which might be missed in the conventional method. High resolution of frequency ratios can further smoothen the spatial discontinuities of landslide susceptibility indices observed in the susceptibility maps predicted by the conventional method.

Although the proposed modified frequency ratio method brought some improvements on the conventional method, some issues must be remarked. The first one is that the choosing and processing of input data are essential for landslide susceptibility assessment; thus, incorporating expert knowledge into landslide susceptibility assessment should be also paid attention to besides developing sophisticated assessment methods. The second one is that no geology or surficial formation data were involved in the landslide susceptibility models of this study.