1 Introduction

In recent years, growing population and development of settlement, infrastructures, and life-lines have largely increased the impact of natural hazards both in industrialized and developing countries (Guzzetti 2005). Landslides play an important role in the evolution of landforms and represent a serious hazard in many areas of the World (Guzzetti 2005). In many countries, landslides generate large annual losses of property than any other type of natural hazards, including earthquakes, floods, and windstorms (Garcia-Rodriguez et al. 2008). According to Centre for Research on the Epidemiology of Disasters (CRED 2009), landslides accounted for approximately 4.4 % of natural disasters worldwide from 1990 to 2009, with 2.3 % of reported landslides occurring in Asia. To minimize the losses of human life and economic value, potential landslide-prone areas should be identified (Devkota et al. 2013). For this reason, landslide susceptibility maps may be helpful for planners, decision makers, and engineers in slope management and land use planning. A landslide susceptibility map gives an important indication of where future landslides are likely to occur based on the identification of areas of past landslide occurrences and areas where similar or identical physical characteristics exist (van Westen et al. 2006). Several different methods and techniques for landslide susceptibility mapping have been proposed and tested. However, no general agreement exists either on the methods for or on the scope of producing landslide susceptibility maps (Carrara et al. 1995; Soeters and van Westen 1996; van Westen et al. 1997; Aleotti and Chowdhury 1999; Guzzetti et al. 1999).

Many studies have evaluated landslide susceptibility using geographic information system (GIS) technology, and many of these studies have used probabilistic models (Lee and Pradhan 2006; Dahal et al. 2008; Oh et al. 2009; Ozdemir 2009; Yilmaz 2010; Oh and Lee 2011; Demir et al. 2012; Pourghasemi et al. 2012a, b; Mohammady et al. 2012; Xu et al. 2012c). The statistical index model is one of the bivariate models while were used by some researchers (Van Westen 1997; Rautela and Lakhera 2000; Cevik and Topal 2003; Tien Bui et al. 2011a; Raman and Punia 2012; Regmi et al. 2013). Also, several studies have been applied to assess landslide susceptibility using logistic regression models in different parts of the world (Ayalew and Yamagishi 2005; Lee and Pradhan 2007; Bai et al. 2010; Nandi and Shakoor 2010; Oh and Lee 2010; Ercanoglu and Temiz 2011; Erner and Duzgun 2012; Devkota et al. 2013).

The analytical hierarchy process and its combinations such as multi-criteria evaluation (MCE), multi-criteria decision analysis (MCDA), spatial multi-criteria evaluation (SMCE) have been used by different authors in landslide susceptibility mapping (Barredo et al. 2000; Nie et al. 2001; Ayalew et al. 2005; Komac 2006; Yalcin 2008; Akgun and Turk 2010; Pourghasemi et al. 2012c; Demir et al. 2012; Hasekiogullari and Ercanoglu 2012; Feizizadeh and Blaschke 2012a, b; Pourghasemi et al. 2012e).

In the past decade, some new methods such as artificial neural networks (ANNs) (Lee et al. 2004; Pradhan and Buchroithner 2010; Zare et al. 2012), fuzzy logic (Pradhan 2010a, b; Pradhan 2011a, b; Akgun et al. 2012; Pourghasemi et al. 2012c), and adaptive neuro-fuzzy inference system (ANFIS) (Vahidnia et al. 2010; Oh and Pradhan 2011; Sezer et al. 2011; Tien Bui et al. 2011b; Pradhan 2013) have been proposed.

Recently, new landslide susceptibility assessment methods such as support vector machine (SVM) (Ballabio and Sterlacchini 2012; Marjanović et al. 2011; Yao et al. 2008; Yilmaz 2010; Xu et al. 2012b; Pourghasemi et al. 2013), decision tree methods (Nefeslioglu et al. 2010; Tien Bui et al. 2012), index of entropy (Bednarik et al. 2010; Constantin et al. 2011; Pourghasemi et al. 2012d, f; Devkota et al. 2013; Wan 2012), Bayesian network (Song et al. 2012; Tien Bui et al. 2012), and fractal theory (Majtan et al. 2002; Yang and Lee 2006; Li et al. 2011) were tried, and their performances were assessed.

The main goals of the current research are to present a detailed landslide susceptibility mapping study by binary logistic regression, analytical hierarchy process, and statistical index models in a landslide-prone area (north of Tehran, Iran), and to assess their performances. The main difference between the present study and the approaches described in the aforementioned publications is to compare the performances of two statistical approaches such as bivariate and multivariate with an expert knowledge-based model (AHP) in landslide susceptibility mapping in the north of Tehran metropolitan, Iran.

2 Study area

The study area is located in the north of Tehran metropolitan, Iran, between longitudes 51°05′26″E and 51°50′30″E, and latitudes 35°45′50″N and 35°59′16″N (Fig. 1). It covers an area of about 900 km2. Based on geological survey of Iran (GSI 1997), the lithology of study area is very variety and 33.97 % it covers by group 5 (Table 1) including alternation of shale and tuffaceous siltstone (E ss3 ), green crystal, lithic and ash tuff, tuff breccia, and partly with intercalations of limestone (E t2 ), alternation of shale and tuffaceous siltstone (E ts2 ), rhyolitic tuff with some intercalations of shale (E r2 ), massive green tuff, shale with dacitic and andesitic-basaltic lava flows (E tsv1 ), dark gray shale with alternation of green tuff, and partly with sandstone, shale, conglomerate and limestone (E sht1 ), alternation of green tuff and shale (E tsh1 ), andesitic-basaltic lava breccia and lava flows (E b1 ), rhyolitic tuff and lava flows (E r1 ), dacitic to andesitic lava flows and rhyodacitic pyroclastic (E da1 ), bituminous siltstone and shale, calcareous tuffite (E ss1 ), tuffaceous sandstone, green tuff (E st1 ), shales and siltstone (E sl1 ), and green tuffs and limestone (E tl1 ). Meanwhile, based on Geology Survey of Iran (GSI 1997), 27.54 % of lithology of study area included by group 4 (Table 1).

Fig. 1
figure 1

Landslide location map of study area

Table 1 Lithology of the study area (GSI 1997)

Landslides are very common phenomenon in the North of Tehran due to its climate condition. Most of these landslides occur near the rivers and valleys. Velenjak region is located in the North-West of Tehran is one of most sensitive areas. Some other of prone regions are including Ozgol, Dar Abad, North of Saadat Abad, North of Emam Zadeh Ghasem, Oushan-Fasham road, Meygoon, North of Lavasan, North of Kan, and Golab Darreh. Population density and high price of lands of these areas are the main reasons for landslide susceptibility mapping, which can be used for optimum management and also avoidance of susceptible regions.

The most important trusts and faults of study area include of Mosha-Fasham, Purkan-Vardij, North of Tehran trusts, Shirpala and Emamzadeh Davud faults (GSI 1997). The altitude of the area ranges from 1,349.5 to 3,952.9 a.m.s.l. The slope angles of the area range from 0° to as much as 83°. The major land use of the study area consists of rangeland and covers almost 90.5 % of the whole area.

3 Conditioning factors database

For any kind of landslide study, a correct landslide database is the pre-requisite (Varnes 1984). Besides, landslide inventory mapping is the most fundamental step in any landslide susceptibility and hazard modeling (Ercanoglu and Gokceoglu 2004). It allows us to develop knowledge about the past landslide types, failure mechanisms, and conceptual knowledge about relations between existing landslide and conditioning and triggering factors (Ghosh 2011). Inventories are prepared using different techniques depending on the scope of the work, the extent of the study area, the scales of base maps, the quality and detail of the accessible information, and the resources available to carry out the work (Guzzetti et al. 2000). In the study area, a total of 528 landslides were mapped at 1:25,000-scale, using aerial photograph, satellite images, and field survey. Some views of the recent landslides identified in the study area are shown in Fig. 2. The smallest landslide that was mapped form above source and recognized in the field had an extent of 685 m2, while the largest was 280,804 m2. The modes of failure for the landslides identified in the study area were determined according to the landslide classification system proposed by Varnes (1978). Most of the landslides are shallow rotational with a few translational. However, during the analyses performed in the present study, only rotational failure is considered and translational slides were eliminated because its occurrence is rare. In this research, the landslide inventory was randomly split into a testing dataset 70 % (370 landslide locations) for training the adopted models and the remaining 30 % (158 landslides locations) was used for validation purpose (Fig. 1). Identification of a suitable set of instability factors bearing a relationship with slope failures requires an a priori knowledge of the main causes of landslides (Guzzetti et al. 1999).

Fig. 2
figure 2

Field photographs of some occurred landslides in study area

In order to landslide susceptibility zoning of the study area, twelve landslide conditioning factors were considered. These factors are slope degree, slope aspect, altitude, plan curvature, normalized difference vegetation index, land use, lithology, distance from rivers, distance from roads, distance from faults, stream power index (SPI), and slope-length (LS) (Table 2).

Table 2 Landslide database of study area

A digital elevation model (DEM) was created from 13 adjacent topographic sheets (digitalization of contours at a 10 m interval and points) at 1:25,000-sclae. The DEM map has a grid size of 10 m with 2,452 rows and 6,768 columns. The digital elevation model has been subsequently used to derive the slope degree, slope aspect, altitude, and plan curvature, which are considered as important topographic factors for stability of the terrain. The slope map of the study area is derived from the DEM using the slope function in ILWIS-GIS. These slope values (in degree) are divided into five different classes are (1) flat-gentle slope <5°, (2) fair slope (5–15°), (3) moderate slope (15–30°), (4) steep slope (30–50°), and (5) very steep slope >50° (Fig. 3a). Slope aspect strongly affects hydrologic processes via evapo-transpiration, direction of frontal precipitation, and thus affects weathering processes and vegetation and root development, especially in drier environments (Sidle and Ochiai 2006). Aspect layer has been categorized into nine classes (Fig. 3b): (1) Flat, (2) north, (3) northeast, (4) east, (5) southeast, (6) south, (7) southwest, (8) west, and (8) northwest. The altitude does not contribute directly to landslide manifestation, but in relation to the other parameters, like tectonics, erosion–weathering processes, and precipitation, the altitude contributes to landslide manifestation and influences the whole system (Rozos et al. 2008). The altitude map for study area with cell size 10 × 10 m was produced from the DEM and classified into 6 classes, that is, (1) <1,500 m, (2) 1,500–2,000 m, (3) 2,000–2,500 m, (4) 2,500–3,000 m, (5) 3,000–3,500 m, and (6) >3,500 m (Fig. 3c). The curvature represents the morphology of the topography. A positive curvature indicates that the surface is upwardly convex at that cell, and a negative curvature indicates that the surface is upwardly concave at that cell. A value of zero indicates that the surface is flat (Oh and Lee 2010) (Fig. 3d). The normalized difference vegetation index is a measure of surface reflectance and gives a quantitative estimate of the vegetation growth and biomass (Hall et al. 1995; Yilmaz 2009). Using the satellite images of Indian remote sensing (IRS) by sensors LISS-III and panchromatic, the NDVI was taken into consideration as a landslide-related factor (Fig. 3e). The NDVI was calculated from the following equation:

Fig. 3
figure 3

Landslide conditioning factors of the study area; a slope degree, b slope aspect, c altitude, d plan curvature, e NDVI, f land use; g lithology; h distance from rivers; i distance from roads; j distance from faults; k SPI; l slope-length

$$ {\text{NDVI}} = \left\{ {\left( {{\text{IR}} - {\text{R}}} \right)/\left( {{\text{IR}} + {\text{R}}} \right)} \right\} $$
(1)

where, IR, infrared portion of the electromagnetic spectrum; R, red portion of the electromagnetic spectrum.

Land use layer was prepared using IRS-LISS-III and panchromatic remote sensing images. The supervised classification and maximum likelihood algorithm is assigned in order to create this map. The area is covered by eight land use types that are agricultural land, cliff, forest, orchard, range land, settlement area, shrubs, and water body. The details of land use type are shown in Fig. 3f and summarized in Table 7. The study area is covered dominantly by range land area (90.51 %). Lithological features are represented in the geological map of the study area (Fig. 3g), which is derived from the two geological maps of Tehran and east of Tehran in scale of 1:100,000. The mentioned map was prepared by geological survey of Iran (1997), digitized in ILWIS-GIS (integrated land and water information system), and divided into eight groups (Table 1). The drainage system of any area plays an important role in slope stability particularly with respect to toe cutting and bank erosion (Miller and Sias 1998). The distance from rivers was calculated using the vector river lines by applying the distance function available in the ArcGIS. Six classes corresponding to distance from river were calculated at 100-m intervals (Fig. 3h). In mountainous region, any disturbance on natural slopes, such as road cutting, may cause the initiation of mass movements (Nefeslioglu et al. 2008). Accordingly, these types of territories, it could be helpful to consider the proximity of roads as a conditioning parameter in landslide occurrence. The map of distance from roads was also constructed by buffering having the respective intervals of 100 m (Fig. 3i). The distance from faults was extracted from the structural geology map of study area at 1:100,000-scale. Five buffers at 200-m class interval around faults were created. The fault buffer categories were thus defined as (1) 0–200 m, (2) 200–400 m, (3) 400–600 m, (4) 600–800 m, and (5) >800 m (Fig. 3j). In this study, two well-known secondary geo-morphometric factors were also evaluated. These factors are stream power index and slope-length (Fig. 3k, l). These conditioning factors were derived based on slope map and specific catchment area (A S) (Moore and Burch 1986; Moore et al. 1991).

$$ {\text{SPI}} = \left( {\tan \beta \times A_{\text{S}} } \right) $$
(2)
$$ {\text{LS}} = \left( {{\raise0.7ex\hbox{${A_{\text{S}} }$} \!\mathord{\left/ {\vphantom {{A_{\text{S}} } {22.13}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${22.13}$}}} \right)^{0.6} \times \left( {{\raise0.7ex\hbox{${\sin \beta }$} \!\mathord{\left/ {\vphantom {{\sin \beta } {0.0896}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${0.0896}$}}} \right)^{1.3} $$
(3)

where β is the slope angle in degree and A S is calculated based on following equation (Hengl et al. 2003):

$$ A_{\text{S}} = \left( {{\raise0.7ex\hbox{${A_{m} \times P^{2} }$} \!\mathord{\left/ {\vphantom {{A_{m} \times P^{2} } {\mathop \sum \nolimits L_{i} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\mathop \sum \nolimits L_{i} }$}}} \right) $$
(4)

In the above equation, P is the pixel size, A m is the cumulative drainage fraction from m neighbors, and \( \mathop \sum \nolimits L_{i} \) is derived as the sum of lengths for drainage pixels.

Stream power index is a measure of the erosive power of flowing water based on the assumption that discharge is proportional to specific catchment area. Also, the slope-length (LS) factor in the Universal Soil Loss Equation (Eq. 3) is a measure of the sediment transport capacity of overland flow (Moore and Wilson 1992).

4 Methodology

As mentioned previously, the main purpose of the present study is to investigate and comparison of the landslide susceptibility mapping using three models such as binary logistic regression, analytical hierarchy process, and statistical index in the north of Tehran metropolitan, Iran. Figure 4 shows the landslide susceptibility analysis and methodology flowchart used in this study.

Fig. 4
figure 4

Flowchart of methodology

4.1 Binary logistic regression (BLR)

The binary logistic model, as a nonlinear regression model, is a special case of a generalized linear model (Schumacher et al. 1996). The goal of logistic regression is to find the best model to describe the relationship between a dependent variable and multiple independent variables (Ohlmacher and Davis 2003; Lee 2005; Ozdemir 2011). The advantage of logistic regression is that, through the addition of an appropriate link function to the usual linear regression model, the variables may be either continuous or discrete, or any combination of both types and they do not necessarily have normal distributions (Lee and Pradhan 2007). The algorithm of logistic regression applies maximum likelihood estimation after transforming the dependent variable into a logic variable representing the natural logarithm of the odds of the dependent occurring or not (Atkinson and Massari 1998; Bai et al. 2010). The mentioned model can be expressed according to following equation (Lee and Pradhan 2007):

$$ P = \left( {\frac{1}{{1 + e^{ - Z} }}} \right) $$
(5)

where, P is the estimated probability of landslide occurrence and varies from 0 to 1 on and S-shaped curve, and Z is the linear combination while defined as the following Equation (Eq. 6) and its value varies from −∞ to +∞:

$$ Z = {\text{intercept}} + b_{1} x_{1 + } b_{2} x_{2 + } b_{3} x_{3 + } \ldots b_{n} x_{n} $$
(6)

where b 1, b 2, b 3, and b n, are the slope coefficient of the logistic regression model and x 1, x 2, x 3, and x n are the independent variables.

4.2 Analytical hierarchy process (AHP)

The analytical hierarchy process is a theory of measurement for considering tangible and intangible criteria that has been applied to numerous areas, such as decision theory and conflict resolution (Vargas 1990; Yalcin 2008). The AHP is an eigenvalue technique to the pair-wise comparisons approach. It is based on three principles: decomposition, comparative judgment, and synthesis of priorities (Saaty 1994; Chen et al. 2009). The decomposition principle is applied to structure a complex problem into a hierarchy of clusters, sub-clusters, and so on (Kheirkhah Zarkesh 2005). The comparative judgment principle of AHP requires pair-wise comparison of the decomposed elements within a given level of hierarchal structure with respect to the next higher level. The synthesis principle of AHP takes each of the derived ratio scale local priorities in the various levels of the hierarchy and constructs a composite set of priorities for the elements at the lowest level of the hierarchy (Chen et al. 2009). The AHP provides a numerical fundamental scale, which ranges from 1 to 9 to calibrate the quantitative and qualitative performances of priorities (Table 3) (Saaty 2008). This matrix ultimately enters in expert choice (EC) software and will calculate final weight for each conditioning factor with consistency ratio (CR). If CR is less than 10 %, then the matrix can be considered as having an acceptable consistency (Saaty 1977). Finally, the landslide susceptibility map using AHP model was constructed using the following equation:

Table 3 The fundamental scale of absolute numbers (Saaty 2008)
$$ \begin{aligned} {\text{LSM}}_{\text{AHP}} & = \, \left( \left( {{\text{slope degree}} \times W_{\text{AHP}}} \right) \, + \, \left( {{\text{slope aspect}} \times W_{\text{AHP}} } \right) \, + \, \left( {{\text{altitude}} \times W_{\text{AHP}} } \right) \right. \\ & \quad + \, \left( {{\text{plan curvature}} \times W_{\text{AHP}} } \right) \, + \, \left( {{\text{NDVI}} \times W_{\text{AHP}} } \right) \, + \, \left( {{\text{land use}} \times W_{\text{AHP}} } \right) \, \\ & \quad + \, \left( {{\text{lithology}} \times W_{\text{AHP}} } \right) \, + \, \left( {{\text{distance from rivers}} \times W_{\text{AHP}} } \right) \, + \, \left( {{\text{distance from roads}} \times W_{\text{AHP}} } \right) \, \\ & \quad \left.+ \, \left( {{\text{distance from faults}} \times W_{\text{AHP}} } \right) \, + \, \left({{\text{SPI}} \times W_{\text{AHP}} } \right) \, + \, \left({{\text{LS}} \times W_{\text{AHP}} } \right) \right)\end{aligned} $$
(7)

where \( W_{\text{AHP}} \) is the weightage for the each landslide conditioning factor.

4.3 Statistical index (SI)

The statistical index method is a bivariate statistical analysis proposed by van Westen (1997) for landslide susceptibility mapping. A weight value for each categorical unit is defined as the natural logarithm of the landslide density in the categorical unit divided by the landslide density in the entire map (van Westen 1997; Rautela and Lakhera 2000; Cevik and Topal 2003). This method is based on the following equation (van Westen 1997):

$$ W_{\text{SI}} = \ln \left( {\frac{{E_{ij} }}{E}} \right) = \ln \left( {\frac{{{\raise0.7ex\hbox{${L_{ij} }$} \!\mathord{\left/ {\vphantom {{L_{ij} } {L_{\text{T}} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${L_{\text{T}} }$}}}}{{{\raise0.7ex\hbox{${P_{ij} }$} \!\mathord{\left/ {\vphantom {{P_{ij} } {P_{\text{L}} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${P_{\text{L}} }$}}}}} \right) $$
(8)

where, W SI, weight given to a certain class i of parameter j; E ij , landslide density within class i of parameter j; E, total landslide density within the entire map; L ij , number of landslides in a certain class i of parameter j; P ij , number of pixels in a certain class i of parameter j; L T, total number of landslides in the entire map; P L, total pixels of the entire map.

Yesilnacar (2005) is stated that the bivariate statistical method gives a satisfactory combination of the (subjective) professional direct mapping and the (objective) data driven analytical capabilities of a GIS. The main advantage of bivariate statistical procedures is that the professional, who executes the analysis, determines the factors or combinations of factors used in the assessment.

In the current research, every parameter map is crossed with the landslide inventory map, and the density of the landslide in each class is calculated. The statistical index map is created by the overlay method in ArcGIS. Positive values of W SI indicate a relevant relationship between the presence of the factor class and landslide distribution, the stronger the higher the score. In contrary, negative values of W SI mean that the presence of the factor class is not relevant in landslide development.

5 Results

5.1 Binary logistic regression

The binary logistic regression analysis was performed using the statistical package for the social sciences (SPSS). In order to process the input data layers, all the conditioning factors and landslides were converted into grid format and then into ACSII data format (Devkota et al. 2013). ASCII data of each map were exported to SPSS, and then the binary logistic regression model was run to obtain the coefficients of the landslide conditioning factors for numerical and categorical data. The Hosmer and Lemeshow test showed that the goodness of fit of the equation can be accepted, because the significance of chi-square is larger than 0.05 (1.00). The value of Cox and Snell R 2 (0.009) and Nagelkerke R 2 (0.624) showed that the independent variables can explain the dependent variables in a way.

The β coefficient of each conditioning factor is shown in Table 4. According to Table 4, it is observed that normalized different vegetation index (NDVI), slope-length (LS), distance from rivers, distance from faults, and distance from rivers have an important role in the landslide susceptibility mapping of study area, because of positive β value. The β values of these conditioning factors are 1.930, 1.524, 0.042, 0.030, and 0.009, respectively. On the other hand, slope degree, altitude, and stream power index (SPI) have negative effect in landslide occurrence with β values of −2.643, −0.023, and −0.050, respectively. In the case of slope aspect, south (β = 17.566), north (β = 16.336), flat (β = 11.656), southeast (β = 3.448), east (β = 1.547), and northeast (β = 0.470) facing have positive β coefficient. In the contrary, southwest facing has value of −3.174. For land use factor, results showed that only range land type has an effect on landslide susceptibility with value of 0.875, while the remaining land use types does not have any role in landslide occurrence of the north of Tehran. Based on results of logistic regression for lithology factor, we seen that lithological formation of groups 3 and 1 (Table 1) have positive β value, whereas groups of 2 and 4 with negative value of −8.171 and −4.795 have an inverse effect on landslide susceptibility.

Table 4 Beta coefficients and test statistics of the variables used in the logistic regression equation

5.2 Multi-collinearity in binary logistic regression

An important consideration in regression is the effect of correlation among independent variables. There is a problem that exists when two independent variables are very highly correlated. The problem is called multi-collinearity. Tolerance and the variance inflation factor (VIF) are two important indexes for multi-collinearity diagnosis. In fact, tolerance is 1−R 2 for the regression of that variable against all the other independents, without the dependent variable. On the other hand, VIF is simply the reciprocal of tolerance. VIF measures the degree to which the interrelatedness of the variable with other predictor variables inflates the variance of the estimated regression coefficient for the variable. Consequently, the square root of the VIF is the degree to which the collinearity has increased the standard error for that variable. A tolerance of less than 0.20 or 0.10 and/or a VIF of 5 or 10 and above indicates a multi-collinearity problem (O’Brien 2007). According to Table 5, the smallest tolerance and highest variance inflation factor were 0.496 and 2.018, respectively. So, there is not any multi-collinearity between independent factors in current research. Finally, the BLR model developed for the study area is given in Eq. 9.

Table 5 The multi-collinearity diagnosis indexes for variables
$$ \begin{aligned} {\text{Z}} & = \left\{ { - 0.016 + \left( {{\text{slope degree}} \times - 2.643} \right) + \left( {\text{slope aspect}} \right) + \left( {{\text{altitude}} \times - 0.023} \right)} \right. \\ & \quad + \left( {{\text{plan curvature}} \times - 11.197} \right) + \left( {{\text{NDVI}} \times 1.930} \right) + \left( {\text{land use}} \right) + \left( {\text{lithology}} \right) \\ & \quad + \left( {{\text{distance from rivers}} \times 0.042} \right) + ({\text{distance from roads}} \times 0.009) \\ & \quad \left. { + ({\text{distance from faults}} \times 0.030) + ({\text{SPI}} \times - 0.050) + ({\text{LS}} \times 1.524)} \right\} \\ \end{aligned} $$
(9)

5.3 Analytical hierarchy process (AHP)

AHP is a multi-objective, multi-criteria decision-making approach, which enables the user to arrive at a scale of preference drawn from a set of alternatives (Saaty 1980). The expert choice software package (E.C. Inc. 1995) based on the analytic hierarchy process (AHP) has been used to estimate weights of the importance of the major objectives (conditioning factors) and their sub-objectives for landslide susceptibility mapping and to test for consistency ratio (CR) between preferences within individual stakeholder groups. In order to calculate of CR, we used of following equation:

$$ {\text{CR}} = \left( {{{\text{CI}} \mathord{\left/ {\vphantom {{\text{CI}} {\text{RI}}}} \right. \kern-0pt} {\text{RI}}}} \right) $$
(10)

where RI is the average of the resulting consistency index depending on the order of the matrix given by Saaty (1980) and CI is the consistency index and can be expressed as:

$$ {\text{CI}} = \left( {{{(\lambda_{\hbox{max} } - n)} \mathord{\left/ {\vphantom {{(\lambda_{\hbox{max} } - n)} {(n - 1)}}} \right. \kern-0pt} {(n - 1)}}} \right) $$
(11)

where \( \lambda_{\hbox{max} } \) is the largest or principal eigenvalue of the matrix and can be easily calculated from the matrix, and n is the order of the matrix. A CR of 0.1 or less is a reasonable level of consistency (Malczewski 1999). A CR above 0.1 requires revision of the judgment in the matrix due to an inconsistent treatment of particular factor ratings.

Using of AHP method, the levels of the influence of major objectives (conditioning factors) were calculated (Table 6). According to Table 6, it can be seen that lithology and slope-length (LS) factors have the most and less influence on landslide occurrence with values of 0.21 and 0.02, respectively. The other factors such as slope degree, slope aspect, altitude, plan curvature, NDVI, land use, distance from rivers, distance from roads, distance from faults, and SPI have weight values of 0.15, 0.07, 0.06, 0.13, 0.08, 0.11, 0.05, 0.06, 0.04, and 0.03, respectively. In current study, the CR is 0.0676; the ratio indicates a reasonable level of consistency in the pair-wise comparisons.

Table 6 The weight each conditioning factors by analytical hierarchy process

Also, the correlation between the landslide locations and the sub-objectives of conditioning factors was presented in Fig. 5. The values are given in Fig. 5 show that all CR values are less than 0.1, and consequently, this proves the preferences utilized to produce the comparison matrixes are consistent. In order to landslide susceptibility mapping by analytical hierarchy process, we were used of the following equation:

Fig. 5
figure 5

Results of AHP for sub-objectives of conditioning factors in expert choice (EC) software

$$ \begin{aligned} {\text{LSM}}_{\text{AHP }} & = \left\{ \left( {{\text{slope degree}} \times 0.15} \right) + \left( {{\text{slope aspect}} \times 0.07} \right) + \left( {{\text{altitud}} \times 0.06} \right) \right. \\ & \quad + \left( {{\text{plan curvature}} \times 0.13} \right) + \left( { {\text{NDVI}} \times 0.08} \right) + \left( {{\text{land use}} \times 0.11} \right) \\ & \quad + \left( {{\text{lithology}} \times 0.21} \right) + \left({{\text{distance from rivers}} \times 0.05} \right) + \left({{\text{distance from roads}} \times 0.06} \right) \\ & \left. \quad + \left( {{\text{distance from faults}} \times 0.04} \right) + \left( {{\text{SPI}} \times 0.03} \right) + ({\text{LS}} \times 0.02) \right\} \end{aligned} $$
(12)

5.4 Statistical index (SI)

Spatial relationship between each landslide conditioning factor and landslide by statistical index model is shown in Table 7. According to Table 7, in the case of slope degree, class of 15°–30° has the highest value of SI with a positive value (0.21), and other classes have negative value. On the other hand, our observation showed that when slope degree is increasing, statistical index is decreasing. For slope aspect conditioning factor, north, northeast, and east facing have a positive value of SI (0.24, 0.64, and 0.27, respectively). This means that the landslide probability is higher in these classes. The statistical index (SI) value for altitude clearly showed that ranges of 2,500–3,000 and 3,000–3,500 m have the most effect on landslide occurrence. However, it is clear that the landslide susceptibility increases by the increase in altitude up to a certain extent (2,500–3,000 m) and then it decreases. In the case of plan curvature, the SI value is positive (0.003 and 0.03) both in concave and convex slopes. The other slope shapes (flat) indicate negative value. Therefore, there is no indication that these shapes favor instability. The NDVI factor shows that the range between 0.05–0.1 and <−0.001 is relatively favorable (high susceptible) for landslide occurrence. It can be said that there is a diverse effect of the presence of vegetation to slope instability. In the case of land use, positive value of SI is seen on range land area only. This type of land use covers almost 90.5 % of study area. When comparing the relationship between landslides and lithology, the statistical values were positive in groups 3, 5, 7, and 8. Meanwhile, group 7 is very susceptible to landslide occurrence with value of 0.86; because of lithological formations are basic marl, marly limestone, siltstone, shale, and clay. In regarding distance from rivers, distances between 100–200 and 200–300 m have a positive value of SI (0.12 and 0.31, respectively), indicating a very high probability of landslide occurrence. Proximity from roads has little impact on landsliding. Distances between <100 m and 100–200 from roads show a very low or non-susceptible to landsliding compared to the other classes. On the other hand, three classes of proximity to roads show strong favor for landsliding. These classes are 200–300 m (SI = 0.06), 300–400 m (SI = 0.16), and >400 m (SI = 0.13). Maybe this appears to go against the visible pattern of more failures close to roads, it is likely due to a few large landslides where no roads are present. As a result, the large slides increase the percentage of landslide pixels occurring far from roads. In case of distance from faults, the intervals 600–200 and >800 m have weights (SI) of 0.21 and 0.08, respectively. It can be observed that as the distance from faults increases, the landslide frequency generally decreases. We think this is for hard and very susceptible lithological formations in close and faraway of faults in study area. The drainage density <0.0018 km/km2 has a SI value of 0.20, whereas class of 0.0027–0.013 has a SI value of −0.69. It can be observed that as the drainage density increases, the landslide frequency generally decreases. The relation between stream power index and landslide probabilities showed that class of 600–900 has the highest value of SI (0.37), and for compound topographic index, the class of 8–10 shows a high SI value (0.223). Similarly, for slope-length, the highest SI value was obtained for the interval of 60–90. The mentioned results for secondary topographic attributes (SPI, CTI, and LS) showed that these classes are very susceptible to landslide and its occurrence.

Table 7 Spatial relationship between each landslide conditioning factors and landslide by statistical index model

Finally, Landslide susceptibility map by statistical index (SI) model was created by following equation:

$$ \begin{aligned} {\text{LSM}}_{\text{SI}} & = \left( {\left( {W_{\text{SI}} {\text{slope degree}}} \right) + \left( {W_{\text{SI}} {\text{slope aspect}}} \right) + \left( {W_{\text{SI}} {\text{altitude}}} \right)} \right. \\ & \quad + \left( {W_{\text{SI}} {\text{plan curvature}}} \right) + \left( {W_{\text{SI}} {\text{NDVI}}} \right) + \left( {W_{\text{SI}} {\text{land use}}} \right) + \left( {W_{\text{SI}} {\text{lithology}}} \right) \\ & \quad + \left( {W_{\text{SI}} {\text{distance from rivers}}} \right) + \left( {W_{\text{SI}} {\text{distance from faults}}} \right) \\ & \quad \left. { + \left( {W_{\text{SI}} {\text{distance from roads}}} \right) + \left( {W_{\text{SI}} {\text{SPI}}} \right) + \left( {W_{\text{SI}} {\text{LS}}} \right)} \right) \\ \end{aligned} $$
(13)

In this research, three landslide susceptibility maps such as binary logistic regression, analytical hierarchy process, and statistical index (Fig. 6a–c) were prepared in ArcGIS into four classes and according to natural break classification method (Falaschi et al. 2009; Bednarik et al. 2010; Erner et al. 2010; Constantin et al. 2011; Xu et al. 2012a, b, Xu and Xu 2012; Pourghasemi et al. 2012b, c, d).

Fig. 6
figure 6

a Landslide susceptibility map based on binary logistic regression (BLR). b Landslide susceptibility map based on analytical hierarchy process (AHP). c Landslide susceptibility map based on statistical index (SI)

6 Verification of the landslide susceptibility maps

To determine the accuracy of three landslide susceptibility models (Binary logistic regression, analytical hierarchy process, and statistical index) used in this study, two verification methods, the relative operating characteristics (ROC) and frequency ratio plot, were used. ROC curve analysis is a common method to assess the accuracy of a diagnostic test (Egan 1975). The ROC curve is a graphical representation of the trade-off between the false-negative and false-positive rates for every possible cutoff value. By tradition, the plot shows the false-positive rate (FPR) on the X axis (Eq. 14) and the true-positive rate (TPR) on the Y axis (Eq. 15).

$$ X = {\text{FPR}} = 1 - \left[ {\frac{\text{TN}}{{{\text{TN}} + {\text{FP}}}}} \right] $$
(14)
$$ Y = {\text{TPR}} = \left[ {\frac{\text{TP}}{{{\text{TP}} + {\text{FN}}}}} \right] $$
(15)

The area under the ROC curve (AUC) characterizes the quality of a forecast system by describing the system’s ability to anticipate the correct occurrence or non-occurrence of pre-defined “events.” The best method has a curve with the largest AUC; the AUC varies from 0.5 to 1.0. If the model does not predict the occurrence of the landslide any better than chance, the AUC would equal 0.5. A ROC curve of 1 represents perfect prediction. The quantitative–qualitative relationship between AUC and prediction accuracy can be classified as follows: 0.9–1, excellent; 0.8–0.9, very good; 0.7–0.8, good; 0.6–0.7, average; and 0.5–0.6, poor (Yesilnacar 2005). The AUC values of the ROC curve for BLR, AHP, and SI models were found to be 0.8520, 0.8037, and 0.7570, respectively (Fig. 7). Hence, it is concluded that the binary logistic regression model employed in this study showed reasonably very good accuracy in predicting the landslide susceptibility of study area.

Fig. 7
figure 7

Comparison of ROC curve (AUC) of landslide susceptibility maps

Also, the landslide susceptibility analyses were validated using frequency ratio plot. Due to, all of the landslide grid cells were overlaid on landslide susceptibility zones (low, moderate, high, and very high) in GIS, and frequency ratio was calculated for each of the susceptibility zones (Pourghasemi et al. 2012d). In an ideal landslide susceptibility map, the frequency ratio value is increasing from a low to a very high susceptibility zones (Pradhan and Lee 2010a, b; Pourghasemi et al. 2012c). A plot of the frequency ratio for the four landslide susceptibility classes of the three landslide susceptibility models is shown in Fig. 8. The results showed that the frequency ratio is gradually increased from the low to the very high susceptibility zone in the study area.

Fig. 8
figure 8

Frequency ratio plots of four landslide susceptibility zones of landslide susceptibility models

7 Discussion and conclusion

Landslide susceptibility maps provide fundamental knowledge of the causes and effective factors on landslide occurrence and can be effective in hazard management and its mitigation measures. In present research, we attempt to compare the results of landslide susceptibility mapping using of three different models namely: BLR, ST, and AHP in the north of Tehran metropolitan, Iran. Of total 528 identified landslide locations in the study area, 370 (70 %) were used as training data and the remaining 158 (30 %) were used for validation goals. In order to landslide susceptibility zonation, twelve conditioning factors such as slope degree, slope aspect, altitude, plan curvature, normalized difference vegetation index, land use, lithology, distance from rivers, distance from roads, distance from faults, stream power index, and slope-length were considered. For validation of generated landslide susceptibility maps in ArcGIS, the receiver operating characteristic (ROC) curves and frequency ratio plot were used.

According to obtained area under the curve (AUC), the binary logistic regression model has higher prediction performance (85.20 %) than statistical index (80.37 %) and analytical hierarchy process (75.70 %) models. Also the results of frequency ratio plot showed that the frequency ratio value is gradually increased from the low to the very high susceptibility zone in the study area, while this was validated our results.

Meanwhile, several investigators found overall accuracy rate relatively similar in some models such as FR, AHP, LR, and ANN (Jin et al. 2010; Park et al. 2012); conditional probability (CP), LR, ANN, and SVM (Yilmaz 2010); MCDA, SVM, and LR (Kavzoglu et al. 2013); heuristic and bivariate statistical models (Bijukchhen et al. 2012); probabilistic, bivariate and multivariate models (Pradhan and Youssef 2010; Tien Bui et al. 2011a; Kevin et al. 2011; Ozdemir and Altural 2012; Shahabi et al. 2012). On the other words, Ayalew et al. (2005), Esmali Ouri and Amirian (2009) stated that AHP model was better that the logistic regression in Sado Island, Japan and Iran, respectively. Yalcin (2008) reported AHP method gave a more realistic landslide susceptibility map than the bivariate statistical models (Wi and Wf). In another research, Yalcin et al. (2011) in order to landslide susceptibility mapping used of frequency ratio, AHP, bivariate statistics, and logistic regression in Trabzon, NE Turkey. They found that the weighting factor (Wf) method is better in prediction than the frequency ratio model, AHP, the statistical index (Wi), and logistic regression model.

Vahidnia et al. (2009) due to landslide hazard calculation in Mazandaran Province, Iran, used of four models namely: weights of evidence (WoE), AHP, ANN, and generalized linear regression (GLM). The estimated accuracy ranges from 80 to 88 %. It is then inferred that the application of WoE in rating maps’ categories and ANN to weight effective factors results in the maximum accuracy.

The main advantage of logistic regression over simple multiple regressions is that LR allows the use of binary dependent variable types in landslide susceptibility mapping. Although logistic regression is a commonly applied quantitative susceptibility mapping method, it has a major limitation of yielding average parameters for the study area (Fotheringham et al. 2001; Erner et al. 2010), which may differ locally in different parts of the study area.

van Westen et al. (2003) stated that the bivariate statistical method gives a satisfactory combination of the (subjective) professional direct mapping and the (objective) data driven analytical capabilities of a GIS. The main advantage of bivariate statistical procedures is that the professional, who executes the analysis, determines the factors or combinations of factors used in the assessment. This enables the introduction of expert opinion into the process. Bivariate statistics are a useful tool in the assessment of landslide susceptibility, but can best be used as a supporting tool to make quantitative estimations of the importance of the various factors involved.

The general purpose of the AHP is to support the decision makers in selecting the best alternative from the various possible choice alternatives under the presence of multiple priorities (Jankowski 1995). On the other hand, AHP model is conventionally based on a rating system provided by expert opinion. In fact, expert opinion is very useful in solving complex problems like landslides. However, to some extent, opinions may change for every individual expert and thus may be subjected to cognitive limitations with uncertainty and subjectivity. Another aspect is that data driven methods are also powerful in landslide susceptibility mapping and contain less subjectivity. Therefore, it is important to analyze the spatial relationship between the landslide conditioning factors and landslide locations. The statistical-based models (Bivariate and multivariate) allow users to order parametric importance before the landslide susceptibility analyses application.

As a final conclusion, these maps can provide very useful information for planners, decision makers, and engineers in slope management and land use planning in landslide areas, and we believe that the results obtained from our study provide a considerable contribution to the landslide literature.