Introduction

Landslides are natural geologic processes that cause different types of damages and effects on human life and human-made structures (Goetz et al. 2011). Mihir et al. (2014) stated that approximately 375 fatal landslides occur globally each year, with around 4600 people killed annually. Previous studies have highlighted the increasing costs of landslide damages, which can be primarily attributed to increased human occupation and activities in vulnerable environments. Landslide susceptibility maps are a prerequisite for planners, decision makers, and engineers in slope management and land use planning for this problem to be effectively addressed. In recent decades, progress has been made in landslide susceptibility mapping because of the rapid development of remote sensing techniques, modeling tools and Geographical Information Systems (GIS) (Weirich and Blesius 2007). For landslide susceptibility mapping, landslide occurrences depend generally on the complex interactions among a large number of conditioning factors. Therefore, investigations into landslide susceptibility require an assessment of the relationships between various terrain conditions and landslide occurrence. An experienced Earth scientist has the capabilities to assess the overall slope conditions and to extract the critical parameters. However, an objective procedure that can be automated and applied over large regions is often desired to quantitatively support the slope instability assessment. This procedure requires the evaluation of the spatially varying terrain conditions, as well as the spatial representation of landslides.

Different techniques have previously been used for landslide susceptibility mapping. Many studies have evaluated landslide susceptibility using GIS and frequency ratio models (Pourghasemi et al. 2012a; Nourani et al. 2014; Park and Lee 2014), bivariate and multivariate statistical methods (Pradhan 2010; Lee et al. 2012; Poiraud 2014; Umar et al. 2014; Youssef et al. 2014a; Fiorucci et al. 2015; Youssef 2015), index of entropy (Bednarik et al. 2010; Constantin et al. 2011; Jaafari et al. 2014; Park 2014; Youssef et al. 2014b), analytical hierarchy process (Hasekiogullari and Ercanoglu 2012; Park et al. 2013; Pourghasemi et al. 2014a; Youssef 2015), fractal theory (Li et al. 2012; Pourghasemi et al. 2014b), support vector machine (Pradhan 2013; Li and Kong 2014; Ren et al. 2014; Wu et al. 2014; Su et al. 2015), artificial neural networks (Pradhan and Buchroithner 2010; Song et al. 2012; Wu et al. 2013; Nourani et al. 2014; Chen et al. 2014; Moosavi and Niazi 2015; Wang et al. 2015; Dou et al. 2015), fuzzy approaches (Ercanoglu and Gokceoglu 2004; Akgun et al. 2012; Osna et al. 2014; Bortoloti et al. 2015; Meten et al. 2015), object-based image analysis (Martha et al. 2012), and some hybrid methods, including the neuro-fuzzy model (Sezer et al. 2011; Pradhan 2013; Lee et al. 2015a, b). Recently, some new methods and techniques have been used for landslide susceptibility mapping such as random forest (RF) (Miner et al. 2010; Stumpf and Kerle 2011; Vorpahl et al. 2012; Catani et al. 2013; Micheletti et al. 2014; Youssef et al. 2015) and Bayesian logistic regression (Das et al. 2012). Bivariate statistical models such as frequency ratio are easy to run in a GIS (Oh et al. 2011b), but have a disadvantage that equal weights are assumed for different effective factors. In the weights of evidence (WofE) method, the contrast value is the rating of each class of each factor that influences landslide occurrence. This value is positive for a positive spatial association and negative for a negative spatial association (Lee and Choi 2004). Logistic regression determines the type of relation between the landslide occurrence and effective factors, which can be positive or negative. The index of entropy determines the weight of each effective factor on the occurrence of landslides. ANN needs an optimal network architecture design (number of hidden layers and units) and the setting of the optimal cost and weight decay values, which makes it complex to use (Rodriguez-Galiano and Chica-Rivas 2012). The evidential belief function (EBF) model supports a series of mass functions, including belief, disbelief, uncertainty and plausibility. Thus, the results of the model present the quantitative relationships between landslide occurrence and effective factors by modeling the degree of uncertainty (Pourghasemi and Beheshtirad 2014). Saaty’s AHP is a widely used multi-criteria decision making technique in the field of natural resources and environmental management. In general, the RF algorithm has several benefits in comparison with other commonly used multivariate regression or classification methods. First, it does not require prior assumptions about the distribution of the explanatory variables; secondly, it allows for the mixed use of categorical and numerical variables without resorting to the use of indicator (or dummy) variables, and third, it is capable of considering interactions and nonlinear relationships between variables. Although many different models and techniques for landslide susceptibility mapping have been proposed and implemented, no agreement has so far been reached on which model and techniques are the best for landslide susceptibility mapping (Wang and Sassa 2005).

Thus, the main purpose of this research is to determine the relationship between landslide occurrences and conditioning factors, using an EBF model, landslide susceptibility mapping using a GIS-based RF model, and evaluation of the combined method in the Safarood Watershed, Mazandaran Province, northern Iran. The main advantages of RF are that it is one of the most accurate learning algorithms available, and that it runs efficiently on large databases (Lee et al. 2015a, b). The accuracy and variable importance values are generated automatically, an important benefit in landslide susceptibility assessment. Moreover, overfitting is not a problem in the mentioned algorithm, and neither is it very sensitive to outliers in the training dataset (Breiman 2001). Therefore, the RF algorithm, coupled with EBF, can be a good approach for other parts of the country, or even other countries for land use planning and decision making in landslide risk areas.

Study area

The Safarood Watershed is located in the western part of Mazandaran Province, Iran. The considered site lies between the latitudes 36°47′29″N to 36°57′35″N, and the longitudes 50°24′16″E to 50°41′39″E (Fig. 1). It covers an area of approximately 162.6 km2. The topographical elevation of study area varies between 20 and 3540 m a.s.l. The slope angles of the area range from 0° to as much as 70°. A rainfall map was prepared using the data collected at six rainfall stations (Ramsar, Gavermak, Mianlat, Abe Madani-e-Nidshet, Gardeh Poshtehsar, and Zarodal) for the years 1975–2012 (annual mean rainfall). The mean annual rainfall is around 1,220 mm between September and December, according to the Islamic Republic of Iran Meteorological Organization (IRIMO 2012), whose data also show the temperature in the Safarood area ranging between 21 and 30 °C. The study area is covered by various types of lithological formations, including Quaternary, Jurassic, Triassic, and Late Permian types of formations (Table 1). The Quaternary deposit covers 15.56 % of the study area and includes Qal (recent loose alluviums), Q d2 (undivided deltaic alluviums), Q m2 (marine deposits), Q2t1 g (old gravelly terrace), Q s2 (scree and rock falls), and Q tg2 2 (young gravelly terrace). The j1c-2 (dark gray polymictic conglomerate with abundant quartzite pebbles, with inter layers of siltstone, sandstone and coal), TR d1e (light gray-to-cream dolomite) and La (laterite) covers about 80 % of the study area (GSI 1997). Its folded mountain system regionally strikes northwest to the southeast. The Binaksar fault is one of the main faults in the study area. Most of the area (64 %) is covered by forest. Other land use classes are dry-farming and forest, forest, forest and orchard, irrigation agriculture, irrigation agriculture and orchard, orchard, range, tea land, and residential area. Generally, mountainous features, high tectonic activity, and geological and climatologically variety caused to the Iranian plateau being susceptible to various kinds of landslides, especially in the Alborz active mountainous belts. Alborz is a mountain range in northern Iran that stretches from the border of Azerbaijan along the western and entire southern coast of the Caspian Sea, and finally runs northeast and merges into the Aladagh Mountains in the northern parts of Khorasan (https://en.wikipedia.org/wiki/Alborz). According to earlier studies by the Iranian Landslide Working Party (ILWP 2007), the highest frequency of landslide occurrence in Iran is in the Mazandaran Province. The results showed that more than 520 landslides occurred between 2005 and 2013 and caused extensive damages. Evidence for that includes the destruction of some 438 ha forest lands, 1264 ha agriculture lands and gardens, 12 km roads, as well as more than 50 villages (http://irna.ir/fa/NewsPrint.aspx?ID=80365819). The Safarood Watershed is one of the most important ecotourism areas in Iran and the Mazandaran Province, because of its climate conditions, geomorphological attributes, and ecological landscape. Despite the mentioned potential (important ecotourism area), deforestation, road construction, land use changes, and expansion of villas along high slopes have led to increasing landslide occurrence in the study area. The area is highly prone to landslide and this susceptibility needs to be assessed for a safer and more optimized use of the land.

Fig. 1
figure 1

Map of the study area, with indication of past landslides

Table 1 Types of geological formations in the study area

Methodology

The flowchart of the methodology used in this study is shown in Fig. 2, and consists of four phases: (1) data integration and analysis, (2) determination of the relationship between landslide occurrence and conditioning factors using the EBF model, (3) pairing EBF values to each map for running the RF model, (4) landslide susceptibility modeling using RF, and (5) validation of the landslide susceptibility map using the ROC-AUC curve (success rate and prediction rate curves).

Fig. 2
figure 2

Flowchart of the methodology used in this study

Data integration and analysis

Generally, data collection and construction of a database of conditioning factors are the most important parts of the landslide modeling process (Ercanoglu and Gokceoglu 2002). Firstly, the landslides were identified and localized by interpreting aerial photographs, and through extensive field surveys. From the total of 153 landslides that occurred between 2005 and 2013, 70 % (105 cases) were used in the model building, while the remaining 30 % (48 cases) were used for validation according to random partition algorithm (Lee and Pradhan 2007; Bednarik et al. 2010; Ozdemir and Altural 2013; Pradhan 2013; Pourghasemi et al. 2014a; Youssef et al. 2014a, b). Failure modes of the landslides identified in the study area were determined according to the landslide classification system proposed by Varnes (1978). Most of the landslides are shallow rotational, with minimum and maximum dimensions of 350 m2 and 150,420 m2, respectively. Some examples of landslides identified in the Safarood Watershed are shown in Fig. 3. For landslide susceptibility mapping in the study area, eleven conditioning factors were considered: slope angle, slope aspect, altitude, plan curvature, profile curvature, topographic wetness index, lithology, land use, distance from rivers, distance from roads, and distance from faults. One of the most important factors in landslide susceptibility mapping is topography. In this study, a digital elevation model (DEM) was created by digitizing contour lines (20 m interval) and survey base points. We used the ArcGIS TIN (Triangular Irregular Network) module, and then converted the TIN to Raster (pixel size 20 m). The contour lines and points were prepared by Iran’s National Cartographic Center (NCC), and we only created the DEM from these layers. Using the DEM, slope angle, slope aspect, altitude, plan curvature, profile curvature, and topographic wetness index were extracted (Fig. 4a–e). Slope angle is one of the parameters that influence landslide occurrences. The slope map of the study area was derived from the DEM and divided into five classes: <5°, 5° to 15°, >15° to 30°, >30° to 50°, and >50° (Fig. 4a). Slope aspect is another factor that correlated with the amount of solar energy received by the area. Therefore, the slope aspect layer was selected as one of the landslide-related factors, and was categorized into nine classes: (1) flat, (2) north, (3) northeast, (4) east, (5) southeast, (6) south, (7) southwest, (8) west, and (9) northwest (Fig. 4b). The altitude map was extracted from the DEM and classified into eight equal interval classes (Lee and Pradhan 2007; Pourtaghi et al. 2014): (1) <100 m, (2) 100–500 m, (3) >500-1000 m, (4) >1000–1500 m, (5) >1500–2000 m, (6) >2000–2500 m, (7) >2500–3000 m, and (8) >3000 m (Fig. 4c). Plan curvature is described as the curvature of a contour line formed by intersecting a horizontal plane with the surface. The influence of plan curvature on the slope erosion processes is the convergence or divergence of water during downhill flow (Ercanoglu and Gokceoglu 2002). The plan curvature map was produced using ArcGIS 9.3 and was classified into three categories: concave, flat, and convex (Fig. 4d). The profile curvature (also called slope profile curvature) is a primary topographic attribute. It shows the flow acceleration, deposition (positive values), and erosion (negative values) rate (Yesilnacar 2005). In addition, the profile curvature is important because it controls the velocity change of mass flowing down the slope (Talebi et al. 2007). The map was created in ArcGIS 9.3 and classified to three categories (Fig. 4e). The topographic wetness index (TWI) is defined as ln (A/tanβ), where A is upslope contributing area and β is the slope angle (Beven and Kirkby 1979). TWI has been extensively used to describe the effect of topography on the location and size of saturated source areas of runoff generation (Beven 1997; Beven and Freer 2001). This index is commonly used to characterize the spatial distribution of soil moisture; therefore, it is used in landslide susceptibility mapping (Ozdemir 2011a, b; Oh et al. 2011a; Pourtaghi et al. 2014; Naghibi et al. 2015). The TWI map was produced using the System for Automated Geoscientific Analyses (SAGA-GIS) (Fig. 4f). The lithology map was obtained using a 1:100,000-scale geological map (GSI 1997), and the lithological units were classified into twelve groups according to lithology and its susceptibility to landslide occurrence. (Fig. 5; Table 1). The land use/land cover changes play an important role in the study of environmental issues, especially in landslide assessment (Mallick et al. 2014). The land use map was created using Landsat-7 imagery. To create the land use map, a supervised classification using the maximum likelihood algorithm was applied and verified by field survey. Nine land use classes were drawn such as dry-farming and forest (DF), forest (F), forest and orchard (FO), irrigation agriculture (I), irrigation agriculture and orchard (IO), orchard (O), range (R), tea land (T), and residential area (U) (Fig. 6). Using the topographical database, the distance from rivers and distance from roads factors were calculated in ArcGIS 9.3, respectively (Fig. 7a–b). The distance from the faults was calculated at 100 m intervals, using the geological map as well (Fig. 7c).

Fig. 3
figure 3

Field photographs of some identified landslides in the Safarood Watershed

Fig. 4
figure 4

Topographical variables maps of the study area: a slope angle, b slope aspect, c altitude, d plan curvature, e profile curvature, f topographic wetness index (TWI)

Fig. 5
figure 5

The lithology map of the study area and the landslides location used for the model building

Fig. 6
figure 6

The land use map of the study area. DF dry-farming and forest, F forest, FO forest and Orchard, I irrigation agriculture, IO irrigation agriculture and Orchard, O Orchard, R range, T tea land, U urban

Fig. 7
figure 7

Buffer maps: a distance from rivers, b distance from roads, c distance from faults

For the classification of the conditioning factors, different methods were used, such as equal interval, natural break, and common standards such as standard division. Finally, for the application of the RF model, all conditioning factors were converted to a raster grid with 20 m × 20 m pixel size in ArcGIS 9.3 and R geostatistical packages. All the maps are in UTM (Universal Transverse Mercator) coordinate system and use the WGS84 datum (WGS84-UTM-Zone39 N).

Evidential belief function (EBF)

The Dempster–Shafer theory (Dempster 1968; Shafer 1976) is a bivariate statistical method that was applied to detect the spatial integration according to the rule of combination (Carranza 2009; Althuwaynee and Pradhan 2014). The EBF theory is based on the Dempster–Shafer rule and consists of four functions: degrees of belief (Bel), disbelief (Dis), uncertainty (Unc), and plausibility (Pls). Bel and Pls are lower and upper probabilities, respectively (Dempster 1968). Thus, Pls may be greater than or equal to Bel. Conversely, Unc and Pls are equal to Pls–Bel and 1–Unc–Bel, respectively. The degree of uncertainty indicates ignorance (or doubt) of one’s belief in the proposition based on a given evidence, while the degree of disbelief is the belief that the proposition is false according to a specific evidence. If Unc = 0, then Bel = Pls. In general, Bel + Unc + Dis = 1. Dempster (1968) stated that Bel, Unc, and Dis are the functions used to integrate evidences according to combination rules. Hence, the degree of belief \(\left( {{\text{Bel}}_{{C_{ij} }} } \right)\) is shown by Eqs. 1 and 2 (Carranza and Hale 2002):

$${\text{Bel}}_{{C_{ij} }} = \frac{{W_{{C_{ij} D}} }}{{\sum\nolimits_{j = 1}^{m} {W_{{C_{ij} D}} } }}$$
(1)
$$W_{{C_{ij} D}} = \frac{{N\left( {C_{ij} \cap D} \right)/N\left( {C_{ij} } \right)}}{{N\left( D \right) - N\left( {C_{ij} \cap D} \right)/N\left( T \right) - N\left( {C_{ij} } \right)}}$$
(2)

where \(N\left( {C_{ij} \cap D} \right)\) is density of landslide pixels that are given in D, \(N\left( {C_{ij} } \right)\) is the total density of landslides that have occurred in the study area, N(D) is the density of pixels in D, and N(T) is the density of pixels in the whole study area T.

Conversely, estimation of the degree of disbelief (\({\text{Dis}}_{{C_{ij} }}\)) is given by Eqs. 3 and 4 (Carranza et al. 2005; Nampak et al. 2014):

$$Dis_{{C_{ij} }} = \frac{{W_{{C_{ij} \bar{D}}} }}{{\mathop \sum \nolimits_{j = 1}^{m} W_{{C_{ij} \bar{D}}} }}$$
(3)
$$W_{{C_{ij} \bar{D}}} = \frac{{N\left( {C_{ij} \cap D} \right)/N\left( {C_{ij} } \right)}}{{N\left( T \right) - N\left( D \right) - {{\left[ {N\left( {C_{ij} } \right) - N\left( {C_{ij} \cap D} \right)} \right]} \mathord{\left/ {\vphantom {{\left[ {N\left( {C_{ij} } \right) - N\left( {C_{ij} \cap D} \right)} \right]} {N\left( T \right) - N\left( {C_{ij} } \right)}}} \right. \kern-0pt} {N\left( T \right) - N\left( {C_{ij} } \right)}}}}$$
(4)

Accordingly, Eqs. 5 and 6 were used to calculate the uncertainty and plausibility functions.

$${\text{Unc}} = \left[ {1 - \left( {{\text{Bel}}_{{C_{ij} }} } \right) - \left( {{\text{Dis}}_{{C_{ij} }} } \right)} \right]$$
(5)
$${\text{Pls}} = \left[ {1 - \left( {Dis_{{C_{ij} }} } \right)} \right]$$
(6)

More details of the mentioned algorithm can be found in Carranza and Hale (2002), Carranza et al. (2005), and Carranza et al. (2008).

Random forests (RF)

RFs are an extension of classification and regression trees, and were first developed by Breiman (2001). RFs are widely used for data prediction and interpretation purposes. They show many appealing characteristics, such as the ability to deal with high-dimensional data, complex interactions and correlations. The RF algorithm tends to produce quite accurate models, because the ensemble reduces the instability that can be observed when building single decision trees (Williams 2011). The algorithm exploits random binary trees, which use a subset of the observations through bootstrapping techniques. For each tree grown on a bootstrap sample, the error rate for observations left out of the bootstrap sample is monitored. This is called the “out-of-bag” (OOB) error rate (Breiman 2001). Basically, the OOB accuracy indicates the accuracy of the RF predictor. It gives an estimate of test set accuracy (generalization error) (Cutler 2013). An RF tries to improve on bagging by “de-correlating” the trees. Each tree has the same expectation. In other words, the algorithm uses random feature selection at each node for the set of splitting variables (Meyer et al. 2003). This algorithm needs two original parameters to be tuned by the user: the number of trees (T) and the number of variables (m). It has been suggested (Breiman 2001; Micheletti et al. 2014) to pick a large number of trees, and the square root of the dimensionality of the input space for m (Micheletti et al. 2014). Based on two parameters, the number of trees in RF has been fixed at 1000 after an introductory analysis, and the number of variables sampled at each node was selected to be 3 to analyze the conjunct contribution of subsets of features, while maintaining a fast convergence during iterations. No calibration set is needed to tune the parameters (number of tree and number of variable) (Micheletti et al. 2014). The parameters to tune in model are iterations and learning rate. Moreover, two types of error were calculated: mean decrease in accuracy, and mean decrease in node impurity (mean decrease in Gini). The mean decrease in accuracy is determined during the calculation of the OOB error. Conversely, the mean decrease in the Gini coefficient is a measure of how much each variable contributes to the homogeneity of the nodes and leaves in the resulting RF. These importance measures can be used for ranking variables and for variable selection (Calle and Urrea 2010). Generally, mean decrease accuracy and mean decrease Gini errors in RF model have been used widely in many fields and researcher and have shown good performance for variable selection (Lawrence et al. 2006; Cutler et al. 2007; Watts et al. 2009; Stumpf and Kerle 2011; Rahmati et al. 2016), including for the classification of moderate resolution imagery that was trained with high-resolution data (Shruthi et al. 2014).

Results

Spatial relationship between landslide and conditioning factors

Results of the spatial relationship between landslide and conditioning factors using the EBF (belief, disbelief, uncertainty, and plausibility) model are shown in Table 2. For the slope degree of >50°, the belief and disbelief values were 0.341 and 0.196, respectively, which indicates a very high probability of landslide occurrence. In the case of slope aspect, the highest Bel values were related to south, southwest, and east (0.195, 0.175, and 0.140, respectively), and it showed that these categories have a positive spatial association with landslide occurrence. On the other hand, the degree of belief was lowest for other aspect categories. The relationship between landslide occurrence and altitude shows that elevations between 100 and 500 m and >500–1000 m have the highest Bel values (0.630 and 0.150, respectively), indicating that the probability of landslide occurrence in these altitudes is high. In general, the results showed that there is an inverse relationship between altitude and belief values. For plan curvature, there is a high belief and low disbelief value for convex shapes (0.349, 0.320, respectively). In the case of profile curvature, classes of <−0.001 or convex shapes have the highest belief value (0.376), followed by concave shapes (>+0.001), which mirrored findings by Gorum et al. (2011) for landslides in China. The higher (Belief degree) and lower (Disbelief degree) probabilities of landslide occurrence were obtained in areas having a TWI <8, with values of 0.359 and 0.313, respectively. In the case of lithology, there are eleven classes. The degree of belief, with respect to landslide occurrence, was higher for La (Laterite) and Pn (including dark gray, well-bedded, chert-bearing limestone with interlayers of shale and dolomitic limestone) lithological units (0.369 and 0.176, respectively), but lower or zero for other classes or units (Table 1). In the case of land use, the degree of belief was higher for forest (0.327) and irrigation agriculture (0.237) land use types; these classes also showed lower Dis values of 0.031 and 0.117, respectively. Regarding distance from rivers, the highest belief and lowest disbelief values are found for a distance <100 m. The results indicated that landslide occurrence decreases with an increase in distance from rivers. The highest belief and lowest disbelief values in case of distance from roads were related to the class of <100 m as well. The results revealed that this class had the highest probability in landslide occurrence. In the case of distance from faults, the <200 m class had the highest belief (Bel = 0.291) and lowest disbelief (Dis = 0.172) values, respectively. The results show that with growing distance from faults landslide occurrence also increases, thus there is a direct relationship between distance from faults and landslide susceptibility. Gorum et al. (2011) also stated that landslide density decreases gradually with increasing distance from surface ruptures.

Table 2 Spatial relationship between landslide conditioning factors and landslide locations, using the EBF model

Random forests

In the RF model, 1000 trees (ntree = 1000) and just three variables (mtry = 3) were considered for the split point in each node. The out-of-bag error (OOB) was used to evaluate the performance of the model. The obtained OOB error rate was 25 %, thus the model accuracy is 75 %, which is a reasonably good model. Stumpf and Kerle (2011) reported that the model with the lowest OOB is the best. The overall measure of accuracy can be seen in the confusion matrix that records the disagreement between the final model predictions and the actual outcomes of the training observations (Table 3). It showed that the model and training dataset agree that for 73 of the observations landslide absence is correctly predicted (69.5 %), and that of the actual 105 landslide locations 84 (80 %) were correctly predicted. One of the most important problems in RFs, compared with a single decision tree, is that it becomes quite a bit more difficult to readily understand the discovered knowledge of 1000 trees. This was solved by assessing the relative variable importance (Table 4). The higher values in Table 4 indicate that the associated variable is relatively more important. For better understanding, a visual plot of the two measures of variable importance calculated by the RFs technique is shown in Fig. 8. Clearly, lithology is the most important variable, followed by altitude, distance from roads, and land use. The accuracy measure then includes distance from roads and land use factors as the next most important. In contrast, altitude, lithology, distance from roads, rivers, and faults had higher importance based on the Gini measure. Another key output of the RFs model is an error plot. This plot portrays the accuracy of forest of trees (y-axis) against the number of tress that have been included in the forest (x-axis; Fig. 9). It can be seen that going beyond about 50 trees in the forest adds low values (0.26–0.3), when considering the OOB error rate. The two other plots show the changes in error rate associated with the model prediction. Finally, the landslide susceptibility map (LSM) using the RF algorithm was constructed, and classified based on the natural break classification scheme in ArcGIS 9.3 (Ozdemir 2011a; Pourghasemi et al. 2012a, b; Mohammady et al. 2012; Pourghasemi et al. 2013) into low, moderate, high and very high potential classes. The landslide susceptibility map achieved from the RF method showed that about 33 % of the study area falls into the low landslide susceptibility class, while 28, 25, and 13 % were identified as having moderate, high and very high susceptibility, respectively (Fig. 10). These are substantial advantages that confine the generation of outliers, especially when working with terrain variables with a high frequency of missing data, and an intrinsic uncertainty in the assignment to the correct class also in surveyed areas. This is a typical problem for classes such as soil type, for which only a few point locations have been directly surveyed (Catani et al. 2013).

Table 3 Confusion matrix from RF model (0 = non-landslide or negative, 1 = landslide or positive)
Table 4 Conditioning factors’ importance based on the random forest model
Fig. 8
figure 8

Two measures of variable importance calculated by the random forest algorithm. DEM altitude, Road distance from roads, LU land use, River distance from rivers, Fault distance from faults, Slope slope angle, Aspect slope aspect, TWI topographic wetness index, Prof profile curvature, Plan plan curvature

Fig. 9
figure 9

OOB error plot of the random forests algorithm in the study area (0 non-landslide, 1 landslide)

Fig. 10
figure 10

Landslide susceptibility map based on the random forests (RF) algorithm

Validation of landslide susceptibility map

To determine the accuracy of the landslide susceptibility map created using the RF algorithm, the receiver operating characteristic (ROC) curve was used. This is a common method used to assess the diagnostic test accuracy. In the ROC analysis, the area under the curve (AUC) value ranges from 0.5 to 1.0, and is used to evaluate the model accuracy (Nandi and Shakoor 2010). If the model does not predict the landslide occurrences better than chance, the AUC would equal 0.5. A ROC curve of 1.0 shows perfect prediction (Yesilnacar 2005). The quantitative–qualitative relationship between AUC and prediction accuracy can be classified as follows: >0.9–1, excellent; >0.8–0.9, very good; >0.7–0.8, good; >0.6–0.7, average; and 0.5–0.6, poor (Yesilnacar 2005). In this study, the landslide locations which were not used during the model building process were used to verify the landslide susceptibility map. For this purpose, success rate and prediction rate curves were created. The success rate method used the training landslide pixels that were used in establishing the landslide models, thus it is not a suitable method for assessing the prediction capability of the models (Bui et al. 2012). However, the method can help in determining how well the resulting landslide susceptibility maps have classified the areas of existing landslides. Another validation technique is the prediction rate curve, which explains how well the model predicts a landslide (Mohammady et al. 2012; Akgun et al. 2012; Ozdemir and Altural 2013). The AUC values of the ROC curve for RF model and the mentioned curves (success rate and prediction rate curves, respectively) were found to be 0.8520 (85.2 %) and 0.8177 (81.8 %), respectively (Fig. 11a, b). It can be concluded that this model provides reasonable results according to the success rate and prediction rate curves for landslide susceptibility mapping of the study area. Stumpf and Kerle (2011) stated that the RF algorithm provided relatively high accuracies of up to 87 % for sites in Haiti and Wenchuan. Vorpahl et al. (2012) compared eight different techniques for landslide susceptibility mapping in Southern Ecuador. They indicated that RF and boosted regression tree (BRT) showed the best performance in tenfold cross-validation. Meanwhile, they found that a generalized additive model (GAM) and general linear model (GLM) with stepwise backwards variable selection performed equally well. Bui et al. (2013) used an EBF-fuzzy logic hybrid model for landslide susceptibility mapping in the Vietnam. Results indicated that EBF-fuzzy logic model is better than EBF and Fuzzy models when applied separately in the study area. Esposito et al. (2014) for landslide susceptibility mapping in Rio de Janeiro, Brasil used RF and logistic regression (LR) models. Their results showed that AUC for RF and LR models were 0.81 and 0.72, respectively. Therefore, RF model shows better accuracy than LR. Trigila et al. (2015) performed a comparative study on landslide susceptibility assessment using three models, namely logistic regression, RF, and frequency ratio. The validation of results showed that the RF model had the highest AUC with 4*4 m regular grid sampling. In another study, Goetz et al. (2015) applied different modeling techniques, such as GLM, GAM, weight of evidence, SVM, RF, and bootstrap aggregated classification tree with penalized discriminant analysis (BPLDA) for landslide susceptibility mapping in three areas in Austria. Meanwhile, the difference of AUC-ROC values ranged from 2.9 to 8.9 %, but the results showed that the RF model performed slightly better than the other models in all study areas. Youssef et al. (2015), in assessing landslides in the Wadi Tayyeh Basin, Asir Region, in Saudi Arabia, used different data mining techniques, such as RF, BRT, classification and regression tree (CART), and GLM models. The prediction rate curves indicated that RF (AUC = 81.2 %), BRT (AUC = 85.6 %), and CART (AUC = 86.2 %) have the highest AUC in comparison with generalized linear models (AUC = 76.9 %). These results are in agreement with our finding, showing that RF-EBF model has a good performance for landslide susceptibility assessment.

Fig. 11
figure 11

ROC curves for the landslide susceptibility map produced by the random forest algorithm: a success rate curve b prediction rate

Conclusion

Different methods have been proposed for landslide susceptibility mapping in the literature, with the accuracy of different methods still being debated. The reliability of landslide susceptibility maps mainly depends on the amount and the quality of available data, the working scale, and the selection of the best methodology (Baeza and Corominas 2001). The RF technique has interesting features that other models lack. It can effectively handle missing data during both the training and validating steps. Because of its ensemble design, it can apply a prediction even when some of the input values are missing. Moreover, during the process of modeling, a measure of variables importance can be calculated, lending insight to the particular system being modeled (Ball 2009). Therefore, in this study a GIS-based RF model was used for landslide susceptibility mapping in the west of Mazandaran Province, northern Iran. At the same time it served as a feasibility assessment to determine if the approach can be used for other un-/under mapped parts of the country. Of the total of 153 landslide locations in the study area, 105 cases were used as training data, while the remaining 48 cases were used for validation purposes. To perform the landslide susceptibility mapping, eleven conditioning factors—slope angle, slope aspect, altitude, plan curvature, profile curvature, topographic wetness index, lithology, land use, distance from rivers, distance from roads, and distance from faults—were considered. The EBF model was used as an initial bivariate statistical method to evaluate the correlation between the landslides and classes of each conditioning factors. Subsequently RF was applied to produce the landslide susceptibility map for the study area. For validation of the model the AUC was used. The RF model resulted in a success rate of 85.2 %, and a prediction rate of 81.8 % accuracy. In summary, based on the overall assessment, the proposed algorithm is objective and an applicable estimator that improves the predictive accuracy. It is also suitable for landslide studies and land use planning in different parts of the country, for which to date no comprehensive landslide susceptibility assessment has been done. Therefore, we propose a coupled RF and EBF analysis for landslide susceptibility in Iran, as it surpasses the performance of established methods such as bivariate and multivariate statistical models.