1 Introduction

Causing great losses of lives and properties, landslides are dangerous processes that occur repeatedly in mountainous and hilly areas worldwide (Juliev et al. 2019; Gutiérrez et al. 2015). This mass movement occurs whenever the loading of an earth material exceeds its shear strength (Lin et al. 2017 (Ilia and Tsangaratos 2016)). Although this geological phenomenon is often triggered by earthquakes and heavy rainfalls, the expansion of anthropogenic activities in susceptible areas has always played an important factor in its occurrence (Baena et al. 2019). In recent years, the damage caused by landslides will increase due to climate change and urban development (Vakhshoori and Zare 2016). Therefore, it is essential to acquire accurate and realistic information about the spatial distribution and degrees of susceptibility to landslide-prone regions (Colkesen et al. 2016). To achieve this goal and to mitigate the destructive impacts of this phenomenon, landslide susceptibility maps can serve as an appropriate tool for increasing awareness and predicting future hazards (Feizizadeh et al. 2017). Based on previous landslides and identical physical features in similar areas, a landslide susceptibility map provides important signs regarding the locations where future landslides are likely to occur (Pradhan et al. 2017).

The Alborz Mountain has always been subjected to natural disasters such as landslides due to its being on the seismic belt of the Himalayas (Farrokhnia et al. 2011). In a study identifying high-risk regions of the world with respect to landslide hazard, Nadim et al. (2006) reported that the Alborz and Zagros Mountains of Iran were among the areas with moderate to high landslide risks. In addition, according to the National Committee on Natural Disaster Reduction of the Iranian Ministry of Interior, the annual damage caused by landslides in Iran amounts to about 500 billion Rials (Arabameri et al. 2019). Consequently, if the loss of human life is taken into account, it is evident that zoning of the study area is necessary.

In recent years, researchers have used different methods and their combinations to zone the areas susceptible to landslide in different areas worldwide that can generally classify into two quantitative and qualitative groups (Sahin 2020). The qualitative approaches, also known as knowledge-driven approaches, are the techniques of assigning weights to and rank criteria and sub-criteria based on experts’ knowledge (Achour et al. 2017). Some of these methods, which have been used in various studies and have yielded acceptable results, include the analytic hierarchy process (AHP) (Yan et al. 2019; Du et al. 2019; Bahrami et al. 2020) and hybrid methods such as MCDA and MCE (Erener et al. 2016; Kumar et al. 2017; Wang et al. 2019) and the WLC (Ahmed 2015; Gigović et al. 2019). The second group, also known as data-driven approaches, consists of techniques which are not influenced by experts’ opinions in the computational process (Kavzoglu et al. 2015). Instead, the relationship between the landslides and the effective parameters is determined by using numerical data and statistical equations (Yan et al. 2019). These methods, which have been used repeatedly in various studies on landslides, include bivariate and multivariate probability models such as frequency ratio (FR) (Hong et al. 2017; Sharma and Mahajan 2019; Berhane et al. 2020), weight of evidence (WoE) (Ding et al. 2017; Cui et al. 2017; Sifa et al. 2020), and logistic regression (LR) (Oh et al. 2018; Kalantar et al. 2018; Sun et al. 2021) as well as soft computing methods such as artificial neural network (ANN) (Bui et al. 2016; Zhu et al. 2018; Yu and Chen 2020), fuzzy logic (Ramesh and Anbazhagan 2015; Turan et al. 2020), adaptive neuro-fuzzy inference system (ANFIS) (Polykretis et al. 2019; Panahi et al. 2020; Mehrabi et al. 2020), random forest (RF) (Kim et al. 2018; Chen et al. 2020; Sahin et al. 2020), and support vector machine (SVM) (Oh et al. 2018; Al-Najjar and Pradhan 2021; Hu et al. 2020). Although mentioned models have suitable performance as predictive models, there are some drawbacks when applied individually (Youssef et al. 2015). According to the literature review, ensemble models perform more accurate results than a single method (Roy et al. 2019; Costache et al. 2020). For example, Aghdam et al. (2017) combined FR and WoE statistical methods with ANFIS algorithms to produce a landslide susceptibility map of the Zagros Mountains in Iran. Their results indicated that FR-ANFIS and WoE-ANFIS have better performance compared with FR and WofE. In another study, Roy and Saha (2019) combined WoE statistical and SVM machine learning models with different kernel functions to identify landslide hazard zones. They found that WofE and Linear-SVM ensemble model with more than 90% accuracy has an excellent performance in spatial modeling. Althuwaynee et al. (2016) indicated that the combination of CHAID and AHP methods has better results than stand-alone implementations of each model.

In limited studies, the combination of machine learning algorithms with MCDM methods has been used (Dehnavi et al. 2015; Arabameri et al. 2019; Costache et al. 2020). For instance, Arabameri et al. (2019) used the VICOR-RF-FR as an MCDM statistical machine learning ensemble method to evaluate groundwater potential. They showed the strength ensemble model to improve the results of nonlinear problems. Dehnavi et al. (2015) showed that the ensemble ANFIS-SWARA model yielded more realistic results than the SWARA.

The best-worst method is one of the latest MCDM methods introduced by Rezaei in 2015. Although this method has been used in two different landslide studies (Gigović et al. 2019; Moharrami et al. 2020), it has not yet been applied in combination with machine learning methods. Reviewing the previous studies shows that despite very good results, the combination of machine learning algorithms with MCDM methods has received less attention. The aim of the present study is to combine the BWM method with ANFIS to implement a new framework and compare it with the widely used SWARA method in order to fill this gap in the spatial modeling studies.

Two important points should be considered to achieve optimal results in the spatial modeling of landslides: (a) the quality of the input data and (b) the structure of the model used (Adineh et al. 2018). In connection with the first point, in this work, an attempt has been made to be as careful as possible in preparing the data. Regarding the second point, the difference between this study and other studies is the combination of the BWM model with ANFIS machine learning method. Moreover, in this study, the hybrid ANFIS-SWARA model has been used to compare them to determine which of these models of MCDM provides better results in combination with the ANFIS. After preparing landslide susceptibility maps, the performance of each model was estimated using the indices of sensitivity, specificity, accuracy, RMSE, and ROC curve. The results showed that the ensemble ANFIS-BWM model performed better and can be used in future studies.

1.1 Study region

With an area of 8604 km2, the Khalkhal-Tarom Basin is located on the southern slopes of the Alborz mountain range along from 47° 42′ 44″ to 49° 10′ 34″ E and 36° 37′ 22″ to 37° 56′ 35″″ N (Fig. 1). Approximately 92% (7967 km2) of the basin consists of highlands and the remainder of plains. The highest and lowest elevations are 3314 m and 288 m, respectively. The data from the climatological stations of the Iran Meteorological Organization and Ministry of Energy were utilized to estimate temperature and rainfall. The average annual temperature in the region is about 10.5 °C, while the coldest month is February, and the warmest is August. In addition, the average annual rainfall is about 375 mm. The difference in rainfall levels in the highlands on the two sides of the main river (the Ghezel Ozan River) results from the differences in the prevailing climatic conditions in the areas adjacent to the study area. Although the study area has diverse lithology, pyroclastic rocks of Karaj Formation cover most of its surface area. Moreover, based on the unit ages,Eocene has the highest coverage of the study area (Fig.2). Various factors such as weather conditions, topography, and human activities, including land use change, have increased the occurrence of landslides in this area. To confirm this important issue, the findings of this study showed that agricultural lands have the highest risk of landslides due to human activities. Given the existence of economic infrastructures and the growing residential areas on the unstable slopes in the future, zoning of landslide-prone regions seems to be vitally important.

Fig. 1
figure 1

Location of the study area

Fig. 2
figure 2

Lithology map

1.2 Database development and data preparation

It is necessary to create a spatial database in any study using geographical information system. The landslide susceptibility mapping is no exception, and database creation including inventory map and conditioning factors is considered as the first and the most important step in this process. The landslide inventory map shows the locations and spatial distribution of landslides that happened in the past (Ding et al. 2017). Since it is crucial to pinpoint the locations of the past and present landslides in order to predict future high-risk areas, preparation of a landslide inventory map is a requisite to any study on landslides (Regmi et al. 2014). Information on the locations of past landslides and their spatial distributions was obtained from the Forest, Rangeland and Watershed Organization of Iran (Fig. 1). According to Fig. 1, the inventory map was employed to randomly select 172 (or 70%) of the 242 landslides that have occurred in the region for training the data and the 30% for model validation.

Various factors including geology, hydrology, geomorphology, climate, and topography affect slope instability. Determination of these factors is among the basic and initial steps in landslide susceptibility mapping. In this study, thirteen conditioning factors including slope angle, slope aspect, altitude, topographic wetness index (TWI), plan curvature, profile curvature, distance to roads, distance to streams, distance to faults, lithology, land use, rainfall, and normalized difference vegetation index (NDVI) were selected based on the available data and previous studies for the spatial modeling of the landslides (Table 1). According to Table 1, these thirteen factors were determined by using the information obtained from the related organizations and the reference data. Following that, ArcGIS 10.5 was employed to generate and digitalize the maps (30×30 m pixels). Raster data models of the layers were then prepared by using the selected methods.

Table 1 Sources of data for extracting thematic maps

In order to prepare the different information layers, the digital elevation model (DEM) was prepared first using ASTER satellite images. DEM is one of the most important databases in any landslide study because preparation of some important thematic maps depends on it. The slope angle, slope aspect, altitude, TWI, and plan and profile curvature layers were extracted from the DEM (Fig. 3a–f). The other considered factors (distance to roads, distance to streams, distance to faults, lithology, land use, rainfall, and the normalized difference vegetation index) were then determined (Fig. 3g–m). In addition, the conditioning factors were categorized based previous studies and available data.

Fig. 3
figure 3figure 3figure 3

Conditioning factors of the study. a Slope angle. b Slope aspect. c Altitude. d TWI. e Plan curvature. f Profile curvature. g Distance to roads. h Distance to streams. i Distance to faults. j Lithology. k Land use. l Rainfall. m NDVI

The slope degree is always considered as an essential factor in analyzing the areas susceptible to landslide (Umar et al. 2014), because it is the major cause of mass movements. Exposure to sunlight, dry winds, and increased relative humidity due to rainfall are all factors associated with slope aspect that trigger landslides (Kavzoglu et al. 2014). Therefore, slope aspect has always been considered by researchers. This factor is divided into 9 classes. Altitude is not directly involved in the occurrence of landslides; however, other factors related to it such as tectonic activity, weathering, and climate change influence the entire process (Rozos et al. 2008). The topographic wetness index is a useful tool for estimating moisture conditions at a basin scale (Grabs et al. 2009). This factor was used due to the varying humidity conditions in the study area. The values ​obtained from the slope curvature show the morphology of the different elevation points (Erener and Düzgün 2010). In this paper, both the profile curvature curve and the plan curvature were taken into account. The former indicates the velocity and process of sediment transport, and the second, the divergence and convergence of the flow passing through the surface (Dehnavi et al. 2015). Road construction, especially when engineering principles are ignored, reduces slope stability and consequently triggers landslides (Moosavi and Niazi 2016). Therefore, the distance from the road has always attracted the interest of researchers (Xiao et al. 2019; Bui et al. 2012). Streams decrease shear strength by eroding the materials from the toe of the slope. Consequently, the factor of distance from the stream is very important in relation to slope stability (Achour et al. 2017). Faults, especially in seismic zones, play a significant role in triggering mass movements (Shirzadi et al. 2017). They either act directly as a triggering factor for landslides or indirectly by causing fractures in slope layers that lead to the penetration of water into joints and fissures, thereby reducing the shear strength of materials constituting the slope that results in the occurrence of landslides (Dehnavi et al. 2015). Lithology as a geological factor has always played an important role in predicting landslide, because different geological units with varying degrees of permeability influence slope stability (Chalkias et al. 2014). Due to their impacts on slope instability, different types of land use have always attracted many researchers in their research on landslides (Conforti et al. 2014; Dou et al. 2014). The rainfall factor was used in this research because the amount of rainfall varies with changes in elevation and rainfall directly and indirectly influences landslide occurrence. The NDVI index was calculated to analyze the effect of vegetation on slope instability:

$$ NDVI=\frac{NIR-R}{NIR+R} $$
(1)

The NDVI benefits from the ratio of near-infrared (NIR) reflection to red (R) reflection to estimate vegetation density (Polykretis et al. 2019).

2 Methodology

2.1 Adaptive neuro-fuzzy inference system (ANFIS)

Although a fuzzy inference system (FIS) using “if-then” rules can analyze complex processes, it is unable to perform the learning process. The adaptive neuro-fuzzy inference system (ANFIS) (Jang 1993) is one of the most widely used fuzzy systems for modeling nonlinear problems. This approach, developed by combining a FIS and an artificial neural network (ANN), utilizes the advantages of both approaches to solve problems. The ANN model is able to optimize the fuzzy logic solution through the learning process (Oh and Pradhan 2011). The details of the ANFIS model structure are as follows.

The ANFIS structure was developed by using the Takagi-Sugeno fuzzy rule base (the details are presented in Eqs. 10 and 11).

$$ \mathrm{Rule}\ 1:\mathrm{if}\ \mathrm{x}\ \mathrm{is}\ {A}_1\ \mathrm{and}\ \mathrm{y}\ \mathrm{is}\kern0.5em {B}_1\ \mathrm{then}\ {f}_{1=}{p}_1x+{q}_1y+{r}_1 $$
(2)
$$ \mathrm{Rule}\ 2:\mathrm{if}\ \mathrm{x}\ \mathrm{is}\ {A}_2\ \mathrm{and}\ \mathrm{y}\ \mathrm{is}\kern0.5em {B}_2\ \mathrm{then}\ {f}_{2=}{p}_2x+{q}_2y+{r}_2 $$
(3)

Here, x and y are the system inputs, and A1, A2, B1, and B2 are fuzzy membership functions. In addition, pi, qi, and ri (∀ i = 1, 2) are the parameters of the output function (Jang 1993). In general, the ANFIS structure is made of five layers described below (Fig. 4).

  • Layer 1

Fig. 4
figure 4

The structure of the ANFIS model

This layer is responsible for the fuzzification of the variables, and the nodes in this layer are adaptive nodes.

$$ {O}_{A_i}^1={\mu}_{A_i}(x),\kern0.5em \mathrm{i}=1,2\kern0.75em $$
(4)
$$ {O}_{B_i}^1={\mu}_{B_i}(y),\kern0.5em \mathrm{i}=1,2 $$
(5)

Here, i represents the related node, x and y are its input variables, Ai and Bi are linguistic terms, and μAi (x) and μBi (y) are the membership functions of the node i.

  • Layer 2

In this section, every node is a fixed node, and each one is responsible for multiplying signals entering it. The nodes are named by the Π label, and their outputs are as follows (Oh and Pradhan 2011):

$$ {O}_{2,i}={\mu}_{A_i}(x).{\mu}_{B_i}(y)={W}_i,\kern1.25em for\ i=1,2 $$
(6)

Here, Wi (the so-called firing strength of each fuzzy rule) represents each node’s output.

  • Layer 3

This layer has the task of normalizing the output of the second layer. Therefore, the nodes, which are fixed ones and named by the N label, normalize the input values (Eq. 15). The numerator of the fraction includes the firing strength of each fuzzy rule, and the denominator includes the total firing strength of each rule.

$$ {O}_{3,i}=\frac{W_i}{W_1+{W}_2}={\overline{W}}_i\kern0.5em for\ i=1,2 $$
(7)
  • Layer 4

This is considered the second adaptive layer in the ANFIS structure, and each node’s output is obtained from the following equation:

$$ {O}_{4,i}={\overline{W}}_i.{f}_i={W}_i.\left({p}_ix+{q}_iy+{r}_i\right)\kern1em for\ i=1,2 $$
(8)

In this equation, \( {\overline{W}}_i \) is the normalized firing strength of the third layer. pi, qi , and ri are the variable parameters (also referred to as the result parameters) of the node i.

  • Layer 5

The only node existing in this layer is fixed node labeled Σ. This node sums up all the input signals and calculates the resulting output (Eq. 17).

$$ {O}_{5,i}={\sum}_i{\overline{W}}_i{f}_i=\frac{\sum_i{W}_i{f}_i}{\sum_i{W}_{\mathrm{i}}} $$
(9)

For more details on the layers and the algorithms, refer to Jang (1993).

2.2 Best-worst multi-criteria decision making (BWM) model

The best-worst method (BWM) is one of the newest and most efficient multi-criteria decision-making approaches introduced in 2015 by Rezaei to calculate the final weights of criteria in decision-making problems. As in other MCDM methods such as AHP, pair-wise comparisons are used in BWM. One of the advantages of BWM over AHP is that fewer pair-wise comparisons are used (for AHP we need n(n − 1)/2 comparisons, and for the BWM method, we need 2n − 3 comparisons) (Rezaei 2015). However, the differences in the final weight calculation in this method have made the final result much more realistic and consistent than methods such as AHP. The numbers used for pair-wise comparisons are integers ranging between 1 and 9, and there is no need for fractional numbers. It is also possible to integrate the BWM with other MCDM methods (Ahmad et al. 2017). The various steps in this method and its algorithms for problem solving are as follows (Rezaei 2015):

  1. 1.

    Specifying the decision-making criteria for evaluation. The set of criteria is defined as {C1,  C2, …, Cn}.

  2. 2.

    Determining the best (B) and worst (W) criteria by the experts. The best criterion includes most important or the most desirable criterion, whereas the worst ones include those with the least desirability and/or lowest importance.

  3. 3.

    Determining the priority of the best criteria compared to all the others (the numbers 1 to 9 are used for this purpose). This preference is represented in the form of the following vector:

$$ {\mathrm{A}}_{\mathrm{B}}=\left({\upalpha}_{\mathrm{B}1},{\upalpha}_{\mathrm{B}2,}\dots, {\upalpha}_{\mathrm{B}3}\right) $$
(10)

Here, αB1 represents the preference of the best criterion (B) over the criterion j (αBB = 1) (Fig. 5).

  1. 4.

    Determining the priority of all the criteria over the worst one (W). The preference vector for this phase is as follows:

Fig. 5
figure 5

Reference comparison for BWM model (Rezaei 2015)

$$ {\mathrm{A}}_{\mathrm{W}}={\left({\upalpha}_{1\mathrm{W}},{\upalpha}_{2\mathrm{W}},\dots, {\upalpha}_{\mathrm{nW}}\right)}^{\mathrm{T}} $$
(11)

Here, αjW is the preference of the j criterion over the worst one (W) (αWW=1) (Fig. 5).

  1. 5.

    Calculating the final weights of the criteria. The following equations are used for this purpose:

The values of the final optimum weights (\( {\mathrm{W}}_1^{\ast },{\mathrm{W}}_2^{\ast },\dots .{\mathrm{W}}_{\mathrm{n}}^{\ast}\Big) \) and

are obtained by Eq. 8. In addition, the consistency ratio for each criterion can be estimated by using the consistency index table (Table 2) and the

value. The following equation states that:

Table 2 Consistency index for the BWM model

It is evident that the closer the value of the consistency index is to zero, the more realistic the results will be. Refer to Rezaei (2015) for more details of this method.

2.3 Step-wise weight assessment ratio analysis (SWARA) model

This is a multi-criteria decision-making method with an ultimate objective like that of other similar approaches: assigning weights to criteria and sub-criteria. Since its introduction by Keršulien et al. in 2010, researchers have used it to analyze various areas (Mardani et al. 2017). An advantage of this method is its flexibility that allows experts to prioritize the criteria based on the existing conditions. The main feature of this approach is its capability in estimating experts’ opinions in relation to the relative importance of the criteria in order to determine their weights (Keršulien et al. 2010). This procedure consists of the following steps:

  1. 1.

    Selecting the required criteria and ranking them according to their degrees of importance (the most important criteria take the highest position of ranking and the least important ones the lowest)

  2. 2.

    Calculating the coefficient Kj, which is a function of the relative importance of each criterion

  3. 3.

    Determining the initial weight of each criterion

  4. 4.

    Calculating the final normalized weight

The final weight for each criterion is calculated through the following equations (Keršulien et al. 2010):

$$ {\mathrm{S}}_{\mathrm{j}}=\frac{\sum \limits_{\mathrm{i}}^{\mathrm{n}}{\mathrm{A}}_{\mathrm{i}}}{\mathrm{n}} $$
(14)

In this equation, j and n represent the criterion number and the number of experts, respectively. The value of Ai also indicates the suggested rating of each criterion.

$$ {\mathrm{K}}_{\mathrm{j}}={\mathrm{S}}_{\mathrm{j}}+1 $$
(15)
$$ {\mathrm{Q}}_{\mathrm{j}}=\frac{{\mathrm{X}}_{\mathrm{j}}-1}{{\mathrm{K}}_{\mathrm{j}}} $$
(16)

Here, Kj and Qi are functions of the relative importance and initial weight of each criterion, respectively.

$$ {\mathrm{W}}_{\mathrm{j}}=\frac{{\mathrm{Q}}_{\mathrm{j}}}{\sum_{\mathrm{j}=1}^{\mathrm{m}}{\mathrm{Q}}_{\mathrm{j}}} $$
(17)

In this formula, j represents the criterion number, and m shows the number of criteria when Wj indicates the final weight.

The final weight (Wj) obtained for each sub-criteria in this study indicates the relationship between landslides and conditioning factors (Table 3).

Table 3 Correlation between conditioning factors and landslides using SWARA and BWM methods

Figure 6 shows the process of the study, including the methods and type of combination used.

Fig. 6
figure 6

Flowchart of the study area that shows all steps

3 Results and validation

Table 3 shows the weights obtained from the BWM model and SWARA. As shown in Table 3, the values are between 0 and 0.5. The higher are these values, the greater is their impact. The values for the slope factor indicate that most of the landslides that occurred in the study area were of the 5–15° class with weights of 0.409 and 0.405, respectively. Aghdam et al. (2017) also reported that the highest probability of landslide occurrence is related to the slope 5–20°, and this probability decreases with an increase in degree. Among the different slope aspects, the northeast aspect, with the values of 0.249 (BWM) and 0.486 (SWARA), had the highest effect on landslide occurrence, due to increased moisture. According to Fig. 7, the main areas with high and very high degrees of susceptible are in the north to east. In line with the present study, Sahin (2020) also showed that the northeast of the study area has the highest susceptible to landslide. In relation to the altitude factor, the 1500–1700 m class had the highest impact on landslide (with values of 0.212 and 0.434 for BWM and SWARA, respectively). As shown in Table 3, the degree of susceptibility decreases with an increase in altitude. In a study, Ding et al. (2017) concluded that the highest probability of landslide occurrence is up to medium altitude and this probability decreases with increasing this altitude. The results of the BWM model for the TWI showed that the 5.65–7.31 and 7.31–9.87 classes with the weight of 0.371 had the highest impact on landslide occurrence. For the SWARA, the 7.31–9.87 class with values of 0.482 had the highest probabilities. Consistent with the present study, Roy et al. (2019) also found that low and medium TWI values (7.37–9.76) have the highest risk. For the plan curvature factor, according to Table 3, the maximum weights obtained from the BWM and SWARA were for the convex class with weights of 0.769 and 0.410, respectively. This is due to divergence and convergence water flow (Arabameri et al. 2019). The obtained results are in accordance with the findings of Chen et al. (2020). For the profile curvature factor, the highest BWM weight (0.470) was that of the concave and convex classes and for SWARA the highest value (0.489) was that of the concave class. The finding of a study by Dehnavi et al. (2015) also revealed that the class “concave” has the highest impact on the landslide occurrence. The results obtained from BWM indicated that the distance to road, distance to stream, and distance to fault in the 0–100-m, 0-100-m, and 1200–1500-m classes with weights of 0.397, 0.297, and 0.330, respectively, had the highest influence on landslide. As in the BWM method, in the SWARA, also the same classes had the highest weights with the values of 0.311, 0.404, and 0.386, respectively. Consistent with the present study, Aghdam et al. (2017) also concluded that the maximum weight for the factors of distance to road and distance to stream is related to the distance of 0–100 m, and it decreases with an increase in distance. Concerning lithology, the Jl and PlQc classes had the highest values in the BWM method (0.2), and the highest in the SWARA (0.344) was in the Jl class. For the land use factor, the agriculture class in both models had the strongest relationship with landslide occurrence with values of 0.505 and 0.270, respectively. The results of this study showed that land use change disturbs the natural balance of the slopes and increases the risk of landslide occurrence. The findings of Arabameri et al. (2019) also showed that the class “dryfarming-agriculture” has the highest risk. Landslides were more likely to occur with increases in rainfall. For the rainfall factor, 332.9–387.65 mm of rainfall had the highest weights in the BWM model and SWARA (0.352 and 0.647 and 0.352, respectively). In relation to the NDVI factor, the likelihood of landslide occurrence was greatest for the class >0.5 with the weights of 0.574 and 0.356 for the BWM method and SWARA, respectively.

Fig. 7
figure 7

Landslide susceptibility map produced by ANFIS-BWM (a) and ANFIS-SWARA (b) models

3.1 Integration of the ANFIS with SWARA and BWM

In this study, MATLAB was employed to construct the ANFIS model and the SWARA method and BWM to feed it for training the network. For this purpose, all the data were first divided into the training and validation sets. As mentioned earlier, 70% of the data (172 landslide locations) were allocated for training and 30% (70 landslide locations) for validation, and they were assigned the value of 1. Using the training data and the SWARA model and BWM, the weights of the sub-criteria were calculated (Table 3). In the next step, 242 non-landslide points, showing the total number of data, were created in the non-landslide areas. Then, 0 was allocated to each of them. Out of these non-landslide points, 70% (172) points were selected randomly and considered for training the network. Next, 172 landslide and non-landslide points (with values of 1 and 0) were overlaid upon the conditioning factors, and the value of each one was determined. This process was carried out once for the SWARA model and once for the BWM. The values obtained from the overlaying were used as input data for ANFIS training. After ANFIS training using the BWM method and SWARA, all the pixels were entered into MATLAB, and the final value of each pixel was determined using the created network. Finally,landslide susceptibility maps were prepared for the ensemble ANFIS-BWM and ANFIS-SWARA models (Fig.7). The prepared maps were divided into five classes with susceptibility degree of very low, low, moderate, high, and very high by applying natural break method (Ilia and Tsangaratos 2016; Ding et al. 2017; Panahi et al. 2020). Figure 8 shows the percent area for each class in the ANFIS-BWM and ANFIS-SWARA ensemble models. It is quite clear that the class with very high landslide susceptibility had the lowest area in both LSMs with values of 18.65% and 16.21%, respectively (Table 4). In addition, the classes with low and high landslide susceptibility had the largest areas with the values of 20.50% and 23.01% for ANFIS-BWM and ANFIS-SWARA, respectively.

Fig. 8
figure 8

Percentages of landslide susceptibility classes

Table 4 AUC and MSE values of two ensemble models for testing

3.2 Models validation and comparison

Validation is a very important step in estimating the accuracy of a method in producing landslide susceptibility maps. In this study, validation was performed by using 30% of landslide and non-landslide locations (72 points with values of 0 and 1) in three stages. In the first stage, the mean-squared-error (MSE) and root-mean-squared-error (RMSE) were calculated to estimate the accuracy of ANFIS-trained network using SWARA and BWM methods. MSE and RMSE are defined as follows:

$$ MSE=\frac{1}{n}\ {\sum}_{j=1}^n\left(\ {T}_j-{\overline{T}}_j\ \right)2 $$
(18)
$$ RMSE=\sqrt{\frac{1}{n}\ {\sum}_{j=1}^n\left(\ {T}_j-{\overline{T}}_j\ \right)2} $$
(19)

where, Tj is the target values and \( {\overline{T}}_j \) is the output values, and n is the total number of samples. RMSE is the square root of MSE.

The lower the MSE value is (closer to zero), the lower the amount of error in the final prediction, and hence, the more accuracy the modeling will be (Jackson et al. 2019). Figure 9c shows MSE and RMSE for the test dataset. The results showed that the MSE values for the ANFIS-BWM and ANFIS-SWARA models are 0.242 and 0.299, and RMSE values are 0.443 and 0.477, respectively (Table 4). As the results indicate, the new BWM method outperformed the SWARA model in training the ANFIS.

Fig. 9
figure 9

ANFIS-SWARA and ANFIS-BWM training and testing datasets. a MSE and RMSE value in the training phase. b Frequency errors in the training phase. c MSE and RMSE value in the testing phase. d Frequency errors in the testing phase

In the second step, indices such as positive predictive value (PPV), negative predictive value (NPV), sensitivity (SST), specificity (SPE), and accuracy (ACC) were calculated using the error matrix (Bui et al. 2016; Wang et al. 2020). The following equations were used to calculate the indices:

$$ PPV=\frac{TP}{TP+ FP} $$
(20)
$$ NPV=\frac{TN}{TN+ FN} $$
(21)
$$ SST=\frac{TP}{TP+ FN} $$
(22)
$$ SPE=\frac{TN}{TN+ FP} $$
(23)
$$ ACC=\frac{TP+ TN}{TP+ FN+ FP+ TN} $$
(24)

Here, TP indicates pixels which have correctly been classified as the landslide occurrence, TN stands for pixels which have correctly been classified as non-landslide pixels, FP represents pixels that have incorrectly been classified as landslide pixels, and FN also indicates pixels that are incorrectly classified as non-landslide pixels. As shown in Table 5, the PPV values for ANFIS-BWM and ANFIS-SWARA are 65.6% and 63.8%, the NPV values are 80.9% and 78.3%, the SST values are 87.1% and 85.7%, the SPE values are 54.3% and 51.4%, and the ACC values are 70.7% and 68.6%, respectively. The results show that the ANFIS-BWM model has a higher percentage in all indicators.

Table 5 Models’ performances using quantitative indices

In the third stage, the LSMs were evaluated using the ROC curve. The ROC curve is a graphical representation of the balance between negative and positive error values that can quantitatively estimate the model accuracy. The area under the curve (AUC) illustrates the predicted value of the system by describing its ability in correctly estimating the occurrence of the event (landslide) and the non-occurrence of the event (non-landslide) (Yan et al. 2019). Therefore, the larger the area under the curve (AUC), the more accurate the model will be, and the lower AUC show the weak performance of the model. Further details on this curve for validating landslide susceptibility maps are provided in paper by Fan et al. (2017).

In this study, 72 landslide and non-landslide points were overlaid upon the final maps to plot the ROC curves. The values obtained for each point were then used as input data. Figure 10 shows the ROC curves for the methods. Based on the results, the areas under the curves for the ANFIS-BWM and ANFIS-SWARA ensemble models are 75% and 73.6%, respectively. The results obtained from the evaluation of the zoning suggested that both models were able to predict the landslide prone areas well; however, the ANFIS-BWM model was more accurate and, hence, yielded more reliable outputs.

Fig. 10
figure 10

Receiver operating characteristic curve for validation

4 Discussion

Landslide spatial modeling is a nonlinear and complex problem because it is affected by various parameters. Therefore, to achieve the better results, using new methods and their combination is necessary. In spatial modeling of landslide, the combination of machine learning algorithms with MCDM methods has received less attention. In this study, we produced a new ensemble ANFIS-BWM model for landslide susceptibility mapping in the Khalkhal-Tarom, Iran. The performance of this model was then compared with the ensemble ANFIS-SWARA model using confusion matrix and ROC curve. One of the important steps in spatial modeling is to compare the results with other similar studies.

To produce landslide susceptibility map, several studies have been applied MCDM and machine learning methods. For example, Gigović et al. (2019) integrated the BWM with the WLC and OWA methods for zoning regional landslides in western Serbia. They showed that the ensemble MCDM-BWM methods with more than 90% accuracy can be a powerful method for spatial modeling of landslides. In another study, Moharrami et al. (2020) applied the combination of fuzzy with BWM and AHP methods to evaluate areas that are prone to landslides. Their findings showed that FBWM ensemble method has better performance than FAHP. According to Gigović et al. (2019) and Moharrami et al. (2020), the BWM method has advantages such as (1) it requires less pairwise comparisons compared to other widely used MCDM methods like AHP, (2) its results are more reliable because it has a higher consistency ratio compared to AHP, and (3) working with this method is more accurate and easier because it does not use secondary comparisons. They also stated that the combination of BWM method with other models has better performance than stand-alone implementation. Consistent with previous studies, the results of the present study showed that the ensemble ANFIS-BWM method has a good performance and is more accurate in preparing LSM when compared to ANFIS-SWARW.

In spatial modeling, the ANFIS method has been used as a powerful method in combination with other methods (Chen et al. 2021; Costache et al. 2020; Dehnavi et al. 2015). Chen et al. (2021), for example, used the ANFIS model and its combination with two intelligent TLBO and SBO algorithms to generate a landslide susceptibility map. Their results showed that the hybrid ANFIS-SBO model outperformed the ANFIS and ANFIS-TLBO models. In addition, they stated that the advantages of the ANFIS method such as capacity, simplicity, and speed of estimation have made it to have better adaptability to other methods in order to create a hybrid model. Panahi et al. (2020) also stated that the ANFIS model has some benefits, including good learning ability, good integration by its neural network, and more flexibility in nature. In another study, Costache et al. (2020) used a combination of ANFIS with three qualitative and quantitative methods of AHP, CF, and WOE. Their findings showed that all three ensemble models with the accuracy more than 80% have excellent performance in flood susceptibility zoning. They also reported that although ANFIS is a powerful method, the type of model used in the production of input data is important and can affect the accuracy of the results. Consistent with the study, in this study, we showed that although both models used are of MCDM type, the BWM method is better than SWARA in combination with ANFIS. Since bringing the prediction closer to reality is the most important objective in complex environmental issues such as landslides, it is necessary to compare newly introduced ensemble methods with the previous ones in order to achieve more optimal results.To generate landslide susceptibility map, Dehnavi et al. (2015) integrated the SWARA multi-criteria decision-making approach with the ANFIS method. They found that the ANFIS-SWARA model with an area under the curve of 0.8 yielded a more accurate prediction than the SWARA method. In line with the study conducted, we also found that although the ensemble ANFIS-SWARA model has a good performance with more than 70% accuracy, but the ANFIS-BWM exhibited higher accuracy. There are also disadvantages to implementing ANFIS-BWM model. For a limited number of landslide points, the model does not provide a suitable output.

In brief, to improve the performance of ANFIS model, two type of MCDM methods were applied. According to the literature review, the type of model that is used to determine the correlation between conditioning factors and the landslide occurrence is effective in improving the results (Dehnavi et al. 2015; Aghdam et al. 2017; Costache et al. 2020). Based on the results shown in Table 3, although both methods are of the type of MCDM and include values between 0 and 0.5, the BWM model performs better compared to SWARA model. In other word, the results indicated that the BWM produced more realistic results than the SWARA method which trained the ANFIS model well and obtained an acceptable output from it.

As a final conclusion, based on the ROC results with more than 70% accuracy, the ensemble models used in this study have a suitable structure for use in other spatial modeling studies. It is recommended to integrate novel multi-criteria decision-making models with machine learning algorithms such as ANFIS for improve the accuracy.

5 Conclusion

Known as natural destructive ground-deforming phenomena, landslides have occurred in all historical periods. In the current study, for spatial prediction of landslide in the Khalkhal-Tarom, Iran, a new combination of MCDM method and machine learning algorithm was conducted. For this purpose, we integrated BWM method with ANFIS model. Moreover, the ANFIS-SWARA ensemble model was applied to compare with ANFIS-BWM to select more realistic LSM. The results of ROC showed that with more than 70% accuracy, although both ensemble models used in this study have a suitable structure for spatial modeling of landslide, new ensemble ANFIS-BWM model performed more accurately than ANFIS-SWARA. In addition, the results of sensitivity, specificity and accuracy proved the superiority of the ANFIS-BWM model.Final maps also indicated that the largest percentage of high and very high susceptibility zones is from north to east. Since the ANFIS-BWM model yielded better results, it is recommended for use in other similar areas because it can substantially help land use managers and planners in making essential decisions.