1 Introduction

Landslide is mass movements of soil or rock from the top to the bottom of a slope (Chen et al. 2018). It is known as one of the most significant and dangerous natural disasters worldwide, which leads to extensive social and economic losses along with the devastation of water and soil resources (Korup et al. 2012; Alimohammadlou et al. 2013; Raja et al. 2017). Thus, the occurrence of landslides is a complex phenomenon and is related to various parameters, i.e. geology, topography, vegetation, heavy rain, and activities of humans (Cruden 1991).

Worldwide from 1998 to 2017, landslides accounted 5.2% of natural hazards according to the information of the Centre for Research on the Epidemiology of Disasters (CRED 2018). Landslides caused global total annual damages of about 18 billion Euros (Zhu et al. 2018). Extensive sections of Iran are covered by mountainous regions, and therefore, landslide is commonly considered as natural disaster causing abundant economic and social losses (Shirzadi et al. 2017). Landslides happen in Iran mainly due to the Alpine–Himalayan belt activities (Farrokhnia et al. 2011; Shirzadi et al. 2017) which caused an economic loss of 10 billion USD with 4900 landslides occurrence till September 2007 (Shirani et al. 2018). Government agencies throughout the world have developed different approaches to alleviate and preclude the damages of landslides, including issuing early warnings, planning evacuation routes, and building engineering structures (Choi and Cheung 2013; Luo and Liu 2018). However, all of these techniques are dependent on the actual determination of spatial landslides prediction (Bui et al. 2016) which have properties that make them susceptible to land sliding, i.e. landslide susceptibility modelling (LSM) (Luo and Liu 2018).

The landslide susceptibility concept demonstrates the possibility of landslide events happening in an area based on the conditions of the local terrain, which do not include the return period nor the probability of happening of the instability process (Corominas et al. 2014; Zêzere et al. 2017). LSM is a solution for the apprehension and prediction of future landslides to alleviate their consequences (Feizizadeh and Blaschke 2013). Therefore, the generation of landslide susceptibility maps (LSMs) at the preliminary step of landslide hazard assessment is of significance for the safe economic planning. However, a standard procedure for the LSMs generation does not exist (Samodra et al. 2018).

It has been proven effective and feasible in recent decades to make use of geographic information system (GIS) and remote sensing (RS) technologies for the evaluation of landslide (Dou et al. 2019). A broad range of techniques and models have been suggested and employed for the LSM. The most usual approaches and procedures suggested in the literature are frequency ratio (Yilmaz 2009; Yalcin et al. 2011; Aditian et al. 2018), index of entropy (Constantin et al. 2011; Devkota et al. 2013; Youssef 2015), analytical hierarchy process (AHP) (Pourghasemi et al. 2012; Bahrami et al. 2020), analytical network process (ANP) (Melchiorre et al. 2008; Rajabi et al. 2016; Swetha and Gopinath 2020), information value (Du et al. 2017; Sharma and Mahajan 2019), fuzzy logic (Bui et al. 2012; Sahana and Sajjad 2017; Tsangaratos et al. 2018).

Comparatively the performance of these methods is good in LSM which can be improved further using hybrid methods to develop techniques (Bui et al. 2014; Truong et al. 2018). In recent years, researchers around the world have proposed various techniques, using hybrid and integrated approaches to generate LSM of distinct areas of the world. These approaches include fuzzy logic and analytic hierarchy process (Gorsevski et al. 2006); fuzzy logic and ANP (Abedi Gheshlaghi and Feizizadeh 2017; Feizizadeh et al. 2021); AHP and statistical index (Arabameri et al. 2020); random forest base classifier and its ensembles (Ebrahimy et al. 2020; Nhu et al. 2020); fuzzy logic and weight of evidence (Hong et al. 2017); support vector machines and random subspace (Tien Bui et al. 2019); SVM and differential evolution (Tien Bui et al. 2016), and Bat algorithm optimized SVM (Bui et al. 2019).

The mentioned literature review reveals that world widely several techniques have been individually used. FMV, IV, and FR are such techniques which are capable to analyse the effect of factor classes on the occurrence of landslide. However, in most cases the correlation between the factors is neglected. On the contrary, the IOE is capable of analysing the association among the parameters, but it is not able to evaluate the factor classes. Therefore, there is a need of their integration into hybrid techniques for the landslide modelling. The above-cited hybrid models give rise to new thoughts of combining two distinct techniques in order to minimize the sensitivity to noises and isolated samples, thus appealing for many scholars (Meng et al. 2016). Combinations of index of entropy (IOE) with fuzzy membership value (FMV), information value (IV), and frequency ratio (FR) techniques can overcome the flaws of four approaches.

In this paper, new ensemble techniques, i.e. \(\mathrm{FMV}\_\mathrm{IOE}\), \(\mathrm{FR}\_\mathrm{IOE}\), and \(\mathrm{IV}\_\mathrm{IOE}\), have been proposed and substantiated for the LSM, with the case study of Azarshahr Chay Basin (ACB). Hence, the main purpose of this paper is to identify the landslide prone areas and to yield better predictions by developing the novel hybrid methods for LSMs. The major distinction between the present study and previous studies is that in this study three ensemble techniques are compared on the foundation of performance, and in the mapping of landslide susceptibility and for the first time their performance has been analysed. Fourteen factors were chosen as landslide controls factors: altitude, slope aspect, slope degree, lithology, distance to fault, curvature, land use, distance to river, topographic position index (TPI), topographic wetness index (TWI), stream power index (SPI), normalized difference vegetation index (NDVI), distance to road, and rainfall. They were created based on ArcGIS environment for the spatial analysis and manipulation of data. Finally, the LSMs were acquired and then compared with the three distinct integrated approaches. These maps provide important information for local landowners, planners to prepare emergency plans to minimize the negative effects on human life.

2 The study area and data used

2.1 The study area

The ACB is situated on the west side of the province East Azerbaijan with Urmia Lake spreading in its west side (Fig. 1). The lowest and highest elevation of the location under consideration is 1239 m a.s.l and 3300 m a.s.l, respectively (mean elevation 2282 m a.s.l.), with the slope variation from 0 to 75.95° (mean 11.06°).

Fig. 1
figure 1

Location of ACB

The local climate can be separated into two different seasons, rainy and dry seasons. The dry season runs between June to September, while the rainy season runs between October and May. January is the coldest month with mean temperature of about −1 °C, and July is the hottest month of the region with mean temperature of about 27 °C. The geology of area is responsible for earthquakes, landslides, and volcanic hazards (Feizizadeh and Blaschke 2013; Rahmati et al. 2019a, b). Various geologic units are included in the lithology of ACB such as dacitic andesite (35.14%) and yellow brecciated limestone and light-grey massive limestone (15.74%) (Table 1). The geological tectonic settings combined with unstable slopes make this area highly prone for hazards of landslide (Feizizadeh and Blaschke 2013). The land use includes agricultural land, orchard land, grassland, barren land, and cultivation and built-up area, whereas maximum area is covered by grassland (58.54%). In this area, massive rain and unfit practices of land use contribute to natural hazards, i.e. landslides, flooding, and erosion of soil during past several years.

Table 1 The lithology of ACB

Due to its steep slopes, absence of full-scope shelter by vegetation, unconsolidated soil and materials of surface and various active processes over the year, this region is one of the watersheds of the Sahand Mountains. It has been made as one of the areas prone to mass movements because of human's indirect manipulation in recent decades (Abedi Gheshlaghi and Feizizadeh 2017). In the study area, most of the landslides occur during rainy season. Mostly landslide events can be contemplated as a rotational landslide according to the observations (Feizizadeh and Blaschke 2013) and the statements of field observations.

2.2 Landslide inventory

Future risk events of a specific location may be estimated through the assessment of the records of past happenings (Devkota et al. 2013; Abedi Gheshlaghi 2019; Costache and Bui 2019; Rahmati et al. 2019a, b). The requisite input for examining the association between the spatial dissemination of landslides and the conditioning factors is the landslide inventory map (Chen et al. 2019b). Therefore, in the assessment of landslide susceptibility, the primary step is the assessment of similar past happenings and their conditioning factors. In this research, a landslide inventory was acquired utilizing images of Google Earth employing Google Earth software and field surveys through GPS. The obtained landslide inventory included 75 landslide conditions, which were classified randomly into two classes, for training (≈70%) and validation (≈30%). This inventory of landslide consists of translational (20 points), rotational (43 points), and debris flows (12 points). Landslide destruction example in the study region is depicted in Fig. 1.

2.3 Landslide conditioning factors

An essential step in LSM is the proper selection of conditioning factors for a landslide to find the spatial association between landslide inventory happenings and geo-environmental factors. In ACB, the conditioning factors were chosen after taking into account many existing studies related to landslide susceptibility as well as the field investigation (Gariano and Guzzetti 2016; Alvioli et al. 2018). Afterwards, the parameters of slope degree (Fig. 2a), slope aspect (Fig. 2b), altitude (Fig. 2c), lithology (Fig. 2d), land use (Fig. 2e), distance to river (Fig. 2f), distance to road (Fig. 2g), distance to fault (Fig. 2h), NDVI (Fig. 2i), curvature (Fig. 2j), SPI (Fig. 2k), TPI (Fig. 2l), TWI (Fig. 2m), and rainfall (Fig. 2n) were utilized for the mapping of landslide susceptibility. Detailed information is available in Table 2 which includes sources of data, GIS data type, and related LSM factor classes.

Fig. 2
figure 2

Maps of thematic: a slope degree; b slope aspect; c altitude; d lithology; e land use; f distance to river; g distance to road; h distance to fault; i NDVI; j curvature; k SPI; l TPI; m TWI; n rainfall

Table 2 The source data for conditioning factors and landslide inventory

For the preparation of slope, curvature, altitude, TWI and SPI factors in ArcGIS 10.6 environment, the digital elevation model (DEM) of the county having a spatial resolution of 30 m was acquired from the United States Geological Survey (USGS, http://www.usgs.gov).

In our study for checking the multicollinearity between the landslide conditioning factors, the tolerance (TOL) and variance inflation factor (VIF) were employed. These methods are mostly used for checking the multicollinearity between independent variables (Chen et al. 2018; Arabameri et al. 2019a), and critical multicollinearity between the conditioning factors is shown by a TOL value b of 0.1 or a VIF value N5 (Chen et al. 2018).

2.3.1 Slope degree

According to the researchers, the most significant factor in landslide stability assessment is always the slope degree (Reichenbach et al. 2018) because it directly influences the shear forces (Lee and Min 2001). The slope degree classifications of the region under consideration were acquired from a DEM with 30 m spatial resolution. It was categorized into five classes, namely: 0–5°, 5–15°, 15°–25°, 25°–35°, and > 35° (Fig. 2a). The investigation of spatial distribution exhibits that about 42.31% of the landslides in the region under study were noticed on the slope degree of 15–25° (Fig. 3a).

Fig. 3
figure 3figure 3

Analysis of landslide conditioning factors: a slope degree; b slope aspect; c altitude; d lithology; e land use; f distance to river; g distance to road; h distance to fault; i NDVI; j curvature; k SPI; l TPI; m TWI; n rainfall

2.3.2 Slope aspect

The direction of the highest slope of the terrain surface is known as slope aspect (Meng et al. 2016) and is a crucial topographic factor that affects moisture on slopes due to radiation from the sun and rainfall depending on the slope facing directions (McKean and Roering 2004; Pham et al. 2018). DEM was utilized for the acquisition of the slope aspect map. The map of the slope aspect was developed with nine intervals, including flat, east, north, northeast, southwest, south, southeast, west, and northwest (Fig. 2b). Slope aspect frequency assessment (Fig. 3b) manifests that the majority of landslide event happenings are in the north direction (34.62%), west (23.08%), and southwest (19.23%).

2.3.3 Altitude

Altitude which is the height above the sea level is familiar for its effects on biological as well as natural factors (Kavzoglu et al. 2014). Various geomorphological and geologic processes controlled this parameter (Ayalew and Yamagishi 2005). Altitude was categorized into five categories: 1239–1500, 1500–2000, 2000–2500, 2500–3000, and > 3000 m (Fig. 2c). Landslide frequency assessment exhibited that the majority of landslides were recorded in the group of 2000–2500 (38.46%) (Fig. 3c).

2.3.4 Lithology

One of the principals and basic factors having direct influence on the landslides occurrence is lithology (Abedini et al. 2018; Jiménez-Perálvarez 2018) since lithological and structural alterations usually lead to variations in durability and porosity of rocks as well as soils (Kavzoglu et al. 2014). Many researches have considered lithological features as impact factors for the susceptibility of landslide (Chen et al. 2016; Rosi et al. 2018; Dang et al. 2019). The map of lithology (Fig. 2d) was created by the Geological Survey of Iran at a scale of 1:100,000 (Table 1). Frequency evaluation of landslide happenings on lithology shows that majority of the landslides (42.31%) are located on dacitic andesite which occupies about 35.14% area (Fig. 3d).

2.3.5 Land use

Alterations in the environment and activities of humans affect land use (Kalantar et al. 2018). According to Pourghasemi et al. (Pourghasemi et al. 2018), it is the maximum utilized predictor layer after lithology, slope, and aspect which can be generated using techniques of remote sensing (Guan et al. 2017; Pham et al. 2018; Yang et al. 2019). For this research, map of land use was generated from OLI of Landsat 8 images in connection with the field maps. Land use in the region under consideration is categorized into five categories: agricultural land, orchard land, grassland, barren land, and cultivation and built-up area (Fig. 2e). Frequency assessment on land use (Fig. 3e) data of the region under study suggests that the majority of landslides are observed in grassland area (65.38%).

2.3.6 Distance to river

In the stability of the landslide, the distance to the rivers conditioning parameter plays an effective role (Dehnavi et al. 2015; Abedini et al. 2018). The measure of distance to river has been utilized in numerous studies as an impact factor (Nicu and Asăndulesei 2018; Moayedi et al. 2019). The river network was generated from DEM and grouped into five buffer groups: 0–200, 200–400, 400–600, 600–800, and > 800 m (Fig. 2f). The results of the analysis show that about 30.77% landslides are distributed from 600–800 m distance in river valley (Fig. 3f).

2.3.7 Distance to road

Distance to road has a significant association with the landslide event happening that can be the result of cut slope formations through the building of roads which disturbs the natural topology and impacts the slope stability (Kavzoglu et al. 2014). Distance to road is often utilized in the assessment of landslide susceptibility in numerous studies and is known as one of the causal parameters for the landslide event (Chen et al. 2019a). In the current research, distance to roads was considered for the landslide susceptibility and grouped into five zones of buffer making use of 200 m interval (Fig. 2g): 0–200, 200–400, 400–600, 600–800, and > 800 m. The output of the frequency assessment (Fig. 3g) manifests that high number of landslides are observed in > 800 mm (62.67%).

2.3.8 Distance to fault

Faults form a zone or line of weakness specified by tectonic structure (Meng et al. 2016). The distance to faults is an important parameter in the mapping of LSM (Abedini et al. 2018). Proximity to these structures escalates the chances of the occurrence of landslides (Bourenane et al. 2016). In this research, the distance to faults was generated from the structural geology map of the area under study at a scale of 1:100,000 and was grouped into five groups using 1000 m interval based on the ArcGIS 10.6 software, and the fault buffer categories were specified as 0–1000, 1000–2000, 2000–3000, 3000–4000, and > 4000 m (Fig. 2h). Results indicate that the majority of the landslides are nearly equally disseminated within these classes: 2000–3000 m (30.77%); 3000–4000 m (23.08%); and > 4000 m (23.08%) (Fig. 3h).

2.3.9 NDVI

The landslides occurrence is closely associated with the density of vegetation (Meng et al. 2016). The NDVI is a parameter that can detect an increase in vegetation and vegetation coverage (Hong et al. 2016). The map of NDVI for the present research was developed from the Landsat-8 satellite images associated with the OLI-sensor making use of the equation given below (Hong et al. 2016):

$$\mathrm{NDVI}=(\mathrm{NIR}-\mathrm{RED})/(\mathrm{NIR}+\mathrm{RED})$$
(1)

where NIR is the near-infrared band (0.85–0.88 µm, Band 5) and RED is the red band (0.64–0.67 µm, Band 4). For the current research, the map of NDVI was created with three intervals, including (−0.09)–0.2, 0.2–0.4, and > 0.4 (Fig. 2i). Landslide frequency assessment manifests that maximum landslides were observed in the group of (−0.09)–0.2 (80.77%) (Fig. 3i).

2.3.10 Curvature

Curvature influences the events of landslide beside other geo-environmental, and topographic factors as the movement of water depends on the curvature of the ground surface (Pham et al. 2018). Positive value of curvature shows convexity, zero value exhibits the flat areas, and negative value manifests concavity (Fig. 2j). Landslides are nearly equally disseminated in concave (57.69%) and convex (40.38%) groups (Fig. 3j).

2.3.11 SPI, TWI, and TPI

The SPI, TPI, and TWI are three important hydrologic factors that can assess the spatial alteration of landslide-vulnerable areas. They are broadly utilized in the mapping of landslide susceptibility (Kalantar et al. 2018; Pourghasemi et al. 2018). The SPI represents the erosion strength of streams which might affect the occurrence of landslide (Raja et al. 2017). TWI commonly supplies a means of quantification of the topographical influence on the hydrological activities (Tehrany et al. 2019). Maximum TWI values were related to the wet regions, whereas the minimum values with dry regions (Laamrani et al. 2015). ArcGIS software was utilized to create SPI and TWI from DEM making use of the equations given below:

$$\mathrm{TWI}=\mathrm{ln}({A}_{s}/\mathrm{tan}\beta )$$
(2)
$$\mathrm{SPI}={A}_{s}\mathrm{tan}\beta$$
(3)

where \({A}_{s}\) is the particular catchment region (m2 m−1), and \(\beta\) (radian) is the slope gradient (in degrees). The maps of TWI and SPI of the watershed were developed with five intervals, including SPI: (−4.6)–(−1.5), (−1.5)–0, 0–2, 2–4.5, and > 4.5; TWI: (−0/5)–2, 2–4, 4–6, 6–9, and > 9 (Fig. 2k–m). The SPI density of 0–2 and in the case of TWI as 2–4 is highly vulnerable to the landslide occurrence (48.08%) (Fig. 3k–m).

TPI was computed in ArcGIS software by employing the equation given below:

$$TPI={E}_{c}-\left(\frac{1}{{n}^{M}}\sum_{i\epsilon m}{E}_{i}\right)$$
(4)

where \({E}_{c}\) is the elevation at the central point, \({E}_{i}\) is the elevation and M is the predetermined radius (predetermined matrix length) (Kavzoglu et al. 2015). The watershed TPI map was developed with five intervals, including: (−106.7)–(−34.5), (−34.5)–(−9.28), (−9.28)–14, 14–47.48, and > 47.48 (Fig. 2l). Maximum landslides were observed in TPI of 14–47.48 (30.77%) (Fig. 3l).

2.3.12 Rainfall (mean annual)

The most influential factor for landslide occurrence is the high-intensity rainfall (Youssef 2015). The landslides induced by rainfall have been widely studied by scholars (Yano et al. 2019). The map of rainfall map was created making use of the inverse distance weighted (IDW) technique for the period 2005–2015 at the Tabriz, Sahand, Ajabshir, Bonab, and Maragheh stations and then grouped into five groups including 221–227, 227–230, 230–234, 234–239, and > 239 mm (Fig. 2n). The output of the frequency assessment (Fig. 3n) manifests that high number of landslides are observed in > 239 mm (46.15%).

3 Methodology

The susceptibility modelling was carried out employing the ensemble \(\mathrm{FMV}\_\mathrm{IOE}\), \(\mathrm{FR}\_\mathrm{IOE}\), and \(\mathrm{IV}\_\mathrm{IOE}\) methods. The proposed methodology in the present study has been carried out in seven main phases: (a) preparation of the spatial database; (b) selection of the conditioning factors for landslide analysis; (c) preparation of training and validation datasets; (d) development of the hybrid landslide models; (e) generation of the LSMs; (f) validation and comparison of the three models; (g) selection of the best model. The procedures of selected techniques are stepwise shown in Fig. 4.

Fig. 4
figure 4

Methodology flow chart for LSM

3.1 Frequency ratio (FR)

FR model can be utilized to quantify the spatial association between dependent and independent variables and is a bivariate statistical method (Termeh et al. 2018). The spatial association between landslides and conditioning factors was employed in the FR technique. It was computed by employing the equation given below:

$${\text{FR}} = \frac{{{\raise0.7ex\hbox{${A_{i} }$} \!\mathord{\left/ {\vphantom {{A_{i} } A}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$A$}}}}{{{\raise0.7ex\hbox{${B_{i} }$} \!\mathord{\left/ {\vphantom {{B_{i} } B}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$B$}}}}$$
(5)

where \({A}_{i}\) is the landslide pixels number within each group area, \(A\) represents the number of total landslides in the region under study, \({B}_{i}\) exhibits the number of the pixels in the conditioning factor group, and \(B\) is the number of total pixels in the region under consideration. If the weights are less than 1, then it represents a minor correlation, whereas if they are more than 1, then it represents a higher correlation (Lee and Min 2001).

3.2 Fuzzy membership value (FMV)

Fuzzy logic shows a grey look into the actual world, finding a way to draw the external fact. For a sample, if white is represented by 1 and black by 0, then grey will be a number which will be between 1 and 0 (Abedi Gheshlaghi and Feizizadeh 2017). Various techniques have been suggested to implement fuzzy principles. One of the techniques of executing this is by utilizing the FR. After the calculation of the FR, the values acquired by making use of the equation given below were normalized, and fuzzy membership values (FMVs) were obtained.

$${\mu }_{ij}={\mathrm{FR}}_{ij}/{max}_{i}({\mathrm{FR}}_{ij})$$
(6)

where \({\mu }_{ij}\) is the FMV of class \(i\) of parameter \(j\).

3.3 Information value (IV)

It is a bivariate statistical method for the spatial forecasting of an event based on the parameter and occurrence relationship. Until now IV has been the most useful model for the mapping of landslide susceptibility by determining the impact of factors governing landslide events happening in the region under study (Achour et al. 2017). A negative value of \({I}_{i}\) shows that the probability of a landslide occurrence is less than average, whereas a positive value of \({I}_{i}\) exhibits that the probability of a landslide occurrence is maximum than average. The IV \({I}_{i}\) for a parameter \(i\) can be computed using the equation given below:

$$I_{i} = \log _{2} \frac{{{\raise0.7ex\hbox{${A_{i} }$} \!\mathord{\left/ {\vphantom {{A_{i} } {B_{i} }}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${B_{i} }$}}}}{{{\raise0.7ex\hbox{$A$} \!\mathord{\left/ {\vphantom {A B}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$B$}}}}$$
(7)

where \({I}_{i}\) is the information value, \({A}_{i}\) represents the landslides number containing parameter class \(i\), \({B}_{i}\) shows the area of parameter class \(i\), \(A\) exhibits the whole number of landslides, and \(B\) manifests the total of the study region.

3.4 Index of entropy (IOE)

Index of entropy is the evaluation of the uncertainty of a system (Al-Abadi et al. 2016). Researchers, i.e. Kornejady and Pourghasemi (Kornejady and Pourghasemi 2019) and Sharma et al. (Sharma et al. 2015), employed the IOE model for the susceptibility of landslides in various parts of the globe. IOE allows approximating the weight for every landslide conditioning factor (\({W}_{j}\)) utilizing the equations given below (Bednarik et al. 2010):

$${P}_{ij}=\frac{a}{b}$$
(8)
$$({P}_{ij})=\frac{{P}_{ij}}{\sum_{i=1}^{{S}_{J}}{P}_{ij}}$$
(9)

where \(b\) is the percentage of the pixels in a class to the whole pixels\(;\) \(a\) is the percentage of landslide happening pixels in a class to the total landslide happening pixels; \(({P}_{ij})\) is the probability density.

$${H}_{j}=-\sum_{i=1}^{{S}_{J}}{(P}_{ij}){\mathrm{log}}_{2}{(P}_{ij}).\,j=1.2. \dots n$$
(10)
$${H}_{j max}={\mathrm{log}}_{2}{S}_{j}$$
(11)

where \({H}_{j}\) and \({H}_{j max}\) are the entropy values; \({S}_{j}\) is the number of classes.

$${I}_{j}=\frac{{H}_{j max}-{H}_{j}}{{H}_{j max}}. I=\left(0.1\right).\,j=1.2.\dots .n$$
(12)
$${W}_{j}={I}_{j}{P}_{ij}$$
(13)

where \({W}_{j}\) is the weight value of the factor as a whole; and \({I}_{j}\) is the value of information coefficient.

3.5 Methods integration

To integrate the techniques, the landslide susceptibility index (LSI) is computed, on the basis of weights and rating values to all categories of the distinct conditioning factors which represent the association between classes in a parameter, and the weight values of every parameters (Table 4). Therefore, the final maps of LSI were created making use of the equations given as:

$${\mathrm{LSI}}_{FMV-IOE}=\sum_{j=1}^{n}{\mathrm{FMV}}_{ij}{W}_{j}$$
(14)
$${\mathrm{LSI}}_{FR-IOE}=\sum_{j=1}^{n}{\mathrm{FR}}_{ij}{W}_{j}$$
(15)
$${\mathrm{LSI}}_{IV-IOE}=\sum_{j=1}^{n}{\mathrm{IV}}_{ij}{W}_{j}$$
(16)

where \({\mathrm{LSI}}_{FMV-IOE}\),\({\mathrm{LSI}}_{FR-IOE}\), and \({\mathrm{LSI}}_{IV-IOE}\) are the susceptibility indexes of the landslide, \({\mathrm{FMV}}_{ij}\) is the weight of class \(i\) in factor \(j\), \({\mathrm{FR}}_{ij}\) is the weight of class \(i\) in factor \(j\), \({\mathrm{IV}}_{ij}\) is the weight of class \(i\) in factor \(j\), \({W}_{j}\) the weight of factor and \(n\) is the number of factors.

LSI shows the landslide susceptibility on the basis of the number of factors (parameters), weight of the classes of every factor, and weight of every factor in the final susceptibility analysis (Fig. 5).

Fig. 5
figure 5

Integration of methods for LSI

3.6 Performance and validation of model

To understand the significance of the model outputs, validation of the techniques is an essential step in any modelling process (Balamurugan et al. 2016). In this research, the relative operating characteristics (ROC) curve was employed to analyse the models' performance. The ROC curve is designed in a two-dimensional space in which the Y-axis denotes specificity (the number of non-landslide pixels accurately classified as non-landslide), and the X-axis specifies sensitivity (the number of pixels of landslide accurately classified as a landslide). As an integral section of the ROC curve, the area under the receiver operating characteristic (AUROC) was employed to assess the landslide models' performance. In the AUROC, the graph depicts the rate of false-positive (\(1-\mathrm{specificity}\) ) on the X-axis (Eq. 14) and the rate of true-positive (sensitivity) on the Y-axis (Eq. 15):

$$X=1- \mathrm{specificity}=1-\left[\frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{FP}}\right]$$
(17)
$$Y=\mathrm{sensitivity}=\left[\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}\right]$$
(18)

where \(\mathrm{FP}\) is false-positive, \(\mathrm{TN}\) is true-negative,\(FN\) is false-negative, and \(\mathrm{TP}\) is true-positive (Arabameri et al. 2019a). The AUROC and prediction accuracy quantitative–qualitative correlation, which ranges from 0 to 1, are described as follows: excellent (0.9–1), very good (0.8–0.9), good (0.7–0.8), moderate (0.6–0.7), and weak (0.5–0.6) (Arabameri et al. 2019b).

In addition to this, statistical indexes such as specificity, sensitivity, and accuracy were utilized to assess the ensemble techniques performance. The proportion of pixels accurately classified as occurrences of landslide is known as sensitivity. On the other hand, the proportion of non-landslides pixels accurately as non-landslides is known as specificity. The proportion of pixels of landslide and non-landslide which are accurately classified is known as accuracy (Chen et al. 2018). These terms can be computed utilizing the equations given below:

$$\mathrm{Sensitivity}=\frac{TP}{TP+FN}$$
(19)
$$\mathrm{Specificity}=\frac{TN}{TN+FP}$$
(20)
$$\mathrm{Accuracy}=\frac{TP+TN}{TP+TN+FP+FN}$$
(21)

where TN (true-negative) and TP (true-positive) represent the number of pixels which are correctly classified and FN (false-negative) and FP (false-positive) represent the number of pixels which are incorrectly classified.

4 Results

The significant step in the prevention of landslides in the landslide-vulnerable areas is the LSM (Abedi Gheshlaghi and Feizizadeh 2017). The maps of landslide susceptibility were prepared after the completion of training process of landslide techniques in three major stages such as (i) preparation of factors (ii) generation of landslide susceptibility indexes (LSIs), and (iii) reclassification of LSIs. During the first step, the techniques, that is, FMV, IV, and FR, were employed for the derivation of the sub-criteria weights, and IOE technique was computed for the derivation of criteria weights (Table 4). During the second step, a set of whole sampling pixels were used for the generation of LSIs of all pixels in the whole study region. During the third stage, making use of the natural break technique the LSIs has been reclassified. On the basis of this approach, the reclassification of LSIs has been done into five susceptible levels such as very high, high, moderate, low, and very low (Fig. 7).

4.1 Importance analysis of landslide parameters

Fourteen landslide parameters predictive capability is presented in Fig. 6 while utilizing the IOE method. Of these, altitude has the maximum value of all (\({W}_{j}\)= 1.343), followed by lithology (\({W}_{j}\)= 0.641), slope degree (\({W}_{j}\)= 0.321), NDVI \({W}_{j}\) (= 0.315), rainfall (\({W}_{j}\)= 0.295), TWI (\({W}_{j}\)= 0.274), SPI (\({W}_{j}\)= 0.259), distance to road (\({W}_{j}\)= 0.210), distance to fault (\({W}_{j}\)= 0.160), slope aspect (\({W}_{j}\)= 0.159), land use (\({W}_{j}\)= 0.143), distance to river (\({W}_{j}\)= 0.081), TPI (\({W}_{j}\)= 0.061), and curvature(\({W}_{j}\)= 0.037) (Table 4).

Fig. 6
figure 6

Analysis of factor importance using IOE method

The results of multicollinearity test (Table 3) show that no significant multicollinearity was noted between the landslide conditioning factors. The minimum TOL was computed for lithology (0.242) as well as for the runoff height (0.397) which are, however, maximum than the theoretical critical value (0.10) for the confirmation of collinearity. Also, for all parameters the values of VIF are below the threshold of theoretical multicollinearity (b5.00). Therefore, these conditioning factors were all selected as input layers to create the maps of landslide susceptibility, because they make significant contribution to the occurrences of landslides on the basis of IOE and the assessment of multicollinearity in the study region.

Table 3 Multicollinearity assessment for conditioning factors

4.2 Integration of the FMV and IOE methods

The relationship between landslides and every landslide associated parameter are summarized in Table 4. Higher FMV values show the maximum chances of landslide occurrence.

Table 4 Spatial correlations between each landslide related parameter and landslides by the FR, the FMV, and the IOE methods

The obtained FMV values were employed as inputs to run the method of IOE. The LSI values were from 0.217 to 3.961. Finally, the LSM was obtained from the \(\mathrm{FMV}\_\mathrm{IOE}\) method, which was categorized into five levels of landslide susceptibility: very high (2.597–3.961), high (1.648–2.597), moderate (1.272–1.648), low (0.918–1.272), and very low (0.217–0.918) (Fig. 7a).

Fig. 7
figure 7figure 7

LSMs using: a \(\mathrm{FMV}\_\mathrm{IOE}\); b \(\mathrm{FR}\_\mathrm{IOE}\); c \(\mathrm{IV}\_\mathrm{IOE}\)

4.3 Integration of the FR and IOE methods

The correlations between landslides and every parameter making use of the FR method are summarized in Table 4. Overall, greater chances of landslide occurrence are manifested by the FR larger values.

To run the IOE method, the acquired values of FR were also employed as inputs. The range of measured LSI values was from 0.472 to 33.314. Finally, the LSM for the case of \(\mathrm{FR}\_\mathrm{IOE}\) method was obtained, and was separated into five levels of landslide susceptibility: very high (12.565–33.314), high (7.791–12.565), moderate (4.413–7.791), low (2.399–4.413), and very low (0.472–2.399) (Fig. 7b).

4.4 Integration of the IV and IOE methods

The relationship between every landslide associated factor and landslides is summarized in Table 4. The IV larger values of exhibit maximum likelihood of landslide occurrence.

The acquired values of IV were also taken into account as inputs for running the IOE method. The range of computed LSI values was from −1.674 to 2.847. Finally, the LSM for the case of \(\mathrm{IV}\_\mathrm{IOE}\) method was obtained, and the study region was dissected into five levels of landslide susceptibility: very high (1.543–2.847), high (0.441–1.543), moderate ((−0.132)–0.441), low ((−0.617)–(−0.132)), and very low ((−1.674)–(−0.617)) (Fig. 7c).

4.5 Percentage and density of susceptibility levels

Figure 8 presents the percentages of landslide susceptibility groups for every model. According to the \(\mathrm{FMV}\_\mathrm{IOE}\) results, 1.86% of the entire region was observed in the very high susceptibility level, 13.98% in the high level, 28.22% in the moderate level, 29.13% in the low level, and 26.81% in the level of very low susceptibility. As for the \(\mathrm{FR}\_\mathrm{IOE}\) ensemble, the low, very low, moderate, high, and very high levels were considered for the percentage values of 31.6, 31.42, 32.86, 2.32, and 1.8%, respectively. Correspondingly in the \(\mathrm{IV}\_\mathrm{IOE}\) ensemble, 41.1, 20.24, 25.98, 10.89, and 1.79% of the region under study were assigned to low, very low, moderate, high, and very high susceptible to landslide respectively.

Fig. 8
figure 8

Distribution of the different susceptibility levels for a \(\mathrm{FMV}\_\mathrm{IOE}\), b \(\mathrm{FR}\_\mathrm{IOE}\), and c \(\mathrm{IV}\_\mathrm{IOE}\) methods

To assess the performance of the three LSMs, the landslide density (LD) was also computed (Table 5) which is a ratio of the percentage of landslide pixels and the percentage of the level pixels on every susceptible level (Pham et al. 2016). In the reliable LSMs, the levels of maximum susceptibility should indicate maximum LD.

Table 5 Landslide density within the LSMs

The findings indicated that the LD value changed among the levels the ranging from 0 to 17.88 (Table 5). In each map, the level of very low susceptibility had the minimum LD values, followed by the level of low susceptibility, level of moderate susceptibility, level of high susceptibility, and level of very high susceptibility. The findings also manifested that the \(\mathrm{FMV}\_\mathrm{IOE}\) technique had the maximum LD for the very high susceptibility level and showed good performance than the other two techniques.

Figure 8 manifests the density of LSI maps in every individual level of landslide susceptibility. In the context of the \(\mathrm{FMV}\_\mathrm{IOE}\) model, the density was highest to low, very low, moderate, high, and very high susceptibility levels in northwest, central, southeast, eastern, and eastern regions, respectively. In this context, in the \(\mathrm{FR}\_\mathrm{IOE}\) model, the density was higher to low, very low, moderate, high, and very high susceptibility levels in western, central, northeast, southern, and eastern regions, respectively. For the \(\mathrm{IV}\_\mathrm{IOE}\) model, the density was maximum to low, very low, moderate, high, and very high susceptibility levels in northwest, southwest, southeast, eastern, and eastern regions, respectively (Abedi Gheshlaghi et al. 2020a, b).

4.6 Integrated techniques assessment

Tables 6 and 7 show the general assessment of the landslide susceptibility methods while making use of ROC curves with training and validation datasets.

Table 6 AUROC analysis for the ensemble methods with the training dataset
Table 7 AUROC analysis for the ensemble methods with the validation dataset

The AUROC values in integrated methods were significant statistically due to having Std. error to 0.018 (less than 0.05). For the ease of interpretation, the performance of methods was overlaid on a single graphic ROC (Fig. 9a and b). The acquired output for the training phase manifests that the values of AUROC for \(\mathrm{FMV}\_\mathrm{IOE}\), \(\mathrm{FR}\_\mathrm{IOE}\) were rather similar to value 0.930 and 0.929, respectively, whereas the AUROC value of \(\mathrm{IV}\_\mathrm{IOE}\) (0.926) was lower than it models. Also, the validation of the three achieved maps revealed that values of AUROC for \(\mathrm{FMV}\_\mathrm{IOE}\), \(\mathrm{FR}\_\mathrm{IOE}\) with values close to one other (0.927 and 0.925, respectively) had a better degree of fit with the training database, while \(\mathrm{IV}\_\mathrm{IOE}\) with AUROC value (0.918) was lower than it models. Therefore, based on the acquired results, \(\mathrm{IV}\_\mathrm{IOE}\) model in both training and validation phases has the lower that it models. On the other hand, \(\mathrm{FMV}\_\mathrm{IOE}\), \(\mathrm{FR}\_\mathrm{IOE}\) were rather similar, but the \(\mathrm{FMV}\_\mathrm{IOE}\) was more efficient in the process of modelling using the dataset for training. Therefore, the \(\mathrm{FMV}\_\mathrm{IOE}\) can be utilized as a propitious technique to develop the study region’s landslide susceptibility map.

Fig. 9
figure 9

a ROC curve and AUROCs of the training dataset and b ROC curve and AUROCs of the validation dataset

Statistical indexes were used to perform the additional training and validation of the datasets for the three models (Tables 8 and 9).

Table 8 The performance of techniques using the training dataset
Table 9 Methods performance using the validation dataset

The landslides techniques performance employing statistical index-based training dataset is exhibited in Table 8. Here, the \(\mathrm{FMV}\_\mathrm{IOE}\) method manifests the highest performance for the landslides pixels’ classification (\(\mathrm{sensitivity}=96.2\mathrm{\%}\)), followed by the \(\mathrm{FR}\_\mathrm{IOE}\) method (\(\mathrm{sensitivity}=94.2\mathrm{\%}\)), and the \(\mathrm{IV}\_\mathrm{IOE}\) method (\(\mathrm{sensitivity}=90.4\mathrm{\%}\)). The classification of non-landslide pixels was shown by the highest performance of the \(\mathrm{FMV}\_\mathrm{IOE}\) method (\(\mathrm{specificity}=90.4\mathrm{\%}\)), followed by the \(\mathrm{FR}\_\mathrm{IOE}\) method (\(\mathrm{specificity}=88.5\mathrm{\%}\)), and the \(\mathrm{IV}\_\mathrm{IOE}\) method (\(\mathrm{specificity}=86.5\mathrm{\%}\)). The highest accuracy is of \(\mathrm{FMV}\_\mathrm{IOE}\) method with 93.3% value, followed by the \(\mathrm{FR}\_\mathrm{IOE}\) method (91.3%), and the \(\mathrm{IV}\_\mathrm{IOE}\) method (88.5%). The landslides techniques validation making use of statistical indexes based dataset for validation is represented in Table 9. The highest performance is of \(\mathrm{FMV}\_\mathrm{IOE}\) method for the landslide pixels’ classification (\(\mathrm{sensitivity}=95.7\mathrm{\%}\)), followed by the methods such as \(\mathrm{FR}\_\mathrm{IOE}\) method (\(\mathrm{sensitivity }= 91.3\mathrm{\%}\)), and \(\mathrm{IV}\_\mathrm{IOE}\) method (\(\mathrm{sensitivity }= 87\mathrm{\%}\)). For the non-landslides pixels’ classification, the better performance is of the \(\mathrm{FMV}\_\mathrm{IOE}\) method (\(\mathrm{specificity}=87\mathrm{\%}\)), followed by the other two methods such as \(\mathrm{FR}\_\mathrm{IOE}\) method (\(\mathrm{specificity}=87\mathrm{\%}\)), and the \(\mathrm{IV}\_\mathrm{IOE}\) method (\(\mathrm{specificity}=82.6\mathrm{\%}\)). The \(\mathrm{FMV}\_\mathrm{IOE}\) method showed better performance with the maximum value of 91.3%, followed by the methods that are \(\mathrm{FR}\_\mathrm{IOE}\) method (89.1%), and the \(\mathrm{IV}\_\mathrm{IOE}\) method (84.8%). As a whole, all the three landslide ensemble techniques are appropriate for LSM in the ACB, and out of all the \(\mathrm{FMV}\_\mathrm{IOE}\) technique shows the most stable and best performance in the ACB.

5 Discussion

The LSMs is generally considered the first stage in dealing with landslide hazard mitigation. Hence, the preparation of a high-precision LSM can be useful in the field of hazard management. Hitherto, several approaches have been developed for LSMs to obtain the best method for a given area. Therefore, it is necessary to investigate novel techniques for LSA (Abedi Gheshlaghi and Feizizadeh 2017). In comparison with conventional approaches, i.e. FMV, FR, IV, and IOE methods which are familiar in solving many real-world problems, recently, researchers around the world have designed various models utilizing integrated techniques to various scientific topics (Abedi Gheshlaghi and Valizadeh Kamran 2018; Ferrari et al. 2018; Hong and Lee 2019; Wagner and Fohrer 2019; Abedi Gheshlaghi et al. 2020a, b; Feizizadeh et al. 2020). Ensemble models are showing promising and premier techniques to solve complex problems. Our study performs a comparison to evaluate the performances of some ensemble models (FMV_IOE, FR_IOE, and IV_IOE) in identifying landslide‐prone areas. The most important contribution of the present study is the simultaneous use of fuzzy system and bivariate statistics, and the integration of their results along for LSA.

Because of the limitations of each zoning approach, the models were combined and integrated to improve their performance. Among the individual techniques, the FMV method had a higher predictive accuracy than the rest of the individual techniques, as already pointed out by previous studies (Sahana and Sajjad 2017; Ozdemir 2020). The combination of the individual techniques increased their accuracy, which was also associated with previous research (Nguyen et al. 2019; Chen and Li 2020; Pham et al. 2020). Among the ensemble models, the combination of fuzzy system and bivariate statistics approaches was more efficient than the rest of the technique combinations, which was in associated with previous studies (Hong et al. 2017).

An important and fundamental step in any LSM process is identifying the most significant factors in landslide assessment (Abedi Gheshlaghi and Feizizadeh 2017). To achieve this aim, we selected 14 conditioning factors (altitude, slope aspect, slope degree, lithology, distance to fault, curvature, land use, distance to river, TPI, TWI, SPI, NDVI, distance to road, and rainfall) for modelling. To test the contribution of these influencing parameters to the landslide methods, the IOE technique has been employed. The IOE technique is efficient to show high predictive proficiency parameters. Of these 14 factors, altitude and lithology contributed most to the models, whereas curvature and TPI contributed least. These results are in line with other previous works and studies (Ercanoglu and Temiz 2011; Du et al. 2017), especially things that are done in Iranian environments (Devkota et al. 2013; Jaafari et al. 2014).

The prediction power of the best-integrated methods was graphically determined to make use of the ROC curve (Fig. 9) and statistical measures (Tables 8 and 9). The maximum AUROC and statistical measures values among the integrated methods were acquired by the FMV-IOE (AUROC = 0.927, sensitivity = 95.7%, specificity = 87%, accuracy = 91.3%). Although in all three maps, the landslides are not present in the level of very low susceptibility, the findings exhibited more values (19.78) of LD in the very high and high levels of the \(\mathrm{FMV}\_\mathrm{IOE}\) map, while they are less in the \(\mathrm{FR}\_\mathrm{IOE}\) (19.35) and \(\mathrm{IV}\_\mathrm{IOE}\) (17.98) maps (Table 5). This indicates that the \(\mathrm{FMV}\_\mathrm{IOE}\) ensemble model showed best performance than the mentioned two models.

In general, all three landslide ensemble methods have given best performance for LSM, but the \(\mathrm{FMV}\_\mathrm{IOE}\) model has given the comparatively excellent performance. To emphasize the relevance of the results obtained in the present study, it is important to note that in the literature dealing with LSA, a small number of analyses report AUROC values higher than 0.9 (e.g. Abedini et al. 2018; Pham et al. 2019).

The ensemble models proposed here successfully improved the accuracy of LSM by 20% compared to previous studies that used individual techniques for the study area (Rajabi et al. 2016). Therefore, this study constitutes a step forward in the field of accurate prediction of natural hazards by suggesting that ensemble models.

6 Conclusions

To alleviate the devastating impacts of landslides, modelling, and creating precise LSMs is essential. The techniques suggested in this research are three hybrid intelligent approaches (\(\mathrm{FMV}\_\mathrm{IOE}\), \(\mathrm{FR}\_\mathrm{IOE}\), and \(\mathrm{IV}\_\mathrm{IOE}\)) for the mapping of landslide susceptibility. The ACB, East Azerbaijan province of Iran, was the case study for which the techniques were utilized. It was developed employing 52 (training data) and validated with 23 (validation data) locations of landslide and fourteen parameters, which includes altitude, curvature, slope aspect, slope degree, lithology, NDVI, land use, distance to faults, distance to rivers, distance to roads, SPI, TWI, TPI, and rainfall. Investigation of the spatial impact of every parameter on the occurrence of landslide showed the altitude, lithology, slope degree, and NDVI as the most effective factors.

The comparison indicated that though all three methods were applicable for landslide modelling, the \(\mathrm{FMV}\_\mathrm{IOE}\) method generated higher accuracy of prediction. Therefore, the \(\mathrm{FMV}\_\mathrm{IOE}\) can be used as a propitious method to create the landslide susceptibility map in the ACB. We propose that this evolutionary approach (\(\mathrm{FMV}\_\mathrm{IOE}\)) can be employed for other areas with homogeneous (similar) conditioning factors. The generated susceptibility maps can assist the governments, planners, and managers to offer a better way for the management of slope and planning of land use in order to manage the hazard associated risks and mitigate further damage.

As the final conclusion, integrated models are suggested for landslide mapping because of their more stable results, higher prediction accuracy, and more generalization ability. Nevertheless, there is limited literature on the utilization of integrated approaches in landslide mapping, and thus, the development of other integrated frameworks is strongly recommended. The proposed approach is a promising tool that can be applied in other types of natural hazard modelling such as wildfire, gully erosion, land subsidence, and flood. From this, it is apparent that a more accurate susceptibility map can decrease the damage and cost from natural hazards.