1 Introduction

Landslides are widespread and frequently occurring natural hazard, which cause serious damages to lives and property, especially on mountainous terrains. The increased awareness about socioeconomic aftermaths of landslides and need of urbanisation in mountainous terrains have necessitated the incorporation of preliminary slope stability evaluations schemes such as landslide susceptibility mapping in development and safeguard practices in landslide-prone terrains. Landslide susceptibility mapping is the process of classifying the land surface into different categories of stability relatively, based on an estimated significance of landslide causative factors. The relative classification of slopes in terms of stability helps the engineers and planners to adopt suitable environmental regeneration measures and planning of sustainable development schemes such as the construction of roads, buildings and other infrastructures in landslide-prone terrains more efficiently and economically (Anbalagan 1992; Feizizadeh et al. 2013a, b).

The landslide susceptibility mapping techniques can be broadly classified into two categories: qualitative and quantitative approaches. The qualitative approaches are inventory-based and knowledge-driven methods such as distribution models (Wright and Nilsen 1974), geomorphic mapping (Humbert 1977), and map integration models (Brabb 1991). On the other hand, the quantitative approaches are data-driven methods and physically based models. The quantitative approaches are probabilistic models (Lan et al. 2004; Lee et al. 2002), deterministic models (An et al. 2018; Gokceoglu and Aksoy 1996; Vieira et al. 2010) and statistical models (Clerici et al. 2002; Fabbri et al. 2003; Feizizadeh and Blaschke 2013, 2014). Among the statistical models, bivariate statistical models such as fuzzy logic (Champatiray et al. 2007; Ercanoglu and Gokceoglu 2004; Feizizadeh et al. 2013b; Juang et al. 1992; Kanungo et al. 2009) and multivariate statistical models such as neural networks (Gómez and Kavzoglu 2005; Kanungo et al. 2005; Pradhan et al. 2010; Yilmaz 2009) are the most popular and widely applied approaches. However, the landslide susceptibility mapping is a nonlinear and multicriteria modelling, which is sensitive to uncertainties or imprecisions associated with decision-making and noises of input data; thus, it may deliver unreliable results if modelled with linear or highly data sensitive statistical models (Feizizadeh and Blaschke 2011). Therefore, researchers like Feizizadeh et al. (2013b, 2014), Hudson (1990), Zhu et al. (2014) emphasise the need of multicriteria decision analysis (MCDA) approaches formulated either based on expert knowledge or combines information derived from different sources under fuzzy logic as the better-suited approach for GIS-based multicriteria modelling like landslide susceptibility mapping.

The fuzzy logic or FSP, a bivariate, nonlinear statistical learning theory, was introduced by Zadeh (1965) for formal modelling of systems with missing or vague input information. Later, Mamdani and Assilian (1975) and Takagi and Sugeno (1985) have advanced fuzzy expert system (FES) that allows incorporation of expert knowledge in the basic fuzzy logic theory, known as Mamdani fuzzy logic and TSK model, respectively. The Mamdani fuzzy logic or popularly known as Mamdani-FIS is perhaps the most preferred fuzzy method to formulate FES for solving complex problems in Engineering Geology (Grima 2000). In spite of its huge popularity, the Mamdani-FIS has not been widely applied to landslide susceptibility mapping, and a comparative analysis of FES models and conventional FSP have not been encountered in the landslide literature. A few FES models for landslide susceptibility mapping based on Mamdani-FIS are available in the landslide literature; they are: Akgun et al. (2012), Saboya et al. (2006), Zhu et al. (2004, 2014).

Zhu et al. (2004, 2014) developed an LSM model based on expert knowledge and Mamdani-FIS. The membership function (MF) structure of the model constituted by a Bell waveform function for each causative factor, the categories of causative factors and their significance are determined empirically. Saboya et al. (2006) assessed the failure potential of slopes by ranking the causative factors based on a questionnaire answered by experts. Further, the ranks were fuzzified under Mamdani-FIS using a similarity concept to determine the failure potential of slopes. Akgun et al. (2012) developed a model called MamLand for landslide susceptibility mapping. Unlike earlier models, MamLand introduced the use of overlapping membership sets (MSs) within an MF and fuzzy combination rules to fuzzy logic-based landslide susceptibility mapping. The published FES models face major shortcomings such as subjectivity associated with classification and significance estimation of causative factor–landslide relation, the incompetence of FIS structure to represent physical condition of causative factors as well as low grade of interpretability and portability.

In view of the shortcomings, an effort has been made to put forth an improved FES model that attenuates the deficiencies of earlier FES models as well as more effective than conventional FSP for GIS-fuzzy-based landslide susceptibility mapping. The improved FES model presented in this paper is a fusion of Mamdani-FIS and FR method based on a strategy, ‘mean and neighbour’. The ‘mean and neighbour’ strategy governs the construction of input (fuzzifier) and output (defuzzifier) MF structure of Mamdani-FIS, whereas the FR method plays part in the formulation of fuzzy if–then rules. Under the strategy, two FIS structures, Bell–Gaussian (BG) and Triangular–Trapezoidal (TT) are proposed for the model based on MF’s shape miscibility and dynamics. The improved FES (BG and TT separately) along with an existing FES model, MamLand and conventional FSP has been applied for GIS-based landslide susceptibility mapping in and around Mussoorie Township, Uttarakhand, India, on mesoscale (1:15,000). The produced LSMs have been validated and compared in terms of spatial distribution of susceptibility zones and statistical analysis by receiver operative characteristics (ROC) and FR method with the help of landslide inventory layer of the study area.

2 Methodology

In this study, landslide susceptibility analysis has been carried out by using three models: (1) FSP, (2) MamLand and (3) the proposed model—improved FES. In FSP analysis, the cosine amplitude method has been used to determine the membership value of categories of causative factors, whereas the MamLand and improved FES models are based on Mamdani-FIS.

2.1 Fuzzy set procedure

The FSP or simply fuzzy logic was introduced by Zadeh (1965) for formal modelling of real-world systems with ambiguous, vague or missing input information to arrive at a definite conclusion. The computing mechanism of fuzzy logic is non-dichotomous in character unlike traditional formal modelling tools such as Boolean algebra or classical set theory. The fuzzy logic considers spatial objects on a map as members of a set and possibility or belongingness of each member to the set are expressed as membership value.

If X is the universe of discourse and elements of X are denoted by x, then a fuzzy set A in X can be defined as a set of ordered pairs.

$$ A = \left\{ {\left. {x,\mu_{A} (x)} \right|x \in X} \right\} $$
(1)

where \( \mu_{A} (x) \) is the membership value of x in A, in the range [0, 1] with 0 representing non-membership and 1 representing the full membership. The membership value can be determined by using the alternate hierarchy process (AHP), cosine amplitude method or can be user-defined (Vakhshoori and Zare 2016). Given two or more maps with fuzzy membership functions for the same set, various fuzzy operators can be used to combine the membership values (Zimmermann 1996). The fuzzy operators are Fuzzy AND, Fuzzy OR, Fuzzy Product, Fuzzy Sum and Fuzzy Gamma (γ).

Let \( \mu_{i} (x),\,i = 1,2, \ldots ,n \) be the membership value of the fuzzy system having n variables, the combination of maps for a fuzzy set using different fuzzy operators can be written as:

Fuzzy AND

$$ \mu_{\text{AND}} (x) = {\text{MIN}}\left[ {\mu_{1} (x),\mu_{2} (x), \ldots ,\mu_{n} (x)} \right] $$
(2)

Fuzzy OR

$$ \mu_{\text{OR}} (x) = {\text{MAX}}\left[ {\mu_{1} (x),\mu_{2} (x), \ldots ,\mu_{n} (x)} \right] $$
(3)

Fuzzy Sum

$$ \mu_{{\text{Sum}}} (x) = 1 - \prod\limits_{i = 1}^{n} {(1 - \mu_{i} (x))} $$
(4)

Fuzzy Product

$$ \mu_{{\text{Product}}} (x) = \prod\limits_{i = 1}^{n} {1\mu_{i} (x)} $$
(5)

Fuzzy Gamma (γ)

$$ \mu_{\lambda } (x) = [\mu_{{\text{Sum}}} (x)]^{\gamma } \times [\mu_{{\text{Product}}} (x)]^{1 - \gamma } $$
(6)

When the AND and OR operators used, only one membership value contribute to the result. The Sum and Product operators make resultant fuzzy set larger than, or equal to the maximum value and smaller than, or equal to the minimum value among all fuzzy sets, respectively. The resultant set integrated with the Gamma operator has the value between that of Sum and Product operators. The value of Gamma is closely associated with the degree of compensation between the extreme confidence levels.

2.2 Cosine amplitude method

Cosine amplitude method is one of the most commonly preferred similarity concept methods to determine the pairwise relationship. In the context of landslide susceptibility mapping, cosine amplitude method can be used to determine the correspondence between categories of a causative factor and landslides (Kanungo et al. 2006). The correspondence between categories and landslides is expressed as the strength of relationship (rij).

Suppose n be the number of categories of a causative factor represented as an array X ={x1, x2,…, xn} each of its elements, xi is a vector of pixels p (i.e. number of pixels) and can be expressed as, xi = {xi1, xi2,…, xip}

The strength of relation, rij, results from the direct comparison of a causative factor i to occurred landslides j, say xi and xj containing elements xik and xjk, respectively. The strength of relation, rij (membership value) values ranging from 0 to 1, where a value close to 0 indicates weak correspondence and value close to 1 indicate strong correspondence.

$$ r_{ij} = \frac{{\left[ {\sum\nolimits_{k = 1}^{p} {x_{ik} x_{jk} } } \right]}}{{\sqrt {\left( {\sum\nolimits_{k = 1}^{p} {x_{xik}^{2} } } \right)\left( {\sum\nolimits_{k = 1}^{p} {x_{jk}^{2} } } \right)} }} $$
(7)

The strength of relation (rij) can be defined as the ratio of total number of landslide pixels in the category to the square root of the multiplication of total number of pixels in that category and the total number of landslide pixels in the area.

2.3 Mamdani-FIS

Mamdani-FIS is an FES that combines expert knowledge and fuzzy set theory for efficient formal modelling. The basic idea of Mamdani-FIS is to use rules in linguistic form instead of explicitly defined algorithm to control the fuzzy inference process. The Mamdani-FIS has four units, namely fuzzifier, rule base, fuzzy inference mechanism, and defuzzifier.

Fuzzifier converts the crisp input to fuzzy input with membership value to a set by using MF of a particular shape. The most commonly used MFs are Triangular, Trapezoidal, Gaussian and Bell (Russell and Campbell 1996). The rule base is the fuzzy logic way to incorporate knowledge in natural language form to link input variables (antecedents) to output variable (consequent).

A continuous fuzzy system with non-interaction inputs x1 and x2 (antecedents) and a single output y (consequent) is described by the collection of r linguistic ‘if–then fuzzy rules’.

$$ {\text{If}}\,x_{1} \,{\text{is}}\,A_{1}^{k} \,{\text{and}}\,x_{2} \,{\text{is}}\,A_{2}^{k} ,\;{\text{THEN}}\;y\,{\text{is}}\,B^{k} \;\;{\text{for}}\;\;k = 1,2, \ldots ,r $$

where A k1 and A k2 are the fuzzy sets representing the kth antecedent pairs and Bk are the fuzzy sets representing the kth consequent.

The inference mechanism establishes an optimum logical connection between the input and output by using the fuzzy operators. Defuzzifier converts the resultant MF to a crisp value. There are various defuzzification methods available in the literature, such as fuzzy mean, adaptive integration, centre of area (COA), but the centre of gravity (COG) is the most commonly used defuzzification method.

COG can be defined as:

$$ y^{*} = \frac{{\int {y\mu (y)dy} }}{{\int {\mu (y)dy} }} $$
(8)

where μ(y) indicates output membership value after fuzzy implication.

2.4 MamLand

The MamLand was introduced as an FES model for landslide susceptibility prediction based on Mamdani-FIS. The FES has three units: input MFs (fuzzifier), output MF (defuzzifier), and fuzzy combination rules (if–then rules). The structure of MFs (both input and output) and fuzzy combination rules is formulated based on expert opinion. The input variables are considered as two types: categorical (ex. lithology) and numerical (ex. slope gradient) variables. The MF of the categorical input variable is consists of non-overlapping triangular MSs, while the MFs of numerical variables are consist of two triangular MFs overlapping each other. The allocation of fuzzy combination rules also depends on the type of input variables. The criterion for establishing fuzzy combination rules according to Akgun et al. (2012) is given below:

  1. a.

    The categorical input variable can be classified into three classes: high, moderate and low based on landslide density, whereas the numerical input variables can be classified into two classes: positive and negative class for landslide occurrence.

  2. b.

    The fuzzy combination rule that includes the high class of categorical variable in the input; the output is accepted as high or very high. Likewise, if the categorical input class is low, the output is accepted as very low.

  3. c.

    If the input class of categorical variable is moderate, dominating class of numerical variables can be considered for determining the output.

2.5 Proposed model-improved FES

The essential qualities of any FES models are that it should be representing the physical condition of causative factors, interpretable, portable, objectivity and most importantly, capable of accommodating the real-world fuzziness that exists between categories of causative factors. For example, if the slope gradient has been classified into five classes: such as 0°–15°, 16°–25°, 26°–35°, 36°–45° and > 45° and assumes that the landslide susceptibility is very low, low, moderate, high and very high, respectively. In the case of moderately susceptible class, the value ranges from 26°-35°, but in real-world conditions, the slope gradients 23°, 24°, 25°, 36° or 37° may also be moderately susceptible to landslide.

Owing to these requirements, a fusion of Mamdani-FIS and FR method rooted on a strategy ‘mean and neighbour’ has been developed to put forth an improved FES model for GIS-based landslide susceptibility mapping. The improved FES has two main segments: first, fabrication of FIS structure based on ‘mean and neighbour’ strategy according to the classification of causative factors. Second, significance estimation of categories of causative factors by FR method to furnish the fuzzy if–then rules required for the inference. The ‘mean and neighbour’ strategy for construction of input structure (fuzzifier) of FIS is explained below;

  1. a.

    In case of numerical causative factors, if a particular category has only a neighbour category, the maximum membership value (i.e. 1) is reserved to the end, or starting point and membership value gradually reduce to the minimum value (i.e. 0) at the mean point of the neighbouring category. Such end categories can be represented with a Trapezoidal or Bell waveform function.

  2. b.

    If a particular category has neighbouring categories, the maximum membership value (i.e. 1) is reserved for the mean value of that category and membership value gradually reduces to the minimum value (i.e. 0) at the mean point of neighbouring categories on both sides. Such categories can be represented by a Gaussian or Triangular waveform function.

  3. c.

    For categorical causative factors like lithology, it is non-essential and impractical to consider fuzziness between categories thus, a non-overlapping MF of Gaussian or Triangular waveform function may prefer to represent each category.

Under the strategy, a combination of two types of MFs required to constitute the MF structure of fuzzifier. Therefore, based on MF’s dynamics and shape miscibility two combinations: Bell–Gaussian (BG) and Trapezoidal–Triangular (TT) waveform functions can be clubbed together to constitute the structure. The output (defuzzifier) MF of both structures (BG and TT) can also be constituted by the same choice of MFs. As discussed earlier, the fuzzy if–then rules link the antecedent part to the consequent part of the FIS. The number of fuzzy rules is directly proportional to the number of categories considered for the causative factors. In this model, a novel procedure for rule fabrication has been developed based on the FR method. FR method establishes the relative significance of categories of the causative factor by direct comparison with the landslide inventory. The hierarchy of FR value can be followed to formulate the required fuzzy if–then rules objectively.

The FR method can be defined as:

$$ {\text{FR}} = L_{i} \cdot L_{t} /C_{i} \cdot C_{t} $$
(9)

where Li number of landslide pixels in the category i, Lt total number of landslide pixels, Ci number of pixels in the category i, Ct number of pixels in the category i.

3 Study area: Mussoorie Township

The Mussoorie Township famously known as the queen of hills is one of the most popular tourist destinations in the State of Uttarakhand, India. The city primarily established as the summer hub for British officers during the colonial period kept on drawing tourists from all over the world. With tourism being the wellspring of income for the native people and Municipality of Mussoorie, the development practices in the city has been taking place at a rapid pace. A prolonged winter season with heavy snowfall and monsoon being the prominent seasons, Mussoorie experiences precipitation of 150 mm/year on an average. The city falls in the Zone IV of the Seismic Zonation Map of India (IS 1893, 2002). Considering the importance and rate of expansion of the township, a 43 km2 area in and around Mussoorie Township has been chosen as the study area for mesoscale landslide susceptibility mapping. Location map of the study area is shown in Fig. 1.

Fig. 1
figure 1

Location of study area

3.1 Slope instability problems in and around Mussoorie Township

The Township Mussoorie is located on the Lesser Himalaya hills. The topography of Lesser Himalayan hills are highly rugged and characterised by the Proterozoic-Cambrian rocks of Krol Belt, which has been thrust over the sedimentary rocks of Siwalik Group along the main boundary thrust (MBT). The proximity to MBT and Sairku fault makes this seismically active terrain inherently vulnerable to landslides. The slope instability problems in and around Mussoorie have been identified through fieldwork and remote sensing to prepare the landslide inventory map, an integral element in the validation of outcome of any landslide susceptibility analysis (Soeters and Van Westen 1996). Altogether, 49 landslides were located (indicated as points on Fig. 2), the active landslides observed on overburden slopes fall in the category of talus, circular and creep mode of failures, whereas on the rock slope, the failure modes are mainly planar and wedge.

Fig. 2
figure 2

a Thematic map of lithology, b thematic map of LULC, c thematic map of slope gradient, d Thematic map of slope aspect, e thematic map of altitude, f thematic map of curvature, g thematic map of RR, h thematic map of TWI

3.2 Characterisation of causative factors for the study area

The degree of slope instability is controlled by the net effect of inherent causative factors. A total of eight geo-environmental factors have been considered as input causative factors in this study. The causative factors are lithology, land use & land cover (LULC), slope gradient, slope aspect, altitude, curvature, lineament and topographic wetness index (TWI). The causative factors were characterised for the study area through fieldwork and by processing remote sensing data on GIS platform. A brief discussion of causative factors is given below.

The process of weathering and erosion by physical agents depends on the type of lithology or slope forming materials (SFM). Stratigraphically, the study area comprised of two Formations—Krol and Tal Formations of Mussoorie Group (Valdiya 1980). In the present study, the SFM has been considered broadly as two categories, rock exposure and soil or overburden materials. Further, the overburden has been considered as two categories based on thickness, shallow overburden (1–5m) which are prone to translational or talus mode of failure and overburden (> 5m), where the mode of failure will be rotational or circular. The lithology map was prepared by carrying out geological fieldwork (Fig. 2a). The types of SFM are listed in Table 1 and it can be interred from Table 3, that shallow overburden is particularly vulnerable to slope failure incidents in the area.

Table 1 Categories of lithology and description

LULC condition is one of the major parameters that govern the slope stability in mountainous terrains. The degree of vegetation plays an important role in resisting slope movements, particularly shallow failures. A well-spread network of root system provides shear resistance to SFM on soil slopes by acting as a natural anchoring system. Moreover, a thick blanket of vegetation suppresses the action of weathering and erosion by physical agents and hence adds to the stability of the slope. On the other hand, barren or sparsely vegetated slopes easily get exposed to physical degradation process easily, thus rendering it vulnerable to failure. Agricultural practices and dwelling constructions are the major land uses in mountainous terrains. Under normal circumstances, agricultural practices by local terracing and urbanisation on gentle slopes are safe. On steeper slopes, quality of drainage system and load of civil structure immensely influence the stability condition. Land cover varies from barren land to thickly vegetated forest area, and major land uses are agricultural practices and urbanisation in the study area. The LULC map was prepared by fieldwork and by referring Google Earth and Bing Map images (Fig. 2b). It can be inferred from Table 3 that barren and sparsely vegetated areas are more prone to landslides, whereas the urbanised area is least affected by the slope instability problems.

Slope gradient is a reflection of a series of localised processes and controls, which has been imposed on the slope. As landslides are shear and gravitational failure of slope-forming materials, the slope gradient plays an important role in slope stability. The slope gradient map of the study area was prepared from the digital elevation model (DEM) of 15 m resolution. In the study area, slope gradient ranges from 0° to 72.5° (Fig. 2c), and the mean slope gradient value of landslide pixels was found to be 35.37° (Table 2).

Table 2 General statistics of numerical causative factors with respect to landslides

Slope aspect is the orientation of the terrain with respect to the true geographic north along maximum slope direction. Slope aspect has an indirect influence on slope instability in terms of sunlight exposure, drying winds and rainfall. Slope aspect map was prepared from the DEM (Fig. 2d), and it can be inferred from Table 2 that landslides typically occur on southeast slopes.

Altitude indicates the concentration of local landslide triggering parameters like rainfall and surface runoff in an area with respect to change in elevation. The DEM is used as altitude map in this study, and it ranges from 732 m to 2253 m in the study area (Fig. 2e), and the mean altitude value of landslide pixels was found to be 1315.77 m (Table 2).

Curvature indicates the morphometry of the slope; there are two sorts of curvatures can be derived from a DEM: plan and profile curvature. Plan curvature is the horizontal curvature that affects the convergence and divergence of down-flowing fluid, whereas profile curvature is the vertical curvature of the slope that controls acceleration and deceleration of down-flowing fluid. The standard curvature that combines both plan and profile curvature has been used in this study. Negative values of curvature indicate the slope is of concave morphometry, whereas positive values indicate convex morphometry. The curvature map of the study area was prepared from the DEM (Fig. 2f). The curvature value ranges from −26.6 to 16.4 in the study area, and mean value of landslide pixels was found to be −0.25 (Table 2).

Lineaments are tectonic discontinuities like faults, joints, and fractures that not only cause disturbance to the structure of SFM; but also cause reduction of shear strength of SFM. Pan-sharpened LISS-III image fused with Radarsat image was used as source layer to prepare the lineament layer of the study area (Fig. 2g). For the analysis, five buffer zones (BZ) at 120 m interval was considered. It can be inferred from Table 3 that maximum slope instabilities occur close to lineaments and its influence reduces as distance increases.

Topographic wetness index (TWI) indicates the spatial variation of surface moisture content. Thus, TWI can be used to interpret the role of water in inducing instability in a given area of interest. In this study, TWI was calculated by using the formula proposed by Moore et al. (1991).

$$ {\text{TWI}} = \ln \left( {\frac{{A_{\text{s}} }}{{{ \tan }\beta }}} \right) $$
(10)

where As is the specific catchment area (m2/m) and β is the slope gradient.

TWI ranges from 4.44 to 19.15 in the study area (Fig. 2h), and the mean TWI value of landslide pixels was found to be 8.07 (Table 2).

4 Landslide susceptibility mapping of Mussoorie Township

As stated earlier, three soft computing models FSP, MamLand and improved FES have been used to carry out GIS-based landslide susceptibility mapping of the study area. Execution of the models is explained in the following sections.

4.1 Execution of FSP

Eight causative factors have been used as input variables in FSP based assessment. For lithology, LULC and slope aspect, the inherent classification has been followed to carry out the analysis (Table 3). For lineament, five BZs at 120 m interval was preferred whereas, rest of the numerical causative factors have been classified into five different classes using the natural break classifier on ArcGIS 10.2.1 (Table 3). In order to determine the membership value of each category used in this study, the strength of relationship (rij) has been calculated using the Eq. 7 by taking each causative factor and (total number of pixels is 190594) and landslide inventory layer (1267 landslide pixels) one at a time. The rij values thus obtained are given in Table 3.

Table 3 Strength of relationship and FR values of causative factor categories

For example, the constituted fuzzy set of slope gradient via cosine amplitude method can be written as:

$$ {\text{Slope}}\,{\text{gradient}} - \left[ { 0.0059 / {\text{Very}}\,{\text{gentle,}}\, 0.0139 / {\text{Gentle,}}\, 0.0378 / {\text{Moderate,}}\, 0.0673 / {\text{Steep,}}\, 0.0658 / {\text{Very}}\,{\text{steep}}} \right] $$

The FSP executed on ArcGIS 10.2.1 software by using inbuilt fuzzy overlay tool. Prior to that, a total of eight fuzzy rasters representing each causative factor were created according to the classification and membership values (Table 4). Later, the fuzzy rasters with membership values were combined using the standard fuzzy operators AND, OR, Sum, Product and Gamma with five different γ values (0.1, 0.3, 0.5, 0.7 and 0.9). The resultant output rasters of different fuzzy operators were then classified into five different susceptibility classes using natural break classifier to generate LSMs. The FSP-produced LSMs are shown in Fig. 5a–i (LSM-I to LSM-IX).

4.2 Execution of MamLand

The MamLand assessment for the study area has been carried out by using six causative factors as the model has limitations in incorporating slope aspect and more than one categorical causative factors as input. The categories of lithology have been considered as high (LDspqc-K, oSl-K, and sSPsl-K) moderate (oLDss-K, Ls-K and sLDss-K) and low (Q-T, sQSl-T and oSSq-T) categories based on FR value (Table 3) and two-class classification has been adopted for rest of the causative factors as recommended. After classifying the causative factors, representative MFs and 96 combination fuzzy if–then rules of MamLand were created on Matlab 2016b software. The thematic layers of causative factors were converted to excel format of 6 × 190594 dimension and supplied as input to the MamLand model for inferring landslide susceptibility pixel wise. The inferred susceptibility values then exported to ArcGIS 10.2.1 for the production of LSM in raster format. The natural break classifier has been chosen to classify the raster into five susceptibility classes. The MamLand-produced LSM is designated as LSM-X (Fig. 5j).

4.3 Execution of improved FES

An introduction of the improved FES model is given in Sect. 2.4. Except for lineament, the classification of causative factors given in Table 3 has been adopted here to execute the improved FES model. For lineament, the BZ interval was decided as 15 m (i.e., the dimension of raster) for construction of MF, and a collection of eight BZs (i.e. a set of eight rasters) is considered for the rule the fabrication procedure. The total number of categories of causative factors is 49 thus, the structure of the FIS comprised of that many MSs for the eight input MFs and five MSs for the single output MF. As discussed earlier, two FIS structures: BG (Fig. 3) and TT (Fig. 4), have been constructed for the study area under the ‘mean and neighbour’ strategy according to the distribution of causative factors.

Fig. 3
figure 3

BG membership function structure of FIS

Fig. 4
figure 4

TT membership function structure of FIS

The FR assessment has been carried out by direct comparison of landslide inventory and causative factor one at a time using the Eq. 9 (Table 3). Subsequently, the fuzzy if–then rules required for the inference process were formulated based on the hierarchy of FR values. The rules base for the study area consists of 49 non-combination fuzzy rules in conventional consequent-antecedent format (Table 4). The improved FES model (BG and TT structure separately with common 49 fuzzy rules) is constructed on Matlab 2016b software with COG as defuzzification method under Mamdani-FIS. Causative factors were converted to excel file of 8 × 190594 dimension and supplied as input to the model. The susceptibility modelling accomplished with both structures (BG and TT) by using the same rule base. The inferred susceptibility values of both structures then imported to ArcGIS 10.2.1 software for production of LSMs in raster format. The BG and TT structure-produced maps are designated as LSM-XI (Fig. 5k) and LSM-XII (Fig. 5l), respectively.

Table 4 Fuzzy if–then rules generated according to FR value
Fig. 5
figure 5figure 5

a LSM-I (AND), b LSM-II (OR), c LSM-III (Sum), d LSM-IV (Product), e LSM-V (Gamma 0.1), f LSM-VI (Gamma 0.3), g LSM-VII (Gamma 0.5), h LSM-VIII (Gamma 0.7), g LSM-IX (Gamma 0.9), j LSM-X (MamLand), k LSM-XI (BG), g LSM-XII (TT)

5 Results and validation

A total of twelve LSMs have been prepared through three different models: FSP, MamLand and improved FES for the study area with each map displaying five susceptibility zones: very low susceptible (VLS), low susceptible (LS), moderately susceptible (MS), highly susceptible (HS) and very highly susceptible (VHS). The distribution of susceptibility zones in different LSMs is given in Table 5. A spatial and statistical analysis is mandatory for any probability based landslide susceptibility mapping to validate the reliability of the outcome. The spatial analysis aims at close inspection of the distribution pattern of susceptibility zones in correlation with input thematic layers, whereas statistical assessment validates accuracy with regard to an event (i.e. landslides). In this study, a statistical model validation technique (pre-production of LSM) and map validation technique (post-production of LSM) have been chosen; they are receiver operating characteristic (ROC) (Zweig and Campbell 1993) plot and FR analysis, respectively. The ROC plot is constructed by plotting the true-positive rate (sensitivity) against the false-positive rate (1-specificity) with various cutoff thresholds. The area under the curve (AUC) obtained from ROC plot indicates the accuracy of the model. AUC value ranges from 0.5 to 1.0, an accurate model earns value close to 1.0, whereas an inaccurate model earns value close to 0.5. The FR analysis compares the similarity between susceptibility zones and landslide inventory; theoretically, the FR value must increase from VLS to VHS zone.

Table 5 Distribution of susceptibility zones in oppose to landslide distribution and FR analysis of different LSMs

5.1 Validation of LSM-I to LSM-IX

LSM-I and II display an unacceptable distribution of susceptibility zones as a definite pattern of categorical causative factors are visible on the map. In company with LSM-I and II, LSM-IV, V, VI and VII have also displayed an erroneous distribution of zones caused by the overabundance of VLS zone, 91.26%, 88.41%, 85.01% and 77.65% respectively (Table 5). Subsequently, the HS and VHS zone accounts fewer percentages of landslide pixels than VLS and LS accommodate. The LSM-III, VIII and IX show a better distribution of susceptibility zones in terms of pattern and area accounted compared to other FSP based LSMs. However, on close inspection, it is visible that the categorical causative factors have left minor traces on these maps. It can be inferred from Table 5 that, in LSM-III, the maximum area is accounted by MS zone (28.64%), whereas VLS zone (41.69%) and LS (32.20%) zone occupies the maximum area in LSM-VIII and IX, respectively. Furthermore, the area of VHS zone in these three maps are 8.74%, 3.17% and 6.92% and accommodates 47.12%, 19.65% and 40.65% of landslide pixels respectively. The lowest AUC value of 0.500 (Fig. 6) recorded for the Product (LSM-IV) and Gamma 0.1 (LSM-V) models, indicating the inaccuracy of the models. The And (LSM-I), Or (LSM-II) and Sum (LSM-III) models earned better AUC scores of, 0.847, 0.708, and 0.839 respectively. In case of Gamma operator, the AUC value increases from 0.500 for Gamma 0.1 (LSM-V) and 0.824 for Gamma 0.3 ((LSM-VI) to the highest 0.855 for Gamma 0.5 (LSM-VII) model, onward, the AUC values show a decreasing trend with the increase in Gamma value as Gamma 0.7 (LSM-VII) and Gamma 0.9 obtain 0.853 and 0.851 AUC values respectively (Fig. 6). In FR analysis, barring LSM-I, II, III and IX rest of the maps have not obtained the ideal increase of FR value from VLS to VHS zone (Table 5, Fig. 7a, b).

Fig. 6
figure 6

ROC assessment of different models

Fig. 7
figure 7

a Frequency ratio assessment of LSM-I to IV, b frequency ratio assessment of LSM-V to IX, c frequency ratio assessment of LSM-X to XII

5.2 Validation of LSM-X

The LSM-X shows a strong influence of categorical causative factor lithology which results in deprived and restricted distribution of susceptibility zones. The HS (5.83%) and VHS (24.39%) zones are restricted to the Blaini Formation and the MS zone (25.74%) to the moderate category and VLS (11.10%) zone are confined to the low category while the LS zone (32.95%) present in the moderate and low categories of lithology (Table 5). The VHS and HS zones together predict a scant 67.09% of landslide pixels Table 5. In statistical validation, the MamLand model scores low on ROC assessment (AUC: 0.725) (Fig. 6) however, the FR value increase from VLS zone to VHS zone (Table 5 and Fig. 7c).

5.3 Validation of LSM-XI and LSM-XII

The LSM-XI (BG) and XII (TT) display significantly better and realistic distribution of susceptibility zones with no definite trace of input causative factors. It can be inferred from Table 5 that LSM-XI and XII pose an acceptable and similar distribution percentage of susceptibility zones with maximum area is accounted by the LS zone, 29.20%, and 29.62% respectively. The HS and VHS zones together accommodate an impressive 83.11% (LSM-XI) and 79.95% (LSM-XII) of landslide pixels in both maps. Similarly, the ROC and FR analysis have also delivered alike result for both models and maps. The obtained AUC values, 0.844 (BG) and 0.841 (TT) indicate the satisfactory performance of both models in predicting an event (Fig. 6). The FR value for VLS is found to be 0.07 for LSM-IX and 0.05 for LSM-X, whereas for VHS, values are 5.34 and 5.41, respectively (Table 5). More saliently, the FR value increase from VLS to VHS for both maps and it is shown in Fig. 7c.

6 Discussion

The AND, OR, Product and Gamma operator with 0.1, 0.3 and 0.5 values showed worst accuracy in FSP, whereas Sum and Gamma operator with 0.7 and 0.9 values showed comparatively satisfactory performance. The AND and OR operators take the minimum (least) and maximum (highest) membership value from a fuzzy set, respectively, it results in considering a causative factor in the analysis and thus are appears to be most unsuitable operators of FSP for landslide susceptibility mapping. Furthermore, the Product operator has a decreasing tendency (towards 0, i.e. non-susceptible), contrary to the Sum operator (increasing tendency towards 1, i.e. susceptible), is responsible for the overabundance of VLS zone on the LSM-IV. Similar to LSM-IV, maps produced through different Gamma values also suffer from the very same issue, though statistical validation measures indicate accuracy. Thus, the LSM-III produced through the Sum operator with satisfactory statistical validation results and negligible error in the spatial distribution of susceptibility zones be considered as the best map among FSP-produced LSMs. However, based on the research, the FSP can be described as a discrete fuzzy approach in which the fuzziness between categories of a causative factor cannot be incorporated due to the dearth of systematic interaction between categories.

The model MamLand has limitations in accommodating more than one categorical causative factors and numerical causative factors with nonlinear behaviour for the analysis. Therefore, MamLand was executed for the study area using six causative factors unlike FSP and improved FES. Based on the validation of LSM-X, it is evident that the FIS structure of MamLand is inadequate to make the necessary classification of causative factors. Further, the subjective formulation of fuzzy if–then rules based on predefined criteria may also have contributed to the below-par performance of MamLand. Compared to FSP and MamLand, the outcomes of improved FES model, LSM-XI and XII showed reasonable accuracy for the statistical validation measure with a significantly better spatial distribution of susceptibility zones. Though both maps pose more or less similar result on validation, the LSM-XI of BG structure found to be marginally better than LSM-XII of TT structure. It can be interpreted from the result that, the choice of structure is vital in the analysis; in this case, the smooth nature of BG structure was better suited. The supremacy of LSM-XI and XII indicate successful susceptibility perdition by improved FES model for the studied area; moreover, this MCDA model that combines expert knowledge and FR method under Mamdani-FIS is more efficacious than existing FES models and conventional FSP. Furthermore, the improved FES model can vanquish the discrete nature of conventional FSP through systematic overlapping of MSs under the ‘mean and neighbour’ strategy, in addition to that, construction of FIS structure under the strategy and objective formulation of fuzzy if–then rules via the FR method offer a high degree of interpretability as well as portability to the model.

7 Conclusion

The FES or expert knowledge formulated under fuzzy logic has been regarded as an ideal approach, which can circumvent the deficiencies inherent to qualitative and quantitative landslide susceptibility mapping techniques. Although, the existing FES models in landslide literature suffer from subjectivity, low grade of interpretability as well as portability. Owing to these shortcomings, the research presented in this paper primarily discusses the development of an improved FES model and presents a comparative analysis against an existing FES model, MamLand and conventional FSP. The improved FES model, a fusion of Mamdani-FIS and FR method is based on a strategy, ‘mean and neighbour’. The ‘mean and neighbour’ strategy explains the construction of FIS structure in three aspects, choice of MFs and physical representation of causative factors, overlap between MFs for optimum fuzziness between categories and interpretability. In addition to that, the improved FES model uses the FR method to estimate the significance of categories of causative factors in order to formulate the required fuzzy if–then rules objectively. The improved FES was executed along with FSP and MamLand for landslide susceptibility mapping of Mussoorie Township, Uttarakhand, India on mesoscale. A total of 12 LSMs have been prepared for the study area through three different models. The LSM-IX (BG), the outcome of improved FES model found to be in better agreement with all validation exercises compared to other LSMs. Based on validation and comparative analysis, it can be concluded that the improved FES introduced in this paper is capable of diminishing shortcomings associated with existing FES models and more effective than conventional FSP. The LSM-IX will be of great help to the Municipality Authority of Mussoorie and town planners to choose future developmental practices on safer slopes as well as to adopt suitable control measures and environmental regeneration schemes such as afforestation, banning of quarries on sensitive slopes in time.