1 Introduction

Landslide is a common phenomenon in hilly area especially in the Himalayan region. It occurs in the form of natural as well as man-made disasters. These phenomena are economically highly destructive and result in ~375 casualties per annum (Guri et al. 2015). These occur mainly due to construction of roads, buildings and other infrastructures or due to erosion by river. The townships in hilly region are highly prone to landslides and cause loss of wildlife habitation, removal of soil, and interruption to the road network as well as to the drainage system (Fayez et al. 2018). The study of landslides have drawn worldwide attention mainly due to increasing awareness of the socio-economic impact of landslides and the increasing pressure of urbanization on the mountains (Verma et al. 2016; Solanki et al. 2019).

Mussoorie township located in the Uttarakhand Himalaya has a history of hazardous landslides such as August 1998 Surabhi Resort landslide (Gupta and Ahmed 2007) and Kempty fall landslide. These landslides have caused enormous loss and hardship to the people living in the area (Madan and Rawat 2000; Gupta and Ahmed 2007). Since during recent years, the township has seen a rapid urbanisation and also it is frequently visited by large number of tourists, there is a need to demarcate the landslide potential zones in the area.

Landslide susceptibility mapping is defined as the probability of occurrence of landslides under different geo-environmental condition. It is used to predict the location of future landslides, assuming the landslides will occur in future under similar conditions that produced them in the past. The landslide susceptibility mapping has been carried out either by qualitative or quantitative methods. Qualitative methods are mainly heuristic, boolean, fuzzy logic, multiclass overlay, and spatial multi-criteria evaluation, whereas quantitative methods are either statistical or probabilistic. The probabilistic methods are mainly based on the past events of landslides. Whereas statistical methods are mainly bivariate, multivariate or artificial neural network. Bivariate statistical methods are mainly weight of evidence, frequency ratio, information value, Yule coefficient, and distance distribution analysis (Cárdenas and Mera 2016). All these methods have their own advantages and disadvantages and are widely used in different settings. Among all, the Yule’s coefficient method has been found to be easy to use and consistent with time and free from user’s biasness.

In the present study, landslides susceptibility mapping (LSM) using the bivariate statistical Yule coefficient (YC) method utilising GIS and remote sensing techniques for the hilly township of Mussoorie and its surrounding has been carried out. The study highlighting landslide susceptibility in the region would be useful for understanding the vulnerability and risk assessment in the area as well as for further planning and safe construction in the township (Guzzetti 2000; Guzzetti et al. 2005; Martha et al. 2013).

2 Study area

The study area, covering the township of Mussoorie and its surroundings is located between longitude 77°59′59″–78°07′46″E and latitude 30°25′58″–30°29′08″N in the Dehradun and Tehri districts of Uttarakhand (figure 1). It covers an area of ~85 km2 and is located in the Survey of Indian 1:50,000-scale toposheet no. 53J/3. The study area has several ridges with elevations varying between 900 and 2290 m above mean sea level (msl). The Mussoorie hill trends east–west and is the water divide between the Yamuna basin and Ganga basin. The maximum elevation in the area is at Lal Tibba point with an elevation of 2290 m above msl. Aglar River, a tributary of the Yamuna River flows to west and has an elevation of ~800 m.

Figure 1
figure 1

Location map of the study area indicating the settlement places, and the bold red coloured lines represent the road network.

Mussoorie is a hilly township with several places for tourists attraction like Camel back road, Mall road, Library road, Gun hill point, Vincent’s hills, Dhanaulti, George Everest, Cloud end point, Mussoorie lake, Company Garden and Kempty fall (figure 1). The town has highly variable floating population with peak tourist season during summer. The area has warm and dry summer and cool and wet winter. The average rainfall in the area varies between 2000 and 3000 mm (Gupta and Ahmed 2007) with maximum rainfall between July and September. The study area falls in zone IV of the seismic zonation map of India.

3 Physiographic setting

Geologically, the study area is located to the north of the Main Boundary Thrust (MBT) and lies in the Lesser Himalaya. It constitutes the rocks of the Chandpur Formation, Nagthat Formation, Blaini Formation, Krol Formation, Infra-krol Formation and the Tal Formation (figure 2). Chandpur Formation mainly constitutes phyllite, slates, siltstone and greywacke which are highly sheared and is overlain by the quartzite and slate belonging to the Nagthat Formation which in turn is overlain by the Blaini Formation constituting conglomerate, siltstone, greywacke, slate and sandstone. However, the greater part of the study area constitutes the rocks belonging to the Krol and Tal formations. The dominant rock types present in Krol Formation are mainly limestone, dolomitic limestone and dolomite (Tewari 1984; Tewari and Qureshy 1985). Krol limestone is foliated in nature. Tal Formation is further divisible into Lower Tal and Upper Tal. Lower Tal constitutes four distinct members, namely, Chert Member, Argillaceous Member, Arenaceous Member and Calcareous Member, whereas the Upper Tal is represented by quartzite Member. In greater part of the study area, the hill slopes are covered with thick Quaternary deposits representing old landslides. The thickness of these deposits as deciphered in the channel cuts is highly variable. The geological setup of the area has been studied in detail by Auden (1934), Ravi Shanker (1971), Panikkar and Subramanyan (1996, 1997), Banerjee et al. (1997), Singh et al. (1980), Jayangondaperumal and Dubey (2001), Gupta et al. (2016b) and Mahato et al. (2019).

Figure 2
figure 2

Geological map of the study area indicating various litho units.

Structurally, the rocks in the study area are exposed as doubly plunging antiform syncline, trending NW–SE (Auden 1934). In general, the rocks are highly folded, faulted, jointed and fractured. Four joint sets trending NE–SW, NNE–SSW, NNW–SSE and ESE–WNW are prominent. The geomorphic setup of the area is defined by highly dissected hills with rugged topography having moderate to steep slopes ranging between 40° and >70° (Panikkar and Subramanyan 1996). The area depicts high relative relief in the upper and lower elevations and moderate relief in the middle region.

4 Data used and methodology

Landslide susceptibility mapping of an area using any method involves the preparation of inventory of landslides along with maps of the causative factors of landslides referred generally as the thematic maps of the causative factors of landslides. In the present study, an inventory of landslides has been prepared using high resolution satellite images, (IRS-P5, Cartosat-1, resourcesat-1multispectral, and LISS IV) along with high resolution satellite images on the Google Earth platform. It has been updated with the extensive field work in the area. The characteristics features of the satellite images used in the present study are presented in table 1.

Table 1 Satellite data used for the preparation of landslide inventory and causative factor/thematic layers.

Cartosat-1 stereo pair data having 2.5 m spatial resolution have been used for generating digital elevation model (DEM) of the area using Leica Photogrammetry Suite (LPS) tools of ERDAS imagine v.14. This DEM was used for extracting various thematic layers like slope, aspect, elevation, profile curvature, plan curvature and drainage. LISS IV images having 5.8 m spatial resolution was used for the preparation of inventory of landslides, landuse/landcover (LULC) and the lineament map of the area. The lithological map has been prepared using extensive field work and from the secondary data and road map was digitized from the Google Earth pro.

Datum for each layer was set as D_WGS_1984 and spatial reference as WGS_1984_UTM_Zone_44N. ERDAS v.14 and ArcGIS v.10.5 were used for processing of satellite images and GIS database generation, analysis and presentation of final output maps, respectively. Success rate curve has been prepared by using ILWIS 3.2 software. The detailed methodology used for the present study is presented in figure 3.

Figure 3
figure 3

Flow chart of the methodology used for the preparation of landslide susceptibility map using bivariate statistical Yule coefficient.

5 Preparation of landslide inventory map and various thematic maps

Landslides inventory map depicts the spatial distribution of landslides in an area. In the study area, a total number of 56 landslides have been delineated (figure 4). Of these, 54 landslides have been classified as planar debris slides and two as rock-cum-debris slides (Cruden 1991; Cruden and Varnes 1996). These landslides are active mostly during the rainy season and their dimensions increase every year. Further about 80% landslides have area >100 m2 and are shallow in nature.

Figure 4
figure 4

An inventory of 56 landslides used for the preparation and the validation of the landslide susceptibility map of the study area.

Of these 56 mapped landslides, 40 landslides were randomly selected for the preparation of the landslide susceptibility map of the area and the remaining 16 were used for the validation of the prepared landslide susceptibility map.

5.1 Lithology

Slope instability in an area is greatly influenced by the spatial disposition of various litho units and the overlying Quaternary deposits, as well the presence of structural features, e.g., folds, faults, joints, fractures, etc. Lithology map of the area has been prepared using primary as well as the secondary data sources and is presented in figure 2 and has already been discussed in section 3.

5.2 Landuse and landcover map

Landuse/landcover (LULC) is an important parameter that can cause landslides, as it has been reported that vegetation covered areas are less susceptible to landslides, c.f., the barren land (Greenway 1987). Six different LULC classes, viz., (i) Scrub land (10.55 km2), (ii) Barren land (9.07 km2), (iii) Degraded land (33.26 km2), (iv) Crop land (4.54 km2), (v) Build up land (1.31 km2), and (vi) Dense forest (21.41 km2) have been mapped from satellite images (figure 5a) and have been updated and validated during field investigation. It is observed that scrub land has the highest percentage of landslides (32.10%), whereas the dense forest areas cover the least percentage of landslides (5.11%).

Figure 5
figure 5figure 5figure 5figure 5

Various thematic layers along with the spatial distribution of landslides used for the preparation of the landslide susceptibility map (a), Landuse–landcover (b), slope (c), slope aspect (d), elevation (e), profile curvature (f), plan curvature (g), lineament buffer (h), road buffer (i), and drainage buffer (j).

5.3 Digital elevation model and its derivatives

Terrain conditions like inclination of slope, aspects, elevation and curvature of slope define the stability of the slope, and have been derived from the digital elevation model (DEM) of the area which is prepared from high resolution Cartosat-1 stereo pair images (Singh et al. 2010). It has been observed that the general slope in the area varies from 0° to >70° and has been classified into seven classes each having interval of 12° (figure 5b). It has been noted that greater part of the study area (~34 km2) is occupied by slope interval of 39°–51°.

Aspect of slope is another important terrain parameter that affects the slope stability as different slope aspects receive different solar irradiance and orographic precipitation, thus affecting differential weathering and hence, varying distribution of landslides in different slope aspects is expected. Therefore, for the present study, the study area has been classified into 10 different aspect classes, viz., flat (0.00 km2), north (5.82 km2), northeast (11.54 km2), east (9.69 km2), southeast (8.00 km2), south (11.04 km2), southwest (11.69 km2), west (9.99 km2), northwest (7.75 km2) and north (4.63 km2) (figure 5c).

The elevation layer was categorized into eight different classes with the interval of 200 m. It has been observed that ~25 km2 study area is covered by 1700–1900 m elevation range (figure 5d).

Slope curvature is another important terrain parameter that greatly affect the slope stability. It can either be profile curvature or planar curvature. Profile curvature is the curvature measured in the vertical plane parallel to the maximum slope direction, whereas planar curvature is the curvature of the hill side in a horizontal plane perpendicular to the direction of the maximum slope (Guri et al. 2015). Profile curvature affects the acceleration or deceleration of flow across the surface, whereas profile curvature relates to the convergence and divergence of flow across a surface. Therefore both these curvature affect the slope stability. Both these curvature were classified into nine classes, varying between −354.59 and 446.64 for the profile curvature, and between −584.68 and 246.64 for plane curvature (figure 5e–f). The curvature represents an upwardly convex for positive value, flat for zero value and an upwardly concave for negative value (Pradhan et al. 2010).

5.4 Lineament, road network and drainage network maps

Lineaments are mapable linear geological features such as joint, shear zones, faults, fold axis, sharp lithological contacts and are known to influence landslides and related mass movement activities as the areas in the vicinity are generally considered weak (O’leary et al. 1976; Verma et al. 2016). Therefore, in the present study, lineament map has been prepared using LISS IV image. Thirty six lineaments have been mapped in the study area and these were observed to be orientated in all the directions (figure 5g).

The human interference on slope, like cutting of slope for construction of roads and buildings is one of the major anthropogenic factors for the destabilisation of slope. In the present study area, the construction of new buildings and mining activities are banned since 1996, as these activities in the past have noted to adversely affect the slope instability and geo-environment of the area. In order to understand the effect of road cutting on the distribution of landslides, road network map was prepared (figure 5h).

Since drainage is considered to be one of the most important causative factors for the occurrence of landslides in an area, the drainage distribution has been extracted from DEM (figure 5i). and buffer into six zones such as 0–50, 50–100, 100–150, 150–200, 200–300, and 300–500 m.

6 Landslide susceptibility model: Yule coefficient (YC)

There are many methods for carrying out the landslide susceptibility mapping in an area. Each method has its own advantages and limitations. In the present study, bivariate statistical method referred as ‘Yule coefficient’ (YC) utilising GIS was used to prepare the landslide susceptibility map of the Mussoorie township. The coefficient is also called Phi coefficient (Yule 1912). It calculates the association between different variables and expressed as dichotomy, i.e., presence or absence, true or false and yes or no (figure 6). This is as a bivariate analysis and represent and quantifies the strength of association between a landslide and its spatial causative factors such as slope, aspect, curvature, elevation, and lithology. The advantage of this method is that it is quick, easy to use and no corrections are required after the analysis.

Figure 6
figure 6

An example with slope class of (0°–12°) of intersection of landslide pixels (O) and pixels of thematic layer class (I) in the study area (T).

It is calculated using the following equation:

$$ {{YC }} = \frac{{\sqrt {T_{11} /T_{21} } - \sqrt {T_{12} /T_{22} } }}{{\sqrt {T_{11} /T_{21} } + \sqrt {T_{12} /T_{22} } }}, $$
(1)

T11 = area of ‘positive match’ where a class of any factor and landslides are both present, T12 = area of ‘mismatch’ where a class of any factor is present but landslides are absent, T21 = area of ‘negative match’ where a class of any factor and landslides are both absent, T22 = area of ‘mismatch’ where a class of any factor is absent but landslides are present.

The value YC ranges between −1 and +1. A negative YC value implies negative spatial association, positive value implies positive spatial association and the zero implies that there is no spatial association (table 2).

Table 2 Calculated Yule coefficient for the different classes of causative factors.

Landslides occurrence favourability score (LOFS) of each class of thematic layer was formulated by dividing the YC value of that class of thematic layer by the maximum value of YC in that class using the following equation:

$$ {\text{LOFS}} = \frac{{\text{if}}\,(YC > 0, \, = 0, < 0)}{\hbox{max} \,YC }. $$
(2)

Relative weight of the thematic layer is the ratio of absolute difference between maximum and minimum YC in a particular thematic layer and minimum value of YC among all thematic layer as given in the following formula:

$$ {\text{Weight }}\left( W \right) \, = \frac{{{\text{Abs}}.\,{\text{Difference}}\,\left( {{\text{Max}}.\,{{YC }} - {\text{Min}}.\,{{YC}}} \right) }}{{{\text{Min}}.\,{\text{value}}\,{\text{of}}\,{\text{Abs}}.\,{\text{differ}}\,{\text{column }}}} $$
(3)

LOFS and W values from equations (2) and (3) for each thematic layer has been integrated in ArcGIS platform and weighted multiclass index overlay (S) has been created by using the following algebric equation:

$$ S = \frac{{\sum\nolimits_{i}^{n} {\left( {LOFS_i \times W_i} \right)} }}{{\sum\nolimits_{i}^{n} W_i }}. $$
(4)

Finally, landslide susceptibility score map has been created and it is classified into five classes by using the natural break value approach.

7 Results and discussion

There are many methods for assessing landslide susceptibility in an area. All these methods have invariably been applied worldwide, including in different parts of the Himalayan terrain (Ayenew and Barbieri 2005; Lee et al. 2007; Yalcin 2008; Jaiswal et al. 2010; Pradhan and Lee 2010; Ghosh et al. 2011; Intarawichian and Dasananda 2011; Ramakrishnan et al. 2013; Guri et al. 2015; Pham et al. 2015; Cárdenas and Mera 2016; Demir 2018; Kundu and Patel 2019; Sharma and Mahajan 2019). Ghosh et al. (2011), Cárdenas and Mera (2016), Ilia and Tsangaratos (2016) and Kundu and Patel (2019) have used bivariate statistical Yule coefficient (YC) in the Indian Himalayan terrain and in the Andes Mountain in central Ecuador. In order to assess the landslide susceptibility in the area, they have used different landslide causative factors and, divided the area into high, moderate and low landslide susceptible zones, and also concluded that the landslides in the area are not randomly distributed, but are in association, either positively or negatively, to different condition.

In the study area, of all the lithological units present, only Chandpur and Nagthat formations along with the Quaternary regolith indicate the positive association with landslides, thus exhibiting YC value of 0.584 and 0.130 and 0.223, respectively (table 2). All other lithological units and formations indicate negative association with landslides. This may possibily be due to fissile nature of phyllite present in the Chandpur Formation and highly jointed and weathered nature of the Nagthat quartzite that facilitate the occurrence of landslides.

In the various landuse–landcover classes, approximately 32% of the total landslides fall under scrub land, ~27% in the degraded land, ~17% in the settlement classs, ~14% in the barren land, ~5% in the dense forest and remaining 5% in the crop land. Of all the LULC classes, scrub land covering an area of ~10.55 km2, barren land ~9.07 km2, settlement land ~4.54 km2 and crop land ~1.31 km2 show positive association with landslide, with barren land exhibiting the least YC value of 0.063, and the settlement areas exhibiting the maximum YC value of 0.297 followed by crop land with YC value of 0.296 (table 2), indicating that the landslides in the area is greatly influenced by the anthropogenic activities. However, the degraded land and the dense forest have the negative association indicating that dense forest are the most stabilised area in terms of landslides.

It has been noted that gently dipping slopes ranging between 0° and 38° have negative association with landslides, whereas slopes inclined >38° have positive association with landslides, having maximum YC value in the slope range of 65°–77°. Further slopes inclined >77°, the YC value decrease indicating relatively lesser association of landslides (table 2). This has also been observed in other parts of the Himalaya that vertical and sub-vertical slopes have comparatively fewer numbers of landslides due to the high geotechnical characteristics of rocks constituting the vertical and subvertical slopes (Gupta et al. 2016a; Kumar et al. 2019). Further slopes at elevation range <1100 m above msl exhibit –1 YC value indicating complete disassociation from the occurrence of landslides. The feeble association with landslides has been shown only by slopes at elevation range of 1300–1500 m and 1900–2100 m (table 2). All slopes at other elevation ranges show negative association with landslides. In general, elevation in the region, control the mechanical as well as chemical weathering, which in turn affect the slope stability and hence the landslides. However, in Mussoorie and its surroundings, the spatial distribution of landslides does not seem to be controlled by the elevation.

Also as expected, the flatter regions show complete dissociation with landslides and only the north, south and southeast directed slopes show positive association with landslides, whereas slopes directed in other directions show dissociation. Further, southeast and south directed slopes show relatively higher positive association with landslides exhibiting YC value of 0.266 and 0.211, respectively (table 2), than the north directed slopes. This may possibly be due to higher solar insolation on the slopes directed towards south and southeast that lead to more physical weathering leading to destabilisation of slopes. Similar observation of comparatively higher incidences of landslides on the south facing slopes have also been observed from other areas in the Himalayan region (Mathew et al. 2009; Sharma and Mahajan 2019).

In the mountainous region, curvature of the slope plays an important role for the slope stability (Pham et al. 2015; Polykretis and Chalkias 2018), and thereby YC with respect to various classes of profile curvature and plan curvature have been calculated. It has been noted that YC value for the plan curvature indicate higher association with landslides than the profile curvature, further the concave slopes are observed to be more prone to landslides. This may possibly be due to accumulation of water in the concave part of slope, leading to destabilisation (Dai and Lee 2002; Pradhan et al. 2010; Guri et al. 2015; Pham et al. 2015; Ding et al. 2017).

Negative YC values in each 50 m zone, up to a distance of 200 m away from lineament clearly demonstrate no role of lineaments in the distribution of landslides in the study area (table 2). Further, the 50 m buffers along the road cuts also indicate negative YC values or very low values indicating the road cutting is also not the major cause for the occurrence of landslides in the study area. Most of the drainage in the area are first or second order non-perennial, and are not the main controlling factor for the occurrences of landslides in the area (Pachauri and Pant 1992). This is also evidenced by the negative YC values in the vicinity of the drainages.

8 Validation

In order to assess the accuracy of the prepared landslide susceptibility map, validation of the map was carried out using success rate curve (SRC) and the prediction rate curve (PRC). Both these curves have been drawn using the cumulative percentage of the study area and the cumulative percentage of landslides, and indicate the accuracy and the predictive value of the landslide susceptibility map (LSM), respectively. The success rate curve was drawn using the landslide training dataset for the preparation of the landslide susceptibility map, which indicate the area under curve (AUC) value of 0.75 (figure 7a). In general, the AUC value ranges between 0.5 and 1, with value close to 1 indicate higher accuracy of the model, whereas value close to 0.5 illustrate the inaccuracy in the model (Pham et al. 2015). In the present case, AUC value of 0.75 illustrate that the model is acceptable and of good quality (Beguería 2006).

Figure 7
figure 7

Validation curve (a) success rate curve and (b) prediction rate curve.

The prediction rate curve was also generated by using the same method as discussed above, but in this case 16 numbers of landslides that have not been used for the training datasets were utilised. The AUC value of the prediction rate curve for the model was found to be 0.70 (figure 7b) indicating good agreement between the LSM and the occurrences of landslides. This has been evidenced by the observations that 10 of the 16 landslides, used for the prediction rate curve lie in the very high landslide susceptible zone and reaming six landslides in the high landslide susceptible zone. It is also notable that the success and the prediction rate curves for the model had greater steepness in the first part of the curve, indicating greater prediction.

9 Conclusions

Mussoorie, a hilly township in the Garhwal Himalaya attracts thousands of tourists, particularly during the summer. The statistical bivariate analysis for the preparation of landslide susceptibility analysis indicate that the landsliding in the area is controlled mainly by lithology, curvature, slope, slope aspect, and landuse–landcover. Five landslide susceptibility classes indicating very high, high, moderate, low and very low landslide susceptibility zones were demarcated. It has been noted that ~2.31% of the area falls in very high susceptibility zone, ~12.94% in high susceptible zone, ~28.65% in the moderate susceptibility, ~24.01% in the low susceptibility and ~32.39% in very low susceptible zones (figure 8).

Figure 8
figure 8

Landslide susceptible map of the study area.

It has been observed that the high and very high landslide susceptibility zones are manly concentrated in the E–W trending central part, and also on the southern and western parts of the study area, whereas the northern and eastern parts fall in the low hazard zone. Further, the settlement places, like Bhattafall, George Everest, Kempty fall, and Barrlowganj lie under the category of high hazard zones. This is also evidenced due to the weak, fractured and weathered rock strata in the area that may lead to landslide due to change in any geo-environmental factor in the area like change in geometry of slope due to construction activity in the area, or excessive rainfall conditions in the area. In the western part of the study area, lot of construction activities in the form of construction of new buildings have been noted. This require the cutting of slope that results the change in slope geometry (Verma et al. 2016). These kind of activities, in turn, destabilise the slope. An example of Luxmanpuri landslide is depicted in figure 9, that was developed due to construction activity in the area. It is important to mention that these human interference activities in the form of cutting of slope is prevalent in the western part only, as there is complete ban of any interference of slope in other areas, and this may be one of the reasons that the greater part of high and very high hazard zones are concentrated in the western part of the area.

Figure 9
figure 9

A view of the Luxmanpuri landslide.

Since there is greater pressure on the finite land resources in the hilly townships, there is an urgent need to carry out the landslide susceptibility mapping on a larger scale. Such kind of maps would be of use to the planners and developers for further planning and development of the area.