Abstract
Multivariate binary logistic regression (LR) model was used for the assessment of landslide susceptibility in the Rorachu river basin of eastern Sikkim Himalaya. For this purpose, a spatial database of 13 factors such as rainfall, slope, aspect, curvature, relief, drainage density, distance from drainage, distance from lineament, distance from road, geology, soil, Normalized Difference Vegetation Index (NDVI), and land use/land cover was constructed under Geographical Information System (GIS) environment. A landslide inventory map was prepared and converted into binary raster coded by 0 for absence and 1 for the presence of landslide. Total 946 landslide pixels were found out of which 725 landslide pixels (76.63%) were used as training dataset for the model and the model was validated using all landslide pixels. The coefficient value of geology was maximum followed by NDVI, soil and land use/land. The calculated probability value was used as Landslide Susceptibility Index (LSI). On the basis of the LSI value, the landslide susceptibility map was divided into five distinct categories of very low, low, moderate, high and very high susceptibility. The susceptibility classes depict that, 9.18% of total area was under very high and high susceptibility contains 75.48% area of total landslide. Finally the accuracy of the model was assessed by area under curve (AUC) of receiver operating characteristics (ROC) curve and landslide density method. The AUC value of 0.947 indicates a very good quality of the model and landslide density shows that the density is increasing with LSI classes.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The impact of landslide on human life and economic activities is very intense in the different countries of the world (Shahabi and Hashim 2015). India has its 25 percent of the total territory under hilly mountainous topography among which tectonically active great Himalayan range is the main mountainous region (Dubey et al. 2005). Due to immature and fragile geological structure, landslide is a very common phenomenon in the Himalayan region. Only for the landslide India faces a loss of US $500 million per year (Dubey et al. 2005). So proper planning is necessary for the reduction of intensity and prevention of the hazard. For this purpose landslide susceptibility map is a very significant tool. Nowadays due to the availability of highly accurate remote sensing data landslide susceptibility mapping becomes easier (Shahabi and Hashim 2015). Landslide is an outcome of combined action of various physical and anthropogenic factors which affect the event directly or indirectly. Hence all methods of landslide susceptibility analysis associated with more or less same conditions: (1) identification of causative factors which are responsible for slope instability, (2) calculation of rating or weight by selecting a suitable method or technique, (3) assessment of role of instability factors for the occurrence of landslide on the basis of their primary ratings or weights, (4) identification of landslide susceptibility zones on the basis of the classification of landslide susceptibility index (LSI) (Anbalagan 1992; Guzzetti et al. 1999, and Dai et al. 2002). The landslide susceptibility mapping deals with the identification and division of land surface on the basis of the degree of actual and potential landslide. For this reason it can be a key tool to the planners for suitable site selection for human settlement and infrastructural development (Parise 2002). In recent times, Geographical Information System (GIS) based statistical modelling become popular for landslide study. Throughout the world plenty of work has been done under GIS environment and many researchers have applied probabilistic models (Lee and Choi 2003; Pradhan et al. 2006, 2011; Kayastha 2015). Statistical models such as logistic regression (Bathrellos et al. 2009; Akgun 2012; Mondal and Mandal 2017) and statistical index model (SI) (Bui et al. 2011; Regmi et al. 2014; Mandal and Mandal 2017) have also employed for landslide susceptibility analysis. Apart from this expert knowledge based statistical models: analytical hierarchy process (Tazik et al. 2014; Kumar and Anbalagan 2016; Rahaman and Aruchamy 2017), weighted overlay model (Shit et al. 2016) were used for landslide susceptibility assessment.
Landslide was observed and studied from late 1800s (Endlich 1876; Atwood and Mather 1932; Crandell and Varnes 1961; Varnes and Savage 1996). Later landslide studied with both qualitative (Ko Ko et al. 2004) and quantitative approaches (Dhakal et al. 2000; Mandal and Maiti 2015). Qualitative approaches are more effective in case of small scale landslide hazard mapping (Yoshimatsu and Abe 2006; Castellanos et al. 2008; Jian and Xiang-guo 2009). On the other hand, for large scale mapping quantitative methods are more suitable (Vakhshoori and Zare 2015). In last two decades remote sensing and GIS based techniques were widely used for the assessment of landslide hazard phenomena (Nagarajan et al. 1998). These techniques depend on scale, practical knowledge of the study area, scientific knowledge, expenditure of the study and time (Vakhshoori and Zare 2015). GIS based techniques and methods were successfully distinguished by several researchers into direct geomorphological mapping (Cardinali et al. 2002), landslide inventory mapping (Guzzetti et al. 2012), heuristic method (Leoni et al. 2015), statistical method (Tornyai and Lúchava 2015), and deterministic approaches (Van Westen and Terlien 1996). Throughout the world plenty of work has been carried out under GIS environment and many researchers have successfully applied probabilistic models (Zhou et al. 2002; Lee and Choi 2003; Youssef et al. 2009, 2012; Kayastha 2015). Statistical models such as logistic regression (Tunusluoglu et al. 2007; Mondal and Mandal 2017) and statistical index model (SI) (Bui et al. 2011; Regmi et al. 2014; Mandal and Mandal 2017) have also employed for landslide susceptibility analysis. Apart from this expert knowledge based statistical models: analytical hierarchy process (Tazik et al. 2014; Kumar and Anbalagan 2016), weighted overlay model (Shit et al. 2016; Basharat et al. 2016; Kanwal et al. 2016), multi-criteria elaluation techniques (Ahmed 2015) were used for landslide susceptibility assessment.
Slope instability as well as landslide is quite common problem in entire Sikkim. The occurrence of landslide is increasing day by day in the Rorachu river basin due to increase of anthropogenic activities like, rapid urban expansion, illicit hill cutting, incessant deforestation etc. In Sikkim, several researchers have successfully carried out some studies on use of 3-D digital elevation model (DEM) for landslide assessment (Dubey et al. 2005), mechanism of landslide initiation (Anbarasu et al. 2010), movement monitoring of landslide by GPS (Rawat et al. 2011), use of high resolution satellite data for damage and geological assessment (Martha et al. 2015), rule-based semi-automated method for landslide detection (Siyahghalati et al. 2014), remote sensing and GIS based statistical approach for landslide susceptibility analysis (Sarkar et al. 2008; Rawat et al. 2016; Vishwakarma et al. 2017). These models have upgraded the information, database and mapping techniques for the analysis of landslide (Chen et al. 2016). The present study deals with identification of landslide susceptibility areas in the Rorachu river basin of eastern Sikkim Himalaya using GIS based logistic regression model.
Study area
The study was completed in the Rorachu river basin of eastern Sikkim Himalaya, located to the northern extent of east Sikkim district. The study area extends between 27°17′19″ to 27°23′52″N and 88°35′37″ to 88°43′17″E with an area of 71.73 sq. km. The maximum and minimum altitudes of the Rorachu river basin are 4213 and 834 m respectively (Fig. 1). The altitude of the area is rapidly increasing from southwestern extent (Ranipool, 27°17′29.04″N, 88°35′29.76″E) towards its northeastern extent (Pandramaile, 27°21′41″N, 88°42′45″E). According to Koppen’s climatic classification, the study area is dominated by subtropical highland climate (Cwb). Because of its high relief and sheltered environment the Rorachu basin enjoys a mild temperature throughout the year with an average maximum temperature of 22 °C during summer and 4 °C during winter. Dissected hilly terrain is the main geomorphologic unit of the study area. Northern and northeastern parts of the basin where maximum altitude was found are under highly dissected hilly terrain where relatively low lying areas like Gangtok, Ranipool were under moderately dissected hilly terrain. Steep slope along with the presence of numerous number of first and second order streams may be the prime causes of high topographic dissection of the basin, as well as slope instability in the Rorachu river basin of East Sikkim.
Database of the study
Landslide is a complex geo-environmental process which occurrence depends on tectonic, climatic, geological, topographical and anthropogenic factors. So, the selection of proper factors is considered as a prime task in landslide related studies. The selection and mapping of appropriate set of factors responsible for the occurrence of landslide depends on the detail knowledge about the main causes of landslide initiation (Guzzetti et al. 1999). For the preparation of land susceptibility map, 13 landslide causative factors including rainfall, slope, aspect, curvature, relief, drainage density, distance from drainage, geology, soil, distance from lineament, distance from road, NDVI and land use/land cover were taken into consideration. Previous literatures (Mondal and Mandal 2017) showed that these factors were significantly used by the researchers in landslide susceptibility analysis. For logistic regression analysis two separate spatial database of continuous data (rainfall, slope, aspect, curvature, relief, drainage density, distance from drainage, distance from lineament, distance from road and NDVI) and categorical or discrete data (geology, soil and land use/land cover) were constructed. In the present study, data and maps from various sources like Survey of India, Geological Survey of India, National Atlas and Thematic Mapping Organization (NATMO), United States Geological Survey (USGS), and world climate website (http://www.worldclim.org) were collected to prepare thematic data layers of different causative factors (Table 1).
Methodology of the study
Methodological framework of the study (Fig. 2) is divided into several distinct segments like, spatial database construction for landslide conditioning factor and landslide locations, formulation of the model, preparation of landslide susceptibility map and validation of the model. The methodological framework of the study was taken on the basis of a very familiar principle of “present and past are keys to the future”. The basic law of this principle is related to the use of subsisted landslide for the evaluation of future landslide areas (Bai et al. 2010), due to which the spatial database was constructed. For the construction of spatial database, all of the landslide causative factors were converted into raster data of 30m × 30 m cell size.
Preparation of landslide inventory:
In GIS based statistical analysis, landslide distribution map or landslide inventory is very much significant, because a landslide inventory provides not only the spatial information of landslide, but also helps to extract parametric information of landslide affected and adjacent areas. Due to these reasons, landslide inventory map is widely used for GIS based landslide susceptibility analysis (Bui et al. 2011; Shit et al. 2016; Mondal and Mandal 2017), seed cell sampling based landslide susceptibility analysis (Bai et al. 2010; Wang et al. 2013; Dagdelenler et al. 2015). For the identification of shape, size, location of landslide and their types of materials and movements, comprehensive field studies were conducted using GPS (Bai et al. 2010). In the Rorachu river basin, total numbers of 80 landslides were recognized with a total areal coverage of 0.85 sq. km (Fig. 3). A detailed landslide database was prepared during field study on the basis of landslide classification provided by Varnes (1978) and Cruden and Varnes (1996). The landslide database comprises shallow landslides, deep seated landslide, rock slide and earth slides. To prepare landslide inventory map, all landslide locations were vectorized from LANDSAT 8 OLI, sentinel-2 image and Google earth images using Arc Map 10.3 software. For the preparation raster layer of landslides, a vector to raster transformation was accomplished.
Application of logistic regression model:
Logistic regression is a statistical model which permits a multivariable regression analysis between a dependent and a group of independent variables (Bai et al. 2010). On the basis of a group of predictor variables, logistic regression is an important multivariate analyzer for the prophecy of presence or absence of an outcome (Lee 2007). The main advantage of logistic regression is that, with the help of a proper link function to the ordinary linear regression, logistic regression can be performed. In this case, there is no problem if the data is either continuous or discrete or both and the regression does not require the normal distribution of the data. In the present study, the dependent variable landslide is presented by binary digits, 0 for absence and 1 for the presence of the event. This condition where dependent variable is binary, the logistic regression link is suitable (Atkinson and Massari 1998). After the conversion of dependent variable into logit variable, the algorithm of logistic regression implicates maximum likelihood estimation. Thus, logistic regression measures the probability of an event (Atkinson and Massari 1998; Dai and Lee 2002). In the present condition when the dependent variable is binary, logistic regression model can be written as equation-1.
where P is the probability of the event, which ranges between 0 to 1on a sigmoid pattern, z is the linear combination of the model (linear logistic model) which ranges between − ∞ to + ∞ (Eq. 2).
where the b0 represents the intercept of the model, b i (i = 1, 2, 3….n) represents the slope co-efficient and xi (i = 1, 2, 3…n) is the number of independent variable in the equation. The linear logistic regression represents the existing conditions (presence or absence of landslide), on the basis of pre-failure conditions (independent variables).
In the present study the spatial relationship between landslide and its causative factors was assessed using logistic regression model in the Rorachu river basin of eastern Sikkim. For the construction of the model and calculation of probability, all the data layers including landslide inventory were converted into point format and the value of each point was used in the IBM Statistical Package for the Social Sciences (SPSS) statistical software. All factors were used to calculate coefficient values of the model. Landslide is a very complex process controlled by several geo-environmental factors. On the basis of the coefficient values, the model analyses the role of causative factors or which factors are prime for the occurrence of landslide in the Rorachu river basin.
Conversion of categorical variables
Generally in logistic regression analysis due to the nominal nature, conversion of categorical variables into numeric variables using binary dummy variable is necessary. But the excess use of dummy variables may increase the length of the equation and influence the accuracy of the model. For this reason, category wise frequency ratio (Eq. 3) was calculated (Table 2) to change categorical variable into numeric variable (Bai et al. 2010; Lee and Pradhan 2007; Wang et al. 2013). A conceptual framework of frequency ratio is presented in the equation (Eq. 3).
where LAi is the number of landslide pixel in ith class of a parameter, Ai is the total number of pixel of ith class in a parameter and N is the total no of class in a parameter. The categories of geology, soil and land use/land cover were transformed into numerical variables.
Preparation landslide susceptibility map:
The calculated probability of the model was used as landslide susceptibility index (LSI). The probability values were assigned to the point features using Arc Map 10.3 software data joining tool. With the help of inverse distance weighting (IDW) tool in Arc Map 10.3 software package, the points of different probability values were interpolated and landslide susceptibility map was prepared. On the basis of LSI values, five distinctive zones of landslide susceptibility were identified using natural break reclassify scheme in Arc Map 10.3.
Model validation
GIS based statistical modeling is a very useful technique for landslide susceptibility analysis. But the acceptance, accuracy and predictive capacity of the model depend on the proper validation. Without validation there is no scientific significance of such models. In the present study, receiver operating characteristics (ROC) curve and decision rule based landslide density method were used to validate the model.
Model validation by ROC curve:
ROC curve is a commonly used method to visualize the performance of the binary classifier, meaning a classifier with two possible output processes, i.e. presence or absence of an event where presence is considered as positive and absence as negative classification. A cut-off point or threshold value is used to discriminate two outcomes. On the basis of the classification, the result is divided into four types, i.e. true positive or TP (presence of event is correctly classified as positive), false negative or FN (presence of event is classified as negative), true negative or TN (absence of event is correctly classified as negative) and false positive or FP (absence is classified as positive). All the results are very much significant for the calculation of specificity and sensitivity. ROC curve is a two dimensional diagram in which specificity lies on the X axis and sensitivity on the Y axis. A conceptual framework may help to understand the basic structure of ROC curve (Eqs. 4, 5). The precision of the test depends how well the test divides the area of an event from non-event areas. Accuracy of a test is measured on the basis of the area under ROC curve (AUC). The value of AUC ranges between 0.5 and 1.0 where 1 indicates perfect test and on the other hand 0.5 indicates useless test.
Model validation by landslide density method:
Landslide density is the ratio of actual landslide area to the landslide susceptibility classes (Sarkar et al. 2008). The class wise landslide density and landslide susceptibility areas were calculated with the help of area with and without landslides for each susceptibility class. The basic rule of this method is that, in case of highly accurate map landslide density will increase with increasing LSI values and highest landslide density will be found in very high landslide susceptibility class.
Landslide conditioning factors
The present study deals with the assessment of landslide susceptibility using logistic regression model. For the fulfillment of the purpose, rainfall, slope, aspect, curvature, relief, drainage density, distance from drainage, distance from lineament, distance from road, geology, soil NDVI and land use/land cover were selected and used as significant factor for landslide occurrence. By nature the data layers of landslide conditioning factors were mainly of two types: continuous data (rainfall, slope, aspect, curvature, relief, drainage density, distance from drainage, distance from lineament, distance from road and NDVI) and categorical or discrete data (geology, soil and land use/land cover). In case of categorical data conversion of the data is necessary for logistic regression analysis.
Rainfall
Rainfall is considered as triggering factor for landslide occurrence, as it increases the soil saturation due to which the slope materials become unstable. On the other hand heavy rainfall encourages the surface run-off as well as the discharge and erosive capacity of small stream segments. The rapid erosion caused by tiny rills, gullies and lower order streams directly affect the slope stability by reducing the cohesiveness of soil. Rainfall data of the study area was collected from the world climate website (http://www.worldclim.org) and a rainfall distribution map was prepared using Arc Map 10.3 software. From the prepared rainfall distribution map (Fig. 4), the maximum 224 cm average annual rainfall was noticed in the Rorachu river basin.
Morphometric factor
Among all morphometic factors, slope, aspect, curvature, and relief were extracted from Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) of 30 m spatial resolution with the help of Arc Map 10.3 software package. Previous literature (Shit et al. 2016; Mondal and Mandal 2017) showed that these factors were widely used by researchers in landslide susceptibility analysis. For the calculation of drainage density river and streams were vectorized from topographical map no. 78A/11 and compared with digital elevation model for necessary corrections. Drainage density was calculated after Horton’s (1945) method (Eq. 6).
where Dd denotes drainage density, Lk represents the length of the streams of a basin and Ak is the total area of the basin. The basin was divided into 1 km × 1 km grids and length of all stream segments per grid was measured to analyze drainage density. On the basis of the obtained value a drainage density map was prepared with the help of inverse distance weighting (IDW) tool in ArcMap 10.3. The slope map of the Rorachu river basin (Fig. 5a) depicts that, the slope ranges between 0° and 68.9495°. The aspect of slope is sounded clockwise in degrees from 0 to 360. On the basis of degree values the aspect map (Fig. 5b) of the study area was prepared. Slope aspect influences temperature and precipitation which affects soil moisture, thickness of soil and vegetation cover of the slope. Normally south oriented slopes in the northern hemisphere receive more precipitation and become unstable due to direct impact of saturation. The curvature of the Rorachu river basin ranges from − 28.09 to 6.79 (Fig. 5c), where positive value indicates convexity and negative value denotes concavity of the slope. The relief of the Rorachu river basin is gradually increasing northward and maximum relief of 4114 m (Fig. 5d) was recorded in the northern part of the basin. In the Rorachu river basin, drainage density ranges from 0.81 to 7.46 (Fig. 6a).
Distance from drainage, lineament and road
Streams are important agent for saturation of slope as well as slope instability. It was widely used by many researchers (Bui et al. 2011; Aghdam et al. 2016, and Wu et al. 2017). For the assessment of influence of streams on landslide occurrence, drainage network of Rorachu river basin was digitized from the topographical map no 78A/11 and a distance from drainage map was prepared on the basis of 100, 200, 300, 400, 500, 600 and 700 m distance (Fig. 6b). Lineaments are considered as the linear expression of underlying geological structure like, fault. Hence, the slopes with lineaments have greater tendency to become unstable. Using line tool of Geomatica 10.2 software, lineaments of the study area were extracted from the panchromatic band of LANDAT 8 OLI image with 15 m spatial resolution and compared with 1:50,000 scaled thematic map of bhuvan (http://www.bhuvan.nrsc.gov.in). For the better understanding of influence of lineaments on slope instability, a distance from lineament map (Fig. 6c) was prepared using 100, 200, 400, 800, 1200, 1600 and 2400 m buffer distance. Modification of slope angle due to construction of road and load of heavy vehicles may affect the stability of slope by increasing stress and reducing cohesiveness of soil. As a result landslide occurred near the roads. In Rorachu river basin 56 landslide locations were identified along 31A national highway. Road network of the study area was also digitized from the same topographical map and updated with the help of Google earth image of 2016 and 10 m resolution sentinel-2 satellite image of 2017. To assess the impact of road on landslide occurrence, a distance from road map was prepared with 100, 200, 400, 800, 1600, 2400 and 3600 m buffer distance (Fig. 6d). All buffer maps were prepared using Arc Map 10.3 software.
Geology and soil
Geology is considered as one of the most important causative factor for slope instability as well as landslide. Fragile and weak geological structures are more prone to landslides. The geological map of the study area was prepared from district resource map of east Sikkim collected from geological survey of India, Kolkata. Different geological groups were vectorized by Arc Map 10.3 software. From the geological map, five lithological groups were identified in the Rorachu river basin such as quartzite, sillimanite bearing granite gneiss, schist, amphibolite and granite gneiss (lingtse gneiss) (Fig. 7a). The soil map of Rorachu river basin was collected from Natural resource atlas of Sikkim guided by National Atlas and Thematic Mapping Organization (NATMO), Kolkata. Soil of the study area was divided into seven different categories on the basin of material present in the soil (Fig. 7b) such as fine loamy fluventic eutrudepts (S001), coarse loamy humic pachic dystrudepts (S002), coarse loamy humic dystrudepts (S003), fine loamy typic paleudolls (S004), fine skeletal cumulic hapludolls (S005), loamy skeletal entic hapludolls (S006) and coarse loamy typic hapludolls (S007). The characteristics of different soil are mentioned in the Table 3.
Ndvi and land use/land cover
The normalized difference vegetation index (NDVI) map (Fig. 7c) was prepared from sentinel-2 image of 10 m spatial resolution using NDVI index in erdas imagine 9.2 software with the help of the given equation:
where NDVI indicates the normalized difference vegetation index, NIR is the near infrared band (band no. 8) and R (band no. 4) is the red band of sentinel-2 image. The value of NDVI ranges between − 1 and + 1, where the values closer to 0 denotes less vegetation and closer to + 1 indicates good concentration of green leaves. Land use/land cover map was also prepared from sentinel-2 image using maximum likelihood algorithm in Erdas Imagine 9.2 image processing software under supervised image classification scheme and later image classification accuracy was assessed by Cohen’s Kappa coefficient method. Land use/land cover map (Fig. 7d) reflects the condition of physical environment and anthropogenic activities. Bare ground, settlement, road, river, terrace farming, sparse vegetation and dense vegetation were identified as significant land use/ land cover in the present study area where dense forest occupied 58.03% area, was considered as the dominant land cover followed by sparse vegetation, bare ground, settlement, river, terrace farming and road.
Result and discussion
Analysis of logistic regression model
On the basis of training data (725 landslide pixel and 78,975 non landslide pixel) obtained from thirteen thematic data layers and landslide inventory map, the model was constructed. After calculation, the result of the model was further verified to assess whether the dataset used in this model was suitable or not. There are various processes for the verification of the model result or to assess the goodness of fit of the model. In the present study, Hosmer–Lemeshow test, Cox and Snell R2 and Nagelkerke R2 and − 2log likelihood method was used to check the suitability of the model. In case of Hosmer–Lemeshow goodness of fit, the model is considered as suitably fitted if the significance of chi square value of the test is more than 0.05 (Bai et al. 2010). In the study when the test was performed, the significance of chi square value of 0.259 was obtained for the model (Table 4) which indicates the suitability of the model. In other words, the dataset used in the model was suitable for logistic regression analysis. Pseudo R2 is another method to justify model fitting. The model is regarded as perfect if the pseudo R2 value is 1 and the value above 0.2 indicates overall good fit (Clark and Hosking 1986). In the present study Cox and Snell R2 and Nagelkerke R2 were used to assess how logistic regression model fits the data. The calculated Nagelkerke R2 value was 0.209 which indicates a relatively good fit of the model (Table 5).
− 2log likelihood (− 2LL) is another method used to evaluate the goodness of fit of the model. The method is a key concept to understand the test in multiple regressions (Garcia-Rodriguez et al. 2007). Generally the smaller value of − 2LL implies better result of the test. In this model − 2LL were calculated using all dependent variables and excluding one by one. The result showed that, the model yields lowest value when all the dependent variables were used (Table 5). That means the causative factors selected for the model construction, were prime factors for the landslide occurrence in the Rorachu river basin and the factors were perfectly suited in the model.
On the basis of the independent and dependent variable, coefficient values of the logistic regression model were calculated (Table 6) and on the basis of the coefficients the following equation or linear combination was constructed for the present study:
where RAIN indicates rainfall, SLOPE indicates slope, ASPECT indicates aspect, CURV indicates curvature, REL indicates relief, DD indicates drainage density, GEOL indicates geology, SOIL indicates soil, DRAI indicates distance from drainage, LIN indicates distance from lineament, ROAD indicates distance from road, NDVI indicates normalized difference vegetation index (NDVI), and LULC indicates land use/land cover of the study area.
The calculated coefficient values reveal that, all landslide causative factors were not equally important or significant for the occurrence of landslide in the Rorachu river basin of eastern Sikkim Himalaya. Generally the significance value of all coefficients was less than 0.05 which depicts that all the independent variables used in the model were significant for the occurrence of the event. Coefficient value also explains the role or contribution of landslide causative factors in the landslide occurrence. The factors have higher coefficient values are considered as more influential than the others. Beside this, coefficient value analyses the change of dependent variable on the basis of independent variables. From the coefficient values, it was noticed that all the landslide causative factors are not positively related with landslide occurrence in the basin and in some cases zero relationship was also found (Table 6). Geology, soil, slope, NDVI and land use/land cover have positive relationship with landslide as the coefficient value of these factors were 2.952, 0.122, 0.038, 0.340 and 0.103 respectively. On the other hand rainfall, relief, aspect, curvature, drainage density, distance from drainage, and distance from road have negative relationship. From the coefficient values it is evident that, lineament has zero relationship with landslide occurrence in the Rorachu river basin. Geology, soil and land use/land cover were converted into continuous data using frequency ratio. So, their relationship with landslide was built on the basis of frequency ratio values. The coefficient value of geology shows that, it is the prime causative factor for the landslide occurrence. Geology yields maximum coefficient value (Table 6). In geological sub-divisions, maximum frequency ratio value was noticed in case of quartzite and Sillimanite bearing granite gneiss (Table 6). The high positive coefficient value reflects the same condition that, the categories having more frequency ratio value were more prone to landslides. Soils with coarser materials are more prone to landslide due to their excessive drained nature. Cohesion is minimum in this type of soil and as a result soil mass flows down to the slope very easily in moist condition. The same fact was noticed in the Rorachu river basin. Maximum frequency ratio was found in coarse loamy typic hapludolls soil followed by loamy skeletal entic hapludolls, coarse loamy humic dystrudepts and coarse loamy humic pachic dystrudepts. Positive coefficient value of soil indicates the same condition as geology. In case of land use/land cover, road, river and bare ground were the main features which influence the occurrence of landslide in the study area. Normalized Difference Vegetation Index (NDVI) represents the presence or absence of the vegetation. Positive coefficient of NDVI indicates that, landslide was occurred not only in bare surfaces, but also in the well vegetated areas in Rorachu river basin. Few landslide and landslide scar was found in the Ranipool forest block, Gangtok forest block and Bhusuk. Generally rainfall, relief and drainage density have positive relation with landslide. But in the present study, they have negative relationship with landslide in the basin. It could be noted that, this is reverse of the normal condition. Maximum rainfall of the basin was found in the south-western part where most of the landslides were occurred in the northern and north western part. So from this scenario it is clear that though rainfall is a major triggering factor for the occurrence of landslide, in Rorachu river basin landslides are not controlled by rainfall only. In the logistic regression model of the present study the coefficient value of relief was found − 0.002 which indicates a very poor negative relationship between relief and landslide occurrence. But the actual picture is quite different from this. In the Rorachu river basin, the relief ranges between 834 m and 4114 m but maximum landslide was found between 2500 m and 3500 m elevation and 56% of landslide pixels were found within 3500 m altitude. From the coefficient value it seems that landslide occurrence is decreasing with increasing relief. But actually a sigmoid pattern of landslide occurrence was noticed in case relief. In the hilly mountainous areas slope stability depends on the distance between streams and drainage density, or it can be said that the distribution of drainage and drainage density influences slope stability in the mountainous areas. Stability of the slope decreases due to dissection of slope by active down cutting of lower order streams. This dissection not only reduces the strength of soil but the amount or degree of slope also changes in a considerable rate. Hence the slopes closer to the drainage are less stable. Drainage density is high when length of the stream per unit area is more. This depends on the shape and alignment of the stream segments. The areas having straight channels contain lesser drainage density than the areas where channel is more or less sinuous. In case of Rorachu river basin, the slope of entire northern and north-western part of the basin was highly dissected by small segments of lower order streams. As a result a huge number of debris flows were found in the area. Some of the debris flows directly intersect the 31A National Highway. As a result, the coefficient value of distance from drainage data layer was found negative. This negative relationship between landslide and distance from drainage clearly depicts that, more landslides were found closer to the river in the basin. Out of 785 landslide training points 726 points (92.48%) were found within 200 m distance from rivers. This fact clearly explains the role of fluvial erosion in slope stability in the Rorachu river basin of eastern Sikkim Himalaya. But in case of drainage density, the relationship between landslide and drainage density was found negative in the logistic regression model. This is due to the fact that in the moderate to low relief areas of Rorachu river basin length of the streams were more than the high relief areas. Slope played an important role for this type of variation in drainage density. Generally in high relief areas slope is more 50°, as result the lower order stream segments were very straight in character. On the other hand in moderate to low relief areas higher order streams are more sinuous or irregular. Hence in the present study maximum drainage density was found in the south-eastern part of the basin where relief is moderate to low. In case of curvature positive and negative value indicates convex and concave slope respectively and zero value indicates flat areas. Generally concave slopes are more prone to landslide than the convex. The coefficient of curvature in the model was − 0.058 which indicates that in the Rorachu river basin concave slopes were more prone to landslide than the concave slopes and landslide was absent in the flat areas. Roads are one of the major causative factors in the present study area. 56 landslides out of 80 were found along 31A national highway in the basin which clearly depicts the influence of slope cutting for the road construction and impact of vibration due to vehicle transportation. The study enlightens the fact that, 594 training landslide points were found between 200 m buffer distances from road. So it can be said that, 75.66% of total landslides were occurred very close to the roads in the Rorachu river basin.
Wald statistics is another method was used in the study to identify the causative factor for landslide occurrence. Wald statistics indicates the significance of the variables responsible for the occurrence of event (Wang et al. 2013). The variables with greater Wald value are considered as the most significant factor for the event. Table 6 revealed that, all factors used in the model were significant at 0.05 significance level and geology produced the highest Wald value followed by distance from road, rainfall and slope. Relief, drainage density and land use/land cover have moderate Wald value and other factors have comparatively lower Wald value. So, it is clear that Geology is the dominant factor for the occurrence of landslide in the Rorachu river basin. Both the coefficient and Wald value of geology is much higher than the other landslide conditioning factors in the basin.
Analysis of landslide susceptibility map
Calculated probability values of logistic regression model were used as landslide susceptibility index (LSI). After the preparation of, the landslide susceptibility map was divided into five suitable classes (very low, low, moderate, high and very high) using manual classification method in Arc Map software package to demarcate different landslide susceptibility zones (LSZ).
The final landslide susceptibility map (Fig. 8) revealed that, the LSI value ranges from 0.000023 to 0.349046. From the map it is very evident that, the northern stretch of the basin is more susceptible to landslide than the southern part. After the demarcation of landslide zones, class wise pixel count as well as area was calculated. The data showed that, out of 71.73 sq. km area of the basin, 36.70 sq. km (51.17%) area was occupied by very low susceptibility class and lowest area of 1.15 sq. km (1.61%) was found in case of very high susceptibility (Table 7). The southern, south-eastern and some portions of central parts of the basin are under stable condition. But if existed landslide location data was observed, it can be seen that most of the landslides were located in the northern, north-eastern and north-western part of the basin belong to high and very high susceptibility classes. Total 0.65 square kilometer (75.48%) slide area was occupied these two classes (Table 7). This condition indicates that, the areas having LSI value more than 0.034218, slope instability as well as landslide can be a serious threat to the life and property and natural resources in the Rorachu river basin. On the other hand landslide was rarely seen in the very low and low susceptibility classes. These two classes contain 0.09 square kilometer (9.72%) slide area (Table 7).
Validation of the model
Model validation by ROC curve
On the basis of 946 landslide pixels (including training and validation data) and 78,754 non-landslide pixels, a receiver operating characteristics (ROC) curve was prepared to validate the accuracy of logistic regression model. The graphical representation of ROC curve and AUC indicates a very good accuracy of the model (Fig. 9). The value of AUC was found 0.947 or 94.7% which reveals that the result of logistic regression model is very close to the perfect analysis of the data (Table 8). The asymptotic significance of the ROC curve apprises that the curve is also statistically significant.
Model validation by landslide density method:
The calculated landslide density shows that the susceptibility class wise landslide was increased towards the higher value classes and maximum landslide density of 0.2695 was found in very high landslide susceptibility class (Table 9).
On the basis of calculated landslide density, a landslide density curve was prepared to visualize the trend of landslide density. As the result of landslide density method was perfectly matched with the rules, so it can be stated that the overall accuracy and predictive capacity of the logistic regression model was very satisfactory.
Conclusion
GIS based statistical modeling is widely used in landslide susceptibility analysis. These methods are very much suitable and it has higher acceptability for the areas where data availability is minimum. Generally landslides are not easy to predict and forecast. In this regard, a landslide susceptibility map may provide brief and necessary information about slope instability of a particular region. Landslide susceptibility map can be a useful tool for the decision makers and planners as it demarcates and delineates the potential areas of future landslides. As complex geo-environmental phenomena, landslide is driven by a group of factors and all factors are not equally responsible for the event. In this condition logistic regression model is very much suitable for the analysis of relationship between landslide and its related factors as it estimates the correlation in a non-linear structure. In the present study, non-morphometric factors (geology, soil, NDVI, land use/land cover and distance from road) were found more responsible than the morphometric factors. Finally it may be concluded that, the landslide susceptibility map of the Rorachu river basin can be treated as an important source of information for future urban and land use planning of the region.
References
Aghdam IN, Varzandeh MHM, Pradhan B (2016) Landslide susceptibility mapping using an ensemble statistical index (Wi) and adaptive neuro-fuzzy inference system (ANFIS) model at Alborz mountains (Iran). Environ Earth Sci 75(7):553
Ahmed B (2015) Landslide susceptibility mapping using multi-criteria evaluation techniques in Chittagong Metropolitan area, Bangladesh. Landslides 12:1077–1095
Akgun A (2012) A comparison of landslide susceptibility maps produced by logistic regression, multi-criteria decision, and likelihood ratio methods: a case study at Izmir, Turkey. Landslides 9:93–106
Anbalagan R (1992) Landslide hazard evaluation and zonation mapping in mountainous terrain. Eng Geol 32:269–277
Anbarasu K, Sengupta A, Gupta S, Sharma SP (2010) Mechanism of activation of the Lanta Khola landslide in Sikkim Himalayas. Landslides 7:135–147
Atkinson PM, Massari R (1998) Generalized linear modeling of susceptibility to landsliding in central Apennines, Italy. Comput Geosci 24:373–385
Atwood WW, Mather KF (1932) Physiography and Quaternary geology of the San Juan mountains, Colorado. United States Geological Survey Professional Paper 166:163–164
Bai S, Wang J, Lü G, Zhou P, Hou S, Xu S (2010) GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the Three Gorges area, China. Geomorphology 115:23–31
Balteanu D, Chendes V, Sima M, Enciu P (2010) A country wide spatial assessment of landslide susceptibility in Romania. Geomorphology 124:102–112
Basharat M, Shah HR, Hameed N (2016) Landslide susceptibility mapping using GIS and weighted overlay method: a case study from NW Himalayas, Pakistan. Arab J Geosci 9:292
Bathrellos GD, Kalivas DP, Skilodimou HD (2009) GIS-based landslide susceptibility mapping models applied to natural and urban planning in Trikala, Central Greece. Estud Geol 65(1):49–65
Bui DT, Lofman O, Revhaug I, Dick O (2011) Landslide susceptibility analysis in the Hoa Binh province of Vietnam using statistical index and logistic regression. Nat Hazards 59:1413–1444
Cardinali M, Reichenbach P, Guzzetti F, Ardizzone F, Antonini G, Galli G, Cacciano M, Castellani M, Salvati P (2002) A geomorphological approach to the estimation of landslide hazards and risks in Umbria, Central Italy. Nat Hazards Earth Syst Sci 2:57–72
Castellanos A, Enrique A, Van Western CJ (2008) Qualitative landslide susceptibility assessment by multicriteria analysis: a case study from San Antonio del Sur, Guantanamo Cuba. Geomorphology 94:453–466
Chen W, Chai H, Sun X, Wang Q, Ding X, Hong H (2016) A GIS-based comparative study of frequency ratio, statistical index and weights-of-evidence models in landslide susceptibility mapping. Arab J Geosci 9:204
Clark WA, Hosking PL (1986) Statistical methods for geographers. Wiley, New York, p 518
Crandell DR, Varnes DJ (1961) Movement of the Slumgullion earth flow near Lake city, Colorado. Short papers in the geologic and hydrologic sciences. United States Geological Survey Professional Paper 424-B:136–139
Cruden DM, Varnes DJ (1996) Landslide types and processes. Special Report, Transportation Research Board, vol 247. National Academy of Sciences, pp 36–75
Dagdelenler G, Nefeslioglu HA, Gokceoglu C (2015) Modification of seed cell sampling strategy for landslide susceptibility mapping: an application from the Eastern part of Gallipoli Peninsula (Canakkale, Turkey). Bull Eng Geol Environ 75:575–590
Dai FC, Lee CF (2002) Landslide characteristics and slope instability modeling using GIS, Lantau Island, Hong Kong. Geomorphology 42(3–4):213–228
Dai FC, Lee CF, Ngai YY (2002) Landslide risk assessment and management: an overview. Eng Geol 64:65–87
Dhakal AS, Amada T, Aniya M (2000) Landslide hazard mapping and its evaluation using GIS: an investigation of sampling schemes for a grid-cell based quantitative method. Photogramm Eng Remote Sens 66(8):981–989
Dubey CS, Chaudhry M, Sharma BK, Pandey AC, Singh B (2005) Visualization of 3-D digital elevation model for landslide assessment and prediction in mountainous terrain: a case study of Chandmari landslide, Sikkim, eastern Himalayas. Geosci J 9(4):363–373
Endlich FM (1876) Report of F.M. Endlich. In: US geological and geographical survey (Hayden) of the territories annual report 1874, 203
Garcia-Rodriguez MJ, Malpica JA, Benito B, Diaz M (2007) Susceptibility assessment of earthquake-triggered landslides in El Salvador using logistic regression. Geomorphology 95:172–191
Guzzetti F, Carrara A, Cardinali M, Reichenbach P (1999) Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study, central Italy. Geomorphology 31:181–216
Guzzetti F, Mondini AC, Cardinali M, Fiorucci F, Santangelo M, Chang K (2012) Landslide inventory maps: new tools for an old problem. Earth Sci Rev 112:42–66
Horton RE (1945) Erosional development of streams and their drainage basins: hydrological approach to quantitative geomorphology. Bull Geol Soc Am 56:275–370
Jian W, Xiang-guo P (2009) GIS-based landslide hazard zonation model and its application. Proced Earth Planet Sci 1:1198–1204
Kanwal S, Atif S, Shafiq M (2016) GIS based landslide susceptibility mapping of northern areas of Pakistan, a case study of Shigar and Shyok Basins. Geomat Nat Hazards Risk 8(2):348–366
Kayastha P (2015) Landslide susceptibility mapping and factor effect analysis using frequency ratio in a catchment scale: a case study from Garuwa sub-basin, East Nepal. Arab J Geosci 8(10):8601–8613
Khan YA, Lateh H, Baten MA, Kamil AA (2012) Critical antecedent rainfall conditions for shallow landslides in Chittagong city of Bangladesh. Environ Earth Sci 67(1):97–106
Ko Ko C, Flentje P, Chowdhury R (2004) Landslides qualitative hazard and risk assessment method and its reliability. Bull Eng Geol Env 63(2):149–165
Kumar R, Anbalagan R (2016) Landslide susceptibility mapping using analytical hierarchy process (AHP) in Tehri reservoir rim region, Uttarakhand. J Geol Soc India 87(3):271–286
Lee S (2007) Application of logistic regression model and its validation for landslide susceptibility mapping using GIS and remote sensing data. Int J Remote Sens 26(7):1477–1491
Lee S, Choi U (2003) Development of GIS-based geological hazard information system and its application for landslide analysis in Korea. Geosci J 7:243–252
Lee S, Pradhan B (2007) Landslide hazard mapping at Selangor, Malaysia using frequency ratio and logistic regression models. Landslides 4:33–41
Leoni G, Campolo D, Falconi L, Gioè C, Lumaca S, Puglisi C, Torre A (2015) Heuristic method for landslide susceptibility assessment in the Messina municipality. Eng Geol Soc Territ 2:501–504
Mandal S (2015) Upslope contributing area, topographic wetness and landsliding: a case study of the Shivkhola watershed, Darjeeling Himalaya. Int Res J Earth Sci 3(7):23–29
Mandal S, Maiti R (2012) Application of 1-d slope stability model in landslide susceptibility mapping of Shivkhola watershed Darjiling Himalaya. Int J Geol Earth Environ Sci 2(2):34–50
Mandal S, Maiti R (2015) Semi-quantitative approaches for landslide assessment and prediction. In: Springer natural hazards. Springer, Singapore (ISBN 978-981-287-145-9)
Mandal S, Mandal B (2015) Assessment and prediction of slope instability in the Lish river basin of Eastern Darjiling Himalaya using RS and GIS. Int Res J Earth Sci 3(12):9–20
Mandal S, Mandal K (2017a) Bivariate statistical index for landslide susceptibility mapping in the Rorachu river basin of eastern Sikkim Himalaya, India. Spat Inf Res 26(1):59–75
Mandal B, Mandal S (2017b) Landslide susceptibility mapping using modified information value model in the Lish river basin of Darjeeling Himalaya. Spat Inf Res 25:205–218
Martha TR, Govindharaj KB, Kumar KV (2015) Damage and geological assessment of the 18 September 2011 Mw 6.9 earthquake in Sikkim, India using very high resolution satellite data. Geosci Front 30:1–14
Mondal S, Mandal S (2017) RS & GIS-based landslide susceptibility mapping of the Balason River basin, Darjeeling Himalaya, using logistic regression (LR) model. Georisk 12(1):29–44
Nagarajan R, Mukherjee A, Roy A, Khire MV (1998) Temporal remote sensing data and GIS application in landslide hazard zonation of part of Western Ghat, India. Int J Remote Sens 19:573–585
Parise M (2002) Landslide hazard zonation of slopes susceptible to rock falls and topples. Nat Hazards Earth Syst Sci 2:37–49
Pradhan B, Singh RP, Buchroithner MF (2006) Estimation of stress and its use in evaluation of landslide prone regions using remote sensing data. Adv Space Res 37:698–709
Pradhan B, Mansor S, Pirasteh S, Buchroithner M (2011) Landslide hazard and risk analyses at a landslide prone catchment area using statistical based geospatial model. Int J Remote Sens 32(14):4075–4087
Rahaman SA, Aruchamy S (2017) Geoinformatics based landslide vulnerable zonation mapping using analytical hierarchy process (AHP), a study of Kallar river sub watershed, Kallar watershed, Bhavani basin, Tamil Nadu. Model Earth Syst Environ 3:41
Rawat MS, Joshi V, Rawat BS, Kumar K (2011) Landslide movement monitoring using GPS technology: a case study of Bakthang landslide, Gangtok, East Sikkim, India. J Dev Agric Econ 3(5):194–200
Rawat MS, Sundriyal YP, Joshi V (2016) Slope stability analysis in a part of East Sikkim, using remote sensing & GIS. In: 2016 2nd international conference on next generation computing technologies (NGCT-2016), Dehradun, 14–16 Oct 2016
Regmi AD, Devkota KC, Yoshida K, Pradhan B, Pourghasemi HR, Kumamoto T, Akgun A (2014) Application of frequency ratio, statistical index, and weights-of-evidence models and their comparison in landslide susceptibility mapping in Central Nepal Himalaya. Arab J Geosci 7(2):725–742
Rozos D, Bathrellos G, Skillodimou H (2011) Comparison of the implementation of rock engineering system and analytic hierarchy process methods, upon landside susceptibility mapping using GIS: a case study from the Eastern Achaia Country of Peloponnesus, Greece. Environ Earth Sci 63:49–63
Sarkar S, Kanungo DP, Patra AK, Kumar P (2008) GIS based spatial data analysis for landslide susceptibility mapping. J Mt Sci 5:52–62
Shahabi H, Hashim M (2015) Landslide susceptibility mapping using GIS-based statistical models and remote sensing data in tropical environment. Sci Rep 5:9899
Shit PK, Bhunia GS, Maiti R (2016) Potential landslide susceptibility mapping using weighted overlay model (WOM). Model Earth Syst Environ 2:21
Siyahghalati S, Saraf AK, Pradhan B, Jebur MN, Tehrany MS (2014) Rule-based semi-automated approach for the detection of landslides induced by 18 September 2011 Sikkim, Himalaya, earthquake using IRS LISS3 satellite images. Geomat Nat Hazards Risk 7(1):326–344
Tazik E, Jahantab Z, Bakhtiari M, Rezaei A, Alavipanah SK (2014) Landslide susceptibility mapping by combining the three methods fuzzy logic, frequency ratio and analytical hierarchy process in Dozain basin. Int Arch Photogramm Remote Sens Spat Inf Sci XL-2(W3):267–272
Tornyai R, Lúchava M (2015) Using statistical analyses to assess landslide hazard. Geol Geophys Environ 41(1):144–145
Tunusluoglu MC, Gokceoglu C, Nefeslioglu HA, Sonmez H (2007) Extraction of potential debris source areas by logistic regression technique: a case study from Barla, Besparmak and Kapi mountains (NW Taurids, Turkey). Environ Geol 54:9–22
Vakhshoori V, Zare M (2015) Landslide susceptibility mapping by comparing weight of evidence, fuzzy logic and frequency ratio methods. Geomat Nat Hazards Risk 7(5):1731–1752
Van Westen CJ, Terlien MTJ (1996) An approach towards deterministic landslide hazard analysis in GIS. A case study from manizales (colombia). Earth Surf Process Landf 21:853–868
Varnes DJ (1978) Slope movement types and processes. In: Schuster RL, Krizek RJ (ed) Landslides, analysis and control. Transportation Research Board Sp. Rep. No. 176. National Academy of Sciences of Ukraine, pp 11–33
Varnes DJ, Savage WZ (1996) The Slumgullion earth flow: a large-scale natural laboratory. US Geol Surv Bull 2130:57–60
Vishwakarma CA, Asthana H, Singh D, Pant M, Sen R, Mukherjee S (2017) GIS based bi-variate statistical approach for landslide susceptibility mapping of South District, Sikkim. Int J Innov Res Sci Eng Technol 6(7):13661–13674
Wang L, Sawada K, Moriguchi S (2013) Landslide susceptibility analysis with logistic regression model based on FCM sampling strategy. Comput Geosci 57:81–92
Wu Z, Wu Y, Yang Y, Chen F, Zhang N, Ke Y, Li W (2017) A comparative study on the landslide susceptibility mapping using logistic regression and statistical index models. Arab J Geosci 10:187
Yoshimatsu H, Abe S (2006) A review of landslide hazards in Japan and assessment of their susceptibility using an analytical hierarchy process (AHP) method. Landslides 3:149–158
Youssef AM, Pradhan B, Gaber AFD, Buchroithner MF (2009) Geomorphological hazard analysis along the Egyptian Red Sea coast between Safaga and Quseir. Nat Hazards Earth Syst Sci 9:751–766
Youssef AM, Pradhan B, Sabtan AA, El-Harbi HM (2012) Coupling of remote sensing data aided with field investigations for geological hazards assessment in Jazan area, Kingdom of Saudi Arabia. Environ Earth Sci 65(1):119–130
Zhou CH, Lee CF, Li J, Xu ZW (2002) On the spatial relationship between landslides and causative factors on Lantau Island, Hong Kong. Geomorphology 43:197–207
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mandal, S., Mandal, K. Modeling and mapping landslide susceptibility zones using GIS based multivariate binary logistic regression (LR) model in the Rorachu river basin of eastern Sikkim Himalaya, India. Model. Earth Syst. Environ. 4, 69–88 (2018). https://doi.org/10.1007/s40808-018-0426-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40808-018-0426-0