1 Introduction

Landslides pose a serious threat to life and property generally in mountainous regions. Remote sensing data are used in three main phases of a landslide-related study: landslide detection and identification, monitoring of landslides and spatial analysis and hazard prediction. There are a wide range of methods that are present in the literature for landslide detection using satellite images. Most of the early researches made use of aerial photographs of varying scales (1:50,000–1:10,000) and satellite images of Landsat TM and SPOT (Cheng et al. 2013; Ren and Lin 2010; Zhou et al. 2002). Many experts have used image interpretation as a tool of landslide identification; they analyzed the satellite images using different keys to identify landslides in areas. These methods give quick and timely response for rescue teams to carry out relief operations. Some examples of landslide detection using image interpretation techniques are discussed in Kääb (2002), Casson et al. (2003) and Van Westen and Lulie Getahun (2003). With the advancement of remote sensing technology and easy availability of remote sensing data, many automatic approaches for landslide mapping and identification have been developed. There are two approaches for landslide characterization. The first one involves determination of qualitative characteristics such as number, distribution, type and character of debris flow using airborne or satellite images. The second approach involves computation of dimensions such as length, width, thickness, and slope using stereo SAR, interferometric SAR (InSAR) and topographic profiles (e.g., LASER altimeter) (Singhroy and Molch 2004; Singhroy 2002).

Monitoring of landslides involves the comparison of landslide conditions such as speed of movement, surface topography and soil humidity to assess landslide activity (Mantovani et al. 1996). Cheng el al. (2004) proposed an automated landslide detection method using multi-temporal satellite images and DTM data. The method involves differing band ratio of two co-registered images to identify changed areas representing landslides. The landslides were further refined using terrain information. Nagarajan et al. (1998) presented a similar approach for landslide identification using IRS images. Another semiautomatic method for monitoring of landslides made use of multi-temporal VHR images. The method involved image orthorectification, relative radiometric normalization, change detection using image difference, thresholding and spatial filtering to eliminate pixel clusters that could correspond to man-made land use changes (Hervás et al. 2003).

In mountainous areas, major earthquakes induce landslides in broad areas with high intensity and high scale, which causes enormous economic and human life loss. Normalized difference vegetation index (NDVI) and terrain slope information of 8-day moderate resolution imaging spectroradiometer (MODIS) images are used to detect Wenchuan earthquake-induced landslide (Zhang et al. 2010). NDVI filtering and change detection analysis are applied on remote sensing images to identify landslides in southern Taiwan (Tsai et al. 2010). An automated method for landslide detection classified the remote sensing images into landslide and non-landslide areas using a scene classification method based on BoVW and pLSA (Cheng et al. 2013). Another method for landslide detection uses high-resolution panchromatic images from Cartosat-1 and IRS along with 10-m gridded DTM data. The method is based on change detection techniques and global contextual criteria in an object-based environment (Martha et al. 2012). Landslides that occurred due to a 6.9-magnitude earthquake in Sikkim Himalaya, India, on September 18, 2011, were detected using the decision tree method applied to two Indian remote sensing satellites linear imaging self-scanning sensor (LISS III) images acquired from 2007 and 2011 which were taken before and after the earthquake (Siyahghalati et al. 2014).

The Himalayas which is an active fold-thrust belt is frequently hit by earthquakes. On September 18, 2011, at 06:10:48 PM (Indian Standard Time), an earthquake of magnitude M w = 6.9 hit the Nepal border with its epicenter located at 27.723°N and 88.064°E and focal depth 19.7 km (USGS). The earthquake induced a large number of landslides in the region. In this paper, a semiautomatic approach for landslide detection from remote sensing images and digital terrain information is presented. The method classifies the pre- and post-landslide images using generalized improved fuzzy Kohonen clustering network (GIFKCN) classifier. Landslides result in loss of vegetation; thus, the changed areas in vegetation class are identified as landslide candidates. The pre- and post-classified images are used to identify candidate landslides. The candidate landslides are validated using the rule set developed using slope and aspect derived from DEM data. The proposed method is applied to detect the September 18, 2011, earthquake-induced landslides that occurred in Sikkim state. Pre- and post-earthquake Landsat 5 and Advanced Land Imager (ALI) EO-1 satellite images, respectively, are used in this study. The terrain information is obtained using ASTER Global Digital Elevation Model (GDEM) of the area. The results show that the landslides are detected accurately and efficiently.

2 Data sources

2.1 Satellite data

Aerial photographs provide detail about landslides, but they are rarely available as obtaining pre- and post-earthquake images is difficult in all cases. Steerable sensors and an increasing number of operational satellites have led to satellite data increasingly replacing aerial photographs for landslide studies. Also, satellite images not only cover a larger area but also are cheaper as compared to aerial photographs. Thus, a novel method for landslide detection using satellite images and DEM data is proposed here. Pre- and post-earthquake satellite images and topographic data are used. Pre- and post-earthquake Landsat 5 and EO-1 ALI images acquired on August 27, 2011, and October 19, 2011, respectively, are used for the study (USGS). The EO-1 ALI image is level 1GST product which is terrain corrected, and Landsat 5 is level 1T which is precision and terrain corrected by incorporating ground control points while employing a DEM for topographic accuracy. A small subset from these images showing areas of Sikkim state such as Lachung, Lachen, Ligtham, Chungthang, and Mangan as shown in Fig. 1 is selected to demonstrate the proposed method.

Fig. 1
figure 1

Geographical location of the study area. a False natural color image (red band 5, green band 4, blue band 3) acquired on August 27, 2011 and b false natural color image (red band 8, green band 6, blue band 4) acquired on October 19, 2011

2.2 DEM

The method also requires elevation information for the validation of landslide candidates. ASTER GDEM data give topographic information. The ASTER GDEM data cover land surfaces between 83°N and 83°S and are comprised of 22,702 1° × 1° tiles. The ASTER GDEM data are available in GeoTIFF files with geographic latitudes and longitudes. The data are posted on a 1 arc-second (approximately 30-m at the equator) grid and referenced to the 1984 World Geodetic System (WGS84)/1996 Earth Gravitational Model (EGM96) geoid.

The ASTER GDEM is resampled with about 30-m resolution using ERDAS Reproject module in order to facilitate subsequent landslide spatial statistical analysis. Resampling will not affect the topographic information of the original ASTER DEM data. In this study, ASTER GDEM data acquired on October 18, 2011, are used. The DEM data used are shown in Fig. 2.

Fig. 2
figure 2

ASTER GDEM data of the study area acquired on October 18, 2011

3 Proposed methodology

The flowchart of the proposed method is shown in Fig. 3. It consists of the following steps:

Fig. 3
figure 3

Flowchart of the proposed method

  1. 1.

    Image preprocessing

  2. 2.

    Computation of spectral indices

  3. 3.

    Image classification using GIFKCN

  4. 4.

    Landslide candidate detection

  5. 5.

    DEM and its derivatives

  6. 6.

    Creation of rule set for validation of landslides

These steps are discussed in detail in the following sections:

3.1 Image preprocessing

The input images are preprocessed to obtain accurate results and remove any sort of distortions. Two preprocessing steps are carried out on the input image: image geometric correction and top of atmospheric reflectance calculation. The study area covers mountainous region, and therefore, there is brightness difference due to image acquisition under different sun illumination conditions. Thus, to compensate for this difference the ToA reflectance of the images is computed.

3.2 Computation of spectral indices

Spectral indices are used for highlighting a particular type of land cover such as NDVI for vegetation, normalized difference building index (NDBI) for built-up areas and normalized difference water index (NDWI) for water. In the proposed method, spectral indices are used for training of the GIFKCN classifier. The study area is classified into four classes: vegetation, water, clouds and bare land. Thus, those spectral indices that highlight these land cover types are used for training. NDVI, NDWI and NDBI are derived from the different wavelength bands. In NDVI, the vegetated area pixels have higher values as compared to other features. Similarly, NDWI and NDBI highlight the water and bare land areas, respectively. These images are normalized into (0, 255) using Eq. (1) for determining the optimal threshold value,

$$I = \frac{{\left( {SI - SI_{\hbox{min} } } \right)}}{{(SI_{\hbox{max} } - SI_{\hbox{min} )} }} \times 255$$
(1)

where I is the normalized image, SI is the input image and SI max and SI min represent the maximum and minimum pixel value of the input image, respectively. In pre-landslide image, vegetated areas have NDVI values >190, water has NDWI values >135, urban areas have NDBI values >192 and clouds have NDBI values in the range 158–192. In post-landslide scene, the vegetated areas have NDVI values >182, clouds have NDBI values in the range 144–185, water pixels have NDWI values >160, and urban area and bare land have NDBI values >185.

3.3 Image classification using GIFKCN

GIFKCN classifier is a neuro-fuzzy classifier that hybridizes the Kohonen clustering network (KCN) (Kohonen 1990) and generalized improved fuzzy partition FCM (Zhu et al. 2009). The classified pre- and post-landslide images are used to identify candidate landslides. Since post classification comparison requires that the individual classification method should have high accuracy, GIFKCN classifier (Singh et al. 2014) is used to classify the pre- and post-landslide images into four classes: vegetation, cloud, water and bare land. Both the pre- and post-landslide images are classified using the following method. The images are classified into four classes; thus, four centers are computed. First, initialize the cluster center \(z_{i} \left( {2 \le i \le c} \right)\), the threshold \(\varepsilon (\varepsilon > 0)\) and topological neighborhood parameters. Set t = 1, maximum iteration limit t max and m > 1. The fuzziness index m t is updated by

$$m_{t} = m + \frac{{t\left( {m - 1} \right)}}{{t_{\hbox{max} } }}\quad {\text{for}}\quad 1< {t \le t_{\hbox{max} } \quad{\text{and}}\quad m } > 1$$
(2)

Calculate fuzzy membership matrix u ik and learning rate γ ik,t using Eqs. (3) and (5).

$$u_{ik} = \left( {\mathop \sum \limits_{l = 1}^{c} \left( {\frac{{\parallel z_{i} - I_{k}\parallel^{2} - \beta_{k} }}{{\parallel z_{l} - I_{k}\parallel^{2} - \beta_{k} }}} \right)^{{1/\left( {m - 1} \right)}} } \right)^{ - 1} \quad {\text{for}}\quad 1 \le i \le c \quad {\text{and}}\quad 1 \le k \le n$$
(3)

where

$$\beta_{k} = \alpha \cdot \hbox{min} \{\| z_{\tau } - I_{k}\|^{2} \left| { \tau \,\epsilon\, \left\{ {1, \ldots, c} \right\}\} } \right.,\quad (0 \le \alpha < 1)$$
(4)

and the membership matrix \(U = [u_{ik} ]\) represents a fuzzy c-partition matrix constrained by the probabilistic conditions \(0 \le u_{ik} \le 1\), and \(\mathop \sum\nolimits_{i = 1}^{c} u_{ik} = 1 \forall i = 1, \ldots ,c.\)

The learning rate γ ik,t of the ik th neuron for t th iteration is given by Eq. (5).

$$\gamma_{ik,t } = \left( {u_{ik} } \right)^{{m_{t} }}$$
(5)

The weight of the output neuron is updated using Eq. (6).

$$z_{i,t} = z_{i,t - 1} + \mathop \sum \limits_{k = 1}^{n} \gamma _{ik,t} \left( {I_{k} - z_{i,t - 1} } \right)/\mathop \sum \limits_{s = 1}^{n} \gamma _{is,t}$$
(6)

The learning rate γ ik,t . is updated and t is incremented. The termination condition \(z_{1,t} - z_{1,t - 1} > \varepsilon\) is checked. If the termination condition is not met, then algorithm continues recursively. Otherwise, the final clustered image is formed by assigning the pixel x k to the class c with highest membership value. GIFKCN is trained using the mean values of spectral indices. The mean values of the spectral indices for different classes are summarized in Table 1.

Table 1 Mean values of the various spectral indices scaled in 8 bits

3.4 Landslide candidate detection

Bare rocks or debris is exposed after a landslide event, giving a bright appearance to landslide-affected areas in an image. One of the commonly observed properties of landslides is that it results in loss of vegetation. This property can be utilized as the first step in identification of landslides. Thus, the pre- and post-classified image is compared to obtain the change information from these images. All the possible change classes from vegetation class are identified. The vegetation to bare land class represents loss of vegetation, and thus, these are identified as candidate landslides.

3.5 Digital elevation model and its derivatives

Since landsliding is a geomorphic process, using DEMs as additional data during image analysis will yield better classification results in comparison with spectral data alone. The DEM data can be utilized to extract important information such as slope and aspect of an area.

3.5.1 Slope

The slope is expressed as the change in elevation over a certain distance. In this case, the distance is the size of the pixel. The resulting grayscale image shows flat areas as dark pixels, and pixel brightness increases as the terrain becomes steeper. The slope of the study areas is derived from the DEM data using ERDAS™ IMAGINE software. The slope image is classified into four classes with slope values in the following ranges 0°–15°, 15°–30°, 30°–45° and above 45°.

3.5.2 Aspect

Another important derivative is surface aspect that gives the direction of slope for a DEM file. Aspect uses a 3 × 3 pixel moving window centered on each pixel to calculate the prevailing direction of its neighbors. For pixel x, y, the average changes in elevation in both x and y directions are calculated first. Then, the average slope is the average change in elevation in the y direction divided by the average change in elevation in the x direction. The aspect is the arc tangent of the average slope. Their values represent a direction in degrees measured clockwise from north, ranging from 0 to 361. 0–22.5 indicates a north-facing slope, 22.5–67.5 indicates northeast-facing slope, 67.5–112.5 indicates an east-facing slope, 157.5–202.5 indicates a south-facing slope, 202.5–247.5 indicates southwest-facing slope, 247.5–292.5 indicates a west-facing slope and 361 indicates areas that are perfectly flat (e.g., water bodies) with no aspect for the slope. The aspect image is classified into ten classes showing different slope directions.

3.6 Creation of rule set for validation of landslide

The candidate landslides identified contain a large number of false positives as some features such as roads, water bodies, barren rocky lands, agriculture terrace, built-up areas and river beds are identified as landslides. Thus, to remove these false positives, the elevation data and their derivatives slope and aspect are used. Slope angle is one of the key factors in inducing slope instability. Landslides generally occur at steep slopes, and based upon the landslide distribution, south- and east-facing slopes (i.e., slopes with aspect values in the range 67.5–202.5) were considered to have more potential for landslides (Li et al. 2013). Slope values <15° correspond to built-up area and water bodies. Therefore, the entire candidate landslides with mean slope values >15° and aspect values in the range 67.5–202.5 are validated as true landslides, while others are removed. So, the following rule set is created.

figure a

4 Experimental results

The method was implemented in MATLAB R2013a and ERDAS software. The spectral indices were derived from ERDAS software, and the results of applying the various spectral indices are shown in Fig. 4. The spectral indices were used as training for the GIFKCN classifier. The various parameters used in the experiment are \(c = 4,{\text{fixed}}\,m_{t} = 2,\alpha = 0\). The pre- and post-classified images are shown in Fig. 5a, b. The change map showing the change in vegetation class is shown in Fig. 5c. The slope and aspect were derived from the DEM data. The slope and aspect classified images are shown in Fig. 6. The final landslides are detected by applying the rule set on the candidate landslides. The final landslide detection results are shown in Fig. 8.

Fig. 4
figure 4

Response of various spectral indices on a pre-landslide image and b post-landslide image

Fig. 5
figure 5figure 5

a Classified pre-earthquake image, b classified post-earthquake image, c classified change map

Fig. 6
figure 6

a Classified aspect image, b classified slope image

5 Accuracy assessment

The sampling strategy for collecting ground data for accuracy assessment is an important step in classification. Some analysts continue to perform error evaluation based only on the training pixels used to train or seed the classification algorithm. But the location of the training sites is not random and is also biased by the analyst’s a priori knowledge of where certain land use/land cover types existed in the scene. The purely random technique is also not practical as it ignores the smaller categories. For these reasons, stratified random sampling is usually used so that the sampling points are fairly spread in each of the classes (Congalton and Green 2008). The error matrix of the pre- and post-classified images is given in Tables 2 and 3, respectively. The overall accuracy and kappa coefficient values show that the performance of the method is quite satisfactory. A number of accuracy elements such as overall accuracy, producer’s accuracy, user’s accuracy and kappa coefficient are computed from the error matrix. The accuracy assessment of the classification results in this paper was done by the method combining stratified random sampling. A total of 256 reference points were chosen using stratified random sampling. The confusion matrices and the various assessment elements for both pre- and post-landslide classified images are given in Tables 2 and 3, respectively. The qualitative analysis of the result is done by overlapping the identified landslides on the original image. Figure 7 shows the landslide inventory map prepared by National Remote Sensing Centre. The accuracy of the results is validated by mapping the detected results (Fig. 8) with the landslide inventory map (Fig. 7).

Table 2 Error matrix for pre-landslide image
Table 3 Error matrix for post-landslide image
Fig. 7
figure 7

Source: NRSC, http://bhuvan.nrsc.gov.in/bhuvan/PDF/sikkim_earthquake.pdf

Distribution of co-seismically generated landslides within an area of 2000 sq. km in Sikkim from satellite data.

Fig. 8
figure 8

Landslide identification result

The overall accuracy is 96.10 and 96.48 % for pre- and post-landslide image, respectively. The value of kappa coefficient is 0.9254 for pre-landslide and 0.9363 for post-landslide image. The high overall accuracy and the value of kappa coefficient show that the results obtained are satisfactory. The accuracy in terms of number of landslides detected is also computed. A total of 274 landslides were manually detected by visual interpretation of high-resolution imagery. The proposed method correctly identified and classified 260 landslides, while 14 landslides remained undetected and 8 landslides are wrongly identified. Based on this data, the accuracy of the method is 94.8 %, the omission error is 5.10 %, the commission error is 2.91 %, the largest landslide identified is 0.69 km2, and the smallest landslide identified is 0.04 km2.

6 Conclusion and discussion

In this paper, a semiautomatic method for landslide detection using satellite images and terrain data is presented. The pre- and post-landslide images are classified into four land cover classes using the GIFKCN classifier. The classifier is trained from spectral indices NDVI, NDBI and NDWI. The change in vegetation class is used to identify the candidate landslides. The candidate landslides are validated using the rule set based on slope and aspect values. The following advantages of GIFKCN have been observed.

  1. (a)

    Sequential data feeds GIFKCN updates the centers after each training epoch. Thus, GIFKCN works parallel and is independent of the feeding sequence.

  2. (a)

    Complexity GIFKCN is less complex as compared to KCN, and due to its parallel nature, it has a complexity of O(t*).

  3. (c)

    Termination KCN always iterated to its maximum iteration number. However, due to the stopping criteria used in GIFKCN, it stopped when the optimal result is obtained making it faster.

The method is applied on bitemporal images of Sikkim, India, to identify the September 18, 2011, earthquake-induced landslides. The accuracy assessment in terms of number of landslides identified is computed, and it is observed that 94.8 % landslides are correctly identified and the omission and commission error is 5.10 and 2.91 %, respectively. The largest landslide identified is 0.69 km2 in size, and smallest landslide identified is 0.04 km2.