1 Introduction

Very severe cyclonic storm Bilis struck the southeast coast of mainland China on July 14, 2006. The torrential rainfall accompanied by the typhoon triggered widespread floods, landslides, and debris flows, which significantly damaged the village of Zixing, in the Hunan province as shown in Fig. 1 (Xu et al. 2011). More than two thousand debris flows and shallow landslides were induced by this heavy rainfall event. This catastrophic event damaged over 31,000 houses and led to over 345 fatalities, and about 89 people missing cases were reported in Hunan province. The unprecedented flood was estimated to have a 100-year return period. In all, this typhoon was responsible for 654 deaths and 208 missing and over USD 2.5 billion in damage to southeastern China (Xinhua, July 17, 2006).

Fig. 1
figure 1

Trajectory of the Typhoon Bilis in 2006: a red rectangle represents a severely damaged location in the study area (created using ArcGIS 10.4 software—data source from Japan National Institute of Informatics)

In the wake of increasing landslide activities and associated hazards following the changes in the global climatic system, it is necessary to investigate the landslide characteristics and assess the landslide-prone area to mitigate damages associated with them.

Landslides are typical of mountainous terrains and are hazardous for people’s life and habitat. Over the last few decades, it has been observed that the frequency of landslide occurrence is increasing worldwide (Petley 2012; Dou et al. 2015a, c; Zhu et al. 2017). Many mass movements have been induced by the rainfall accompanied by the typhoons that caused substantial loss of life and damage around the worldwide (Jebur et al. 2014; Wang et al. 2015; Dou et al. 2017). For instance, super Typhoon Haiyan in the year 2013 devastated the Leyte region in the Philippines with damage amounting to more than USD 2 billion (Rabonza et al. 2016). Heavy rainfall struck during the Typhoon Wipha on October 16, 2013, in the Izu Oshima Island in Japan, located about 100 km south of the Tokyo triggered many landslides, caused at the least 35 deaths and nearly 50 people missing (Ministry of Land Infrastructure and Transport and Japan-MLIT, 2013). According to MLIT, the torrential rainfall hit on August 2014 in Hiroshima city triggered 166 slope failures, caused 74 deaths, and damaged 429 houses (MLIT 2014; Wang et al. 2015). During June 15–17, 2013, the cloudburst in Uttarakhand state in India triggered numerous landslides and caused the death of 6074 people and widespread damages to cultural properties (Martha et al. 2014). United State Geological Survey reports that an average of 25–50 people is killed by landslides each year in the USA (USGS 2019). In Italy, more than 7500 square miles of land areas are identified as high-risk zones for landslides (Parsons and Lister 2019). China, one of the largest countries with diverse topography, has no exemption to landslides. In fact, China has suffered from the most serious landslides in the past century that caused many human lives and economic destruction (Petley 2012). Historical records show that more than ninety thousand hazards associated with landslides have been recorded in several regions of China (Huang 2007). Southwestern part of the country is close to China Sea which is at high risk to landslides owing to the increased typhoon activities. Li et al. (2017) pointed out the number of rainfall-induced landslides in China has risen over to 90% of the total number of landslide events compared with last decades. Understanding these hazardous landslides and debris flows induced by heavy rainfall events has become an important and urgent issue in the view of emergency activities (Dou et al. 2015c; Wang et al. 2015).

Landslide spatial distribution in any region is influenced by physical rules that can be analyzed with the empirical, statistical, or deterministic approach (Reichenbach et al. 2018). Numerous models have been successfully applied for landslide susceptibility mapping worldwide (Youssef and Pradhan 2014; Chen et al. 2016; Camilo et al. 2017; Chen et al. 2018; Dou et al. 2018; Pham et al. 2018). In the early days, landslide susceptibility mapping is carried out using qualitative approaches (knowledge-driven methods). The pioneering works on data-driven methods and physically based models are dated to the late 1970s and early 1980s (Neuland 1976; Carrara 1983). In comparison with knowledge-driven methods, the latter one minimizes the subjectivity and attains reproducibility (Bui et al. 2011; Zêzere et al. 2017). The extensively used data-driven methods in susceptibility mapping are bivariate statistical analysis (BSA), binary logistic regression (BLR), artificial neural network (ANN), and support vector machines (Bui et al. 2012; Arnone et al. 2014; Dou et al. 2015b; Arnone et al. 2016; Pham et al. 2019). All of these techniques rely on a few assumptions (Rabonza et al. 2016; Dou et al. 2019a, b). One of the basic assumptions is “past is the key to future.” Therefore, bivariate statistical methods estimate landslide probabilities based on relationship analysis between historical landslide events and geo-environmental conditions inferred from heuristic investigations. The accuracy of such statistical techniques depends on the completeness of landslide inventory used to prepare the model. Landslide inventories can be either points (centroids of the landslide area or rupture zone) or polygons (Dou et al. 2014; Pham et al. 2018). Nowadays with the high-resolution imageries, polygon type is the most preferred. For achieving the likelihood ratio, landslide density analysis over the studied portion has to be established.

As per the literature, BSA and BLR are considered to be the most frequently used methods for the assessment of the likelihood of landslide occurrence at regional scales (Shahabi et al. 2014). Reichenbach et al. (2018) reviewed eighteen different landslide susceptibility models published over the last three decades and reported that logistic regression topped the chart accounting 18.5% of all the occurrences. The merit of BLR over other multivariate analysis methods is that it is independent of data distribution and can handle a variety of datasets such as continuous, categorical, and binary data (Bui et al. 2011). However, the BLR model has little to no predictive value, if a set of irrelevant independent parameters are involved. Because of such constraints, predicting landslide susceptibility needs a distributed model that ascertains all the relevant independent aspects of the method used. Effective landslide susceptibility mapping, therefore, requires optimal predisposing factors as input to the LSM models. In LSM studies, selecting landslide-predisposing factors and their classes are key points. However, most scholars arbitrarily and subjectively selected the predisposing factors including geological, anthropogenic, geomorphological, and hydrological factors. There is no standard law to select predisposing factors. Hence, we address this issue by presenting the certainty factor (CF) model that has been applied to landslide factors. CF is a method using rule-based expert systems to handle certain problem classes.

The understanding of landslide mechanism of the rainfall-triggered event over a reservoir watershed is useful for geological disasters and warning systems. Several researchers have studied the impacts of tropical cyclones from the hydrological process in reservoir watersheds (Xu et al. 2011; Zou et al. 2013); however, to our knowledge, few studies have paid attention to the characteristics of rainfall-triggered landslides by tropical typhoons and assessment of landslide susceptibility in this study region. This study, therefore, focused on addressing: (1) characteristics of the landslides triggered by the extremely heavy rainfall even for the Dongjiang Reservoir Watershed, Hunan province, China; (2) constructing the event-based landslide inventory map using multi-high-resolution satellite images; (3) optimization of the best predisposing landslide factors using the CF model; (4) comparison with the LSM maps implemented by ensemble models and validation of the models.

2 Study area

The study area, Dongjiang Reservoir, which is situated in the southeast of Hunan Province, China, is an area vulnerable to heavy rainfall during the tropical cyclone seasons (Fig. 2). The elevation of the study area varies between 78 m a.s.l. and 1868 m a.s.l. with an average of 540 m. Three distinct geomorphological units represent the entire study reach: hills and valleys, hilly plains, and the Luoxiao Mountains near the eastern and southern borders. Geologically, the area is composed mostly of Paleozoic sedimentary and metamorphic rocks (sandstone, limestone, and slate) which were invaded by granitic rocks in places. The granitic rocks are severely weathered and thus are subjective to failure. The weathered soils are mostly composed of highly oxidized laterite, prone to erosion. Land use/cover in the study area is characterized by small-scale agro-industrial activities like a plantation, and paddy farming, and settlements. The case study area falls within the humid subtropical monsoon climate region. The mean annual precipitation is about 1932 mm (1953–2004), 80% of which occurred during the rainy months of March to August, typically influenced by cyclones. Each year numerous cyclones hit the province and cause severe damage to life and property in the region. The most recent one is tropical cyclone Mangkhut, which killed 2 people on September 16, 2018. Months before Mangkhut landfall, another cyclone Typhoon Ewiniar has brought torrential downpours recording over 250 mm of rain in 24 h, June 8 to 9, 2018. Wang et al. (2008) studied the extreme precipitation patterns in the Dongjiang River Basin using statistical parameters and noticed significant changes in several annual extreme flood flow and monthly precipitation processes in the region.

Fig. 2
figure 2

Dongjiang Reservoir Watershed study area. a Location map of China. b Map of the study area with rain gauge distribution. c Distribution of shallow landslides on the elevation map derived from a 30-m DEM. d The lower map is the enlarged area of showing the landslide boundary

The Dongjiang Reservoir is the biggest reservoir in the south of Hunan Province, which covers a water area of 160 km2 and has a capacity of 8.12 × 109 m3. Owing to the intense rainfall triggered by the Typhoon Bilis in 2006, thousands of sediment-related disasters, including numerous slope failures (shallow landslides) and debris flows occurred and were identified from the high-resolution 0.6 m QuickBird images, China–Brazil Earth Resources Satellite (CBERS) images (20 m), and field surveys (Fig. 3). The torrential rainfall event associated with Typhoon Bilis caused 246 deaths, 95 missing, and more than 300 million US dollars of economic loss just in and around Zixing City. Damages for destroyed or buried buildings by debris flows were serious. Flash floods also inundated the short and steep rivers in the hilly areas.

Fig. 3
figure 3

Rainfall-induced landslides by the Typhoon Bilis. a Examples of shallow landslides (white arrows) and debris flow associated with boulder deposition; b seriously damaged houses; c rainfall-triggered debris flows caused substantial deposited materials, such as various sizes of boulder; d destroyed crops

3 Data source

Rainfall data from the local records of 21 rain gauges in and around the Dongjiang Reservoir area were used to analyze the rainfall characteristics of the major rainstorm. Typhoon Bilis was a strong tropical storm with severe precipitation in a short duration, whose trail is shown in Fig. 1, and it landed on the coast of Fujian Province, China, on July 14 2006, with the maximum wind speed of 108 km/h. Then, it weakened into a tropical storm and moved westward and north-westward at the speed of 10–15 km/h until July 16 2006, when it disappeared in Hunan Province.

The rainfall observation data from the rain gauge networks around the reservoir on 14–15th July are displayed in Fig. 4. The Longxi rain gauge shows the maximum rainfall with a total 36-h rainfall of 507 mm and total monthly rainfall of 826 mm. One of the rain gauge data from Xingning was plotted as shown in Fig. 5. In 48 h, the cumulative rainfall in Xingning is more than 400 mm. The incremental rainfall of Xingning at 15–18 UTC was approximately 180 mm. More than 1600 landslides occurred when the accumulative rainfall reached 340 mm. Figure 6 shows the rainfall contour diagram of the Dongjiang Reservoir area in 36 h on July 14th–16th. The reservoir watershed area totally received a rainfall amount of around 6.6 × 108 m3, leading to a reservoir depth increase of 4.66 m. The reservoir was severely affected by the heavy rainfall in a short time.

Fig. 4
figure 4

Graphs showing rainfall around Dongjiang Reservoir in the month of July 2006 (top) and 36 h of rainfall between July 14 and 16, 2006 (bottom). The water level of the Dongjiang Reservoir increased to about 7.73 m during July 14–19, 2006

Fig. 5
figure 5

Hourly and cumulative rainfall recorded by rain gauge in Xingning around the Dongjiang Reservoir

Fig. 6
figure 6

Rainfall contour diagram of the Dongjiang Reservoir area in 36 h on July 14–16, 2006. The reservoir area suffered from the total rainfall of around 6.6 × 108 m3, leading to a reservoir depth increase of 4.66 m

The landslide inventory map is constructed through a combination of satellite image—interpretation of before and after the event (0.6-m QuickBird and 20-m CBERS) as listed in Table 1 and fieldworks. In order to identify the landslides triggered during the Bilis event, we firstly interpreted and mapped the landslides visible in the pre-event satellite CBERS imageries in a GIS environment. Following this, post-event satellite imageries from CBERS have interpreted for mapping all the landslides in the study area that have triggered before and after the event. Finally, high-resolution QuickBird images of December 2007 are interpreted and mapped for accurately delineating the boundary of landslide polygons. Then using analysis (erase function) toolbox in ArcGIS, landslide polygons of Bilis event are extracted from the entire database, assuming that no further landslides have occurred past the Bilis event till October 2006. This assumption is based on the fact that no major typhoons are reported in the study area during this time period. In this way, we built the entire database of landslide inventory from 2000 to 2009 as well as event-based landslide Atlas. A total of 2207 landslide polygons are mapped for the Bilis event from the interpretation of satellite imageries as shown in Fig. 7. The polygon data were then converted to landslide points. As more than 50% of the landslides in the study area are less than 10,000 m2, the centroid technique was applied to deal with the transformation of landslide polygon to point. Although many studies have pointed out the lower accuracy in LSM while using point technique rather than landslide polygons (Simon et al. 2013), several other studies favor usage of centroid points for fast, easy to use, and automated LSM mapping (Bui et al. 2012; Chen et al. 2016; Pham et al. 2018). The landslides mostly located around the upper catchment of Dongjiang Reservoir corresponded with the zonal distribution. Field observations reveal the type of landslides as shallow landslides. The landslide density is approximately 8.2/km2. Topographic data for analyses such as slope, aspect, and curvature are derived from the 30 m ASTER GDEM (version 2). In this case study, based on the analysis of landslide inventory map and availability of data, a total of 12 landslide-predisposing factors were prepared, namely elevation, slope angle, slope aspect, curvature, plan curvature, profile curvature, drainage density, distance to drainage network, stream power index (SPI), compound topographic index (CTI), 36-h cumulative rainfall, and lithology.

Table 1 Collected pre- and post-images in the study area
Fig. 7
figure 7

Examples of the construction of landslide inventory-based satellite images and field survey, landslides mainly occurred in the four types of land use: a near the roads, b in the plantation area, c near the reservoir, and d in the slope surface in the hilly area

4 Methodology

4.1 CF model for selecting predisposing factors

The certainty factor (CF) model is a rule-based expert system developed by Shortliffe and Buchanan (1975) for managing uncertainty in computational fields. When comparing with other models, CF can provide probable favorability functions for incorporating heterogeneous data (Chung and Fabbri 1993). The CF weight can be computed by the subsequent functions:

$${\text{CF}} = \left\{ {\begin{array}{*{20}l} {\frac{{P_{a} - P_{s} }}{{P_{a} \left( {1 - P_{s} } \right)}}} \hfill & {{\text{if}}\, P_{a} \ge P_{s} } \hfill \\ {\frac{{P_{a} - P_{s} }}{{P_{s} \left( {1 - P_{a} } \right)}}} \hfill & {{\text{if}}\, P_{a} < P_{s} } \hfill \\ \end{array} } \right.$$
(1)

Here \(P_{a}\) is the conditional likelihood of landslides in class \(a\) and \(P_{s}\) is the prior likelihood of a total number of landslides in the case study area. The CF values vary between − 1 and 1, and it indicates a measure of belief in the outcome (Lucas 2001). A positive CF value measures decreasing uncertainty, whereas negative values indicate an increasing uncertainty of landslide occurrence. If CF value is closed to 0, no information on the certainty is indicated. Once the CF values for classes of the predisposing factors are obtained, these factors are then integrated pairwise using the combination rule (Binaghi et al. 1998) as follows:

$$Z = \left\{ {\begin{array}{*{20}l} {{\text{CF}}1 + {\text{CF}}2 - {\text{CF}}1{\text{CF}}2} \hfill & {{\text{CF}}1, {\text{CF}}2 \ge 0 } \hfill \\ {{\text{CF}}1 + {\text{CF}}2 + {\text{CF}}1{\text{CF}}2} \hfill & {{\text{CF}}1,{\text{CF}}2 < 0 } \hfill \\ {\frac{{{\text{CF}}1 + {\text{CF}}2}}{{1 - \hbox{min} \left( {\left| {{\text{CF}}1} \right|, \left| {{\text{CF}}2} \right|} \right)}}} \hfill & {{\text{CF}}1, {\text{CF}}2, {\text{opposite signs}}} \hfill \\ \end{array} } \right.$$
(2)

where CF1 is a value in class 1, and CF2 is a value in class 2.

The pairwise combination is performed until all the CF layers are brought together, and the predisposing factors are optimized by computing the Z values. If the Z values are positive, we favor those factors have high correlations with landslide occurrence. Based on the range of CF values, predisposing factor weights were acquired. The weights are assessed as the sum of the ratio relative to those predisposing factors that provide a measurement of certainty in predicting landslides (Binaghi et al. 1998). According to the computed results, CF weights are then classified into six classes as shown in Table 2 (Binaghi et al. 1998).

Table 2 Weight classification based on the range of CF values

4.2 Bivariate statistical analysis

Van Westen et al. (1997) proposed the bivariate statistical analysis (BSA) method, which is based on the assessment of the relationship of a landslide inventory map and predisposing factors. In the BSA method, the weight for each class of the landslide-predisposing factors was initially determined. Landslide susceptibility indexes were then computed by summing up the weights. The weight (Wi) of each class i is defined as the natural logarithm of the landslide density in the class over the landslide density in the predisposing factor map as listed (van Westen et al. 1997):

$$W_{i} = \ln \left( {\frac{{{\text{Density\_landslide}}}}{{{\text{Density\_area}}}}} \right) = \ln \left( {\frac{{{\raise0.7ex\hbox{${N_{i,j} }$} \!\mathord{\left/ {\vphantom {{N_{i,j} } {A_{i,j} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${A_{i,j} }$}}}}{{{\raise0.7ex\hbox{${N_{l} }$} \!\mathord{\left/ {\vphantom {{N_{l} } {A_{T} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${A_{T} }$}}}}} \right)$$
(3)

where Wi is the weight given to an ith class of a certain thematic layer (e.g., limestone in the thematic layer—lithology); \({\text{Density\_landslide}}\) is the landslide density within the entire thematic layer; \({\text{Density\_area}}\) is the landslide density of the whole factor study area for all classes; \(N_{i,j}\) is the number of landslide pixels in the class j of the predisposing factor i; \(A_{i,j}\) is the total area of the class j of the predisposing factor i; \(N_{l}\) is the total number of landslides; and \(A_{T}\) is the pixels in the entire study area.

Finally, the LSM by BSA model was generated by the following equation:

$${\text{LSM}}_{{\varvec{w}_{\varvec{i}} }} = \left( {W_{1} } \right) + \left( {W_{2} } \right) + \left( {W_{3} } \right) + \cdots \left( {W_{\text{i}} } \right)$$
(4)

4.3 Binary logistic regression

Binary logistic regression (BLR) is one of the well-known multivariate analytical methods in the field of LSM assessment during the last decade (Chauhan et al. 2010; Dou et al. 2018). The BLR method is suitable for forecasting the presence or absence of a characteristic outcome from a set of parameters (Devkota et al. 2013). Here, we do not use the ordinary least squares regression (OLS) because of three problems: (1) the error terms are heteroskedastic; (2) the error terms are not normally distributed; (3) the predicted probabilities can be larger than 1 or less than 0. In this study, the purpose of BLR is thus to simulate the relationships between a dependent variable and multiple independent parameters (Bui et al. 2011). The advantage of BLR is that it does not compulsorily need normal distribution data. In addition, both continuous and discrete data can be used as an input for the BLR model.

The dependent parameter (Y) in the BLR method is a function of the possibility and can be calculated as follows (Lee and Pradhan 2006):

$$Y = \frac{1}{{1 + e^{ - z} }}$$
(5)

where \(Y\) is the estimated likelihood of landslide occurrence and ranges [0 1]; \(z\) is the weighted linear combination of the independent parameters.

To linearize the stated model as well as eliminate the 0/1 boundaries for the dependent parameter, the estimated \(Y\) is transformed by the following equation:

$$Y^{\prime} = { \ln }\left( {\frac{Y}{1 - Y}} \right)$$
(6)

This modification is referred to as the logit transformation. Theoretically, the logit transformation of binary data can confirm that the dependent parameter is continuous and the logit transformation is limitless. Additionally, it can ensure that the likelihood surface can be continuous under [0, 1]. By means of the logit transformations, the standard linear regression models can be written by the following equation:

$$Y^{'} = \ln \left( {\frac{Y}{1 - Y}} \right) = \beta_{1*} x_{1} + \beta_{2*} x_{2} + \cdots + \beta_{n*} x_{n} + \alpha$$
(7)

where \(\alpha\) is the intercept of the equation, \(\beta_{1,} \beta_{2,} \ldots \beta_{n}\) denotes the slope coefficients of the independent parameters. Landslide or non-landslide as the dependent determined the approximate equation that is meaningful at 0.01% error level.

5 Results

5.1 Characteristics of landslides triggered by the Typhoon Bilis

To investigate the landslide-predisposing factors contribution in the initiation of landslides, the landslides occurred in the case study area were interrelated with those factors contributing to landslide occurrence. These predisposing factors include elevation, slope angle, slope aspect, curvature, plan curvature, profile curvature, drainage density, distance to drainage network, SPI, CTI, cumulative rainfall, and lithology. Figure 8 shows the results of landslide frequency analysis that examines the relationships between landslide occurrence and the predisposing factors. The relationship of landslide frequency with elevation is shown in Fig. 8a. It can be seen that landslides (43.15%) mostly occurred at the intermediate elevation (320–400 m) taken a proportion of 29.11% total area. At the following elevation class (400–500 m), landslide frequency is around 21%. The results suggest that landslides are frequently in the middle elevations; this is because the area ratios in the middle elevations are greater than those in the higher elevations.

Fig. 8
figure 8figure 8

Relationships between landslide frequency and the predisposing factors: a elevation; b slope angle; c slope aspect; d curvature; e plan curvature; f profile curvature; g drainage density; h distance to drainage network; i SPI; j CPI; k accumulative rainfall; l lithology

Slope angle plays an important role in the occurrence of landslides. On a relatively flat slope (0°–5°), the force of gravity acts directly downward. Thus, the material remains on the flat slope and it will not move under the force of gravity, whereas on a steeper slope, the shear stress or tangential component of gravity increases, and the perpendicular component of gravity decreases (Dou et al. 2014). As observed in this study, the landslide frequency in the slope classes 10°–15°, 15°–20°, and 20°–25° is 22.14%, 20.44%, and 16.82%, respectively, as shown in Fig. 8b. It could also be seen that gentle slope angles have a relatively lower frequency of landslide occurrence due to the lower shear stress at the slope angles 0°–5° (Fig. 8b). The decrease in the frequency of landslides in steeper slope classes is attributed to the decrease in the percentage of an area ratio in that particular class.

Aspect that describes the orientation of slope is an important factor attributing the regions insolation, vegetative growth, soil moisture conditions and wind velocity (Aksoy and Ercanoglu 2012) and hence regarded as a highly important predisposing factor in LSM (Carrara 1983; Camilo et al. 2017). Also, when the hillsides suffer from the dense precipitation to reach saturation, it influences the infiltration capacity of the slope controlled by some parameters including the constitution of soil, permeability, and pore water pressure. With regard to the slope aspect, landslides mostly occurred among the east-, southeast-, south-, southwest-, and west-facing direction as shown in Fig. 8c. The results indicate that from east to west is greatly prone to landslide occurrence. The largest landslide frequency (22.59%) occurred along the southeastern slope direction, followed by south slope direction (20.84%). On north-facing slope direction, the landslide frequency is comparatively less. This is in agreement with many previous studies which states that north-facing slopes are favorable for the enhanced growth of vegetation (Olivero and Hix 1998; Ghimire et al. 2011; Måren et al. 2015). The higher solar radiation received in the south-facing slopes may dry out the vegetation cover faster and hence induces more landslides.

Figure 8d shows that landslides (37.37%) are mostly concentrated at the 0–2 class for the curvature, followed by the −1–0 class with a landslide frequency of 27.97%, while for the profile curvature, landslides mostly occurred at −2–0 class and −4–2 class (Fig. 8f). The curvature of the hillside in the horizontal plane is the plan curvature of that surface. Based on the hillsides, the plan curvatures are subdivided into concave (hollows), convex (noses), and flat (planar) regions. As for the plan curvature as shown in Fig. 8e, the landslides generally occur in the concave slope because it strengthens the soil moisture and causes the land sliding. However, in this study, the flat and convex slopes show higher landslide frequency than concave slopes. One reason is probably that hilly ridges in Dongjiang Watershed could be likely to collapse because of the impact of human activities (building the reservoir) causing the higher ground acceleration. The other reason may be that the dropped intense rainfall flashed the surface of the hill slope; thus, the rainfall could not accumulate too much in short time.

Drainages undercut the hill slopes as the intensity of flow increases, thus resulting in increased landslide frequency with a higher drainage density. For example, drainage density and erosion rates in steep Japanese mountains are negatively correlated due to active landslides (Oguchi 1997). Several scholars have therefore studied the interrelationship of landslides and geomorphological characteristics of drainage networks (Hovius et al. 1998; Dou et al. 2015c). In this location, the Dongjiang River flows into the reservoir. It has been observed in our study area that the landslides mostly occurred at 1–1.4 m−1 and decreased further in proportion to the area ratio (Fig. 8g). For the distance to drainage network factor, the landslide highly occurred at 130–280 m followed by less than 130 m (Fig. 8h). With the increase in distance to the drainage network, the landslide frequently usually decreases because the topography change induced by erosion might influence the landslide initiation.

In the case of hydrological predisposing factors, SPI (the measures of the erosive power of overland flow) and CTI (soil wetness: topographic control on hydrological processes), landslides highly occurred at < − 6 and at < − 2 category, respectively, as shown in Fig. 8i, j. Rainfall increases the weight to the slope by seep into the bedrock beneath and replaces the pore space or fractures. This added weight force leads to an increase in stress and induces slope instability. Rainfall also induces a change in the angle of repose. In landslide studies, accumulated rainfall is considered as an important factor rather than simple rainfall statistics (Li et al. 2017). For the accumulative rainfall factor, landslides mostly occurred at the 320–345 mm, followed by 345–360 mm. The landslides also easily occurred at over 375 mm because it takes a relatively small percentage of the total study area in terms of this class.

Lithology is considered, landslides (around 50%) mostly occurred at the biotite adamellite type (one of the granite types), followed (about 20%) by the sandstone and slate type, and then by the limestone (about 16%). As mentioned previously in Sect. 2, the granitic rocks are highly weathered and are susceptible to failure. The sandstone type contains enough pore space to accumulate more rainfall that can saturate rock and increase its weight. Water also enters into the bedrock below through the bedding plane and ultimately reduces the cohesion. Similarly, the slate rocks which contain clay minerals generally tend to have a low shear strength and will be the most likely place for failure to occur, especially if the layer dips in a down-slope direction. Limestone units may have caverns and be leached in the rock due to chemical weathering by groundwater.

5.2 Predisposing factor selection for LSM maps

The results of the correlation analysis between the landslide occurrence and predisposing factors for the Dongjiang Reservoir area are shown in Table 3. The result of CF analysis shows that the Z value is positive for slope angle (0.25), curvature (0.82), plan curvature (0.21), drainage density (0.96), distance to drainage network (0.11), accumulative rainfall (0.97), and lithology (0.47) as shown in Fig. 9. The Z values are negative for the other factors. Hence, these seven factors are selected for producing LSM maps. This result also shows that the occurrence of landslides in the study area is mainly affected by some predisposing factors. Even Z values between those factors are different; they all contribute to a certain extent in the landslide occurrence. We conducted the objective method of CF analysis to avoid the “ghost effect” and get appropriate factors for modeling LSM maps.

Table 3 Spatial relationship between the predisposing factors and landslide occurrence based on the CF and BSA methods
Fig. 9
figure 9

Calculation of CF values in the Dongjiang Reservoir Watershed

5.3 Mapping landslide susceptibility using BSA

The correlations between the landslide occurrence and predisposing factors using BSA are represented in Table 3. Two landslide susceptibility maps were generated: (1) using the seven selected factors (CF > 0) and (2) using all the original 12 factors (Fig. 10). Based on the natural breaks, the susceptibility level was divided into six classes, i.e., extremely low, low, moderate, high, very high, and extremely high. Visual interpretation reveals that there are much more red color areas (very high susceptible class) in Fig. 10b, whereas there are more dark blue areas (very low susceptible class) in Fig. 10a. Quantification of the same as shown in Fig. 11 and Table 4 reveals that 90.84% of the total landslides occurred in the 52.56% of the area which are classified as high, very high, and extremely high susceptibilities when the original factors were used, while 51.73% of the total landslides occurred in the 92.03% of the area which are classified as high, very high, and extremely high susceptibilities if the optimized seven factors were used (Fig. 12 and Table 5).

Fig. 10
figure 10

LSM maps produced by the BSA method: a selected seven factors and b original 12 factors. Maps show the spatial probability of landslide occurrence in six classes. The upstream of the reservoir is located at the lower left corner of the map

Fig. 11
figure 11

Susceptibility class distribution within the study area and the occurrence of landslides according to the classification scheme for LSM using the BSA method with the original 12 factors

Table 4 Result of statistical analysis concerning landslide susceptibility from the BSA method with the original 12 factors
Fig. 12
figure 12

Susceptibility class distribution within the study area and the occurrence of landslides according to the classification scheme for LSM using the BSA method with the selected seven factors

Table 5 Result of statistics analysis concerning landslide susceptibility from the BSA method with the selected seven factors

5.4 Mapping landslide susceptibility using BLR

The forward stepwise BLR approach was used to incorporate the predictor variables using the SPSS 20 software. The training dataset (1545 of total landslides) represented by points was assigned the value of 1. The same number of non-landslide points was randomly sampled from the landslide-free area and assigned the value of 0. The result based on all original factors is shown in Table 6. According to this table obtained by logistic regression, all the predisposing factors have a P value less than 0.05, indicating a statistical correlation between factors and the susceptibility of landslides at the 90% confidence level (Bui et al. 2011). Based on the equation, the occurrence of landslide probability (P) can be computed as mentioned before.

Table 6 Coefficients, statistics of the factors with all factors used in the BLR equation

Lastly, the regression coefficients of the predictors, GIS, and the natural break criterion were used to generate the landslide susceptibility maps (Fig. 13). In the maps, there are places where differences are subtle but also areas with obvious dissimilarities. There are more red colors in the map when using all factors, which segregate at the very high and extremely high ends of the color ramp than the seven-factor counterpart. The map from the seven factors is less heterogeneous. Figure 14 and Table 7 show that 95.51% of the total landslides occurred in the 66.73% of the area which are classified as high, very high, and extremely high susceptibilities if the all the original factors were used, while if the optimal seven factors were used 96.1% of the total landslides occurred in the 64.09% of the area which are classified as high, very high, and extremely high susceptibilities (Fig. 15 and Table 8).

Fig. 13
figure 13

LSM maps produced by the BLR method: a selected seven factors and b original 12 factors. Maps show the spatial probability of landslide occurrence in six classes. The upstream of the reservoir is located at the lower left corner of the map

Fig. 14
figure 14

Susceptibility class distribution within the study area and the occurrence of landslides according to the classification scheme for LSM using the BLR method with the original 12 factors

Table 7 Result of statistical analysis concerning landslide susceptibility from the BLR method with the original 12 factors
Fig. 15
figure 15

Susceptibility class distribution within the study area and the occurrence of landslides according to the classification scheme for LSM using the BLR method with selected seven factors

Table 8 Result of statistical analysis concerning landslide susceptibility from the BLR method with the selected seven factors

5.5 Accuracy estimation

For the verification, the total landslides were randomly divided into two groups, training data and validation data. The evaluation of the prediction skills of susceptibility models was made using receiver operating characteristics (ROC) curves and computing the receiver operating characteristic (ROC) plot of sensitivity (% of terrain units containing landslides that are correctly classified) and 1-specificity (% of terrain units containing landslides that are correctly classified). The ROC area under the curve (AUC) evaluates the overall performance of the landslide susceptibility models (Bui et al. 2011). As a rule, the closer the ROC AUC value to 1, the better is the landslide model performance (Shahabi et al. 2014). For the BSA method, AUC value (0.837) is higher when the optimal seven factors were used than 0.794 from all the original factors (Fig. 16a). For the BLR model, the AUC value of the prediction rate curve (84.8%) from the seven factors is higher than that from all factors (80.8%) as shown in Fig. 16b. Consequently, using the seven factors gives a higher accuracy than using all the original factors. In addition, BLR has a slightly higher accuracy than BSA.

Fig. 16
figure 16

a ROC curves for landslide susceptibility maps produced using BSA with the selected seven and original 12 factors; b ROC curves for landslide susceptibility maps produced using BLR with the selected seven and original 12 factors

6 Discussions

Devastating landslides as a result of intense rainfall are common in many places around the world every year. Predicting the exact locations of the instabilities and therefore landslide susceptibility assessment is rather difficult due to the uncertainty of the spatial and temporal distribution of rainfall. We investigated the landslide characteristics triggered during the torrential rainfall caused by Typhoon Bilis in the Dongjiang Reservoir Watershed region. In the study area, intense rainfall caused slope failures associated with severely weathered granite, resulting in numerous shallow landslides. While there are many factors that lead to landslides such as rainfall, slope, aspect, curvature, bedrock, drainage density, elevation, SPI, CPI are the important ones. Though the selection of factors is a fundamental step for landslide susceptibility evaluation, universal standard or rule to select the predisposing factors is absent (Dou et al. 2019ab). These issues are commonly addressed by GIS-based landslide susceptibility studies.

To address this problem, we proposed the CF method to select the principal factors. Different scholars use various landslide-predisposing factors for LSM. Using this method, we selected the predisposing factors highly related to landslide occurrence. Our study of rainfall-induced landslides in Dongjiang Reservoir Watershed can be applicable in many similar cases. The resultant improvement in the values of AUC validates our approach. The use of the optimized factors led to a higher accuracy than when all possible factors were simultaneously used. Spatial autocorrelation and data redundancy among the predisposing factors before optimization are the possible causes for this observation.

Analysis of CF suggests that drainage density and total curvature are important in the case study area besides the other common factors such as lithology and rainfall. Total curvature represents the morphological measurement of the topography (Lee and Pradhan 2006). A more upwardly concave or convex slope holds more water and keeps it longer, and these hydrological controls of topography are more expressed in mountainous areas and lower in the flat areas. Furthermore, another important factor that does not represent in this study is the location of the reservoir and its implications. During the heavy rains that drenched the area, the fluctuation of groundwater might have played a very important role in triggering landslides around the reservoir. The slopes tend to lose their stability due to the loss of suction under this circumstance. Previous studies have indicated precipitation, subsequent infiltration, groundwater circulation patterns, and the resultant increase in the hydrostatic pressures that have cumulated over long periods in triggering the landslides (de Montety et al. 2007; Ronchetti et al. 2009). Debieche et al. (2012) in their study pointed out that the influence of flow path and aquifer complexity in the hydrogeology of a landslide. Susceptibility assessments may also be influenced by other important factors such as lithology as noticed in the CF analysis. The weathering of granite bedrock provided a source for forming into the residual soil. Under the unsaturated conditions, residual soil depositions are probably the frequent prone to induce landslides associated with long-duration rainfall (Regmi et al. 2013; Yamagishi et al. 2004). The permeability and drainage characteristics of the area also affected the large-scale movement of boulders and sediments. Based on the degree of fracturing and weathering, the underlying rock could have acted as a sink or as a source for groundwater in the overlying landslide and should be very crucial for slope stability analyses. Studies in the Japanese archipelago by various researchers in granitic terrains of central Japan found that groundwater flowing in permeable weak and fractured rocks seeps into the overlaying unconsolidated sediments (Asano et al. 2003; Katsura et al. 2008), resulting in landslides.

Additionally, previous research by the authors in Sado Island, Japan (Dou et al. 2015c), has found that the drainage density, lithology, and slope angle are the typical factors. These findings also agree with the other studies around the world (Jebur et al. 2014; Dou et al. 2015c). For instance, the drainage density can provide an indirect measure of groundwater conditions that play an important role in landslide activity (Dou et al. 2015c). Thus, these landslide factors may be common to various areas in the world. We believe that our research findings differ from the others in a way that we provide a method to select and qualify the landslide-predisposing factors. The comparison of BSA and BLR with the support of respective AUC values suggests that logistic regression has a better performance than BSA. This conclusion is also in a good agreement with the other researchers around the world (Chen and Wang 2007; Devkota et al. 2013). However, both the BLR- and BSA-derived LSM maps ought to leave some stripping, called here as ghost effect (Fig. 16). These ghost effects can be largely attributed from the buffer zone reproduction of drainage density and distance to drainage networks. Saha et al. (2005) also reported ghost effects in their LSM because of structural discontinuity buffering while producing landslide nominal susceptibility factor for Himalayas. Nevertheless, the resultant prediction maps from data-driven models are very much helpful in emergency response and management of the Dongjiang region.

7 Conclusions

This study explores characteristics of landslides induced by the Typhoon Bilis. Due to the orographic effects, around the reservoir areas are likely to have received extremely high rainfall totals. Two main reasons are responsible for landslide event: (1) torrential rainfall at the high intensity and rainfall duration and (2) serious weathering rock formed into considerable sediment, thus combined with the water formation into mudslides downstream. Additionally, this research determines the usefulness of the CF model in identifying the fitted predisposing factors for LSM mapping. Based on the CF model, seven influencing factors with the high correlations to landslide occurrence were selected from a set of original factors. The LSM maps were then produced by the BSA and BLR methods for the CF-identified predisposing factors and the original set of factors. Both the success rate and prediction rate indicated for both the BSA and BLR methods that the seven factors obtain better results than that of all factors. In addition, we noticed that the maps prepared by using seven predisposing factors have much more homogeneous classes than the original factors. The proposed certainty factor method provides a useful way to select the predisposing factors of landslides in particular where data redundancy or scarcity is critical. The findings acknowledge that in the mountainous regions suffering from data scarcity, it is possible to select key factors related to landslide occurrence based on the CF models in a GIS platform. Moreover, in this research, BLR has slightly outperformed the others such as frequency ratio, BSA, which agrees with results from some other researchers in the world.

We believe that the results of our studies provide helpful information for disaster management, urban planning, risk mitigation, and related decision making in landslide-prone areas. For example, in the study areas, the resultant landslide susceptibility maps can be conducive to select appropriate locations for urban development to increase economic benefits and decrease future damages and loss of lives.