1 Introduction

Landslides are one of the most serious natural hazards that kill lots of people each year (Kirschbaum et al. 2015; Petley 2011). In order to accurately and quickly assess the landslide risk in a region, regional landslide risk assessment methods are often applied (Wu and Sidle 1995). Different from the risk assessment to a single landslide, which is often based on physical models and needs some specific physical parameters, such as geometry, shear strength (c, ϕ) and moisture content, depending on the model or software used (e.g., Duc 2013; Montgomery and Dietrich 1994; Thanh and De Smedt 2014; Van Westen and Terlien 1996; Gokceoglu and Aksoy 1996), the regional landslide risk assessment method is often based on the statistical models (including machine learning). Because lots of landslides may occur in a region, it is almost impossible to get the detailed physical parameters for each landslide (Van Westen and Terlien 1996). This is why the physical models of landslides are rarely used in the regional landslide risk assessment.

The statistical model is a kind of method based on statistical analysis of existing landslides, and then predicts future landslide risk (Bui et al. 2016). The basic assumption of this method is that the geographical and geological conditions of occurring landslide failures are more likely to occur in future landslides. Therefore, statistical methods usually require a large number of historical landslide data (i.e., landslide samples) to figure out harmful or triggering conditions for landslides (Bui et al. 2012c). The quality of landslide samples, the scale of maps and the features of statistical models together determine the accuracy of the outcomes. The common statistical models and machine learning adopted in landslides risk analysis include support vector machines (SVM) (Kavzoglu et al. 2014b), logistic regression (Atkinson and Massari 1998; Costanzo et al. 2014; Felicisimo et al. 2013; Kavzoglu et al. 2014a; Lee 2005; Pradhan and Lee 2010; Bui et al. 2011; Tunusluoglu et al. 2008), fuzzy logic analysis (Akgun et al. 2012; Ercanoglu and Gokceoglu 2002; Lee 2007; Pourghasemi et al. 2012; Pradhan 2011; Bui et al. 2012b); decision tree (Nefeslioglu et al. 2010; Pradhan 2012; Bui et al. 2012a, 2013a) and BP neural network (Lee et al. 2003a, b, 2004; Lu and Rosenbaum, 2003; Ermini et al. 2005; Gomez and Kavzoglu 2005). The above methods in the studies can get a good accuracy in the region with area of hundreds or thousands km2, but how these models perform (such as calculation speed and accuracy) in the region with area of more than tens of thousands km2 needs to be clarified. To meet the needs for large regional landslide risk analysis, it is essential to establish landslide risk assessment models which are suitable and effective for large regions, and some tests are also necessary.

Here, the large regional landslide risk analysis model means that the model is suitable for assessing an area over tens of thousands km2 and can ensure the accuracy and reduce the time cost. However, most models would face the problems such as accuracy decreasing and computation speed slowing with the assessment region getting larger (Cascini 2008). To solve the above problems, we compared all models discussed above and found that the backpropagation (BP) neural network shows better applicability of the landslides risk assessment in different regions and relies less on the scale of maps. But it is noted that the BP neural network used for landslide risk assessment needs more nodes in input layer and hidden layers. With the number of nodes increasing, the main advantages are: The BP neural network would get better accuracy and be effective for more complex problems. However, main disadvantages also appear: The initial weights and thresholds that are generated randomly between nodes in the neural network may reduce its accuracy or cause unreliability to the assessment results.

The objectives of this work are: (1) Use existing algorithms to improve the BP neural network for landslide risk assessment and clarify their applicability to the large and common region (more than tens of thousands km2) and (2) do a comprehensive landslide risk assessment of Sichuan Province, China, and draw landslide risk zoning maps. To achieve these objectives, this paper adopts genetic algorithm (GA) (Belew et al. 1992) and particle swarm optimization (PSO) (Changuhan et al. 2015; Aydln et al. 2013) for optimizing the initial weights and thresholds determination in the BP neural network, called GA-BP model and PSO-BP model for landslides risk analysis in the large region. Afterward, based on 100 typical historical landslides in Sichuan Province, China, this paper compares the accuracy of the BP, the GA-BP and the PSO-BP neural network models in the assessment of the landslide risk in Sichuan Province by using root mean square error (RMSE), coefficient of determination (R2), Kappa coefficient, receiver operating characteristic (ROC), training time and weights of condition factors. And then, according to the risk value from the three models, the risk maps of Sichuan Province are performed in the geographic information system (GIS), which can provide the fundamental maps of landslide risk for the engineering planning and construction of mountainous regions in Sichuan Province. At last, this study gives proposed models under different engineering needs, which increase the efficiency of the risk assessment for landslides in the large region.

2 Methodology

2.1 Landslide risk assessment model

2.1.1 BP artificial neural network model

Backpropagation (BP) neural network is a multilayer feedforward network which is trained by the error inverse propagation algorithm (i.e., BP algorithm), and it is one of the most widely adopted neural networks. The basic idea of the BP algorithm is that the learning process consists of both the forward propagation of signals and the reverse propagation of errors. The BP neural network has three geometric topologies: input layer, hidden layer and output layer (see Fig. 1a). Our landslide risk analysis models in this paper are all based on the BP neural network. The condition factors in landslide risk assessment determine the number of nodes of the inputs layers in the BP neural network, and the risk assessment values determine the number of nodes of the output layers (see Fig. 1b). The weights of the BP neural network can be divided into two parts, one being the weights from input layer to hidden layer (wik) and the other being the weights from the hidden layer to the output layer (wkj), and the thresholds are same as the weights, called threshold1 and threshold2, respectively. The weights and thresholds are updated over and over again during the training to fit the complex nonlinear relationships between condition factors and the risk, in which the initial weights and thresholds are important. However, the initial weights and thresholds are usually randomly generated and this can lead to unreliability of assessment results. To overcome this drawback, in this paper, the initial weights and thresholds are decided by the optimization algorithms, which are introduced as follows.

Fig. 1
figure 1

a The topology of the BP neural network, b the topology of the BP neural network in this paper, c the calculation process of the BP neural network

The calculation process of the BP neural network is drawn in Fig. 1c, and the other necessary parameters used in the BP neural network are shown in Table 1.

Table 1 The main parameters of the BP neural network

2.1.2 GA-BP neural network model

Genetic algorithm (GA) is a method for searching optimal solutions by simulating natural evolutionary processes. The optimized weights and thresholds provided by the GA can replace the randomly selected weights and thresholds in the BP neural network, which means optimizing the initial weights by the GA in advance and then using the BP neural network training for minor adjustments. The GA can repeatedly optimize the weights and thresholds of the BP neural network until the errors meet the allowable error; after that, the obtained parameters are very close to the optimal parameters, and then the BP neural network is adopted for final adjustment. The GA-BP model comprises the following steps:

  1. (1)

    Initialize the parameters of the BP neural network.

  2. (2)

    Extract initial weights and thresholds from the BP neural network, and encode them to form chromosomes with four genes in the GA, expressed as \((w_{ik} ,{\text{threshold}}_{1} ,w_{kj} ,{\text{threshold}}_{2} )\).

  3. (3)

    Calculate the fitness to each chromosome in the GA by using training data according to the fitness function F (Eq. 1)

    $$F = k\left( {\sum\limits_{i = 1}^{n} {asb(y_{i} - o_{i} )} } \right)$$
    (1)

    where n is the number of network output nodes, yi is the expected output of the ith node, oi is the actual output of the ith node, and k is coefficient.

  1. (4)

    Calculate the probability of each chromosome to be selected according to the fitness value (Eqs. 2 and 3). The higher the fitness value, the greater the probability selected.

    $$f_{i} = kF_{i}$$
    (2)
    $$p_{i} = \frac{{f_{i} }}{{\sum\nolimits_{j = 1}^{N} {f_{j} } }}$$
    (3)

    where Fi is the fitness value of the ith chromosome, pi is the selected probability, and N is the number of chromosomes.

  1. (5)

    Remain chromosomes with better fitness value according to the pi, and then carry out the process of crossover and mutation to the remaining chromosomes by means of the GA. The crossover that occurred between the kth chromosome (Akj) and the lth chromosome (Alj) can be described as in Eq 4.

    $$\begin{aligned} A_{kj} & = A_{kj} (1 - b) + A_{lj} b \\ A_{lj} & = A_{lj} (1 - b) + A_{kj} b \\ \end{aligned}$$
    (4)

    where b is a random number from 0 to 1.The mutation that occurred in the jth gene (aij) of the ith chromosome can be described as in Eqs. 5 and 6.

    $$a_{ij} = \left\{ {\begin{array}{*{20}l} {a_{ij} + (a_{ij} - a_{\hbox{max} } )*f(g)} \hfill & {r > 0.5} \hfill \\ {a_{ij} + (a_{\hbox{min} } - a_{ij} )*f(g)} \hfill & {r \le 0.5} \hfill \\ \end{array} } \right.$$
    (5)
    $$f(g) = r_{2} \left( {1 - \frac{g}{{G_{\hbox{max} } }}} \right)^{2}$$
    (6)

    where amax is the maximum of genes aij, amin is the minimum of genes aij, r2 is a random number, g is current number of the iteration, Gmax is the maximum number of the iteration, and r is a random number from 0 to 1.

  1. (6)

    One generation in the GA means a process of completing selection, crossover and mutation. After several generations, stop the computation if the computation error reaches the allowable error.

  2. (7)

    Replace initial weights and thresholds of the BP neural network with the optimized weights and thresholds computed by the GA.

The whole process is shown in Fig. 2a, and the other necessary parameters used in the GA are shown in Table 2.

Fig. 2
figure 2

a Flowchart of GA-BP model; b flowchart of PSO-BP model

Table 2 The main parameters of the GA

2.1.3 PSO-BP neural network model

Particle swarm optimization (PSO) is also a method of searching optimal solutions, and it is simpler than the GA (fewer calculation steps and parameters). The initial weights and thresholds from the BP neural network are defined as particles in the PSO, and all particles form a group. The basic idea of the PSO is to find the optimal solution through cooperation and information sharing among individuals in group. The combination of the PSO and BP neural network is similar to GA-BP model. The PSO-BP model consists of the following steps, and the flowchart is shown in Fig. 2b:

  1. (1)

    Initialize the parameters of the BP neural network.

  2. (2)

    Extract initial weights and thresholds from the BP neural network, encode them into a group of particles in the PSO (group size N), and obtain a position (Eq. 7) and a random velocity (Eq. 8) in each particle. Then, the fitness F to each particle is calculated based on Eq. 1.

    $$x_{i} = (x_{i1} ,x_{i2} , \ldots ,x_{iD} )$$
    (7)
    $$v_{i} = (v_{i1} ,v_{i2} , \ldots ,v_{iD} )$$
    (8)

    where D is the dimension of the space and equal to 4 in this paper.

  1. (3)

    For each particle, compare its fitness value with the best value it has obtained (i.e., pbest), and if it is better, replace the pbest with it.

  2. (4)

    For each particle, compare its fitness value with the best value which was obtained in the group (i.e., gbest), and if it is better, replace the gbest with it.

  3. (5)

    Update the velocity and position of each particle by means of Eqs. 9 and 10 in the PSO.

    $$v_{id}^{k + 1} = \omega v_{id}^{k} + c_{1} r_{1} \left( {pbest - x_{id}^{k} } \right) + c_{2} r_{2} \left( {gbest - x_{id}^{k} } \right)$$
    (9)
    $$x_{iD}^{k + 1} = x_{iD}^{k} + v_{iD}^{k + 1}$$
    (10)

    where c1, c2 is learning rate, r1, r2 is random number from 0 to 1, ω is the inertia weight, and pbest is the best fitness value a particle has obtained, gbest is the best fitness value which was obtained in the group.

  1. (6)

    Stop the computation if the error meets the allowable error.

  2. (7)

    Replace initial weights and thresholds in the BP neural network with the optimized weights and thresholds computed by the PSO.

The necessary parameters used in the PSO are shown in Table 3. Figure 3 shows the process of determining weights by GA and PSO. The main functions of these two algorithms are to prevent the wrong rank of weights and optimize the adaptability of weights. All computation processes are implemented in MATLAB R2017a.

Table 3 The main parameters of the PSO
Fig. 3
figure 3

Schematic diagram of initial weight improvement

2.2 Indices of evaluating the proposed models

2.2.1 Indices of evaluating the overall risk accuracy: root mean square error (RMSE) and decision coefficient (R2)

Root mean square error (RMSE) can describe the deviation between the estimated value and the true value and be an index of measuring accuracy for fitting (Eq. 11). The lower the RMSE, the higher the accuracy obtained from the model.

$${\text{RMSE}} = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {(Y_{{{\text{estimated}},i}} - Y_{{{\text{true}},i}} )} }}{n}}$$
(11)

where n is the number of validating samples.

The decision coefficient (R2) is a parameter to describe whether the estimated value is in agreement with the true value. Equations 1214 show the computation process of R2, and the higher the decision coefficient, the higher the accuracy obtained from the model.

$${\text{SSR}} = \sum\limits_{i = 1}^{n} {\left( {\widehat{{y_{i} }} - \bar{y}} \right)}^{2}$$
(12)
$${\text{SST}} = \mathop \sum \limits_{i = 1}^{n} \left( {y_{i} - \bar{y}} \right)^{2}$$
(13)
$$R^{2} = \frac{{{\text{SSR}}}}{{{\text{SST}}}}$$
(14)

where \(y_{i}\) is the true value, \(\widehat{{y_{i} }}\) is the estimated value, and \(\bar{y}\) is the average value of true values.

2.2.2 Indices of evaluating accuracy of risk-level classification: Kappa coefficient

Kappa coefficient is a parameter that can indicate whether the classification of the estimated risk level is in good agreement with the true classification. The value of Kappa coefficient ranges from 0 to 1. Usually, a Kappa value is divided into different ranges, such as 0–0.4, 0.4–0.6, 0.6–0.75 and 0.75–1 representing poor, medium, better and excellent accuracies, respectively. The Kappa coefficient is computed by Eqs. 1517.

$$k = \frac{{p_{0} - p_{e} }}{{1 - p_{e} }}$$
(15)
$$p_{0} = \frac{T}{n}$$
(16)
$$p_{e} = \frac{{a_{1} \times b_{1} + a_{2} \times b_{2} + \cdots + a_{c} \times b_{c} }}{n \times n}$$
(17)

where k is the value of Kappa coefficient, \(p_{0}\) is the accuracy of classification, T is the amount of samples with the correct classification, \(a_{c}\) is the amount of samples with correct classification for each category, \(b_{c}\) is the total amount of samples in each category, and n is the amount of validating samples.

2.2.3 Receiver operating characteristic curve

The receiver operating characteristic curve (ROC) is widely used for assessing the performance of classification models. And the area under curve (AUC) can quantitatively express the accuracy. However, the ROC is known for its applicability to the two-class classification problem. So this paper makes three simple rules to turn the outputs into the two-class classification (‘0’ or ‘1’). The rules are: (1) The ‘high risk’ and the ‘very high risk’ are classified as the true value ‘1,’ the ‘very low,’ ‘low’ and ‘medium’ are classified as the false value ‘0’; (2) for one landslide sample, if its estimated class and true class are both ‘high’ or ‘very high,’ it would be defined as the true positive (TP); if its estimated class and true class are both ‘very low,’ ‘low’ or ‘medium,’ it would be defined as the true negative (TN), and (3) for one landslide sample, if its estimated class is lower than its real class, it would be defined as the false negative (FN), conversely it would be defined as the false positive (FP). Based on these rules, the contingency table of validating samples is listed as Table 4. The ROC can be drawn according to the true positive rate (TPR) and the false positive rate (FPR).

Table 4 The main parameters of the PSO

2.2.4 Weights computation of condition factors

After training of BP neural network, the weights of six condition factors can be computed by using the weight \(w_{ik}\) from input layer to hidden layer and the weight \(w_{kj}\) from hidden layer to output layer. Actually, the relative relationship of the condition factors can illustrate the geographical characteristics of landslides collected in the study region. Besides, the compared results between computed weights and experiential weights can also verify the validity of models.

2.2.5 Training time

The training time in this paper refers to the time taken for the models to complete 10,000 training sessions or achieve the goal error. This time can be easily obtained in MATLAB R2017a at the end of the program.

3 Study region and spatial database

3.1 Topography and climate conditions in the study region

In the study, to verify the proposed models, Sichuan Province, China, is selected as the study region. Sichuan Province (see Fig. 4) is located in southwestern China with a total area of approximately 486,000 km2 and between east longitude 97°21′–108°33′ and north latitude 26°03′–34°19′. The terrain of the Sichuan Province is west high and east low, and the transition zone from the basin to the plateau has a quick rise in very short range (average 100 m/km), called elevation rapid changing zone (ERCZ, the area circled by the red dotted line in Fig. 5). The ERCZ is generated due to the action of the Longmen Mountain thrust fault, and it is the most distinctive terrain feature in Sichuan Province. According to the multi-discipline investigation to geological hazards carried by Sichuan Provincial Communications Department Highway Survey and Design Institute, the number of hazards such as landslides, collapses and falling rocks in the ERCZ is significantly larger than that in other regions (Fig. 5).

Fig. 4
figure 4

Landslide inventory map and location in the study region. (The black points represent landslide samples obtained from the literature, and the red points represent landslide samples obtained from our investigations.)

Fig. 5
figure 5

Photographs of landslide positions in Sichuan. (The red dashed lines represent elevation rapid changing zone.)

Sichuan Province mainly has a subtropical humid and semi-humid climate with annual rainfall over 1000 mm, and is warm throughout the year. The landslide hazards caused by rainfall have become the main risk faced by the engineers for engineering design, construction and maintenance in Sichuan Province (Fig. 5).

3.2 Spatial database used in the study

3.2.1 Collection of landslide samples

Collecting landslide samples is a key step in the process of landslide risk assessment, because the landslides samples will be used to train the BP neural network (such as weights and thresholds) and validate the accuracy of models. Therefore, 100 samples that occurred in Sichuan Province were investigated and collected carefully in this work. The distribution of the landslide samples adopted in this paper is shown as black and red points in Fig. 4. And the sources of the samples are from: (1) 40 landslide samples that occurred along G213 road, G317 road and Yaxi Highway in Sichuan Province (red points in Fig. 4) and (2) 60 landslide samples collected from the studies, which occurred in Sichuan Province from 2000 to 2017 (black points in Fig. 4).

These landside samples have some features: (1) with accurate location, time record and detailed information of the sliding volume and damage, (2) belonging to rainfall-induced shallow landslides, and (3) there are no strong protective measures (such as the retaining wall or the frame beam) before the landslide occurs.

To show the features of the landslide samples used in this study, two typical landslide samples (see Fig. 6) are described in detail here.

Fig. 6
figure 6

Investigation of landslide samples: a landslide occurred at Miyaro Tunnel exit; b landslide occurred at G213 line

The landslide at Miyaro Tunnel exit (Fig. 6a) occurred on June 20, 2017, due to rain for nearly a week. The main overburden is clay, on which lush vegetation plants grow. The landslide volume is about 2000 m3, and the hazard caused a 2-day blockage at the tunnel exit under construction.

The landslide at G213 road (Fig. 6b) is close to a bridge under construction and occurred on July 23, 2013, after heavy rain. The main overburden is the mixture of the sand soil and the gravel, and there is nearly no vegetation on it. This landslide is about 5000 m3 and damaged some power facilities. More seriously, it caused some casualties. After the disaster, the rescue work cost a lot of financial resources.

3.2.2 Condition factor selection and fundamental maps preparation

There are many condition factors that can affect the risk of landslides. For each condition factor, its influence on the landslide risk in different regions is different. To select suitable condition factors for Sichuan Province, specific natural geography and climate in Sichuan Province will be considered. In this paper, the selected condition factors are elevation (X1), vegetation index (X2), slope (X3), average annual rainfall (X4), surface cutting density (X5) and overburden soil type (X6), according to expert opinions and the results from the studies (Bui et al. 2016; Fuheng 2004; Baocheng 2011; Zhou et al. 2018).

To do the risk assessment, condition factors for each landslide sample need to be obtained, and some fundamental maps are necessary. Usually, the fundamental maps can be stored on the GIS platform, and the fundamental maps adopted in this paper consist of: (1) Digital Elevation Map (DEM, pixel size of 30 m) of Sichuan Province, (2) Vegetation Index Distribution Map of Sichuan Province, (3) Annual Average Rainfall Map of Sichuan Province (statistics from 2015), (4) Soil Distribution Map of Sichuan Province, (5) Slope Angle Map of Sichuan Province, and (6) Surface Cutting Density Map of Sichuan Province. In the above maps, (1)–(4) can be obtained from the Resource and Environmental Science center of the Chinese Academy of Sciences, and (5) and (6) can be obtained by processing the DEM map on GIS platform.

3.2.3 Data pre-processing

The magnitude scales of the above six condition factors described in the previous section are different from each other, which can cause unreliability in the risk assessment. For example, the value of rainfall is about 1000 mm, but the value of vegetation index is less than 1 (dimensionless quantity), so the effect of vegetation index would be ignored if these values are adopted directly in the models. Therefore, the process of normalization for the six condition factors has to be done before use.

The normalization to the six condition factors is complex, because in the determination of the scale for the normalization, geological environment and interaction between any two condition factors should be considered. To simplify the procedure, the magnitudes of condition factors in this paper are normalized to a value from 0 to 100. For elevation, slope, average annual rainfall and surface cutting density, a statistical method is used to determine the scale, which comprises the following successive steps: (1) Sort the landslide samples from small to large for each condition factor, (2) draw the statistical curves, of which the x-axial is the sample number and the y-axial is the value, (3) divide the area below the curve into five equal parts, (4) gain the value of the divided boundary as the scale value to form the five intervals (i.e., [0,20], [20,40], [40,60], [60,80], [80,100]) and (5) normalize the magnitudes to a value from 0 to 100. The whole process is shown in Fig. 7. But the scales of vegetation index and overburden soil type are determined according to the studies (Chen et al. 2018; Fuheng 2004; Baocheng 2011). The whole normalization range of the six condition factors is shown in Table 5.

Fig. 7
figure 7

Scale determining: a elevation, b slope, c average annual rainfall, d surface cutting density

Table 5 Normalization range

After normalization, the magnitudes of all condition factors are located in the range from 0 to 100. Finally, the maps for six condition factors are reclassified based on the GIS platform. Figure 8a–f shows the maps of elevation, vegetation index, slope, average annual rainfall, surface cutting density and overburden soil type, respectively. The black points in the maps show the relationship between landslide locations and the values of condition factors.

Fig. 8
figure 8

Landslide condition factors: a elevation, b vegetation index, c slope, d rainfall, e cutting density, f soil types. (The black points represent the location of landslide samples in this study.)

The risk values of the 100 landslide samples need to be evaluated in advance by the evaluation criteria. In some studies (Fuheng 2004; Baocheng 2011; Li et al. 2010), the evaluation criteria of risk levels were proposed based on the volume, casualties, economic losses and repair time caused by landslides, as shown in Table 6, which is adopted in this paper. We check the descriptions and field records of the 100 landslide samples and compare then with the evaluation criteria, and then assign risk values to the 100 landslides samples. Finally, according to the geographic coordinate of each sample, the above six condition factors can be extracted from reclassified maps on the GIS platform. The risk value of each sample can be obtained based on the evaluation criteria. The condition factors and risk values together form the database of the risk assessment model. And to make it clear, a detailed database of the two landslide samples mentioned in ‘Investigation and collection of landslide samples’ section is shown as Table 7.

Table 6 Risk evaluation
Table 7 Detailed evaluation of landslide samples (N for landslide sample number, L for location, NV for normalized value, R for risk value, number 1 for the landslide occurred at Miyaro Tunnel exit, number 2 for the landslide that occurred at G213 line)

3.2.4 Correlation of six condition factors

In the process of the landslides risk assessment model building, if there is a strong correlation between the condition factors, the accuracy of the models will reduce dramatically. Therefore, to remove the condition factors with strong correlation, correlation test must be carried out before model training. As for landslide condition factors, Pearson correlation index is usually adopted for correlation test (Booth et al. 1994; Bui et al. 2011).

The Pearson correlation index is a measure of the linear correlation between two continuous variables (X, Y) computed based on Eq. 18 and its value in the range of (− 1, 1). In general, when the absolute value of the Pearson index between the two variables is greater than 0.7, a powerful correlation appears.

$$r = \frac{{N\sum {XY - \sum X \sum Y } }}{{\sqrt {N\sum {X^{2} - \left( {\sum X } \right)^{2} } } \sqrt {N\sum {Y^{2} - (Y)^{2} } } }}$$
(18)

where N is the number of samples.

However, a problem may be caused if Pearson correlation analysis is used and the variables are larger than 2. To overcome the disadvantage of the Pearson correlation analysis, the variance inflation factor (VIF) is introduced to perform the correlation analysis for k variables (Eq. 19), in which the VIF value is a measure of the multicollinearity analysis (see Eq. 20). Usually, the higher the VIF value, the more likely it has multicollinearity. When the VIF value is greater than 10, it can be diagnosed that there is a strong multicollinearity between the variables.

$$Y = \beta_{0} + \beta_{1} X_{1} + \beta_{2} X_{2} + \cdots + \beta_{k} X_{k} + \varepsilon$$
(19)
$${\text{VIF}}_{i} = \frac{1}{{1 - R_{i}^{2} }}$$
(20)

where R2i is the coefficient of determination of the ith variable computed based on Eq. 14.

In this paper, the six condition factors (Xi) and the risk values (Y) of the 100 landslide samples need to be checked if there is a strong correlation by means of the Pearson correlation test and the VIF value. The Pearson correlation test results (Table 8) indicate that the absolute value of the Pearson indexes between any two condition factors is less than 0.7, so the six condition factors can be considered to have no strong correlation. The VIF results are shown in Table 9. The maximum value of VIF is 1.829, and the minimum is 1.164, illustrating that the correlation between the variables is low.

Table 8 The Pearson correlation indexes of six condition factors (x1 for elevation, x2 for vegetation index, x3 for slope, x4 for rainfall, x5 for cutting density, x6 for soil types and Y for risk value)
Table 9 The VIF value of six condition factors

After knowing the Pearson correlation index and the VIF value, a conclusion can be drawn that there is almost no strong correlation between condition factors, which is suitable for the models.

3.2.5 Preparation of training samples and validating samples

In the process of the landslides risk assessment model building, landslide samples are generally divided into training samples and validating samples (Aditian et al. 2018; Zhu et al. 2017). Training samples are adopted to train the neural networks and update the weights and thresholds, and validating samples are adopted to validate whether the performance of the model meets the expected goals. Usually, the ratio of training samples and validating samples is 3:1, and so 75 samples are selected as the training samples and 25 samples as the validating samples. The distribution of training samples (the red circles) and validating samples (the blue triangles) is shown in Fig. 9a, and the risk levels of these samples are shown in Fig. 9b.

Fig. 9
figure 9

Description of landslides samples: a the distribution of landslide samples; b number of the landslide samples in five risk levels

4 Results and analysis

4.1 Risk assessments of BP, GA-BP and PSO-BP models

4.1.1 The evaluation of the overall risk values from three models

As mentioned before, there are 75 training samples and 25 validating samples. The evaluation of risk assessment results is based on the validating samples. Figure 10a–c shows the comparison between the curves of estimated risk value (the red curves) and the curves of true risk value (the black curves) from the three models, and the errors (the blue triangles) are also drawn. In terms of the whole curves, the curves of estimated risk value and the curves of true risk value have the similar amplitude and tendency in all three models. This phenomenon proves the applicability and the reliability of three models for landslides risk assessment. As for the errors, the error scatters are uniformly distributed around the zero line. The linear fitting lines of errors (the blue straight line) show the positive system errors of three models. To highlight the difference of errors from three models, the error frequency histograms are drawn in Fig. 11. The results show that the absolute errors of the GA-BP model and the PSO-BP model have a concentration near zero compared with the BP model. This phenomenon illustrates that the errors of the two optimized models are smaller and more controllable.

Fig. 10
figure 10

The test results of 25 validating samples: a BP model, b GA-BP model, c PSO-BP model

Fig. 11
figure 11

The error frequency histograms of 25 validating samples: a BP model, b GA-BP model, c PSO-BP model

The linear fit curves from the three models are drawn in Fig. 12. The 1:1 line (the 45° black line) provides a reference with the 100% accuracy. The linear fit curves of the GA-BP model and the PSO-BP model match well with the 1:1 line. However, the linear fit curve of the BP model has an obvious gap with the 1:1 line. In this case, the GA-BP model and the PSO-BP model are better than the BP model. Furthermore, the linear fit curves of the GA-BP model and the PSO-BP model are very close to the 1:1 line when the true risk values are larger than 40. This phenomenon illustrates that the two optimized models are suitable for the assessment of high-risk landslides. However, the BP model underestimates the risk when the true risk values of landslides get larger, which is negative to the engineering.

Fig. 12
figure 12

Linear regression statistics

To give a quantitative evaluation to the three models, the indices mentioned in Sect. 2.2.1 are computed and listed in Table 10. The results show that the GA-BP model is the best model in the three with the lowest RMSE value (11.500) and the highest decision coefficient (R2 = 0.771), followed by the PSO-BP model (RMSE = 14.111, R2 = 0.655). In terms of the training time, the BP model and the PSO-BP model take less time to finish the training compared with the GA-BP model, but the gap is small. On the whole, the GA-BP model is characterized by the good accuracy and the acceptable training time; the PSO-BP model is characterized by less training time taken.

Table 10 The statistics of indices for the evaluation to the three models

4.1.2 The evaluation of risk levels from three models

Compared with the risk value, the risk level is more likely to cause people’s attention. Therefore, it is important to give an accurate assessment to the risk level. In this section, the Kappa coefficients mentioned in Sect. 2.2.2 are computed to show the accuracy of the risk level. To compute the Kappa coefficients, the contingency tables are necessary. The function of the contingency table is to count the correct or incorrect number of the estimated risk levels. The row head is for true risk level, and the column head is for assessed risk level. The number on the diagonal line refers to the number of the correctly assessed risk level.

Tables 11, 12 and 13 are, respectively, the contingency tables from the BP, GA-BP and PSO-BP models, and show that the GA-BP model correctly assesses the risk level of 18 landslide samples, followed by the PSO-BP model (15 landslide samples) and the BP model (13 landslide samples). And then, the Kappa coefficient is computed by using the data in the contingency tables and listed in Table 14. Table 14 shows that the Kappa coefficient of the GA-BP model is the highest (0.665), followed by the PSO-BP model (0.532) and the BP model (0.461), indicating that the GA-BP model and the PSO-BP model can give better assessment accuracy than the BP model. According to statistics, if the Kappa coefficient is greater than 0.5, the classification result would be thought reliable. So the assessment results from the GA-BP and PSO-BP models are reliable.

Table 11 The contingency table from the BP model. (Row head is for true value, and column head is for estimated value.)
Table 12 The contingency table from the GA-BP model. (Row head is for true value, and column head is for estimated value.)
Table 13 The contingency table from the PSO-BP model. (Row head is for true value, and column head is for estimated value.)
Table 14 Kappa coefficient

The receiver operating characteristic curve (ROC) is known as an effective method to evaluate the accuracy of the classification models, and the area under curve (AUC) can quantitatively express the accuracy. To validate the correctness of the Kappa coefficient, this paper also draws the ROCs of three models, as shown in Fig. 13.

Fig. 13
figure 13

The ROCs from three models

The results show that the GA-BP model gets the highest AUC value, followed by the PSO-BP model and the BP model, which matches well with the results of Kappa coefficients.

4.2 Landslide risk zoning maps

Geographic information system (GIS) is an effective way to perform geographic information visualization; especially it can combine the landslide risk zoning maps (LRZMs) with the neural network models. Here based on GIS, the LRZMs of Sichuan Province from the three models are drawn in Fig. 14.

Fig. 14
figure 14

Landslide risk zoning mapping: a BP model, b GA-BP model, c PSO-BP model

The results show that the range of estimated landslide risk value is 28.8–98.8 from the BP model, 32.9–99.9 from the GA-BP model and 32.8–99.9 from the PSO-BP model. The estimated risk values are divided into five risk levels as mentioned before (Very low 0–30, Low 30–40, Medium 40–60, High 60–80, Very high 80–100), and the medium risk areas (about 23.9% from the GA-BP model) and the high-risk areas (about 58.0% from the GA-BP model) are the dominant areas in Sichuan Province, indicating that the problems of the landslide hazard faced by Sichuan Province are serious. To validate the reliability of the LRZMs, some comparisons with existing maps are carried out. Firstly, the LRZMs are compared with the global susceptibility maps (GSMs) given by Stanley and Kischbaum (2017) using a fuzzy overlay model. In the GSMs, most region of Sichuan Province is evaluated as the ‘High’ level and the ‘Very High’ level. The results in this paper show that the ‘High’ level and the ‘Very High’ level are together 55.4% from the BP model, 62.8% from the GA-BP model and 63.1% from the PSO-BP model. Furthermore, the LRZMs from the three models identify the high-risk regions in the elevation rapid changing zone (ERCZ) and the low risk regions in the Chengdu Plain, which means that the LRZMs in this paper are more accurate. And then, the LRZMs are compared with the Chinese landslide susceptibility classification map (CLSCM, shown in Fig. 15) given by Chun Liu by using 60-year landslide historical data (Liu et al. 2013). The results from the CLSCM show that the high and very-high-risk regions are mainly distributed along the ERCZ. The LRZMs from the GA-BP and PSO-BP models also show that the high and very-high-risk regions are almost along the ERCZ.

Fig. 15
figure 15

Results from the Chinese landslide susceptibility classification map in Chun Liu’s work (Liu et al. 2013)

From the above results of the whole risk value accuracy, the risk-level accuracy and risk zoning maps, it illustrates that the GA-BP model and PSO-BP model both show better applicability and reliability of the risk assessment for landslides in the large region. However, for landslide samples from Sichuan Province, the GA-BP model obtains the highest accuracy in the evaluation of the risk values and the risk level. So in this paper, the GA-BP model can be considered as the best model in the three for landslide risk assessment in Sichuan Province.

4.3 The validation of proposed models in other region

The two proposed models in this paper are mainly analyzed based on the region with a large region. To perfect this study, a risk assessment of a region with the common area (about 400 km2) is carried out to prove a good application case for the common area. The S301 road is a provincial road (about 120 km long) from Jiuzhaigou County to SongPan County in Sichuan Province (see Fig. 16). During the rainy season in 2017, there occurred about 30 landslide events in the first 70 km of the S301 road. A quick risk assessment using the GA-BP model is performed, for lack of space, and only the risk mapping results are shown here (Fig. 16). It is obvious that the high-risk pixels almost match with the position of landslides. In other words, the proposed models in this paper have a good accuracy for the risk assessment in the region of both a larger area and a common area.

Fig. 16
figure 16

Landslide risk zoning mapping results of the S301 road from the PSO-BP model

4.4 Applicability of three models with more hidden layers

It is known that, with the hidden layers increasing, the BP neural network would be more effective and fit well with any nonlinear function. And it is necessary for this work to discuss the applicability of GA-BP and PSO-BP models with two or more hidden layers. Therefore, this paper has increased the hidden layers by a couple and tested the new models using the 100 landslide samples. The topology of the new BP neural network is shown as Fig. 17. The test time and accuracy results are now shown in Table 15.

Fig. 17
figure 17

The topology of the BP neural network with two hidden layers

Table 15 The statistics of indices for the evaluation of the three models with two hidden layers

The results show that: (1) With the number of hidden layers increasing to 2, the accuracy gets better and the training time increases; (2) with the number of hidden layers increasing to 2, the accuracy of the PSO-BP model is close to the GA-BP model, and the training time is much less. The comparison results illustrate that the GA-BP model with two hidden layers gives the most accurate estimation of landslide risk for the discussed study region, but it should be noted that the difference between the GA-BP model and the PSO-BP mode has become smaller.

However, with the number of hidden layers increasing, overfitting problem must be paid attention to. For this work, due to the limitation of only 100 landslide samples, the number of hidden layer cannot be too many. Because there are not enough samples for training, the model with more hidden layers would be easy for overfitting. We find that, in the models with two hidden layers, the R2 of training samples is more than 0.950, but the R2 of validating samples is lower than 0.800. This may be a sign of the overfitting problem. And for landslide risk assessment in Sichuan Province, there may be the most effective hidden layer number or node number in hidden layer for the three models. Further research would be done in our future work.

5 Discussion

5.1 Reliability about weights of condition factors

In the landslide risk assessment, the weights of condition factors are the main influence on the risk of landslides (Pavel et al. 2008). Therefore, some discussions about the weights of the condition factors are necessary. In this paper, when the model training stops, the weights of six condition factors can be obtained based on the \(w_{ik}\) and \(w_{kj}\) (Table 16). The weights can be arranged from the largest to the smallest as slope, elevation, surface cutting density, overburden soil type, average annual rainfall and vegetation index in the BP model and the GA-BP model, and elevation, slope, surface cutting density, overburden soil type, average annual rainfall and vegetation index in the PSO-BP model. In general, the slope is considered the most effective condition factor to the landslide risk (Costanzo et al. 2012), and this is proved again by the results in the three models. However, the weights of the vegetation index and the rainfall are much less in the BP model, and this is the main reason causing the accuracy gap between the BP model and the optimized model.

Table 16 Condition factor weights computed by the three models

5.2 Errors

In the error curves before (Fig. 10), the errors of the 5th, 10th, 19th, 24th and 25th samples of the BP model are too large (error > 15). After the optimization of the GA and the PSO, the errors of these samples are reduced, but they are still not reduced to the acceptable errors. These five samples are defined as ‘outliers’ in this paper. The existence of these outliers is the main reason of the reduction in the model accuracy. In general, the larger the region is, the more outliers there will be. Now, this paper shows two types of outliers for analysis from the original data as shown in Fig. 18. From the results above, it is known that the risk value has a positive correlation with the elevation, slope and surface cutting density, but the outliers shows the opposite pattern: The first type of outliers (see sample point 1 in Fig. 18) has lower value of elevation, slope, and surface cutting density but obtains higher risk value, whereas the second type of outliers (see sample point 2 in Fig. 18) has higher value of elevation, slope and surface cutting density but obtains lower risk value. For the first type of outliers, it refers to the landslide occurred in Chengdu Plain, which is characterized by low probability to occur but high risk because it is very close to cities and main roads. For the second type of outliers, it refers to the landslide occurred in inaccessible mountains, which is characterized by high probability to occur but low risk because it is far from humans, buildings and main roads. An effective way to improve the accuracy of the models is to remove the outliers from the training samples and the validating samples.

Fig. 18
figure 18

Details of the outlier

5.3 The performance of the GA-BP and PSO-BP model

It is reported in the studies that the PSO usually outperforms the GA in some complex problems. However, for this work, the GA-BP model got the higher accuracy for landslide risk assessment in Sichuan Province. It may be caused by following reasons:

  1. (1)

    The PSO in this work may be troubled by the local optimal solution. This is more likely to happen when the number of training samples is not large.

  2. (2)

    The parameters set for the PSO in this work may limit its capability. For a specific problem, the optimal parameters need to be studied comprehensively. However, to the authors’ knowledge, no optimal parameters for PSO using for landslide risk assessment have been reported in the literature. So this paper sets the general value for parameters. This may influence the result.

  3. (3)

    The number of landslide samples may be not enough for the PSO. With the landslide samples increasing, the speed advantage of the PSO would be obvious.

5.4 Limitations of the research

The amount of landslide samples adopted in this article is only 100. It is obviously not enough for an accurate risk analysis in 486,000 km2 area. However, it can be effective for the comparison of the three models. These 100 samples are widely distributed in Sichuan Province, which can reflect the ‘large’ of large region. And with the increase in samples, the speed advantage of the PSO-BP model and the accuracy advantage of GA-BP model will get more and more important for the work.

The risk analysis in this paper does not consider any triggering factors such as rainfall intensity or duration, and most factors in this analysis are related to the stability of the landslide itself and the geological and geographical environment. So a higher risk value means greater possibility for triggering disastrous landslides under the same condition (such as same rainfall intensity or rainfall duration). The advantage is that these models can quickly and extensively evaluate the risk in a large region, which is suitable for engineering applications. And if real-time rainfall forecast data are considered, the real-time risk zoning maps can be easily obtained.

6 Conclusions

This paper proposes two new models of the landslide risk assessment in a large region based on the genetic algorithm (GA) and the particle swarm optimization (PSO), and some comparisons of the proposed models are made to validate the accuracy and the reliability. Some main conclusions are drawn as follows:

  1. (1)

    For the landslide risk assessment in large region, the optimization of the BP neural network by the GA and the PSO mainly improves the accuracy. The GA-BP model has the highest accuracy, but its calculation speed is slow, and the PSO-BP model has the fastest calculation speed. Therefore, the GA-BP model is suitable for the risk assessment at priority of accuracy, and the PSO-BP model is suitable for the risk assessment at priority of speed.

  2. (2)

    For the condition factors that mainly affect the risk of landslides in Sichuan Province, the reasonable rank of condition factor weights from high to low is slope, elevation, surface cutting density, overburden soil type, average annual rainfall and vegetation index.

  3. (3)

    The landslide risk zoning maps of Sichuan Province show that the ‘Medium’ region account for 24% (about 97,200 km2), and the ‘High’ and ‘Very high’ region totally account for 60% (about 291,600 km2). They are mainly distributed in the elevation rapid change zone. The research results of this paper would make some contributions the planning, design and construction of engineering in mountainous regions in Sichuan Province.

  4. (4)

    The applicability of three models is influenced by the number of hidden layers, and some work needs to be done in future to study the optimal hidden layers for the GA-BP model and PSO-BP model.

  5. (5)

    The conventional studies show that the PSO usually outperforms the GA, but this paper gives the opposite conclusion. This may be caused by the limitation of parameters set for PSO and the lack of training samples. Therefore, more landslide samples in Sichuan Province would be collected, and then further works would be done for this issue in the future.