Introduction

Landslide is one of the most prevalent and serious sources of geological hazards prone to occur in natural terrain areas, and landslide disasters have extremely damaging influences on the environment and ecology (Görüm and Fidan 2021). The main triggering factors of landslides include the earthquake (Ji et al. 2021, 2020; Shinoda et al. 2019; Song et al. 2017; Tsai et al. 2019), the continuous rainfall (Lee et al. 2020; Park et al. 2019), the snow melting (Naudet et al. 2008), and some underground activities, such as mining (Chen et al. 2021). For landslide risk control and disaster mitigation, it is vital to identify the possible triggering factor distributions via a susceptibility analysis (Park et al. 2019). Approaches employed to investigate the landslide susceptibility can be classified into five categories (Qin et al. 2019): (1) geological hazard mapping methods, (2) landslide survey methods, (3) machine learning-based methods, (4) statistical models, and (5) physically based modelling methods incorporating the shallow slope stability analysis and relevant material parameters.

Among these methods of landslide susceptibility analyses, the methods based on machine learning and statistical models entail reviewing the historical event data and ignoring the intrinsic failure mechanisms and relevant physical parameters. Meanwhile, the physically based modelling method has been extensively applied lately due to its greater predictive ability and the suitability for the quantitative evaluation of the impact of the individual parameters related to landslide occurrence (Fell et al. 2008). Recently, the physically based modelling method has been easily embedded into the geographic information system (GIS) to implement the landslide susceptibility analysis involving broad areas in the framework of the grid-based structure feature (Sorbino et al. 2010). For example, the method is commonly used to investigate the rainfall-triggered shallow landslides considering the influence of hydrostatic rise and pore water pressure increases caused by rainfall infiltration. Notably, the physically based modelling approach is frequently coupled with slope stability models (the infinite slope) and hydrological models (Lee and Park 2015).

Acquisition of detailed geotechnical parameters is a necessity for accurate landslide susceptibility analysis in the framework of physically based models. Nevertheless, for a regional scale landslide prediction, it is generally arduous to identify the actual slope angle, the thickness of the soil layer, and the groundwater level (Juang et al. 2019). Thus, physically based models are inevitably limited by some hypotheses and restrictions and various sources of uncertainties from accurate geological hazard assessments. Recently, some physically based landslide susceptibility models, such as SHALSTAB (Dietrich et al. 2001; König et al. 2019), TRIGRS (Baum et al. 2002; Weidner et al. 2018), and tRIBS (Arnone et al. 2011), and methods incorporating the probabilistic analysis (Lee and Park 2015; Park et al. 2019) have been proposed. Further, the probabilistic physical modelling approach can handle the internal uncertainties associated with the underlying slope stability models and material properties (Escobar-Wolf et al. 2021; Park et al. 2019). For example, the infinite slope model with the Monte Carlo simulation (MCS) approach was adopted in the computer programme developed by Hammond (1992). Considering the computational burdens of MCS, fast reliability methods, such as the first-order second-moment (FOSM), were introduced and even implemented as a software package in GIS. The principle is to propagate the uncertainties of input parameters (variance or standard deviation of the variables) to predict the probability of shallow landslides (Haneberg 2007, 2004). Escobar-Wolf et al. (2021) noted that the users must pre-process these probabilistic analysis files with the ASCII format and post-process the outputs to analyse them in the GIS environment following Haneberg. Hence, a GIS-based toolbox of the PISA-m algorithm was developed and implemented in the ArcPy environment. Notably, in the engineering reliability community, FOSM as a tool for probabilistic analysis is limited by ignoring the statistical distributions of and correlations between random variables, which can result in significant estimation errors (Ditlevsen 1973). Accordingly, the FOSM is gradually replaced by the first-order reliability method (FORM) (Low and Tang 2007) that can overcome these limits. So far, nevertheless, there has been little discussion about considering the FORM reliability algorithm in applying for the hazard mapping assessment of regional slopes. This is perhaps because FORM usually involves more computational effort than FOSM. Therefore, implementing FORM in GIS is regarded as a challenging work that has never been reported in the literature.

It is worth pointing out that the recursive algorithm of FORM called the HLRF_x approach of fast convergence ability (Ji and Kodikara 2015; Ji et al. 2018) makes it possible to implement the probabilistic physical modelling of landslides in GIS. As a result, this study focuses on developing an effective tool called the ‘GIS-FORM landslide prediction’, which is written in Python language, running as an extension of the ArcGIS 10.6 software to automatically perform the landslide susceptibility analysis in seismic areas. The GIS toolbox can account for complete uncertainty information associated with the physically based model and can carry out the regional seismic landslide assessment based on the calculated slope displacement.

Methodology

Infinite slope stability model for shallow landslide prediction

The physically based landslide analysis is frequently related to the earth slope stability model by evaluating the forces applied to the slope. The failure surfaces of rainfall-induced or seismic-induced natural landslides are generally shallow (upper few metres) (Jibson et al. 2000; Khazai and Sitar 2004; Okada and Konishi 2019). The infinite slope model (Fig. 1) is regarded as an extremely helpful model for shallow landslide prediction on the regional scale, including the terrain inclination, soil characteristics (strength, weight, and depth), water table level, and even vegetation coverage (Escobar-Wolf et al. 2021). The infinite slope model can analytically characterise the factor of safety (FoS) concerning slope stability.

Fig. 1
figure 1

The infinite slope model: (a) Hammond model, (b) simplified model

As shown in Fig. 1a, the FoS calculated by the infinite slope model is the ratio of the average shear strength of the soil to the average shear stress developed along the potential failure surface based on the framework of the limit equilibrium analysis. The model can be easily extended to consider the influence of the water table (above the sliding plane) and the impacts of the vegetation, including the added weight of trees and increased strength from root cohesion (Escobar-Wolf et al. 2021; Hammond 1992). The corresponding FoS can be expressed as follows:

$${\text{FoS}}=\frac{Cr+Cs+[{q}_{t}+{\gamma }_{m}(D-{H}_{w}D)+({\gamma }_{s}-{\gamma }_{w}){H}_{w}D]\cdot {\mathrm{cos}}^{2}\beta \cdot \mathrm{tan}\phi }{\left[qt+{\gamma }_{m}(D-{H}_{w}D)+{\gamma }_{s}{H}_{w}{\text{D}}\right]\mathrm{sin}\beta \cdot \mathrm{cos}\beta }$$
(1)

where Cr denotes the contribution of roots to the cohesive soil strength, Cs denotes the cohesive soil strength, ϕ denotes the soil internal friction angle, qt denotes the vegetation weight added to the slope, γm denotes the unsaturated (above the phreatic surface) soil unit weight, γs denotes the saturated soil unit weight, γw denotes the water unit weight (9.81kN/m3), D denotes the depth of the slip depth, Hw denotes the pore water pressure ratio, and β denotes the terrain inclination (slope).

Seismic displacement prediction model

For the seismic condition, regional landslide susceptibility analysis usually requires further simplifying the infinite slope model with focused geotechnical parameters (Jibson et al. 2000; Shinoda et al. 2019). As shown in Fig. 1(b), assuming that the unit weight of a shallow landslide soil layer does not change with the saturation (i.e., \(\gamma_{m} \approx \gamma_{s}\)) and the vegetation contribution is ignored from stability analysis, Eq. 1 can be simplified as:

$${\text{FoS}}=\frac{Cs}{{\gamma }_{s}D\mathrm{sin}\beta \cdot \mathrm{cos}\beta }+\frac{\mathrm{tan}\phi }{\mathrm{tan}\beta }-\frac{{\gamma }_{w}Hw\cdot \mathrm{tan}\phi }{{\gamma }_{s}\mathrm{tan}\beta }$$
(2)

Furthermore, seismic stability analysis can be carried out using the Newmark displacement method (Newmark 1965). The critical acceleration (ac) defining the initial movement of landslide is simply given by the following:

$${a}_{c}=\left({\text{FoS}}-1\right)g\cdot \mathrm{sin}\alpha$$
(3)

where g is the gravity acceleration, FoS is the static factor of safety given by Eq. 2, and α denotes the slope inclination.

After obtaining the ac, the accumulative displacement in the seismic shaking period of the landslide can be cheaply computed using Newmark’s ground acceleration integration. However, in the regional-scale seismic hazard assessment, applying the Newmark displacement method is still time-consuming and labour-intensive. Thus, empirical formulas/models are employed for regional-scale seismic displacement prediction (Du and Wang 2016; Jibson et al. 2000; Song et al. 2017, 2021). We considered both Newmark’s two-stage integration method and Jibson’s (2007) logarithmic formula model to facilitate the coseismic landslide prediction in our toolbox.

$$\mathrm{log}{D}_{\text{Jibson}}=0.215+\mathrm{log}\left[{\left(1-\frac{{a}_{c}}{{a}_{\mathrm{max}}}\right)}^{2.341}{\left(1-\frac{{a}_{c}}{{a}_{\mathrm{max}}}\right)}^{-1.348}\right]\pm 0.510$$
(4)

Probabilistic stability prediction model using FORM

FORM is a semi-probability calculation method that is well-known in geotechnical engineering failure analysis. Given the physical model describing the performance of slope stability, it can quickly calculate the probability of failure (i.e., the landslide susceptibility) using the marginal distribution statistics of input parameters with uncertainty (Hasofer and Lind 1974). The fundamental concept of the FORM probabilistic calculation is to find the reliability index (RI) (Low and Tang 2007), which represents the minimum distance from the vector of mean values (MV) to the vector of the most probable point of failure (MPP). In this work, we adopt the recursive algorithm proposed by Ji and Kodikara (2015) and Ji et al. (2019b) to implement the FORM calculation procedure into GIS. In brief, the recursive algorithm for locating the MPP in the space of random variables defined by vector x (x-space) is written as follows:

$${x}_{k+1}={\mu }_{{}_{k}}^{N}+\frac{1}{\nabla g{({x}_{k})}^{T}{T}_{k}\nabla g({x}_{k})}\left[\nabla g{({x}_{k})}^{T}\left({x}_{k}-{\mu }_{{}_{k}}^{N}\right)-g({x}_{k})\right]{T}_{k}\nabla g({x}_{k})$$
(5)

where \({T}_{\mathrm{k}}={\left[{\sigma }_{\mathrm{k}}^{\mathrm{N}}\right]}^{\mathrm{T}}R\left[{\sigma }_{\mathrm{k}}^{\mathrm{N}}\right]\) is the transformation matrix, \({\mathbf{x}}_{k}\) is the vector of random variables in x-space, and \({{\varvec{\upmu}}}_{{_{k} }}^{N}\) is the vector of equivalent normal MV for converting random variables into u-space. Further, the diagonal matrix \(\left[{\sigma }_{\mathrm{k}}^{\mathrm{N}}\right]\text{=}\left[\begin{array}{ccc}{\sigma }_{\mathrm{k},1}^{\mathrm{N}}& \cdots & 0\\ \vdots & {\sigma }_{\mathrm{k},\mathrm{n}}^{\mathrm{N}}& \vdots \\ 0& \cdots & {\sigma }_{\mathrm{k},\mathrm{m}}^{\mathrm{N}}\end{array}\right]\) with \({\sigma }_{k,i}^{\mathrm{N}}\) is the equivalent normal standard deviation of the ith random variables evaluated at xk, and R is the correlation matrix of all random variables.

At the converged MPP, the RI = βf following Low and Tang (2007) can be calculated as follows:

$${\beta }_{f}=\sqrt{{n}^{*T}{R}^{-1}{n}^{*}}=\sqrt{{\left[\frac{{x}_{i}{}^{*}-{u}_{i}{}^{N}}{{\sigma }_{i}^{N}}\right]}^{T}{R}^{-1}\left[\frac{{x}_{i}{}^{*}-{u}_{i}{}^{N}}{{\sigma }_{i}^{N}}\right]}$$
(6)

where \(x_{i}^{*}\) denotes the MPP component value of the ith variable evaluated in x-space and \(u_{i}^{N}\) and \(\sigma_{i}^{N}\) denote the equivalent normal mean and standard deviation of the ith variable, respectively. Further, R is the correlation matrix and \(u_{i}^{N}\) and \(\sigma_{i}^{N}\) can be obtained by the Rackwitz-Fiessler transformation (Rackwitz and Flessler 1978).

The corresponding failure probability Pf is calculated as follows:

$${P}_{f}=\Phi \left(-{\beta }_{f}\right)$$
(7)

where Φ(⋅) denotes the standard normal cumulative distribution function.

GIS-FORM landslide prediction toolbox: conceptual architecture

Different from FOSM for direct uncertainty propagation calculations in the GIS platform (Escobar-Wolf et al. 2021; Park et al. 2019), several iterations are needed to compute the RI using the FORM with the HLRF_x algorithm. In GIS-related software for the geospatial analysis, such as ArcGIS, implementing the iterative algorithm through the ordinary grid calculation is extremely complicated without the help of external tools. Additionally, the dataset analysis of topographic and geomorphological parameters is time-consuming and labour-intensive since they must be multiple data interactive using the geospatial technology on different platforms (Rahmati et al. 2019). In this paper, we first propose an effective computation framework built in the Python environment to eliminate the aforementioned limitations. In short, our Python-based GIS-FORM landslide prediction toolbox was developed in ArcGIS and could run as an extension of the ArcGIS 10.6 @ESRI software. The design, development, and conceptual architecture of the toolbox are introduced next.

The design of the GIS-FORM landslide prediction toolbox is mainly divided into four parts:

  1. 1.

    Generating required files to create a geospatial dataset (i.e., the input);

  2. 2.

    Selecting the calculation method and model;

  3. 3.

    Automatically implementing the calculation based on ArcPy;

  4. 4.

    Generating the hazard assessment maps in the form of raster layers (i.e., the output).

Additionally, the graphical user interface is extremely significant in model designing, which permits users to modify some significant parameters and the computational model based on accessible maps (Rahmati et al. 2018). The entire methodology is illustrated in two flowcharts: the corresponding workflow of the toolbox (Fig. 2a) and the iteration process of the HLRF_x reliability algorithm used in the toolbox (Fig. 2b). The user interface is displayed in Fig. 3, and the use of this toolbox will be subsequently described in detail.

Fig. 2
figure 2

Schematics of the toolbox and methodology: (a) workflow of the toolbox; (b) the HLRF_x algorithm

Fig. 3
figure 3

User interface of GIS-FORM landslide prediction toolbox (GUI)

GIS-FORM landslide prediction toolbox: input options

The prescribed file formats provide input parameters to the options of the GIS-FORM landslide prediction toolbox, as shown in Fig. 3, to facilitate a user-friendly application. The inputs to the principal calculation elements are a digital elevation model (DEM) in the raster format, an ESRI shapefile with the spatial distribution of soil types, an ESRI shapefile with the spatial distribution of vegetation types, and the data file with the statistical information of geological parameters (e.g.,.csv), including the parameters of soil and vegetation, respectively. Meanwhile, the vegetation-relevant input files are optional in our Python-based toolbox. Additionally, we provide the option of importing the statistical parameters’ correlation coefficient matrix, R, with the form of the.csv file. Some other additional inputs include the minimum value of slope worthy of stability analysis and the DEM standard deviation value (following Haneberg (2007) and Escobar-Wolf et al. (2021)).

Similarly, the options to input the GIS mapping distribution parameters through the.csv files (i.e., soils.csv and vegetation.csv) are analogous to the operational approach implemented in GIS-TISSA (Escobar-Wolf et al. 2021). When the soil and vegetation (if needed) distribution areas are input in the form of a.shp file, these shapefiles must conclude polygons with an attribute filed name ‘Unit’ precisely matching the unit name of the input.csv file. Both.csv files should be prepared with the required format, including seven columns, each with different rows according to the soil or vegetation (if needed) properties. Similarly, the first column is set as ‘Unit’, and it covers the names of all the soil and vegetation units. The second column is specified as ‘param’, and it includes the parameter names of soil, such as Cs, γm, γsat, γw, D, and Hw, or the parameter names of vegetation, for instance, Cr and qt. The third column represents the probability distribution name for each variable whose header is called ‘dist’. The fourth to seventh columns are the parameter values of the corresponding probability distribution, and the headers are called ‘stat1’, ‘stat2’, ‘stat3’, and ‘stat4’, respectively. The statistic values are assigned to the columns marked as ‘stat1’ and ‘stat2’, respectively, when a variable conforms to two-parameter statistical distributions, such as the normal or lognormal distribution. The zero value is filled in the columns named ‘stat3’ and ‘stat4’. Parameters for 10 common types of probability distributions are listed in Table 1 (Ji and Kodikara 2015).

Table 1 Definition of probability distribution and associated parameters

GIS-FORM landslide prediction toolbox: calculation methods and models

Note that the terrain inclination is one of the most basic indexes in geospatial analysis. Selecting the calculation method for determining the terrain inclination of the interested region includes two algorithms following Haneberg (2007) and Escobar-Wolf et al. (2021), namely the PISA-m and ArcMap algorithm. If the PISA-m slope algorithm is employed, the slope and its corresponding standard deviation are obtained using a four-neighbour-pixels method. In contrast, if the standard ArcMap algorithm is chosen, the eight-point-neighbour-pixels method based on ArcMap Spatial Analyst is applied to calculate the terrain inclination.

Furthermore, as described in the physical model, the toolbox developed in this paper considers Hammond’s infinite slope model (as defined in Eq. 1) as well as the universal infinite slope model (as defined in Eq. 2) without considering the influence of vegetation. This is why we designed our GIS-FORM toolbox as the optional way when importing a file of vegetation. The users can choose freely from the two-calculation model by the drop-down option provided.

The remaining option is to select whether or not to implement the seismic slope disaster assessment. Notably, two distinct input modes are provided, and two options in the GIS-FORM landslide prediction toolbox are set to facilitate the free use of ground acceleration information (i.e., the ground acceleration is inputted as a.csv file format, and the peak ground acceleration [PGA] distribution is inputted as.shp file format). Then, seismic displacement can be calculated by combining the imported acceleration and FoS at each pixel. Finally, we provide an option to compare the calculated results between the FORM with the HLRF_x algorithm and the FOSM.

GIS-FORM landslide prediction toolbox: automatic calculation process and output results

After forming the geospatial dataset and selecting the calculation method and model, the Python-based toolbox starts grid sampling and calculating these contents, which is required by the geological hazard assessment. The assessment contains FoS, ac and seismic displacement (Dn) (if the seismic stability analysis is chosen), and the probabilistic analysis based on the HLRF_x algorithm GIS-base. Specifically, the DEM raster file is used to determine the pixel size and geographic extent for calculation. A series of zero-valued raster files are produced and assigned with the parameter value. Meanwhile, the shapefiles are also converted to raster format in the operation of the dataset with another populated raster. The first step, which is called the ‘grid sampling and dataset process’, is completed automatically based on the developed toolbox, as shown in Fig. 2a. In the GIS-TISSA toolbox, although the procedure is implemented in Python-based GIS (i.e., ArcPy), it is still based on the conventional operation between raster layers, which does not involve any iterative computation process by its FOSM algorithm. Nevertheless, we fully employ Python to process the raster dataset in this study, mainly including soil cohesion, soil friction, slip depth, slope angle, unit weight, groundwater, the ratio of saturation, roots cohesion and vegetation weight (if needed). These raster layers are converted to large-scale arrays, respectively. The Python-based mathematical calculations are launched by extracting the given values on the corresponding position elements in different arrays. Therefore, the FoS of each location point can be estimated according to Eq. 1 or Eq. 2. If the seismic analysis is selected, the ac and Dn are obtained according to Eq. 3 and Eq. 4 of every pixel point. Additionally, we develop an effective RI calculation framework using an iterative loop. Figure 2b demonstrates the automatic procedure employed in finding the final RI involving the HLRF_x algorithm. Subsequently, the Pf of each array element is estimated. Notably, these different calculating results are stored on different arrays and are written automatically to raster files in the GIS-FORM landslide prediction toolbox.

Finally, the main results in the form of raster layers are displayed, and a rapid geological hazard assessment is performed. In other words, the FoS layer, RI layer, Pf layer, ac layer (in the seismic module), and the Dn layer can all be produced automatically.

Comparison between GIS-FORM landslide prediction and GIS-TISSA

We first compare the raster files of output results between the proposed GIS-FORM landslide prediction tool and the GIS-TISSA toolbox reported by Escobar-Wolf et al. (2021). For a consistent comparison, the example dataset provided by Escobar-Wolf et al. (2021) is compiled into our proposed GIS-FORM tool to verify that the computation codes in our toolbox are accurate and calculations are correctly carried out. The Hammond model is used to evaluate the stability of regional-scale slopes according to the physical parameters given by the example dataset as shown in Tables 2 and 3. Further, the eight-neighbour-pixels approach (ArcMap algorithm) used in GIS-TISSA to estimate the slope was employed as the slope calculation option in the GIS-FORM landslide prediction to guarantee that both toolboxes implement the same type of computations. After fully inputting the dataset, the GIS-FORM landslide prediction toolbox automatically conducts a regional disaster assessment, at a cost of about 15 min 53 s in our desktop computational facility.

Table 2 Mean and standard deviation of the soil property (adapted from Escobar-Wolf et al. 2021)
Table 3 Mean and standard deviation of the tree property (adapted from Escobar-Wolf et al. 2021)

The relevant raster files within the geodatabase obtained based on the GIS-FORM landslide prediction toolbox were used to compare with the calculation results of the GIS-TISSA. The FoS raster files as the output based on the two algorithms, respectively, for example, are called ‘FOS_GIS-TISSA’ and the ‘FOS_GIS-FORM’. Note that FoS = 1 is generally set as the threshold for the landslide prediction. The level of safety areas is divided into three intervals, namely [1.00,1.25], [1.25,1.5], and larger than 1.50 following previous studies (Escobar-Wolf et al. 2021). Figure 4a–b present the FoS maps using the GIS-TISSA and GIS-FORM landslide prediction toolboxes. The extremely good agreement (within the 10−5 relative error in terms of FoS) implies that the fundamental calculation pattern of the GIS-FORM landslide prediction toolbox is validated. Moreover, users are allowed to consider seismic hazard predictions using Newmark’s concept of critical acceleration, as discussed before (Jibson 2007). The mean critical acceleration of each pixel can be calculated after obtaining the FoS and is written as output rasters automatically based on Eq. 3 (called ‘ac_GIS-TISSA’ and the ‘ac_GIS-FORM’). The distributions of ac estimated through the two algorithms are also in good agreement with each other in the study area, as shown in Fig. 4c–d.

Fig. 4
figure 4

Maps of the calculation results: (a) FoS of GIS-TISSA and (b) FoS of GIS-FORM, and (c) ac of GIS-TISSA and (d) ac of GIS-FORM

Apart from the fact that the two toolboxes obtain almost the same results of the FoS distributions, there are obvious differences in the relevant probabilistic hazard predictions reflecting various parameter uncertainties, as shown in Fig. 5. This is mainly due to the distinctive computing strategies of the two algorithms embedded in each of them. As mentioned earlier, the final RI by the GIS-FORM toolbox is obtained using the iterative HLRF_x algorithm according to Eqs. 5 and 6. However, the RI in GIS-TISSA is simply estimated by RI = (μFoS − 1)/σFoS, which cannot consider the complete probability distribution information of random variables. Figure 5e–f show two maps of the Pf distribution given by GIS-TISSA and GIS-FORM, respectively, when uncorrelated normal random variables were employed. The differences between the GIS-TISSA and the GIS-FORM landslide predictions are relatively small, with the maximum Pf differing by 5.1%. This implies that the GIS-FORM landslide prediction results would bring about somewhat conservative assessments of landslide susceptibility. Note that the fundamental statistical information subject to multiple random variables cannot be considered comprehensively, especially the cross-correlation when performing the landslide sensitivity analysis adopting the GIS-TISSA. However, those important statistical information can be completely modelled in the GIS-FORM toolbox. For instance, Fig. 5c, d, g, h shows the RI and corresponding Pf maps calculated by the GIS-TISSA and GIS-FORM respectively when the parameters of soil and vegetation obey correlated non-normal distribution.

Fig. 5
figure 5

Maps of calculation results: RI (a) GIS-TISSA (uncorrelated normals), (b) GIS-FORM (uncorrelated normals), (c) GIS-TISSA (correlated non-normals), (d) GIS-FORM (correlated non-normals), and Pf (e) GIS-TISSA (uncorrelated normals), (f) GIS-FORM (uncorrelated normals), (g) GIS-TISSA (correlated non-normals), and (h) GIS-FORM (correlated non-normals)

Alternatively, we further consider using the failure probability raster to display the risk level under different threshold ranges, as shown in Fig. 6. The probability raster includes the stable areas with 5%, 10%, 25%, 50%, and 90%, respectively. Correspondingly, the failure areas (percentage) decrease dramatically with the increase of the failure probability threshold with nonlinear form, as described in Fig. 6f. Notably, the failure region given by GIS-TISSA is generally larger than the results obtained by the GIS-FORM landslide prediction, and the absolute error decreases from 2.6 to 0.30%. It is reasonable because different reliability algorithms were implemented and show again that the results obtained by GIS-FORM are more conservative.

Fig. 6
figure 6

Failure probability maps with the threshold increasing

Rapid landslide susceptibility assessment: a real case study

Basic geospatial dataset

Herein, a representative real case study was carried out to validate the performance of the GIS-FORM landslide prediction. The research region is located in the Jiuzhaigou county, northern Sichuan province. On 8 August 2017, a magnitude Ms 7.0 earthquake struck the region and triggered plentiful coseismic landslides. These landslides were primarily shallow rock slides and rockfalls following the field survey by Fan et al. (2018). Therefore, it is necessary to rapidly assess the seismic landslide susceptibility in this area for managing the seismic disaster.

First, the geospatial dataset should be established. A 12.5-m DEM (https://search.asf.alaska.edu/) of the study area with 681 landslide polygons as reported by Yi et al. (2020) was employed and shown in Fig. 7a.

Fig. 7
figure 7

Case study area (landslides adapted from Yi et al. (2020)): (a) elevation, (b) lithology map, and (c) PGA map

Furthermore, the PGA is recognised as a necessary dynamic factor when evaluating the coseismic landslides. The shakemap (PGA distribution map) was downloaded from the USGS (https://www.usgs.gov/natural-hazards/earthquake-hazards/earthquakes), as shown in Fig. 7c. Additionally, the primary geological information is also used to evaluate the seismic slope stability to determine the occurrence potential of landslides. Lithology portrays the engineering geological properties associated with landslide occurrence and is applied for estimating the surficial soil properties. The study area was mainly split into four lithologic units, as shown in Fig. 7b. These were (1) Permian (P), limestone, dolomite, and dolomitic limestone; (2) Carboniferous (C), limestone and limestone intercalated dolomite; (3) Triassic (T), green-grey metamorphic tuffaceous sandstone and siltstone; and (4) Devonian (D), organic limestone and layered dolomite (Chen et al. 2020; Fan et al. 2018; Yi et al. 2020). The accurate surficial soil strength parameters are crucial for determining the slope’s stability. However, obtaining the actual parameters across such a vast region is impractical (Chen et al. 2020) and outside the scope of this work. Generally, several various sources are employed to establish the typical strength parameters of the rocks. They include (1) geological reconnaissance report of research areas, (2) recommended values from experienced engineers, and (3) the parametric back analyses according to installed sensors and ground monitoring stations (Qin et al. 2019). As a result, a set of typical shear strength parameters is utilised for each geologic unit. The bedrock in the study area was divided into three categories and the corresponding physical properties are presented in Table 4 (Chen et al. 2019).

Note that the major influencing variables of slope stability are surficial soil strength properties, slope inclination, and soil saturation. Specifically, the variation of soil saturation is attributed to rainfall or seasonal change, and it is disregarded in the study (i.e., Hw = 0). Moreover, field investigation reveals that coseismic landslides are generally shallow slope failures. Hence, the slope-normal failure slab thickness is considered to be 2.0 m, following Chen et al. (2020).

Table 4 Physical properties of surficial soils assigned to areas of different bedrock types in the study region

Note that the landslide factors can be summarised into three types: seismic, terrain, and geologic conditions (Fan et al. 2018; Xu et al. 2013). Nevertheless, the focus of this study is on the feasibility of using the GIS-FORM landslide prediction toolbox to verify the predicted landslides rather than considering more factors to reveal the mechanism of coseismic-induced landslides. Hence, the PGA was chosen to represent the seismic influence. Three terrain factors including the terrain inclination, the elevation and the slope aspect were selected. At last, the lithology was chosen to describe the geological condition.

Calculation results and analysis

Deterministic analysis

After entering the required files into the GIS-FORM landslide prediction toolbox, a series of visualised geohazard maps will be obtained. Generally, the FoS can play a direct feedback role in the distribution of landslides. Figure 8a shows the overall FoS map calculated by Eq. 2. The statistical analysis was implemented to analyse and compare the accuracy between the predicted landslides by the GIS-FORM toolbox and the actual landslides subjected to field observation. First, the study region was deterministically divided into safety (FoS ≥ 1.0) and unsafety regions (FoS < 1.0). Then, the observed landslides were used as a reference for comparison. The distribution of landslides with seismic, terrain and geologic factors are demonstrated in Fig. 9.

Fig. 8
figure 8

Calculated results of GIS-FORM landslide prediction toolbox: (a) FoS, (b) PGA-ac, and (c) Dn

Fig. 9
figure 9

Comparison of predicted and actual landslides occurrence under different controlling factors based on the criterion of static FoS: (a) PGA, (b) slope, (c) elevation, (d) lithology, and (e) aspect

The mainshock PGA in the research area was between 0.16 g and 0.26 g, as shown in Fig. 9a. Note that the predicted percentage of landslides based on FoS method increases with the PGA value, which is consistent with the observed coseismic landslide percentage. Figure 9b shows that most predicted landslide rasters based on FoS lie in the areas with slope inclination between 30° and 60°(unsafety region, red bars), which is similar to the observed coseismic landslides—with the slope ranging from 30° to 55° (Fan et al. 2018). Figure 9c on the other hand shows that the possible landslide raster area keeps increasing and reaches the peak at 3,400 m of elevation without considering the seismic with approximately the same trend as the observed landslides. Figure 9e shows that the dominant slope aspects affecting slope stability are mainly the east (E, 67.5° ~ 112.5°) and west-south (W-S: 202.5° ~ 247.5°). Additionally, Fig. 9d shows the influence of geological conditions (Lithology) in assessing regional landslide potentials. Most landslides occurred in Carboniferous under the actual situation accounting for 79.7% of the total landslide area. However, most predicted landslides appeared in Carboniferous and Devonian, especially the latter occupying 77.2% of the total area.

It is also worth pointing out that the unstable region predicted based on the deterministic FoS method generally compares well with the observed coseismic landslide percentage area (dotted line) when the slope and/or elevation is small. On the contrary, the predictions gradually deviate and significantly over-estimate the landslide area with the increase of the slope and/or elevation, of which the trends are shown in Figs. 89b, c. The main reason is that the deterministic FoS results adopted only the mean values of model parameters at many different locations, which are unlikely to be true in nature. Therefore, the adverse outcome is terrains with high values of slope/elevation (which are proportionally contributing factors to FoS) are prone to landslide, while many uncertainty factors that could reduce the abovementioned contributions are completely neglected from the FoS analysis. To overcome these limitations, more improved prediction consistency in the framework of probabilistic analysis will be given in the next section.

The seismic displacement, Dn, is regarded as a significant index feasible for determining the coseismic landslide initiation. The study area is divided into four susceptibility levels based on the threshold of Dn, according to the California Geological Survey (CGS) (Shinoda et al. 2019): (1) susceptibility level I (very low): Dn < 5 cm; (2) susceptibility level II (low): 5 cm ≤ Dn < 15 cm; (3) susceptibility level III (moderate): 15 cm ≤ Dn < 30 cm; and (4) susceptibility level IV (high): 30 cm ≤ Dn.

The seismic displacement of the region slopes is calculated by using Eq. 4, as displayed in Fig. 8c. Most of the landslides predicted under seismic action occur in the medium- and high-susceptibility levels and have a good agreement with the field survey landslides.

The statistical analysis is used again to investigate the rationality of landslide prediction results based on the seismic, terrain, and geologic factors. For comparison, the absolute error (the dotted line) was employed to characterise the difference between the predicted and observed landslides and was counted according to different susceptibility levels. As shown in Fig. 10, almost all areas-errors show a quick decrease of fluctuation when the susceptibility level of Dn increases, regardless of the landslide influence factors. This also indicates landslides occurred in moderate- and high-susceptibility level areas (e.g., when Dn ≥ 15 cm). Note that the relatively minimal error (green dotted line) subjected to moderate susceptibility level was displayed between the predicted and observed landslides regardless of the kind of landslide influence factors in the statistical chart. This indicates that the landslides occurred when the slope’s seismic displacement within the earthquake zone reached 15 cm. The feasibility of using the seismic module of the GIS-FORM landslide prediction toolbox to predict coseismic landslides has been further demonstrated herein. Actually, the abovementioned conclusions are straightforwardly resulted from deterministic-FoS analysis. However, Shinoda et al. (2019) proposed that the criteria can be affected by many sources of uncertainties in geomaterial properties, landslide failure mode and scale, and seismic waves. The work below will be extended to probabilistic modelling analysis.

Fig. 10
figure 10

Comparison of predicted and actual landslides occurrence under different controlling factors based on the risk level of seismic displacement (Dn): (a) PGA, (b) slope, (c) elevation, (d) lithology, and (e) aspect

Probabilistic analysis

Different from the deterministic analysis as demonstrated by Chen et al. (2020; 2019), this work investigates the effect of parameter uncertainties involved in the physical modelling of landslides. Relevant model input parameters are prescribed with different coefficients of variation (COV, the ratio of standard deviation to mean value of a specific random variable): COV = 0.05, 0.10, 0.20, 0.30 as listed in Table 4. Note that high COV means that parameters change significantly within the study areas. In other words, low COV means random variables have less uncertainties, so the probability of failure is small, and inversely, the corresponding RI is high. Figure 11 shows the toolbox outputs for RI, and corresponding Pf distribution maps considering the uncertainties of the parameters. It can be visualised from Fig. 11a–d that the variation of the distribution of RIs is closely related to the COV of the corresponding parameters. For example, with very low variability (COV = 0.05), only 4.3% of the landslides have RI values less than zero. Nearly 83.3% of the actual landslide area is situated in zones with a very high RI (e.g., RI = 5). The Pf distribution maps with COV ranging from 0.05 to 0.30 are displayed in Fig. 11e–h. The Pf can be classified into five levels following Lacasse and Nadim (2011) to further analyse the variation characteristics of regional slopes under different failure probability conditions: I (very low) Pf ≤ 1%; II (low) 1% < Pf ≤ 10%; III (moderate) 10% < Pf ≤ 50%; IV (high) 50% < Pf ≤ 90%; and V (very high) Pf ≥ 90%. The Pf distribution maps vividly demonstrate that the area of high risk landslides increases remarkably with parameters’ COV. Note that the phenomenon is actually reflecting the nature of uncertainty and probability analysis: when an area is subjected to large variability, i.e., high COV values, the expected probability of failure will increase. In general, the more unknown involved in physical modelling, the larger possibilities of failure.

Fig. 11
figure 11

RI and Pf based on static FoS under different COVs

To further explain the influence of different levels of parameter COV on the accuracy of landslide susceptibility prediction using the GIS-FORM toolbox, the graphical results of Fig. 11 can be lumpily expressed using the probability of detection plots (PoD) or true positive ratio (TPR), true negative ratio plots (TNR), and balance accuracy. The mathematical definitions of PoD (TPR) and TNR as well as balance accuracy are given below (Chuang et al. 2021; Mathew et al. 2008):

$$PoD\left(TPR\right)=\frac{TP}{TP+FN}$$
(8)
$$TNR=\frac{TN}{TN+FP}$$
(9)
$$Balance\ accuracy=\frac{TPR+TNR}{2}$$
(10)

where TP denotes the true positive which means the landslides occurred and the occurrence of landslides was predicted based on a certain index, FN denotes the false negative which means there were actual landslides occurring but no landslides predicted, TN denotes the true negative which means there were no landslides occurred as well as no landslides predicted, and FP denotes the false positive which means there were no actual landslides occurring but the occurrence of landslides was predicted. In this study, we evaluated the PoD and TNR change with respect to the landslide threshold of Pf as an index. As shown in Fig. 12a, the PoD value increases with the decrease of the prescribed threshold value for Pf. Nevertheless, the TNR value presents the opposite trend which increases with the threshold value of Pf as shown in Fig. 12b. For regional landslide susceptibility management, the prescribed threshold can to some extent reflect the acceptable risk level. For example, a landslide having a probability of failure more than 5% (a prescribed threshold) is defined as positive, then the acceptable risk level is Pf = 5%. Comparing the PoD or TNR curves under different model parameters’ COV, it seems the GIS-FORM toolbox works better (e.g., both PoD and TNR are above 0.65) when the COV and the prescribed threshold values lie in the ranges of 0.2 to 0.3 and 0.2 to 2.8% as shown in Fig. 12c, respectively. Thus, the two indexes PoD and TNR can be tentatively used to tune the model parameters. In the literature, most soil properties are reported with COV ranging from 0.1 to 0.5. Therefore, the proposed GIS-FORM toolbox may have a great potential for use in conducting landslide susceptibility prediction in those soil regions.

Fig.12
figure 12

(a) Probability of detection (PoD), (b) true negative rate (TNR), and (c) balance accuracy w.r.t. probabilistic threshold of landslide (Pf) under different COVs

We further investigated the performance of the toolbox for landslides prediction, considering the uncertainty of the parameters. For instance, the slope is investigated as a represented factor. The calculated results of the GIS-FORM landslide prediction toolbox considering different COVs are shown in Fig. 13. The absolute error abovementioned is used to identify the trend between estimated and determined landslides (the dotted line diagram). The error fluctuation of the most likely landslide case (Pf ≥ 90%) is observed to change from the most significant to the least significant when the COV increases from 0.05 to 0.30. The lowest sensitivity area shows a diametrically opposite trend. This also reflects to some extent that the predicted landslide areas become increasingly accurate with the parameter variability increasing. In other words, the uncertainty of the parameters is clearly of great significance for the regional hazard assessment.

Fig. 13
figure 13

Absolute error of slope-based comparison between predicted and actual landslides under different risk levels of Pf: COV = (a) 0.05, (b) 0.10, (c) 0.20, and (d) 0.30

The distribution of RI and the corresponding variation of Pf were further calculated for the region-wide critical accelerations to study the influence of parameter uncertainty considered as the dynamic factor (i.e., seismic), as shown in Fig. 14a. Compared with Fig. 11, most of the actual landslides located in the region of RI belong to the low critical acceleration range, even though they are at the lower COV (i.e., 0.05). This is significantly different from the RI distribution of the FoS under static conditions. The area of the excellent stable region (RI ≥ 5) decreases rapidly from 45.2 to 0.04%, increasing the COV for the whole study area. Further, more landslide location points fall in the high sensitivity area (red area, low RI). Additionally, the distribution of yield acceleration at different COVs can be obtained according to Pf’s classification of the region risk level. The probability of the actual landslide falling in the high-risk area (IV-ac, V-ac) according to the ac varies with the increase of the COV, as shown in Fig. 14b. For comparison, similar high-risk classification results (IV-FoS, V-FoS) obtained based on the FoS were counted again. Overall, the landslides in the IV risk level area gradually prevail with increasing the COV regardless of using the FoS or ac. However, about 82.6% of the coseismic-induced landslides were observed as situated in the landslide height susceptibility based on the benchmark of the probabilistic analysis for yield acceleration. This indicates a better effect than only 5.5% assessed using the FoS. Accordingly, the potential of adopting the GIS-FORM landslide prediction toolbox to estimate earthquake-induced landslides is more superior.

Fig. 14
figure 14

Probabilistic analysis results: (a) RI and Pf based on critical acceleration (ac) under different COVs, (b) landslide proportion

Further discussion

Although FORM utilising the HLRF_x algorithm has been successfully proposed and applied in conducting probabilistic analysis of complex geotechnical problems via standalone numerical packages (Ji and Kodikara 2015), it has not been utilised in landslide prediction works at a regional scale. In the GIS environment, the dominant barrier is the huge amount of data exchange. Since the probabilistic FORM algorithm involves iterative calculation process, the existing conventional operation, namely the calculation in layers, makes it extremely difficult to complete this process in ArcGIS. In this work, although the HLRF_x-based approach was successfully implemented in the GIS environment via Python, some key problems and limitations are still presented for in-depth investigations.

Our comparison of the GIS-FORM landslide prediction outputs with GIS-TISSA outputs (Escobar-Wolf et al. 2021) reveals that the GIS-FORM can more precisely capture the probability of landslides under complete statistical information in the infinite slope model. Hence, the GIS-FORM offers a superior way of producing regional seismic-triggered landslide susceptibility in the GIS platform, which is user-friendly and less labour-intensive. The developed toolbox is certainty extremely convenient for engineers unfamiliar with the details of calculation procedures.

The practicality and generalisability of the model and approaches (i.e., infinite slope model, HLRF_x, Newmark analysis) were verified based on a typical case of the coseismic landslides triggered by the 2017 Jiuzhaigou Ms 7.0 earthquake. Note that it takes approximately 90 min to complete the iterative calculation of 3,450,001 grid points based on the 12.5-m resolution DEM compared with the 30-min calculation of 598,829 grid points. Comprehensively, three main landslide factors (i.e., the seismic, the terrain, and the geologic information) were used for further exploration. The absolute error analysis was employed in this work to verify the prediction accuracy of the toolbox under different landslide factors. The distribution trends of error belonging to diffident landslide factors were compared for different risk levels based on the permanent displacement classification. As demonstrated in Fig. 10, the error of medium risk area (i.e., susceptibility level III) was found to be smaller than the errors of other risk areas. This implies that the area with seismic displacement greater than 15 cm is the high-frequency area of landslide. Additionally, the apparent gap between landslides in the high-risk area (Pf ≥ 50%) and low-risk area is more obvious when COV is 0.30. In other words, the extent of possible landslides is directly affected by the variability of geotechnical parameters, indicating the necessity of uncertainty analysis in landslide susceptibility mapping. In summary, compared with other qualitative methods, the GIS-FORM landslide prediction is feasible and more effective.

One of the main limitations of this work is that the uncertainty of the pore water pressure varying with rainfall penetration, which is a remarkable factor triggering landslides (namely, Hw) at low risk of earthquake activities has not been considered. In the literature, there have been a great number of studies on the rainfall-induced regional landslide analysis. For example, some of these include Shallow Landslide Stability Model (SHALSTAB) (Montgomery and Dietrich 1994), Stability Index Mapping (SINMAP) (Pack et al. 1998), Transient Rainfall Infiltration and Grid-based Regional Slope-stability analysis (TRIGRS) (Baum et al. 2002, 2008), and Shallow Landslides Instability Prediction (SLIP) (Montrasio and Valentino 2008). In future, slope stability modelling on the rainfall intensity, infiltration, and pore water change mechanisms should be carried out to further implement the proposed GIS-FORM toolbox for rainfall-induced landslide predictions.

Conclusions

In this study, a GIS toolbox embedding the FORM-based probability algorithm was developed and applied successfully to complete probabilistic predictions of regional-scale landslide susceptibility in seismic areas. By probabilistic physical modelling of the infinite slope stability, the toolbox can fully consider the statistical information of uncertain parameters contributing to the probability of landslide. Two major features of the GIS-FORM toolbox are as follows: (1) for static load conditions, the factor of safety model was probabilistically evaluated using a rapid iteration algorithm called HLRF-x and (2) for earthquake-triggered landslide susceptibility analysis, the Newmark displacement models were adopted.

In terms of the deterministic FoS calculation, a comparison of the GIS-FORM results with those from GIS-TISSA was made in this study, and the difference is almost negligible in an acceptable range (10e-5). When the statistical uncertainties of some basic inputs are taken into account, the landslide susceptibility predictions obtained by the GIS-FORM are distinctly different from GIS-TISSA, the latter integrated with a much weaker reliability analysis tool called the FOSM. The versatility of the proposed GIS-FORM for landslide susceptibility prediction is that all the fundamental statistical information of multiple random variables can be simulated, including the statistical distribution type, the mean value, standard deviation, and the cross-correlation between random variables.

Furthermore, the coseismic landslide records subject to the Ms 7.0 Jiuzhaigou earthquake were adoptedied to verify the calculated landsliding areas with proper assumptions of parameter uncertainties, using the proposed GIS-FORM tool racticality for the rapid landslide hazard assessment landslides areas. The results indicated that the landslide susceptible based on probabilistic-modelling analysis has high accuracy as compared with the recorded failures, which demonstrated the applicability of our proposed method.

Overall, the developed GIS-FORM tool as an extension of ArcGIS 10.6 software is feasible and effective in performing the earthquake-induced landslide susceptibility analysis. In this way, the landslide susceptibility analysis considering the probabilistic framework can generate more precise and physical model-dependent landslide susceptibility maps for the disaster reduction target. We expect that the GIS-FORM landslide prediction toolbox based on the FORM probability algorithm will enhance our ability to evaluate seismic-triggered landslide susceptibility on regional scale and increase awareness of landslide hazards (Ji et al. 2019a, b). In the future, relevant research should be further carried out to implement the physically based slope stability model and engineering reliability algorithm to rainfall-induced regional landslide susceptibility predictions.