Keywords

Introduction

Rainfall-induced landslides are widespread and destructive natural phenomena occurring all around the world that often cause severe human and economic losses (Froude and Petley 2018). Landslide early warning systems (LEWS) are being increasingly applied as non-structural risk mitigation measures. LEWS can be designed and employed at two different reference scales (Calvello 2017; Pecoraro et al. 2019): local systems address single landslides at slope scale (Lo-LEWS), while territorial systems (Te-LEWS) deal with multiple landslides over wide areas at regional.

Te-LEWS are used to provide generalized warnings over appropriately-defined homogeneous warning zones of relevant extension. Typically, these systems address weather-induced landslides through the monitoring and prediction of meteorological parameters. However, the definition of a regional warning model may be challenging for several reasons: the reconstruction of rainfall events, the absence of a direct relationship between rainfall and landslide initiation, the uncertainty of available landslide catalogues (e.g., Piciullo et al. 2018; Segoni et al. 2018a).

In this study, a conceptual framework for the definition of probabilistic rainfall thresholds for landslides at regional scale is developed. The main steps of the proposed approach are: (i) objective reconstruction of triggering and non-triggering rainfall conditions taking into account their frequency, (ii) probabilistic analysis, (iii) definition and performance evaluation of a two-levels probabilistic warning model. The proposed procedure has been tested by analyzing the reported landslides in the period 2010–2018 within a study area in northern Italy.

Materials and Methods

Study Area and Database

The study area includes 6 of the 158 weather warning zones (WZ) defined for hydrogeological risk management in Italy: Emil-E, Emil-G, Ligu-B, Ligu-C, Tosc-L, and Tosc-S1 (Fig. 1). Although the selected WZ fall into three different Italian regions, all of them are characterized by high susceptibility to the occurrence of rainfall-induced landslides.

Fig. 1
figure 1

Shaded relief map of the study area showing the 540 rainfall-induced “FraneItalia” landslide records in the period 2010–2018, differentiated in single (red circles) and areal landslide events (blue squares). The inset shows the location of the six warning zones in Italy

Indeed, the study area is one of the rainiest of Italy; moreover, climate change is producing an extraordinary increase of rainfall intensity in there (Libertino et al. 2018). As a consequence, this area is one of the most severely affected by landslides in the last few years in Italy (Battistini et al. 2013).

In particular, thunderstorms characterized by intense and very intense rainfall cause widespread and damaging ground effects, both on the slopes and along the drainage pattern, in Ligu-B, Ligu-C, Tosc-L, and Tosc-S1 (Roccati et al. 2018). Besides, the frequency of rapid shallow landslides is markedly increasing in the last few years in Emil-E and Emil-G, as shortest and more intense rainfalls, typically the main triggering factor of shallow landslides and debris flows in the Emilia-Romagna region, became more frequent in the Mediterranean area due to climate change (Segoni et al. 2018b).

The FraneItalia database (Calvello and Pecoraro 2018) reports 540 landslide events that occurred in the study area in the period 2010–2018. 27 records have been excluded from the analysis performed herein as they are reported as human- or earthquake-induced landslides or landslides for which the trigger is not known. Among the 513 landslide events included in the dataset, 353 are classified as single landslide events (SLE, red circles in Fig. 1) and the remaining 160 as areal landslide events (ALE, blue squares in Fig. 1).

The rainfall measurements were derived from the satellite-based Tropical Rainfall Measuring Mission (TRMM) database, which is a joint mission between NASA and the Japan Aerospace Exploration Agency (JAXA) launched in late November 1997 for the study of rainfall for weather and climate research purposes (Huffmann et al. 2007). Precipitation data used in this research have been derived from the TRMM version 3b42, which includes gridded precipitation data collected every 3 h at a 0.25° × 0.25° (∼25 km × 25 km) spatial resolution, extending from latitude 50° S to latitude 50° N. It is worth mentioning that the spatial resolution is finer respect to local rain gauge networks usually employed for early warning purposes.

Satellite rainfall data retrieved from TRMM database have been analyzed using Google Earth Engine (https://earthengine.google.com), a cloud-based platform for planetary-scale environmental data analysis. For the purposes of this study, precipitation measurements have been aggregated at 3-hourly temporal resolution and the mean rainfall values over each territorial unit have been calculated.

Methodology

The methodology developed for the definition of the probabilistic thresholds of landslides occurrence can be schematized into three main phases: reconstruction of the rainfall events, probabilistic analysis and definition of the probabilistic warning model.

In the first phase, the correlation between landslides and rainfall events in the study area is conducted by reconstructing the rainfall events, in order to convert a series of hyetographs into a point cloud in a graph reporting triggering and non-triggering combinations of rainfall parameters. Duration (D) and cumulated rainfall (E) are identified as the most appropriate rainfall parameters to use. To this aim, a modified version of the “algorithmic” approach developed by Melillo et al. (2016) is applied.

A reduced set of parameters to account for different physical settings and operational conditions has been considered. In particular, all the parameters are differentiated considering the “warm” springer-summer period, CW, and the “cold” autumn–winter period, CC (Table 1).

Table 1 Parameters used for the application of the algorithm developed by Melillo et al. (2016)

The automated procedure is based on several steps. In the pre-processing step (S0), the rainfall records lower than a predefined threshold GS are considered noise and are set to EH = 0.0 mm. The remaining steps are differentiated into two main logical blocks. The first block performs the automatic reconstruction of the rainfall events and can be schematized in the following four steps: (S1) detection of the isolated rainfall events considering a dry interval, R1 and exclusion of irrelevant events that do not exceed a predefined threshold ER; (S2) identification of rainfall sub-events proceeded and followed by dry periods with no rain, R2; (S3) exclusion of irrelevant sub-events, whose cumulated (total) rainfall, ES is lower than a given threshold, R3; (S4) identification of rainfall events, constituted either by a period of continuous rainfall or by an ensemble of periods considering a minimum dry period, R4. Successively, in the second block the algorithm combines information on temporal occurrence of rainfall events and landslide events, performing three additional steps: (S5) selection of triggering and non-triggering rainfall events; (S6) reconstruction of multiple aggregations of rainfall sub-events that are likely to trigger landslides; (S7) reconstruction of multiple aggregations of rainfall sub-events that did not trigger landslides. All the triggering and non-triggering sub-events identified by the algorithm are equally possible.

In the second phase, a probabilistic approach based on a two-dimensional Bayesian analysis, similar to that used by Berti et al. (2012), is developed to calculate the landslide probability associated to the different rainfall combinations. To this aim, the posterior landslide probability is evaluated considering the joint probability of the duration (D) and cumulated rainfall (E), as follows:

$$P(L|D,E)=\frac{P(L)\times P(D,E|L)}{P(D,E)}$$
(1)

where: P(L|D, E) is the posterior landslide probability; P(L) is the prior probability; P(D, E|L) is the likelihood; P(D, E) is the marginal probability. The needed probabilities have been determined considering that the triggering and non-triggering rainfall conditions are expressed in terms of multiple combinations, as follows:

$$P(L)=\frac{{N}_{L}}{{N}_{R}}$$
(2)
$$P(D,E)=\frac{\sum_{i}{n}_{i,(D,E)}\times {f}_{i}}{{N}_{R}}$$
(3)
$$P(D,E|L)=\frac{\sum_{i}{n}_{i,(D,E|L)}\times {f}_{i}}{{N}_{L}}$$
(4)

where: NL is the total number of landslide events that occurred in the period of analysis; NR is the total number of rainfall events recorded in the period of analysis; ni,(D,E) is the number of possible rainfall conditions characterized by specific values of D and E; ni,(D,E|L) is the number of rainfall events characterized by specific values of D and E that resulted in landslides; fi is the relative frequency, defined as the inverse of the total number of possible aggregations of sub-events for a given rainfall event.

In the third phase, a warning model is defined employing two warning levels (WL1 and WL2) associated to the exceedance of two thresholds (P1 and P2) based on the probabilities of occurrence of SLE and ALE (Table 2).

Table 2 Warning levels defined considering the probabilities of SLE and ALE

In the fourth phase, the performance of the warning model is analyzed using statistical indicators, following a procedure similar to that proposed by Calvello and Piciullo (2016). In particular, the performance analysis of a 3 by 3 contingency matrix is based on a set of two performance criteria, both of them assigning a meaning to all the elements of the matrix (Fig. 2). The “alert classification” criterion employs an alert classification scheme derived from a standard 2 by 2 contingency table, and identifies correct predictions (CP), false alerts (FA), missed alerts (MA), and true negatives (TN). The “grade of accuracy” criterion assigns a colour code to the components of the matrix in relation to the agreement between a given warning event and a given landslide event. Using this criterion, the elements are classified in four colour-coded classes, as follows: green (Gre) for the elements which are assumed to be representative of the best model response, yellow (Yel) for elements representative of minor model errors, red (Ora) for elements representative of a significant model error and purple (Red) for elements representative of a severe model error.

Fig. 2
figure 2

Contingency matrix used for the performance analysis of the probabilistic rainfall thresholds

Considering the two performance criteria, several performance indicators can be derived. Table 3 lists the indicators used in this study.

Table 3 Performance indicators used for the performance analysis

Results

Rainfall Events Reconstruction

1903 rainfall conditions (D, E) have been identified and plotted in log–log coordinates (Fig. 3). The 207 rainfall conditions responsible for triggering 353 SLE (red circles in Fig. 3) and the 129 rainfall conditions responsible for 160 ALE (blue squares in Fig. 3) are in the range of duration 3 ≤ D ≤ 915 h and in the range of cumulated rainfall 1.02 ≤ E ≤ 243.54 mm. The non-triggering rainfall conditions, reconstructed in the same period, are 1567 (green circles in Fig. 3). They are in the ranges of 3 ≤ D ≤ 495 h and 1.01 ≤ E ≤ 311.87 mm.

Fig. 3
figure 3

Rainfall duration (D) versus cumulated rainfall (E) in the study area from 2010 to 2018. Graph plotted in log–log coordinates

It is worth mentioning that rainfall combinations characterized by E < 2 mm (grey circles in Fig. 3) constitute a negligible amount of rain, thus these combinations have been excluded from the analysis because they are considered irrelevant for the purpose of early warning.

Probabilistic Analysis

The definition of the probabilistic thresholds is based on a two-dimensional Bayesian analysis evaluating the conditional probability of landslide occurrence given the joint probability of D and E. According to the available data, the prior landslide probability, P(L) has been calculated using Eq. (2) and is equal to 18.79% for SLE and 7.44% for ALE.

Successively, the D, E space reported in Fig. 3 has been divided in 6 × 6 cells, both for SLE and ALE, and the posterior landslide probabilities, P(L|D, E), have been calculated by applying Eq. (1). Looking at SLE, Fig. 4a displays that long-duration (12 ≤ D ≤ 915 h), high-accumulation rainfall (50 ≤ E ≤ 243.54 mm) events show the highest landslide probabilities (P(L|D, E) > 40%). The only singularity is represented by the combination 5 ≤ D ≤ 10 h, 50 ≤ E ≤ 243.54 mm, for which P(L|D, E) = 63.64%. However, this can be considered a singularity, as it represents only 0.02% of the rainfall combinations that occurred from 2010 to 2018. The results are substantially confirmed for ALE (Fig. 4b). Indeed, apart for the singularity already highlighted for SLE, the highest values of the posterior probability (P(L|D, E) > 20%) are reached again for 12 ≤ D ≤ 915 h and 50 ≤ E ≤ 243.54 mm.

Fig. 4
figure 4

Posterior landslide probabilities obtained considering SLE (a) and ALE (b)

Probabilistic Warning Model

A performance evaluation has been conducted in order to identify the optimal two thresholds to be employed in the warning model. Several combinations have been compared, by varying the lower threshold, P1 and the upper threshold, P2. As significant differences in the performance evaluation depend only on the variations of P1, the results are reported grouping the thresholds on the basis of P1 (Table 4).

Table 4 Combinations considered to identify the optimal values of P1 and P2

Table 5 shows the results obtained for the five combinations considering the elements of the correlation matrix reported in Fig. 2. Higher values of CP and Yel are obtained when the lower probabilities values are considered to define WL1 (P1 from 10 to 20%). In particular, passing from P20,40–50 to P30,50 results in a reduction of CP of about 37%. However, an increase of the P1 threshold results in a significant reduction of the FA and Red errors and increasing values of TN.

Table 5 Number of contingency matrix elements considering the “alert classification” (CP, TN, MA, FA) and “grade of accuracy” (Gre, Yel, Ora, Red) criteria

Table 6 shows the results in terms of success (Ieff and OR) and error (PSM, PSM-MA, PSM-FA, and MFB) indicators for the five different thresholds combinations reported in Table 4. Concerning the success indicators and, in particular, the efficiency index (Ieff), raising the value of P1, a general increase is observed, as it is evident when comparing P10,20–50 and P12.5,25–50 to P20,40–50 and P30,50. The odds ratio (OR), which can be considered a rate between correct and wrong predictions, obviously increases with the reduction of FA and the increment of TN. However, it should be noted that passing from P20,40–50 to P30,50 the Probability of serious missed alerts (PSM-MA) shows an increment of about 25%. Besides, the majority of the errors are missed alerts, as demonstrated by the high value of MFB (60.72%). For these reasons, P20,40–50 can be considered the best-performing thresholds combination of the 5 considered herein.

Table 6 Performance indicators computed for the five thresholds combinations considered

Conclusions

In this study, a Bayesian approach has been developed for the definition of a probabilistic warning model for rainfall-induced landslides. It has been defined using a landslide inventory retrieved from online news and satellite-based rainfall measurements. Both landslide records and satellite rainfall monitoring used in this study come from open-access datasets available online.

Firstly, the triggering and non-triggering rainfall conditions have been objectively reconstructed. Then, a Bayesian approach has been applied for calculating the posterior landslide probabilities of occurrence of single landslide events (SLE) and areal landslide events (ALE). Finally, a probabilistic warning model employing two thresholds has been defined and its performance evaluated using performance indicators derived from a 3 by 3 contingency table.

The performed analyses showed that P20,40–50 is the best-performing thresholds combination, as it represents the best compromise between the minimization of incorrect landslide predictions and the maximization of the correct predictions. Generally, the probabilistic warning model revealed an overall good performance in predicting landslide events triggered by significantly different rainfall conditions. Although the performance of the model can be further refined considering wider and longer datasets, the preliminary results achieved herein clearly allow to highlight its potential for landslide early warning purposes.