Introduction

Rainfall-induced landslides are a frequent and world-wide phenomenon that claims mass lives and causes huge damages (Iverson 2000; Nadim et al. 2006; Froude and Petley 2018); they are intensified under a changing environment (Gariano and Guzzetti 2016; Guha-Sapir et al. 2016; Lewkowicz and Way 2019). To prepare for and mitigate these disasters, landslide early warning systems (LEWSs) are needed to determine the time and location of landslide occurrence and respond immediately, especially in landslide-prone and vulnerable areas. Regional LEWSs have been developed around the world, such as in Hong Kong (Premchitt et al. 1994), Piedmont, Italy (Aleotti 2004), Seattle, USA (Chleborad et al. 2006), and Norway (Devoli et al. 2015); a detailed review of these systems was prepared by Piciullo et al. (2018). Nevertheless, some areas are still threatened with untimely and imprecise monitoring and warnings (Harp et al. 2009; Broeckx et al. 2018). Existing LEWSs are implemented by costly gauge observations with limited spatial coverage (Martelloni et al. 2012; Kirschbaum and Stanley 2018). With quasi-global coverage and (near) real-time monitoring, satellite-based precipitation estimates (SPEs) could be a better proxy for rain-gauged input in operational LEWSs, potentially promoting the timeliness of global landslide warnings and substantially reducing landslide disasters (Gebremichael and Hossain 2010; Sun et al. 2018).

Several studies have focused on the application of SPEs in LEWSs over global to regional scales. Hong et al. (2006, 2007) proposed a preliminary framework for early global landslide warning with the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) rainfall estimates, based on rainfall thresholds for landslide occurrence. This landslide hazard algorithm was updated by Kirschbaum et al. (2009, 2012). Farahmand and AghaKouchak (2013) created another satellite-based global landslide model using the Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (PERSIANN) rainfall data. Apip et al. (2010) integrated the Climate Prediction Center (CPC) Morphing Technique (CMORPH) precipitations into landslide modeling in Indonesia. Consequently, LEWS forced by SPE have been implemented, such as LHASA global model (https://pmm.nasa.gov/precip-apps; Kirschbaum and Stanley 2018). However, there have been few attempts to evaluate the performance of landslide modeling, largely because data availability is limited, as noted by Kirschbaum et al. (2009) and Farahmand and AghaKouchak (2013). Kirschbaum and Stanley (2018) pointed out that the absence of a global database of “nonlandslide” points is a main obstacle for a robust evaluation. Additionally, previous studies were mostly conducted with the goal of applying particular SPE product, whereas comparison of various rainfall products would be required to comprehensively evaluate the performances of SPEs for landslide modeling.

To date, efforts to evaluate and compare the performances of various SPEs for landslide modeling have been limited at regional scale. Rossi et al. (2017) compared the landslide rainfall thresholds defined by gauged and TMPA rainfall data in the Umbria region, Italy. Brunetti et al. (2018) evaluated the performance of four SPE products to forecast the possible occurrence of rainfall-triggered landslides in Italy by employing a uniform evaluation procedure. The performances of various SPE products applied in the global LEWSs are poorly assessed and compared. The reasons may lie in the questioned accuracy or bias of SPEs as compared to gauged rainfall data (Hossain et al. 2014; Camici et al. 2018), the lack of a global rainfall-induced landslide database with records of reliable time and location information (Kirschbaum et al. 2009), and the demand for a solid evaluation procedure (Gariano et al. 2015).

TMPA (Huffman et al. 2007), PERSIANN (Hsu et al. 1997), CMORPH (Joyce et al. 2004), and Multi-Source Weighted-Ensemble Precipitation (MSWEP; Beck et al. 2017) rainfall products are state-of-the-art SPEs with increasing spatial-temporal resolution and accuracy. These four rainfall datasets are evaluated and compared for landslide modeling in this study. We employ the Global Landslide Catalog (GLC; Kirschbaum et al. 2010) for integration of landslides and their triggering rainfall conditions. This global rainfall-induced landslide dataset provides reliable occurrence information of landslide events and is adequate for a solid evaluation procedure. This paper addresses three specific scientific questions: (1) do SPE products distinguish rainfall events responsible for landslides from those not related to landslides? (2) how do SPE products perform in providing global landslide warnings and which product is best? (3) how do SPE products perform in landslide warnings based on landslide type and regional effects?

To address the questions above, we first reconstruct rainfall events that are responsible for landslide events and those are not and conduct preliminary comparisons between these two rainfall event data sets for each SPE product. Then, we establish cumulated event rainfall (E)–rainfall duration (D) thresholds (denoted as ED rainfall thresholds) based on rainfall events that are responsible for landslides. Threshold uncertainties are considered by using frequentist thresholds corresponding to different exceedance probabilities. For quantitative evaluations, skill scores of ED rainfall thresholds are estimated at the global scale. Despite the fact that a good few practices used available datasets to generate unique global thresholds for landslide initiation (reviewed in Guzzetti et al. 2008), climatic and geological environments contribute to regional differences of rainfall thresholds by influencing rainfall and landslide patterns (Guzzetti et al. 2007; Zhang et al. 2011; Peruccacci et al. 2017). Landslide category also plays an important role in rainfall threshold establishment, since failure mechanisms vary with landslide types (Cruden and Varnes 1996; Hungr et al. 2014). Thus, we also establish rainfall thresholds and evaluate their performances for particular landslide types and regions. Specifically, we calculate the skill scores at the scale of 5° × 5° of latitude and longitude, i.e., each 5° × 5° grid is an individual region.

Data and methods

Data

Global landslide catalog

The GLC has been compiled by the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center since 2007 to promote landslide disaster evaluation and reduction (Kirschbaum et al. 2010). The catalog records 10,988 landslide events worldwide as recently as September, 2017 (last updated on April 3, 2018; https://data.nasa.gov/Earth-Science/Global-Landslide-Catalog-Export/dd9e-wu2v). GLC details occurrence time, occurrence location, and trigger factor, which are vital for extracting rainfall information corresponding to landslide events. However, the accuracy of each record varies and unquantifiable uncertainties exist in GLC due to heterogeneous and limited data availability (Kirschbaum et al. 2015).

The following criteria were used to select valid records: (1) records with location accuracy worse than 25 km were excluded; (2) landslide events initiated by triggers other than rainfall were removed; (3) occurrence time listed by local time was converted to Coordinated Universal Time (UTC) and ambiguous descriptions were replaced with specific times (e.g., 6:00 as a substitute for “early morning”); and (4) landslide events occurring in the period from 2007 to 2014 were selected with an ending time corresponding to TRMM instrument termination in early 2015. The selected landslide dataset consists of 5164 records (Fig. 1). According to Kirschbaum et al. (2015), landslide classifications in GLC were modified from Cruden and Varnes (1996) and USGS (2004) classfications. The top type with the most landslide records is still the generic term “landslide” (68.8%), which is not specifically defined in the original sources. Thus, only types of mudslide, complex, rock fall, and debris flow are considered in type effect analysis (section “Rainfall threshold establishment and evaluation for different landslide types”). Additionally, 82.7% of records reported are distributed in Asia and North America.

Fig. 1
figure 1

Spatial distribution of the 5164 selected landslide records during the period of 2007–2014 from the Global Landslide Catalog (GLC)

Satellite-based precipitation estimates

SPEs play a vital role in global hydrological modeling and rainfall-related hazard monitoring. We used four notable SPEs—TMPA, CMORPH, PERSIANN, and MSWEP products with the same temporal resolution and similar spatial resolution (Table 1)—to explore the association between rainfall data and rainfall-induced landslides in this study.

Table 1 Information about the four satellite-based rainfall products analyzed in this study

TMPA rainfall data merge precipitation estimates from multiple satellite sensors. 3B42RT of version 7 is provided by the NASA Precipitation Measurement Missions (PMM) Science Team (https://pmm.nasa.gov/TRMM). The original CMORPH rainfall data cover the period 2002–present. Recently, reprocessed and biased-corrected CMORPH rainfall estimates since 1998 are available (http://ftp.cpc.ncep.noaa.gov/precip/CMORPH_V1.0/CRT/), provided by the National Oceanic and Atmospheric Administration (NOAA). PERSIANN algorithm generates hourly rainfall data by the Center for Hydrometeorology and Remote Sensing (CHRS) at the University of California, Irvine. The suite of PERSIANN rainfall data is accessible on the official website (http://chrsdata.eng.uci.edu/). MSWEP is a newly developed SPE dataset, merged from a wide range of satellites, rain gauges, and reanalysis estimates. The latest version is depicted online (http://www.gloh2o.org/). All the datasets integrate high-quality combined microware precipitation estimates derived from low earth orbit satellites and variable rain rate infrared precipitation estimates collected from geostationary satellites.

Methods

Rainfall event reconstruction

To acquire rainfall conditions triggering landslide events, we employed an automatic algorithm proposed by Melillo et al. (2015, 2016). The algorithm defines dry period with no rain to distinguish individual rainfall events from each other. When a dry period is given, we can extract a single rainfall event with its rainfall duration (D) and cumulated event rainfall (E) based on its specific time or acquire all the rainfall events during a period. A continuous rainfall series is required as input. We used an updated version of the algorithm in which the dry periods vary with the climatic and geographical environment (Melillo et al. 2018). The distributed dry periods were generated at a grid scale of 0.25° based on available global datasets and details are provided in the appendix.

Two types of rainfall events were classified—those responsible for landslide occurrence (landslide-initiating rainfall events, LREs) and those triggering no landslides (non-landslide-initiating rainfall events, NLREs). Rainfall series of the SPE grids in which landslides occurred were used as input to the rainfall event reconstruction algorithm. LREs were extracted according to the landslide occurrence time. We assumed that all rainfall events that precede LREs are NLREs for every rainfall series. The rainfall event reconstruction was conducted for each landslide record. For SPE grids with more than one landslide event, only rainfall events before the first LRE were considered as NLREs.

Considering the uncertainties in the landslide and SPE datasets, some landslide events may miss their rainfall conditions with zero cumulated event rainfall. To assess the capability of the reconstruction algorithm, we defined the reconstruction percentage, equal to the ratio of landslide records matching their rainfall conditions over the total landslide records.

Rainfall threshold establishment

Rainfall thresholds have been a powerful tool for LEWSs, presenting a minimum rainfall condition over which landslides are likely to occur, thus providing a simple and effective method to model the possible occurrence of landslides (Guzzetti et al. 2007; Segoni et al. 2018). ED rainfall thresholds are a common rainfall threshold, revealing a linear increasing trend of cumulated event rainfall with rainfall duration in the logarithmic coordinates for rainfall conditions that are responsible for landslides (Innes 1983; Peruccacci et al. 2012), which can be written as

$$ E=\alpha {D}^{\beta } $$
(1)

where E is the cumulated event rainfall or total rainfall of a single rainfall event in millimeters and D is the rainfall duration of the single rainfall event in hours, β and α are the fitting parameters, and β ranges from 0 to 1.

Practically, various ED rainfall thresholds can be fitted based on different methods and stochastic concepts. High thresholds tend to miss more real disasters, whereas low thresholds could be more sensitive to small rain, leading to false positives. The frequentist method was proposed to define multiple thresholds corresponding to different exceedance probabilities (frequentist thresholds, FTs; Brunetti et al. 2010; Piciullo et al. 2017), providing the choice to search for better thresholds for an operational LEWS. This method yields thresholds for any given exceedance probability, e.g., a 5% threshold (T5) indicates that rainfall conditions below the threshold may initiate landslide events at a probability of less than 5% for a sufficiently complete and representative catalog of rainfall events responsible for landslide occurrences. We employed (E, D) data points of LREs during the period of 2007–2014 to fit FTs from 1 to 50% for each rainfall product. The definition of rainfall thresholds using frequentist method was also compiled in Melillo et al. (2018; http://geomorphology.irpi.cnr.it/tools/rainfall-events-and-landslides-thresholds/ctrl-algorithm/ctrl-code/CTRL_code.R/). We modified the algorithm for grid rainfall application.

Skill scores for evaluation

Using the FT at a given exceedance probability, both of LREs and NLREs are divided into two subsets. ED data points of LREs above FT are true positives (TPs), and those below FT are false negatives (FNs). Similarly, rainfall events of NLREs above FT are false positives (FPs) and those below FT are true negatives (TNs). Skill scores, such as Hanssen-Kuiper (HK) skill score (Hanssen and Kuipers 1965), δ skill score, and comprehensive skill score index (Gariano et al. 2015), are defined as functions of the TPs, FNs, FPs, and TNs. HK skill score has been widely used in previous LEWS evaluations (Peres and Cancelliere 2014; Gariano et al. 2015; Brunetti et al. 2018), as it has a simple formula, robust performance, and general applicability for identifying the usability and accuracy of scientific data, methods, or models. The following formula show the definition of HK:

$$ HK= POD- POFD=\frac{TP}{T\mathrm{P}+ FN}-\frac{FP}{TN+ FP} $$
(2)

where the probability of detection (POD) is the hit rate, representing the ratio of correctly forecasted landslides over recorded landslides; the probability of false detection (POFD) is the false alarm rate, which is the ratio of falsely forecasted rainfall events initiating landslides over the total rainfall events of NLREs. Both indexes range from 0 to 1. Thus, HK ranges from − 1 to 1 and its optimal value is 1, representing the ideal case in which all recorded landslides are correctly forecasted (POD = 1) without incorrect predictions (POFD = 0).

We calculated the POD, POFD, and HK for each FT and SPE product. The performance of each SPE was exhibited by HK curves (HK values against the exceedance probability of each FT, respectively). The evaluation procedure was conducted using rainfall information from LREs and NLREs during the period of 2007–2014. Based on HK curves, we determined the best-performing FTs with the highest HK values. The same evaluation procedure was conducted using landslide events and their rainfall conditions for different types of landslides and 5° × 5° latitude and longitude grids across global land areas. To ensure reliability of evaluation, we removed grids with less than ten LREs.

Results

Rainfall event reconstruction and rainfall characteristics

We reconstructed LREs and NLREs and summarized the number of each and the reconstruction percentage (Table 2). Rainfall conditions for most landslide events were captured by the SPE products with an average reconstruction percentage of 92.9%, i.e., almost 7% of landslide events fail to associate with valid rainfall amount. The reconstruction results vary with rainfall products and landslide types. There are significant differences in the number of NLREs, suggesting that rainfall characteristics are distinct for different rainfall products. To determine regional differences, we counted the number of LREs in each 5° × 5° latitude and longitude grid (Fig. 2).

Table 2 Summary of rainfall event reconstruction for landslide-initiating rainfall events (LREs) and non-landslide-initiating rainfall events (NLREs). Landslide types of 0~6 represent all landslide types, landslide, mudslide, debris flow, rock fall, complex, and other types, respectively
Fig. 2
figure 2

Landslide-initiating rainfall events (LREs) in 5° × 5° latitude and longitude grids worldwide for a TMPA-3B42RT V7, b CMORPH-3h V1, c PERSIANN-3h, and d MSWEP-3h V2 rainfall products. TG represents the total number of 5° × 5° grids

The statistics of the rainfall variables for LREs and NLREs are analyzed in Figure 3. Empirical cumulative distribution function (ECDF) curves show that LREs tend to have shorter rainfall duration, larger cumulated rainfall, and higher rainfall intensity compared with NLREs except for PERSIANN-3h data. ECDF curves of rainfall intensity for LREs and NLREs are almost the same in PERSIANN-3h data whereas shorter rainfall duration is accompanied by less cumulated rainfall for LREs. PERSIANN’s poor reconstruction percentage may be one reason for this anomaly (Table 2). Boxplots of rainfall variables reveal similar results and show that the rainfall intensity of LREs is lightly higher than that of NLREs for PERSIANN-3h data. Overall, satellite-based rainfall data can distinguish LREs from NLREs and are of potential use in global landslide modeling.

Fig. 3
figure 3

Empirical cumulative distribution function (ECDF) curves and boxplots of rainfall duration, cumulated event rainfall, and rainfall intensity of the LREs and NLREs during the period of 2007–2014 for TMPA-3B42RT V7 (a–c), CMORPH-3h V1 (d–f), PERSIANN-3h (g–i), and MSWEP-3h V2 (j–l) rainfall products. The mean values of rainfall variables are labeled

Rainfall threshold establishment and evaluation at the global scale

Figure 4 displays the scattered (E, D) points of LREs and FTs of 1%, 5%, 20%, and 50% (denoted by T1, T5, T20, and T50) for the four SPE datasets, with rainfall duration ranging from 3 to 1000 h. The slopes of thresholds capture the increasing linear relationship between cumulated event rainfall and rainfall duration, indicating that ED rainfall thresholds are appropriate for linking rainfall-induced landslides and satellite-based rainfall data. With an increasing probability, the threshold moves upward and tends to miss more real active disasters. Rainfall thresholds defined by TMPA-3B42RT V7 are higher than those defined by CMORPH-3h V1, MSWEP-3h V2, and PERSIANN-3h. Specifically, more cumulated rainfall is required for TMPA-3B42RT V7 and CMORPH-3h V1 than MSWEP-3h V2 and PERSIANN-3h when the rainfall duration is shorter than a few days. When the rainfall duration reaches 1 month or longer, rainfall conditions for landslide occurrence are almost the same for the CMORPH-3h V1, MSWEP-3h V2, and PERSIANN-3h datasets.

Fig. 4
figure 4

Rainfall events and thresholds during the period of 2007–2014 for a TMPA-3B42RT V7, b CMORPH-3h V1, c PERSIANN-3h, and d MSWEP-3h V2 rainfall products. The dark gray points represent ED data points responsible for landslide events as defined in Melillo et al. (2018). T1, T5, T20, and T50 are the ED rainfall thresholds with the exceedance probability of 1%, 5%, 20%, and 50%, respectively

HK curves indicating the variation of performances of rainfall thresholds at different exceedance probability levels are shown in Figure 5. The results reveal that HK values first rise and then decline with the exceedance probability increasing. CMORPH-3h V1 performs the best with the highest HK curve, followed by TMPA-3B42RT V7, whose HK values are about 0.03 lower than those of CMORPH-3h V1 for the exceedance probability lower than 27% and almost the same for greater exceedance probabilities. The HK curve of MSWEP-3h V2 is significantly lower than that of TMPA-3B42RT V7 and PERSIANN-3h performs the worst. The average differences of HK values between TMPA-3B42RT V7 and MSWEP-3h V2, MSWEP-3h V2 and PERSIANN-3h are 0.15 and 0.14, respectively. The best-performing FTs with the highest HK values are T22 for CMORPH-3h V1, T27 for TMPA-3B42RT V7, T31 for MSWEP-3h V2 and T9 for PERSIANN-3h. Detailed ED rainfall threshold information and evaluation results for the best-performing FTs are shown in Table 3.

Fig. 5
figure 5

Hanssen-Kuiper (HK) skill score curves (HK values against the exceedance probability of each frequentist threshold) for each rainfall products based on ED rainfall thresholds. The individual points highlight HK values of the best-performing frequentist thresholds for each rainfall product

Table 3 Parameters and skill scores of the global ED thresholds for rainfall-induced landslide occurrence based on the best-performing frequentist thresholds for each satellite-based rainfall dataset

Rainfall threshold establishment and evaluation for different landslide types

We established rainfall frequentist thresholds based on landslide types in GLC following the same procedure in section “Rainfall threshold establishment and evaluation at the global scale”. Figure 6 shows the performances of rainfall thresholds for different landslide types by HK variations with exceedance probability. HK values for different landslide types follow the rising-and-falling trend and are at least as good as HK values for all landslide types. Comparatively, the rainfall thresholds for mudslide and debris flow perform equally well to that for all landslide types; thresholds for rock fall and complex perform better, with HK values 0.02 higher on average than those of all landslide types. Detailed evaluation results for the best-performing FTs are displayed in Table 4.

Fig. 6
figure 6

HK curves of different types of landslides for a TMPA-3B42RT V7, b CMORPH-3h V1, c PERSIANN-3h, and d MSWEP-3h V2 rainfall products

Table 4 Threshold parameters and skill scores for different types of landslides based on the best-performing frequentist thresholds. The landslide types are the same as those in Table 2

These results show significant differences among rainfall thresholds for various landslide types (Fig. 7). All rainfall thresholds for specific landslide types are significantly different from the rainfall thresholds of all landslide types, except rock fall and complex for TMPA-3B42RT V7 data. As the rainfall duration increases, so do differences in the cumulated rainfall thresholds of the four landslide types. Specifically, the rainfall threshold of debris flow is higher than that of all landslide types, except for PERSIANN-3h data, i.e., on average, the occurrence of debris flow requires higher rainfall intensity than other types of landslides for a given rainfall duration. In contrast, rainfall thresholds of mudslide and complex are lower than that of all types excluding CMORPH-3h V1. Admittedly, systematic uncertainties exist considering the completeness and representativeness of landslide events for different landslide types. This work serves as a preliminary and qualitative analysis.

Fig. 7
figure 7

Rainfall thresholds of different landslide types for a TMPA-3B42RT V7, b CMORPH-3h V1, c PERSIANN-3h, and d MSWEP-3h V2 rainfall products based on the best-performing frequentist thresholds

Rainfall threshold establishment and evaluation at a grid scale

Rainfall thresholds were established and evaluated for each 5° × 5° latitude and longitude grid proposed in section “Rainfall event reconstruction and rainfall characteristics” (Fig. 2), i.e., we calculated rainfall threshold parameters α, β, and HK values for each grid. The total number of grids with more than ten LREs ranges from 79 for PERSIANN-3h to 102 for CMORPH-3h V1, and the HK values of the best-performing FTs range from 0.1 (the worst for PERSIANN-3h) to 0.9 (the best for TMPA-3B42RT V7; Fig. 8).

Fig. 8
figure 8

HK values based on the best-performing frequentist thresholds in 5° × 5° grids of latitude and longitude for a TMPA-3B42RT V7, b CMORPH-3h V1, c PERSIANN-3h, and d MSWEP-3h V2 rainfall products

Inspection of Figure 8 proves that TMPA-3B42RT V7 and CMORPH-3h V1 are superior to the other two rainfall products with more grids of high HK, a finding that coincides with the results in section “Rainfall threshold establishment and evaluation at the global scale”. By contrast, TMPA-3B42RT V7 performs better than CMORPH-3h V1 with more grids that have HK values greater than 0.5 (64.6% vs. 61.8%). When the percentage of grids with HK values greater than 0.3 is examined, CMORPH-3h V1 overrides TMPA-3B42RT V7. For PERSIANN-3h, 35.4% (31.6%) of all grids hold HK values between 0.3 (0.2) and 0.5 (0.3), whereas a majority (53.2%) of MWSEP-3h V2 grids hold HK values between 0.3 and 0.5.

We then tested the regional effect of rainfall on the variation in rainfall threshold establishment by assessing the relationship between rainfall threshold parameters α, β and the mean annual precipitation (MAP) in grids. The MAP over the period of 2007–2014 of each 5° × 5° grid was calculated based on original SPEs with 0.25°/0.1° resolution and we analyzed the linear relationships between logα, β, and MAP (Fig. 9). Variations in α can be well fitted by MAP variations and α increases with MAP. The relationship between β and MAP is not significant with a p value greater than 0.1, and β decreases with MAP for TMPA-3B42RT V7 and CMORPH-3h V1 but increases for PERSIANN-3h and MWSEP-3h V2, i.e., variations in β are far more complex than variations in MAP. Further detection should include more geological and climatic factors. This analysis proves that the regional differentiation of rainfall thresholds can be partly represented by variations in precipitation.

Fig. 9
figure 9

Linear relationships between the parameters of the best-performing frequentist thresholds and mean annual precipitation (MAP) based on values in the 5° × 5° grids of latitude and longitude (in Figs. 2 and 8) for TMPA-3B42RT V7 (a, b), CMORPH-3h V1 (c, d), PERSIANN-3h (e, f), and MSWEP-3h V2 (g, h) rainfall products. Here, TG represents the number of scattered points

Discussions

Rainfall thresholds

Caine (1980) proposed the first global rainfall threshold for shallow landslides and debris flows. Following this pioneering work, more global thresholds for shallow landslides and debris flows were published (Guzzetti et al. 2008) and a global rainfall threshold based on TMPA was prepared by Hong et al. (2006). We compared the rainfall thresholds proposed in our study with existing global rainfall thresholds (Fig. 10). Our rainfall thresholds are lower than existing rainfall thresholds except for the rainfall threshold proposed by Guzzetti et al. (2008). The high existing rainfall thresholds were established with rain-gauged data, whereas rainfall thresholds in this study and Hong et al. (2006) were results of satellite-based rainfall data. Brunetti et al. (2018) concluded that satellite data underestimate rainfall with respect to ground observations, which can explain the lower rainfall thresholds. Our results corroborate this finding. Using rainfall thresholds for TMPA as a reference, our thresholds are lower than that proposed by Hong et al. (2006), perhaps because that the latter was established with a small landslide dataset, just like the different results in studies by Caine (1980) and Guzzetti et al. (2008). In summary, our work can be an equivalent to the published global rainfall thresholds.

Fig. 10
figure 10

Rainfall thresholds proposed in this study for landslide types: a all types, b mudslide, c debris flow, d rock fall, and e complex compared to the existing global rainfall threshold

Regional effect

Landslide occurrence is a combined effect of preparatory variables and triggering factors, both of which exhibit obvious variations (Dai et al. 2002; Corominas et al. 2014). As the area changes, factors that initiate landslides differ and therefore the pattern of landslide occurrence varies. It is evident that the spatial occurrence of landslide is heterogeneous and USA, the Caribbean Island, the Andes Mountain, UK, the European Alps, East Africa, Iran and most of South Asia, East Asia and Southeast Asia are areas of high landslide incidence (Fig. 1; Kirschbaum et al. 2015). Landslide density also varies from 10 to 297 per 5° × 5° latitude and longitude grid (Fig. 2). High mountain areas such as the Alps, Andes, and Himalayas are prone to landslides, partly owing to fundamentally non-stable geological and geomorphological factors (e.g., Dahal and Hasegawa 2008). Slope stability in North America, South America, East Asia, South Asia, and Southeast Asia is strongly influenced and determined by climatic conditions (Froude and Petley 2018). The spatial distribution of HK values in our analysis of each 5° × 5° grid is a consequence of the regional effect on landslide occurrence (Fig. 8). We found that variations of rainfall threshold parameters can be fitted by variations in MAP to some extent (Fig. 9). We also compared our evaluation results with those obtained by Brunetti et al. (2018) in Italy, in which the optimal HK values are 0.41, 0.43, and 0.31 for TMPA, CMORPH, and PERSIANN data. HK values of two 5° × 5° grids corresponding with Italy in this study are shown in Figure 11. The optimal HK values differ due to the differences in landslide inventory data and methods. However, the conclusions are the same: SPEs work for detecting the landslide occurrences and CMORPH performs the best whereas PERSIANN holds a worst performance among the three common rainfall datasets investigated.

Fig. 11
figure 11

HK values in Italian regions in 5° × 5° grids of latitude and longitude for a TMPA-3B42RT V7, b CMORPH-3h V1, c PERSIANN-3h, and d MSWEP-3h V2 rainfall products

Limitations and potentials

Our evaluation shows that the performances of SPEs vary by product, landslide type, and region. Excellent performance of TMPA-3B42RT V7 and CMORPH-3h V1 data can be attributed to their detailed and precise rainfall estimates; however, the reasons for poor performances of PERSIANN-3h and MSWEP-3h V2 products are diverse. There are intrinsic limitations to link landslide events and their rainfall conditions. Uncertainties of landslide and rainfall information may lead to some landslide events failing to link with their responsible rainfall events (Table 2). Biases and false coverages of landslide occurrence time and location cannot be excluded completely (Chleborad et al. 2006; Froude and Petley 2018). Data availability varies with regions owing to different monitoring capabilities, thus leading to underestimation of landslide occurrences in some areas. There are also large systematic errors and random errors for all SPE data over large regions (Sun et al. 2018).

Another limitation in this study is its reliance on empirical rainfall thresholds to determine the occurrence of landslides. ED thresholds captured the linear relationship of cumulated event rainfall and rainfall duration (Fig. 4). Although rainfall is the main trigger of landslides examined here, interaction between rainfall and terrestrial systems—including factors such as slope, elevation, lithology, and land cover—is also responsible for the initiation of landslides (e.g., Arnone et al. 2011; Pisano et al. 2017; Reichenbach et al. 2018). HK values are greatly low in western North America, western Asia, and parts of Oceania, suggesting the contribution of factors other than precipitations in these areas. The changing world facilitates environmental and geographic forces that weaken slope stability and exposes more people and buildings under threat (Gariano and Guzzetti 2016; UNISDR 2017), thus greatly raising the risk of landslide disasters in the future. Links between SPEs, other environmental factors and rainfall-induced landslides should be detected in further evaluations, especially in changing scenarios.

This work will promote further evaluation of existing global landslide hazard algorithms based on SPEs and their application in LEWSs. Kirschbaum and Stanley (2018) developed a global landslide model, whose validation was limited for lack of a global database of non-landslide points. We used NLREs as a substitute for non-landslide points, yielding POD and POFD for evaluation. The reconstruction of NLREs also has a potential to advance other statistical methods such as machine-learning algorithms in the evaluation procedure.

Conclusions

Satellite-based rainfall data hold promise in advancing real-time global landslide warnings. However, few previous studies have compared the performances of various satellite-based rainfall datasets or addressed worldwide evaluations. In this study, four popular and widely used satellite-based rainfall products—TMPA-3B42RT V7, CMORPH-3h V1, PERSIANN-3h, and MSWEP-3h V2—were assessed and compared based on rainfall thresholds of landslide occurrence.

We found significant differences between LREs and NLREs, suggesting that satellite-based rainfall data are suitable for extracting rainfall conditions corresponding to landslides and useful for global landslide modeling. The HK skill score was employed for comparative evaluation of different satellite-based rainfall data. CMORPH-3h V1 performs the best with the highest HK values, followed by TMPA-3B42RT V7, MSWEP-3h V2, and PERSIANN-3h. Further investigation indicated that thresholds of specific landslide types perform equally as well or better than those considering all landslide types; moreover, high variations exist across the 5° × 5° latitude and longitude grid regions examined.

This study shows that satellite-based rainfall data are helpful for global landslide detecting, though the performances of these data vary by product, landslide type, and region. Further efforts are required to link satellite-based rainfall data with landslide observations and other environmental factors to detect trends and variations in rainfall-triggered landslides in a changing world.