Abstract
Statistical landslide susceptibility mapping is a topic in complete and constant evolution, especially since the introduction of machine learning (ML) methods. A new methodological approach is here presented, based on the ensemble of artificial neural network, generalized boosting model and maximum entropy ML algorithms. Such approach has been tested in the Monterosso al Mare area, Cinque Terre National Park (Northern Italy), severely hit by landslides in October 2011, following an extraordinary precipitation event, which caused extensive damage at this World Heritage site. Thirteen predisposing factors were selected and assessed according to the main characteristics of the territory and through variance inflation factor, whilst a database made of 260 landslides was adopted. Four different Ensemble techniques were applied, after the averaging of 300 stand-alone methods, each one providing validation scores such as ROC (receiver operating characteristics)/AUC (area under curve) and true skill statistics (TSS). A further model performance evaluation was achieved by assessing the uncertainty through the computation of the coefficient of variation (CV). Ensemble modelling thus showed improved reliability, testified by the higher scores, by the low values of CV and finally by a general consistency between the four Ensemble models adopted. Therefore, the improved reliability of Ensemble modelling confirms the efficacy and suitability of the proposed approach for decision-makers in land management at local and regional scales.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Landslides are one of the major natural hazards, having large and spread impact all over the world and are responsible for human and socio-economic losses (Froude and Petley 2018). Identification and mitigation of landslide risk is still a challenging task for local administrations and policymakers, being the spatial-temporal prediction of such phenomena still affected by large uncertainties. To this aim, landslide susceptibility assessment is fundamental to improve the understanding of the occurrence of such phenomena (Brabb 1984; Gigli et al. 2014; Lagomarsino et al. 2017). Many methodologies aim at assessing Landslide Susceptibility Mapping (LSM) and in recent years, a significant boost has been gained thanks to robust scientific advances, both among numerical deterministic modelling (Jelínek and Wagner 2007; Godt et al. 2008; Cervi et al. 2010; Formetta et al. 2016; Schilirò et al. 2016; Ciurleo et al. 2017; Park et al. 2019) and statistical approaches (Van Westen et al. 2003; Yalcin et al. 2011; Goetz et al. 2015; Reichenbach et al. 2018; Sepe et al. 2019; Huang et al. 2019; Kavoura and Sabatakakis 2019; Xiao et al. 2019). Deterministic models need information about the physical processes leading to the trigger and hence involve different approaches according to the different types of mass movements. On the other hand, the statistical modelling assumes that the factors that led to slope failure in the past are to be expected to recurrently trigger landslides. Thus, inventories of past landslides coupled with environmental factors can be used to train statistical models (Ermini et al. 2005; Pradhan and Lee 2010; Vorpahl et al. 2012; Corominas et al. 2014; Segoni et al. 2015; Carotenuto et al. 2017). No standard susceptibility procedure has ever been established, hence making possible the exploitation of numerous different algorithms and methodologies. Statistical LSM methods have been extensively applied in different settings and spatial scales by performing different models: bivariate statistical analysis (Constantin et al. 2011; Chen et al. 2018), logistic regression (Chauhan et al. 2010; Mousavi et al. 2011; Wang et al. 2015; Trigila et al. 2015; Tsangaratos and Ilia 2016a, b), multivariate regression (Guzzetti et al. 2006; Akgün and Türk 2011; Felicísimo et al. 2013; Althuwaynee et al. 2014), weight of evidence (Neuhäuser et al. 2012; Ilia and Tsangaratos 2016; Teerarungsigul et al. 2016; Tsangaratos et al. 2017), etc. Spatial distribution of landslides has been also investigated by mathematical models such as fractals (Liu et al. 2019). Among the new statistical approaches, machine learning algorithms (MLAs) are more and more recurrent, gaining new consideration. Many algorithms have been applied to LSM, such as artificial neural networks (ANN—Lee et al. 2004; Ermini et al. 2005; Choi et al. 2012; Gorsevski et al. 2016; Meng et al. 2016; Chen et al. 2018), decision trees (DT—Yeon et al. 2010; Tsangaratos and Ilia 2016b), support vector machine (SVM—Yao et al. 2008; Kavzoglu et al. 2014; Pham et al. 2018, Dou et al. 2019), random forest (RF—Catani et al. 2013; Chen et al. 2017a, b, Krkač et al. 2017) or treebagger RF (Segoni et al. 2020). Many authors have proposed methods to minimize uncertainty, which relies on combining the predictions yielded by multiple algorithms, as testified by the works of Umar et al. (2014), Youssef et al. (2015), Chen et al. (2017a, b), Pham et al. (2017), Kim et al. (2018) and Bueechi et al. (2019). The investigation of new ensemble methods (EMs) for landslide susceptibility mapping is therefore highly necessary, as also highlighted by Chen et al. (2018). Anyhow, EM provides a solid contribution to minimize uncertainty and to refine and improve the prediction accuracy, which is always the key parameter to take into account when working with LSM. In this work, three different MLAs were ensembled to assess landslide susceptibility within some small coastal basins located in the Monterosso al Mare area (Cinque Terre National Park, La Spezia Province, Italy). Specifically, ANN, generalized boosting (GBM) and maximum entropy (MaxEnt) were employed. The study area is highly vulnerable to rapid shallow landslides, as testified by the rainfall-induced ground effects produced by the severe rainfalls occurred on October 25, 2011, which affected a large area across Northern Tuscany and Eastern Liguria reaching intensities of up to 350 mm in 6 h at Monterosso al Mare Rain Gauge (A.R.P.A.L.-C.F.M.I.-P.C. 2012; Cevasco et al. 2015). Such rainfall, which caused extensive damage across the Tyrrhenian basins between Bonassola and Manarola and the Magra river basin (Cevasco et al. 2012; D’Amato Avanzi et al. 2013) triggered more than 500 shallow landslides in the Vernazza catchment (Cevasco et al. 2013, 2014) and about 260 shallow landslides in the area of Monterosso al Mare (Schilirò et al. 2018). Thanks to the availability of high-resolution aerial photographs and field surveys, these landslides were inventoried at great detail. Thus, the database of Monterosso al Mare landslides, represented as points, has been used for susceptibility modelling. Thirteen predisposing factors (PFs) were also considered, based on the main characteristics of the territory and of shallow rapid landslide mechanisms: slope angle, aspect angle, geo-lithology, planform curvature, profile curvature, distance to roads, distance to streams, land use, agricultural terraces state of activity, soil thickness, Topographic Wetness Index (TWI), Topographic Position Index (TPI) and Stream Power Index (SPI). Specifically, each single susceptibility model (GBM, ANN and MaxEnt) was implemented with 100 different landslide data combinations, every time splitting the landslide database in 80% as data training and 20% as data testing. The dataset of selected landslides was performed randomly and, in the end, about 300 different models have been obtained starting from slightly different datasets. The final result is therefore represented by a susceptibility map computed by an average of the values of probability of occurrence previously obtained. With this case study, we aim to promote the use of MLA to assess landslide susceptibility and to provide an example showing how Ensemble modelling can support land use planning activities and landslide hazard management.
Study area
Geological and geomorphological settings
The study area (Fig. 1), approximately 5.5 km2 wide, is located in the Cinque Terre National Park, within the territory of Monterosso al Mare municipality, along the easternmost Ligurian coast (Northern Italy, La Spezia Province). It includes six small catchments: Rio delle Rocche, Rio Pastanelli, Fosso Serra, Rio Gaggia, Corone and Fosso Molinaro. The extent of these water basins ranges from 3.2 km2 (Rio Pastanelli) to 0.1 km2 (Fosso Serra). The Cinque Terre area is a UNESCO World Heritage Site, since 1997 (https://whc.unesco.org/en/list/826/) and a National Park since 1999. This area is worldwide known as a typical example of man-made landscape, characterized by widespread century-old agricultural terraces retained by dry-stone walls. The Cinque Terre National Park, and in particular its western sector, is geologically belonging to an NW–SE oriented segment of the Northern Apennine, an orogenic chain formed during the Tertiary period (Abbate et al. 1969; Terranova et al. 2006). This sector of the belt is made up of a nappe sequence that includes five overlapping tectonic units, from the top to the bottom: Gottero Unit, Ottone Unit, Marra Unit, Canetolo Unit and Tuscan Nappe (Raso et al. 2019a) (Fig. 2). The Gottero Unit crops out in the westernmost sector of Cinque Terre National Park, along the Punta Mesco promontory, and mainly consists of ophiolite rocks (Jurassic), followed by a turbidite sequence (Late Cretaceous). The Ottone, Canetolo and Marra units are localized in the western side of the Monterosso al Mare village and in the central sector of the national park. The first one is prevalently composed of pelitic rocks (Monte Veri Monte Veri Complex Fm.); Canetolo and Marra occupy a narrow NW–SE-oriented stretch of land encompassing the coastal villages of Manarola and Corniglia and they include claystone with limestones and silty arenaceous turbiditic rocks. The Tuscan Nappe occupies most of the Cinque Terre National Park as it crops out on the eastern, central and western sectors; in the Monterosso al Mare area, this unit is represented by thick sandstone-claystone turbidites (Macigno Fm.), largely cropping out along the coast and along the Rio Pastanelli catchment. The geomorphological setting of the area is strictly linked to the tectonic history of the Apennine orogeny, which experienced an earlier compressive phase first, and a ductile extensive phase then (Carmignani et al. 1994). The Cinque Terre National Park territory is bounded by the sea at SW and S, with a 15-km-long coastal configuration and highest elevations reaching 800 m a.s.l. along the NW–SE-orientated main mountain ridge, located at a very short distance from the coastline. The geo-structural and lithological setting strongly influences the local morphology, including slope orientation: most of the slopes are SE to SW-oriented, showing high slope gradient values (over 60°) for more than 74% of the total surface. The Monterosso al Mare territory is nevertheless characterized by slightly different slope orientation and steepness, being N–NE oriented and showing more gentle slopes, mainly due to the weathering affecting gabbros and serpentinites belonging to the Liguridi tectonics units. The tectonic activity influenced also the stream network configuration: the main streams, Fegina (which gives the name to the homonymous hamlet) and Pastanelli, have a very steep profile and are short, with anti-Apennine direction, imposed on approximately N–S faulting systems. These creeks mostly show a torrential behaviour, characterized by high rates of solid transport mainly due to the erosional processes affecting the surrounding slopes: especially in the occurrence of extraordinary rainfall events, very frequent in the area, these high amounts of material strongly contribute to feed the beach deposits that interrupt the continuity of the rocky sea cliffs. Among the natural factors, slope steepness, tectonic setting and the role of exogenous agents (i.e. rainfall and anthropic activity) made the area of Monterosso al Mare prone to landslides over time. Many landslides are reported, besides those of the 2011 event, such as the ones affecting the Mesco promontory along with the western sector of the coastline. A fundamental role to slope stability is also played by agricultural terraces, which strongly reshape the surrounding landscape and therefore contribute to providing a worldwide knowledge of the Cinque Terre National Park (D’Onofrio and Trusiani 2018). As noted before, Cinque Terre is characterized by an average annual precipitation rate of 900 mm; however, extraordinary rainfall event is very frequent due to its geomorphological configuration. The ridge, running parallel to the coast, represents, indeed, a barrier for cumulonimbus coming from S to SW. Furthermore, in the ongoing climate change climatic context, a general increase of precipitation extremes is recognizable, leading thus to an intensification of the frequency of flash floods and landslides.
Land use setting
The landscape of the Cinque Terre area has been almost totally modified by human activity during the last centuries through agricultural terraces building (Terranova 1984; Brandolini 2017). The agricultural terraced slopes at Cinque Terre area (Fig. 3) extend from the shoreline, just above the cliff edge or from the toes of coastal landslides to an average altitude of 400–450 a.s.l., and in some cases up to 500 m (Terranova 1984; Terranova et al. 2006). The necessity of claiming new spaces to allocate agricultural activities, mainly for vineyards and olive groves, led the local farmers to rework and retain the shallow eluvial–colluvial soil covers by constructing dry-stone walls, over the most suitable bedrock (Brandolini 2017). Terraced areas are almost entirely distributed over some specific geological formations: most of the cultivations can be found on terraces built on the Macigno sandstones belonging to the Tuscan Unit or over the limestones and silty sandstones belonging to the Canetolo Unit or to the Ottone Unit, while few terraced areas can be counted on the sandstones, gabbros and serpentinites that crop out in the Monterosso al Mare area and are part of the Gottero Unit. According to Terranova (1984), about 60% of the territory of the Cinque Terre area is covered by terraced areas, and the sum of the linear extent of dry-stone walls can be estimated to approximately 6000 km. Considering an average dry-stone wall height of 2 m (however varying, according to slope steepness and wall width), the total volume of reworked stones is calculated to be approximately 8 × 106 m3 (Terranova 1984).
The lack of maintenance of the dry-stone walls and the clogging of drainage channels due to farmland abandonment are considered some of the main causes of the increase in landslide susceptibility of terraced slopes within the Cinque Terre National Park (Canuti et al. 2004; Tarolli et al. 2014; Cevasco et al. 2014; Brandolini et al. 2018). In this area, agricultural terrace abandonment started between the two World Wars and accelerated after the 1950s, leading to widespread and intense erosion and land degradation issues. According to a recent inventory, more than 80% of the whole terraced areas are nowadays abandoned (Brandolini 2017) and have been replaced by pine trees and by Mediterranean scrub, peculiar land-use features in the Liguria region. Although the slopes were almost entirely terraced during the past centuries for vineyards and olive yards, only 19% of the total terraced areas are still cultivated. This issue denotes how the social changes that took place in the second half of the 1950s affected the land use at Cinque Terre area (Schilirò et al. 2018). On abandoned terraced areas, vegetation spreads quite rapidly, as well as scrubs grow up sparsely. After a time span of about 25–30 years, dense vegetation constituted by forest tree species (mainly pine, mixed woods, chestnut woods, holm oak woods) or by Mediterranean scrub covers abandoned terraces (Brandolini et al. 2018). Such land-use modifications concurred to increase the propension of the Cinque Terre territory to be affected by landslides, as confirmed by previous studies which investigated the relationship between agricultural terrace abandonment and the magnitude of rainfall-induced shallow landslides (Cevasco et al. 2013, 2014; Galve et al. 2015; Pepe et al. 2019).
Data and methods
Landslide inventory
The Cinque Terre area has been historically affected by recurring landslide events, mainly in response to extreme rainfall events related to the peculiar rainfall regime (Cevasco et al. 2015). A complete landslide inventory map of the Cinque Terre National Park was redacted by Raso et al. (2019a, b), in which a total of 459 landslides characterized by an extension higher than 100 m2 were identified and classified according to Cruden and Varnes (1996) and Hungr et al. (2014). A large number of debris slides (22.9%) can be related to the vulnerability of dry-stone terraces, while rockfalls (17.6%) are concentrated along the coast. Debris flows (11.1%) are also very spread, especially in the western sector of the Cinque Terre area (Monterosso al Mare and Vernazza municipalities); most of them were triggered by the October 25, 2011 rainfall event, between 9 and 15 UTC. The Monterosso al Mare rain gauge recorded very high hourly rainfall peaks (i.e. 92 mm/h), for a total of 382 mm of rain falling during the entire event (A.R.P.A.L.-C.F.M.-P.C. 2012). As a consequence of such an extreme event, numerous shallow landslides and dry stone wall collapses were triggered, more than 250 in the Monterosso al Mare catchment alone (Fig. 4), where the mobilization of enormous amounts of material caused severe damage to Monterosso al Mare village (Cevasco et al. 2015).
The landslide inventory used in this study was prepared in the aftermath of the October 25, 2011 event (Fig. 4) through aerial ortophotogrammetry and field survey (Schilirò et al. 2018). Two hundred and sixty shallow landslides and dry-stone wall collapses were triggered following the abundant precipitations and therefore mapped. Most of these landslides were classified as debris avalanches, as defined by Hungr et al. (2001), namely shallow and very fast flow-like movements (velocity between 3 m/min to 5 m/s, Cruden and Varnes 1996) of partially or completely saturated debris on very steep slopes. Furthermore, having available the surveyed landslides as polygons, only one the highest point, characterized by the highest altitude in the detachment area, has been selected. This operation has been conducted in order to make possible the run of the different models requiring points as input data.
Predisposing factors
According to the most representative local morphological and spatial features, 13 PFs were selected (Fig. 5). PFs include slope angle; aspect angle; planform curvature; profile curvature; distance to roads; distance to streams; Topographic Wetness Index, Topographic Position Index and Stream Power Index; agricultural terraces state of activity; land use; geo-lithology and soil thickness. The analysis is focused on environmental variables since it does not take into consideration the triggering factors (e.g. intense rainfall or earthquakes). Since most of the variables were derived from the digital terrain model (DTM) of 5 × 5 m cells, all the elements were set to equal DTM resolution.
A brief description of the PFs is available as Supplementary Information (Supplementary 1).
Modelling methods
Three different MLAs, included in the package “biomod2” (Thuiller et al. 2016), developed in R environment (R Development Core Team 2019) were used. Specifically, artificial neural network (ANN), generalized boosting model (GBM) and maximum entropy (MaxEnt) algorithms, well-known for their good performance in species distribution modelling (Elith et al. 2006), were adopted. A brief description of the algorithms is provided as follows, whereas extended descriptions are available as Supplementary Information (Supplementary 2).
To implement and evaluate models obtained with the different MLAs, the k-fold cross-validation approach was used. Cross-validation is one of the most widely accepted approaches for testing the predictive accuracy of species distribution models (SDMs). A random part of the data is kept for calibration (i.e., training data) while the remainder is used to test (i.e., testing data) the prediction of the model; the whole approach is then repeated several times for a single model and the average predictive accuracy is finally reported (Araujo et al. 2005; Thuiller et al. 2009). Therefore, each SDM was carried out 100 times splitting the population in 80% for training and the remaining part for testing.
Artificial neural network
ANN simulates the structure and/or functional aspects of biological neural networks of the brain, in order to process information (Zurada 1992). Similarly, ANN models are composed by a large number of nodes and connections, arranged in various layers, with the input layer containing the data being processed, many hidden layers and the output layer that represents the final result of the model. Input data, in our case, are the environmental variables, each represented by a node. Connections between nodes and hidden layers may be characterized by specific weights that are randomly assigned at the beginning of the process and later updated for algorithm optimization by means of back-propagation processes (Pijanowski et al. 2002).
Generalized boosting model
The boosting principle was originally developed, among the MLAs, to improve classification procedures (Schapire 1990) and consists of an ensemble of several logistic regression or decision trees. In such an algorithm, a starting model is built by relying on some randomly chosen trees and its performance is then improved by iteratively adding new randomly chosen decision trees that improve the accuracy of the previous iteration’s model.
Maximum entropy
Conceptually, MaxEnt relies on the distribution of an event in the space, related to the distribution on the landscape of the factors characterizing the events (Philips and Dudik 2008). MaxEnt method compares the probability density functions (PDFs) of environmental variables at event’s occurrence places with the PDFs of the same variables on the remaining landscape; the difference is regarded as the “relative entropy” (Elith et al. 2011, Convertino et al. 2013). MaxEnt algorithm finds the PDFs of environmental variables characterizing the events that approximate the PDFs on the landscape, i.e. by maximizing the entropy of information. Hence, the aim of MaxEnt is to recognize the PDFs of variables that may induce patterns to happen (Kornejady et al. 2017). The maximized entropy provides information to forecast those patters with high precision. Since MaxEnt does not specifically model the presence data but rather the density of the environmental conditions used, raw outputs are then transformed into logistic values to be interpreted directly as the probability of an event to occur.
Ensemble modelling
The approach of an Ensemble modelling (EM) is relatively recent and was first introduced by Burnham and Anderson (2002), who made the average of different regression models. The basic principle of multi-model inference is to avoid selecting the best model and to rely on more models indicated; in such a way, uncertainty and bias of both variables and models are minimized. The procedure of modelling is visible in Fig. 6 and can be subdivided into different steps. The first one allows recognizing landslide occurrence data as the response variable and the predisposing factors as predictors. In the second step, to reduce the collinearity among predictors, the variance inflation factor (VIF) was measured and a value of 0.7 was set as threshold. VIF is based on the square of the multiple correlations coefficients resulting from regressing a predictor variable against all other predictor variables. It, therefore, detects multicollinearities that cannot always be easily detected through a simple correlation. In detail, two different strategies to exclude highly collinear variables using a gradual procedure were chosen. Vifcor (cor stands for correlation) was used; first, it finds a pair of variables that has the maximum linear correlation and excludes the one having the largest VIF. The procedure is repeated until there is no variable with a higher correlation coefficient (greater than the threshold) between another variable. Vifstep calculates the VIF for all variables, excludes the one with the highest VIF (greater than the threshold) and repeats the procedure until there are variables with VIF greater than the remainder (Guisan et al. 2017). In the next step, the third, the above mentioned SDMs were performed to model susceptibility over the study area. Furthermore, during this phase, the importance of each variable related to the probability of event occurrence was acquired. Once the models are trained, the variables’ importance is obtained, by making a correlation between the individual variables (Pearson’s correlation, by default). A good correlation score between two predictors means that one of the two variables has low influence, and it is considered not important for the model. On the contrary, a low correlation means that both variables have a high influence on the model. In the ‘VarImportance’ output, the values given correspond to 1 minus the correlation score. The higher the value, the more important the prediction variable is in the model. To assess the reliability of the results of each SDMs, ROC (receiver operating characteristic) and TSS (true skill statistic) methods were chosen. TSS takes into account both omission and commission errors and ranges from − 1 to 1, not being affected by prevalence differently from Kappa (Allouche et al. 2006). Values ranging from 0.2 to 0.5 were considered poor, from 0.5 to 0.8 useful and values larger than 0.8 were considered good to excellent (as in Coetzee et al. 2009). Moreover, ROC/AUC was used as an alternative measure of accuracy. ROC use is very spread in literature: Cantarino et al. (2019) even applied a classification scheme of susceptibility levels based on ROC analysis. Prediction accuracy is considered to be similar to random for ROC/AUC values lower than 0.5; poor, for values in the range 0.5–0.7; fair in the range 0.7–0.9 and excellent for values greater than 0.9 (Swets 1988; Fressard et al. 2014). Since ROC/AUC curves show higher performance values and it is more consolidated in literature, the ensemble modelling has been used. The outputs of the models were combined by implementing four ensemble techniques and the variability of the modelling outputs was calculated by executing the coefficient of variation.
In detail, four kinds of representative ensemble models were chosen, namely (Thuiller et al. 2016):
-
Mean of probabilities (PM)—this ensemble method corresponds to the mean probabilities over each selected model
-
Median of probabilities (PME)—this ensemble model corresponds to the median probability over the selected models. The median is less sensitive to outliers than the mean
-
Committee averaging (CA)—to do this model, the probabilities from the selected models are first transformed into binary data. It is built on the analogy of a simple vote. Each model votes for the landslides being either present or absent. The interesting feature of this measure is that it gives both a prediction and a measure of uncertainty. When the prediction is close to 0 or 1, it means that all the models agree to predict 0 and 1 respectively; when the prediction is around 0.5, it means that half the models predict 1 and the other half 0
-
Weighted mean of probabilities (PMW)—this algorithm returns the mean weighted (or, more precisely, this is the weighted sum) by the selected evaluation method scores
To measure the role of the chosen EM on the prediction output, the spatial variability between mean, weighted mean and median-based final predictions by means of ANOVA test (Analysis of Variance test, Fisher 2006) was tested. ANOVA provides a statistical test of whether two or more population means are equal, and it is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. Furthermore, pairwise differences between methods were verified by means of the Tukey honestly significant difference test (Tukey 1953), one of the most recommended and used procedures for controlling Type I error rate when making multiple pairwise comparisons.
Results
Ensemble forecasting
Each single ML model (ANN, GBM and MaxEnt) was applied with 100 different test and training combinations (Fig. 7). Every susceptibility map generated in this phase is characterized by small variations and with different values of errors and evaluation scores, subsequently reduced in the EM phase. The average AUC score of the 100 iterations for each model is resumed in Table 1, where it is evident that the best result is obtained by GBM and MaxEnt (0.84 and 0.83, respectively), while ANN result is slightly lower (0.74). Therefore, the three models showed fair results, according to Swets (1988) charts.
Thus, EM was performed by using four different Ensemble techniques: mean of probabilities (PM), median of probabilities (PME), committee averaging (CA) and weighted mean of probabilities (PMW).
The first landslide susceptibility map, shown in Fig. 8a, has been generated through the mean of probabilities (PM). Such map shows a probability of occurrence with a minimum and maximum of 5 and 93%, respectively. It is worth to underline that in all maps, susceptibility was showed using continuous data, without any division in classes, as often adopted in Italy in the framework of landslide risk management. The landslide distribution within the range of susceptibility value is reported in Table 2: the highest concentration of landslide can be found between the values 0.8 and 0.9. Moreover, the landslide distribution is characterized by an increasing trend when going from the lowest to the highest value of susceptibility. Conversely, the areal distribution of the susceptibility values shows a decreasing trend, with large parts of the territory characterized by low values of susceptibility and a limited extent for those characterized by higher values. As for the validation metrics, ROC/AUC test gave a value of 0.910, while the TSS test value was 0.658.
The second susceptibility map has been produced with the median of probability (PME) EM (Fig. 8b). In this case, the values are ranging between a minimum and maximum value of 0.02 and 0.97, respectively. As previously observed, the main concentration of landslides is located within the areas with a higher level of susceptibility, with an increasing trend (from the lowest to the highest value of susceptibility) (Table 2). The areal distribution is also characterized by a decreasing trend, with a larger extent of lower-susceptibility areas. ROC/AUC and TSS for the selected zone are 0.901 and 0.637, respectively.
By means of the committee averaging (CA) method, values ranging between 0 and 1 were obtained (Fig. 8c). Moreover, the average landslide distribution trend is slightly different, with a quasi-exponential trend, being more than 60% of landslides located in areas with susceptibility values between 0.9 and 1 (Table 2). The areal extent of susceptibility pixels is different as well: larger areas are both distributed in the lower and higher levels of susceptibility. Both aforementioned aspects are nevertheless conditioned by the CA method, which is based on the calculation of binary average values, identified by a cut-off value, to establish the susceptibility. It is indeed a method which tends to overestimate areas with higher susceptibility values, but which can provide an excellent instrument for local administrators and decision-makers since it could represent an application of a “precautionary approach”. ROC/AUC and TSS scores with the CA method are 0.89 and 0.66, respectively.
Finally, the weighted mean of probabilities method (PMW) has been implemented (Fig. 8d). In this case, minimum and maximum values are 0.05 and 0.93, respectively. As in the previous cases, most of the landslides are distributed over the more susceptible areas, as well as the areal distribution presents a decreasing trend, with lower susceptibility values distributed over larger areas (Table 2). ROC/AUC and TSS values are similar to the previous scores, in this case, 0.91 and 0.658, respectively.
To summarize, the use of EM has improved the evaluation scores and therefore the model reliability. The obtained results are also comparable to each other, being all characterized by excellent values of AUC and a good score of TSS (Table 1).
Another significant outcome was obtained, by applying biomod2 library, which allows the assessment of the variable importance for each model. Through this peculiar function, landslide dynamics in the study area and the role of representative PFs can be assessed. Tables 3 and 4 show every variable’s importance for each applied SDM. The first table represents the average importance value of all the environmental variables for each stand-alone model. The sum of all predictors, to any model, is not equal to 1; this is because, as explained in the “Modelling methods” section, every variable must be considered individually. Aspect and slope variables exhibited a higher score respect to the others in all models. The other PFs showed moderate levels of importance, such as land use and terraces state of activity. Table 4 exhibits the important value for each variable in all ensemble models. In this case, five PFs (i.e. “slope angle”, “slope aspect”, “planar curvature”, “terraces” and “land use”) show significant values in all the ensemble models.
Uncertainties and similarities of the models
To compare the results obtained from all the performed EMs and to assess their uncertainty, the coefficient of variation (CV) has been computed. CV of probabilities (i.e. standard deviation/mean of each pixel) is a measure of uncertainty of the ensemble models, and it assumes relevant importance when there is a great availability of data. If CV exhibits a high evaluation score, the uncertainty of a given data is high, while the lower the score, the better are the models’ outputs (Fig. 9). As can be seen from Fig. 9, the central and NE areas of the Monterosso al Mare catchment showed lower CV values (light areas), while the other portions of the basin showed high CV values (dark areas). This is an expected pattern, taking into account that the ensemble models highlighted analogous values of susceptibility in the central and NE study areas. However, areas with high CV values can be identified where mainly low susceptibility values are located. In general, the CV map gives a more powerful basis to estimate landslide-susceptible areas of Monterosso al Mare basin along with ensemble maps. However, the CV map could not take into account uncertainty coming from predisposing factors; thus, uncertainties from variables were reduced by using the same input variables for each SDM.
In this study, a matrix-based approach was used to analyse the relationship between the ensemble models and the uncertainty map (i.e. CV), to generate a “confidence” map (Kim et al. 2018); in this way, the reliability of ensemble models has been assessed. The susceptibility values derived from EM were subdivided into five groups (1–5), while the CV map was categorized into six classes (1–6); in both maps, the Natural Breaks method was applied (Jenks 1967). The matrix was generated across the table to recognize the efficacy of the ensemble map in terms of reducing the uncertainty from the various SDMs. The value of 51 in the matrix table highlights pixels characterized by high landslide likelihood with low variability. Conversely, a value of 16 means that a pixel has a high uncertainty with a low proneness to collapse. The maps in Fig. 10 presents areas with two different typologies, namely, zones with a high probability of landslide occurrence and low uncertainty (i.e. 41, 51) and areas with pixels with low susceptibility and low uncertainty (i.e. 12, 22). Areas with high landslide susceptibility and high uncertainty (56, 55) were hard to recognize.
However, small areas with low probabilities and high uncertainty (i.e. 15, 16) can also be identified. The graphs represented in Fig. 11 confirmed the statistical analyses performed on these maps.
As can be observed from the graph in Fig. 11a, most of the landslides (about 75%), in all ensemble models, fall into classes 51 and 41 (i.e. high susceptibility and low uncertainty), confirming how EM correctly identified potential triggering areas, with a low uncertainty value. In addition, landslides falling into low-susceptibility classes are about 3%. The second graph (Fig. 11b) confirms that the largest areas are those with low susceptibility, although the degree of uncertainty varies from low to moderate (classes 13, 14, 15). Nevertheless, pixels with higher susceptibility values are mainly distributed (about 90%) in correspondence of the classes 41 and 51. Therefore, highly susceptible areas are not particularly extended, contrary to those with low susceptibility, and assume low values of uncertainty.
A further comparison between the ensemble susceptibility maps was produced in terms of the spatial distribution of the susceptibility values. Indeed, the spatial similarity between the different models was analysed (Fig. 12). As previously reported, the ensemble maps have been divided into five classes (1–5). Subsequently, an intersection operation between the different maps has been performed. It should be stated that the CA map was not considered in the study since it could have a strong influence on the result, because it tends to overestimate the highest and the lowest susceptibility values. It should be noted that the CA first transforms the probabilities of all ensemble maps into binary values and then it calculates their average. Each pixel of the intersection map represents the different combinations that the individual models can assume. Thus, values labelled as 111 represented pixels in which all models have low susceptibility. Similarly, the pixels labelled as 555 represented areas where all models have high susceptibility. During this operation, intermediate combinations have been also obtained as output (e.g. 121), in which the first and third models have a low susceptibility while the second one has a higher susceptibility.
As in the previous analyses, an intersection operation between the landslides and the output map was performed, to acquire further information. In Table 5, the number of landslides in each specific class and the relative percentage are represented. About 84% of the total landslides fall within “coherent” susceptibility values (e.g. 555, 444, etc.), while the remaining 16% in “non-coherent” classes (Fig. 13a; Table 5). Furthermore, 55% of the total landslides fall into the 555 class, in which all the models are characterized by very high susceptibility. Lastly, about 80% of the landslides fall into the classes with moderate to very high susceptibility. Figure 13b represents the obtained areal extension for each classes. Coherent classes represent 83% of the total study area, where the main part (42%) is characterized by the lowest susceptibility level (class 111), while about 15% is represented by the non-coherent classes (Table 5).
Discussion and conclusions
Landslide susceptibility mapping represents an essential tool in terms of landslide risk mitigation, especially if carried out in a rigorous way and supported by an accurate dataset. To this, statistic-probabilistic methods represent an ideal synthesis, especially if applied to large areas and when the interaction between geological, geomorphological, hydrological and anthropic PFs is extremely high. However, one of the constraints of statistical methods is represented by the uncertainty associated with every process. Even if many methodologies with the aim of assessing the reliability and evaluating the goodness of statistical methods (such as ROC/AUC, etc.) could be implemented, a percentage of unpredictability is always strictly linked to statistical modelling. One of the main objectives of this work is, therefore, the minimization of such effects, by using different ML methods.
The rationale behind using and merging several models is that two or more models may have very similar predictive performance even when they contain different environmental predictors and/or they yield vastly different spatial predictions, making difficult to know which of the equivalent candidate models to use (Guisan et al. 2017). Furthermore, the “best” model may not be automatically the best one for predictions in a different area or under new conditions (Randin et al. 2006). Combination of several models demonstrates that they produce robust and more stable outputs than single models, which is called “ensemble forecasting” (Araújo and New 2007). The ensemble modelling draws both the main trend (i.e. mean, median, weighted mean) and the general variation (and thus uncertainty) through all models.
In this work, the EM multiple modelling has been carried out 100 times, every time with a different combination of training and test data, for each of the three stand-alone selected methods; later on, a first data calibration has been achieved. The different EMs, obtained by averaging the result of every model, was useful to obtain a summary of all data and therefore to get available information with a strong uncertainty minimization. Ensemble results seem therefore to be encouraging and improved with respect to stand-alone methods, showing ROC/AUC values higher than 0.9, hence excellent according to the various classifications available in the literature (Fressard et al. 2014). Additionally, also TSS values are good (higher than 0.6). As to confirm, landslide distribution in the Monterosso al Mare catchments is mainly concentrated in the areas characterized by high susceptibility. A further refinement of EM reliability has been achieved comparing the ensemble susceptibility maps with the uncertainty value, computed through the CV. Susceptible areas with low uncertainty value provide high-quality outputs, which may be especially useful for decision-makers, willing to obtain trustworthy information. Likewise, non-susceptible areas with low uncertainty value may represent areas where new social and economic activities may potentially be addressed. Moreover, the spatial comparison done through the intersection of the different outcomes of each EM shows a general consistency, with a very high coincidence (about 84%) of susceptible areas computed in different ways, and a general presence of inventoried landslides almost exclusively in high susceptible coherent areas. By using the ANOVA test, no significant differences between the three different ensembling methods have been found (F = 0.244, p = 0.783) and confirmed by the general similarity of AUC of each different ensemble model applied (ranging between 0.9 and 0.91).
The environmental variables chosen as PFs have been selected according to the main characteristics of the territory. Such an assumption is fundamental because many aspects presented here are strictly related to peculiar slope processes occurring within the Cinque Terre area. Indeed, the selection of PFs influencing the landslide triggering is not based on universally recognized guidelines, but it was determined according to the type of landslides and the slope dynamics of Cinque Terre area. By analysing the score derived from the stand-alone and ensemble susceptibility computation, it is possible to observe that geomorphological variables, such as slope and aspect, are very significant and influences the debris cover deposition, the hydrological circulation, evapotranspiration processes and thus erosion, land use and slope dynamics. Among the other variables, an important role is played by those strongly influenced by human activities, such as the presence of terraces and their state of activity and land use. The influence of these factors has been already highlighted in previous works (Cevasco et al. 2013, 2014; Brandolini et al. 2018; Pepe et al. 2019), revealing that terraces abandoned for a short time showed the highest landslide susceptibility. Moreover, slope failures affecting cultivated zones were characterized by a lower magnitude than those occurring on abandoned terraced slopes. Furthermore, the role of vegetation may also be positive, both increasing the evapotranspiration potential and controlling the erosional processes. Besides, the constant maintenance of dry-stone walls, and thus terraces conservation, is a key aspect to preserve the function of dissipating the pore water pressure excess generated behind the wall and subsequently protect and conserve a unique man-made landscape (Camera et al. 2012, 2014). However, a further refinement of input data, such as constant update and new-data implementation, is advisable to obtain susceptibility models which may consider all potential factors leading to instability.
As a final consideration on the methodology applied in this work, further improvements and latest knowledge acquisition about ML algorithms for landslide susceptibility mapping are necessary, to obtain more accurate and less uncertain estimations. A further improvement toward a more refined and precise forecasting of landslide occurrence, both in time and space, can be achieved with the integration of landslide triggering factors such as rainfalls (Segoni et al. 2018), which can support the national civil protection agency in setting-up early warning systems. Moreover, a field cross-validation of LSM is always recommendable to have as most reliable as possible maps, especially when such products are dedicated to public administrations and stakeholders.
Indeed, such tools may be applied in a more suitable way once further and more complete predisposing factors will be identified on the whole Cinque Terre area. These tools will represent potentially a fundamental tool for the safety of people living in the area and for many tourists that every year visit the Cinque Terre National Park. It is worth to underline that in an extremely dynamic and variable environment, such an advanced instrument will be suitable for the management of agricultural terraces, since many cooperatives and wine-makers have started a land recovery during these last years. For all these reasons, an update of environmental variables and therefore of the susceptibility assessment is necessary to guarantee informed land management policies.
References
A.R.P.A.L.-C.F.M.I.-P.C. (Agenzia Regionale per la Protezione dell'Ambiente Ligure – Centro Funzionale Meteoidrologico di Protezione Civile della Regione Liguria) (2012) Uno tsunami venuto daimonti - Provincia della Spezia 25 ottobre 2011 rapporto di evento meteo-idrologico (In Italian). Quaderni ARPAL 1. Redazione Ed Genova
Abbate E (1969) Geologia delle Cinque Terre e dell’entroterra di Levanto (Liguria Orientale). Memorie della società geologica Italiana, vol 8. Arti Grafiche Pacini Mariotti, Pisa
Abedini M, Ghasemian B, Shirzadi A, Shahabi H, Chapi K, Pham BT, Bin Ahmad B, Tien Bui D (2018) A novel hybrid approach of Bayesian logistic regression and its ensembles for landslide susceptibility assessment. Geocarto Int 34:1427–1457. https://doi.org/10.1080/10106049.2018.1499820
Agnoletti M, Errico A, Santoro A, Dani A, Preti F (2019) Terraced landscapes and hydrogeological risk. Effects of land abandonment in Cinque Terre (Italy) during severe rainfall events. Sustainability 11:235
Akgün A, Türk N (2011) Mapping erosion susceptibility by a multivariate statistical method: a case study from the Ayvalık region, NW Turkey. Comput Geosci 37:1515–1524. https://doi.org/10.1016/j.cageo.2010.09.006
Allouche O, Tsoar A, Kadmon R (2006) Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J Appl Ecol 43:1223–1232. https://doi.org/10.1111/j.1365-2664.2006.01214.x
Althuwaynee OF, Pradhan B, Park H-J, Lee JH (2014) A novel ensemble bivariate statistical evidential belief function with knowledge-based analytical hierarchy process and multivariate statistical logistic regression for landslide susceptibility mapping. CATENA 114:21–36. https://doi.org/10.1016/j.catena.2013.10.011
Araujo MB, Pearson RG, Thuiller W, Erhard M (2005) Validation of species-climate impact models under climate change. Global Change Biology 11(9):1504–1513
Araújo MB, New M (2007) Ensemble forecasting of species distributions. Trends Ecol Evol 22:42–47. https://doi.org/10.1016/j.tree.2006.09.010
Barbet-Massin M, Jiguet F, Albert CH, Thuiller W (2012) Selecting pseudo-absences for species distribution models: how, where and how many? Methods Ecol Evol 3:327–338
Beven KJ, Kirkby MJ (1979) A physically based, variable contributing area model of basin hydrology / un modèle à base physique de zone d’appel variable de l’hydrologie du bassin versant. Hydrol Sci Bull 24:43–69. https://doi.org/10.1080/02626667909491834
Brabb EE (1984) Innovative approaches to landslide hazard and risk mapping. In International Landslide Symposium Proceedings Toronto Canada. vol 1, pp 17-22
Brandolini P (2017) The outstanding terraced landscape of the Cinque Terre coastal slopes (eastern Liguria). In: Landscapes and landforms of Italy. Springer, Cham, pp 235–244
Brandolini P, Cevasco A, Capolongo D, Pepe G, Lovergine F, Del Monte M (2018) Response of terraced slopes to a very intense rainfall event and relationships with land abandonment: a case study from Cinque Terre (Italy). Land Degrad Dev 29:630–642. https://doi.org/10.1002/ldr.2672
Bueechi E, Klimeš J, Frey H, Huggel C, Strozzi T, Cochachin A (2019) Regional-scale landslide susceptibility modelling in the cordillera Blanca, Peru—a comparison of different approaches. Landslides 16:395–407. https://doi.org/10.1007/s10346-018-1090-1
Burnham KP, Anderson DR (2002) Model selection and inference: a practical information-theoretic approach, 2nd edn. Springer-Verlag, New York. https://doi.org/10.1007/b97636
Camera C, Masetti M, Apuani T (2012) Rainfall, infiltration, and groundwater flow in a terraced slope of Valtellina (northern Italy): field data and modelling. Environ Earth Sci 65:1191–1202. https://doi.org/10.1007/s12665-011-1367-3
Camera CAS, Apuani T, Masetti M (2014) Mechanisms of failure on terraced slopes: the Valtellina case (northern Italy). Landslides 11:43–54. https://doi.org/10.1007/s10346-012-0371-3
Canuti P, Casagli N, Ermini L, Fanti R, Farina P (2004) Landslide activity as a geoindicator in Italy: significance and new perspectives from remote sensing. Environ Geol 45:907–919. https://doi.org/10.1007/s00254-003-0952-5
Carmignani L, Decandia FA, Fantozzi PL, Lazzarotto A, Liotta D, Meccheri M (1994) Tertiary extensional tectonics in Tuscany (northern Apennines, Italy). Tectonophysics 238:295–315. https://doi.org/10.1016/0040-1951(94)90061-2
Carotenuto F, Angrisani AC, Bakthiari A, Carratù MT, Di Martire D, Finicelli GF, Raia P, Calcaterra D (2017) A new statistical approach for landslide susceptibility assessment in the urban area of Napoli (Italy). In: Workshop on World Landslide Forum. Springer, Cham, pp 881–889
Catani F, Lagomarsino D, Segoni S, Tofani V (2013) Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Nat Hazards Earth Syst Sci 13:2815–2831. https://doi.org/10.5194/nhess-13-2815-2013
Cantarino I, Carrion MA, Goerlich F, Ibañez VM (2019) A ROC analysis-based classification method for landslide susceptibility maps. Landslides 16(2):265–282
Cervi F, Berti M, Borgatti L, Ronchetti F, Manenti F, Corsini A (2010) Comparing predictive capability of statistical and deterministic methods for landslide susceptibility mapping: a case study in the northern Apennines (Reggio Emilia Province, Italy). Landslides 7:433–444. https://doi.org/10.1007/s10346-010-0207-y
Cevasco A, Pepe G, Brandolini P (2012) Shallow landslides induced by heavy rainfall on terraced slopes: the case study of the October, 25, 2011 event in the Vernazza catchment (Cinque Terre, NW Italy). Rend Online Soc Geol Ital 21:384–386
Cevasco A, Brandolini P, Scopesi C, Rellini I (2013) Relationships between geo-hydrological processes induced by heavy rainfall and land-use: the case of 25 October 2011 in the Vernazza catchment (Cinque Terre, NW Italy). J Maps 9(2):289–298
Cevasco A, Pepe G, Brandolini P (2014) The influences of geological and land use settings on shallow landslides triggered by an intense rainfall event in a coastal terraced environment. Bull Eng Geol Environ 73:859–875. https://doi.org/10.1007/s10064-013-0544-x
Cevasco A, Diodato N, Revellino P, Fiorillo F, Grelle G, Guadagno FM (2015) Storminess and geo-hydrological events affecting small coastal basins in a terraced Mediterranean environment. Sci Total Environ 532:208–219
Chauhan S, Sharma M, Arora MK (2010) Landslide susceptibility zonation of the Chamoli region, Garhwal Himalayas, using logistic regression model. Landslides 7:411–423. https://doi.org/10.1007/s10346-010-0202-3
Chen W, Xie X, Wang J, Pradhan B, Hong H, Bui DT, Duan Z, Ma J (2017a) A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160. https://doi.org/10.1016/j.catena.2016.11.032
Chen W, Shirzadi A, Shahabi H, Ahmad BB, Zhang S, Hong H, Zhang N (2017b) A novel hybrid artificial intelligence approach based on the rotation forest ensemble and naïve Bayes tree classifiers for a landslide susceptibility assessment in Langao County, China. Geomat Nat Haz Risk 8:1955–1977. https://doi.org/10.1080/19475705.2017.1401560
Chen W, Peng J, Hong H, Shahabi H, Pradhan B, Liu J, Zhu A-X, Pei X, Duan Z (2018) Landslide susceptibility modelling using GIS-based machine learning techniques for Chongren County, Jiangxi Province, China. Sci Total Environ 626:1121–1135. https://doi.org/10.1016/j.scitotenv.2018.01.124
Choi J, Oh H-J, Lee H-J, Lee C, Lee S (2012) Combining landslide susceptibility maps obtained from frequency ratio, logistic regression, and artificial neural network models using ASTER images and GIS. Eng Geol 124:12–23. https://doi.org/10.1016/j.enggeo.2011.09.011
Ciurleo M, Cascini L, Calvello M (2017) A comparison of statistical and deterministic methods for shallow landslide susceptibility zoning in clayey soils. Eng Geol 223:71–81. https://doi.org/10.1016/j.enggeo.2017.04.023
Coetzee BWT, Robertson MP, Erasmus BFN, van Rensburg BJ, Thuiller W (2009) Ensemble models predict important bird areas in southern Africa will become less effective for conserving endemic birds under climate change. Glob Ecol Biogeogr 18:701–710. https://doi.org/10.1111/j.1466-8238.2009.00485.x
Constantin M, Bednarik M, Jurchescu MC, Vlaicu M (2011) Landslide susceptibility assessment using the bivariate statistical analysis and the index of entropy in the Sibiciu Basin (Romania). Environ Earth Sci 63:397–406. https://doi.org/10.1007/s12665-010-0724-y
Convertino M, Troccoli A, Catani F (2013) Detecting fingerprints of landslide drivers: a MaxEnt model. J Geophys Res Earth Surf 118:1367–1386. https://doi.org/10.1002/jgrf.20099
Corominas J, Van Westen C, Frattini P, Cascini L, Malet JP, Fotopolou S, Catani F, Van Den Eeckhaut M, Mavrouli O, Agliardi F (2014) Recommendations for the quantitative analysis of landslide risk. Bull Eng Geol Environ 73:209–263. https://doi.org/10.1007/s10064-013-0538-8
Cruden D M, Varnes D J (1996) Landslides: investigation and mitigation. Chapter 3-Landslide types and processes. Transportation research board special report (247)
D’Amato Avanzi G, Galanti Y, Giannecchini R, Mazzali A, Saulle G (2013) Remarks on the 25 October 2011 rainstorm in eastern Liguria and northwestern Tuscany (Italy) and the related landslides. Rend Online Soc Geol Ital 24:76–78
D’Onofrio R, Trusiani E (2018) Strategies and actions to recover the landscape after flooding: the case of Vernazza in the Cinque Terre National Park (Italy). Sustainability 10(3):742
Dou J, Yunus AP, Bui DT, Merghadi A, Sahana M, Zhu Z, Chen C-W, Han Z, Pham BT (2019) Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan. Landslides 12:T05005:641–658. https://doi.org/10.1007/s10346-019-01286-5
Elith J, Graham CH, Anderson RP, Dudík M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann A, Li J, Lohmann LG, Loiselle BA, Manion G, Moritz C, Nakamura M, Nakazawa Y, Overton J, McC M, Townsend Peterson A, Phillips SJ, Richardson K, Scachetti-Pereira R, Schapire RE, Soberón J, Williams S, Wisz MS, Zimmermann NE (2006) Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29(2):129–151
Elith J, Phillips SJ, Hastie T, Dudík M, Chee YE, Yates CJ (2011) A statistical explanation of MaxEnt for ecologists. Diversity and distributions 17(1):43–57
Ermini L, Catani F, Casagli N (2005) Artificial neural networks applied to landslide susceptibility assessment. Geomorphology 66:327–343. https://doi.org/10.1016/j.geomorph.2004.09.025
Felicísimo ÁM, Cuartero A, Remondo J, Quirós E (2013) Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: a comparative study. Landslides 10:175–189. https://doi.org/10.1007/s10346-012-0320-1
Fisher RA (2006) Statistical methods for research workers. Genesis Publishing Pvt Ltd.
Formetta G, Capparelli G, Versace P (2016) Evaluating performance of simplified physically based models for shallow landslide susceptibility. Hydrol Earth Syst Sci 20:4585–4603. https://doi.org/10.5194/hess-20-4585-2016
Fressard M, Thiery Y, Maquaire O (2014) Which data for quantitative landslide susceptibility mapping at operational scale? Case study of the Pays d’Auge plateau hillslopes (Normandy, France). Nat Hazards Earth Syst Sci 14:569–588. https://doi.org/10.5194/nhess-14-569-2014
Froude MJ, Petley D (2018) Global fatal landslide occurrence from 2004 to 2016. Nat Hazards Earth Syst Sci 18:2161–2181
Galve JP, Cevasco A, Brandolini P, Soldati M (2015) Assessment of shallow landslide risk mitigation measures based on land use planning through probabilistic modelling. Landslides 12:101–114. https://doi.org/10.1007/s10346-014-0478-9
Gigli G, Morelli S, Fornera S, Casagli N (2014) Terrestrial laser scanner and geomechanical surveys for the rapid evaluation of rock fall susceptibility scenarios. Landslides 11(1):1–14
Godt JW, Schulz WH, Baum RL, Savage WZ (2008) Modeling rainfall conditions for shallow landsliding in Seattle, Washington. Rev Eng Geol 20:137–152
Goetz JN, Brenning A, Petschko H, Leopold P (2015) Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput Geosci 81:1–11. https://doi.org/10.1016/j.cageo.2015.04.007
Gorsevski PV, Brown MK, Panter K, Onasch CM, Simic A, Snyder J (2016) Landslide detection and susceptibility mapping using LiDAR and an artificial neural network approach: a case study in the Cuyahoga Valley National Park, Ohio. Landslides 13:467–484. https://doi.org/10.1007/s10346-015-0587-0
Guisan A, Thuiller W, Zimmermann NE (2017) Habitat suitability and distribution models: with applications in R. Ecology, biodiversity and conservation. Cambridge University Press, Cambridge
Guzzetti F, Reichenbach P, Ardizzone F, Cardinali M, Galli M (2006) Estimating the quality of landslide susceptibility models. Geomorphology 81:166–184. https://doi.org/10.1016/j.geomorph.2006.04.007
Hengl T, Heuvelink GBM, Rossiter DG (2007) About regression-kriging: from equations to case studies. Comput Geosci 33:1301–1315. https://doi.org/10.1016/j.cageo.2007.05.001
Hengl T, Heuvelink GBM, Kempen B, Leenaars JGB, Walsh MG, Shepherd KD, Sila A, MacMillan RA, Mendes de Jesus J, Tamene L, Tondoh JE (2015) Mapping soil properties of Africa at 250 m resolution: random forests significantly improve current predictions. PLoS One 10:e0125814. https://doi.org/10.1371/journal.pone.0125814
Hengl T, Mendes de Jesus J, Heuvelink GBM, Ruiperez Gonzalez M, Kilibarda M, Blagotić A, Shangguan W, Wright MN, Geng X, Bauer-Marschallinger B, Guevara MA, Vargas R, MacMillan RA, Batjes NH, Leenaars JGB, Ribeiro E, Wheeler I, Mantel S, Kempen B (2017) SoilGrids250m: global gridded soil information based on machine learning. PLoS One 12:e0169748. https://doi.org/10.1371/journal.pone.0169748
Huang F, Zhang J, Zhou C, Wang Y, Huang J, Zhu L (2019) A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 12:1077–1229. https://doi.org/10.1007/s10346-019-01274-9
Hungr O, Evans SG, Hutchinson IN (2001) A review of the classification of landslides of the flow type. Environ Eng Geosci 7(3):221–238
Hungr O, Leroueil S, Picarelli L (2014) The Varnes classification of landslide types, an update. Landslides 11(2):167–194
Ilia I, Tsangaratos P (2016) Applying weight of evidence method and sensitivity analysis to produce a landslide susceptibility map. Landslides 13:379–397. https://doi.org/10.1007/s10346-015-0576-3
Jelínek R, Wagner P (2007) Landslide hazard zonation by deterministic analysis (Veľká Čausa landslide area, Slovakia). Landslides 4:339–350. https://doi.org/10.1007/s10346-007-0089-9
Jenks GF (1967) The data model concept in statistical mapping. Int. Yearb. Cartography 7:186–190
Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349:255–260. https://doi.org/10.1126/science.aaa8415
Kavoura K, Sabatakakis N (2019) Investigating landslide susceptibility procedures in Greece. Landslides 76:237–145. https://doi.org/10.1007/s10346-019-01271-y
Kavzoglu T, Sahin EK, Colkesen I (2014) Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression. Landslides 11(3):425–439
Kim HG, Lee DK, Park C, Ahn Y, Kil S-H, Sung S, Biging GS (2018) Estimating landslide susceptibility areas considering the uncertainty inherent in modeling methods. Stoch Env Res Risk A 32:2987–3019. https://doi.org/10.1007/s00477-018-1609-y
Kornejady A, Ownegh M, Bahremand A (2017) Landslide susceptibility assessment using maximum entropy model with two different data sampling methods. CATENA 152:144–162. https://doi.org/10.1016/j.catena.2017.01.010
Krkač M, Špoljarić D, Bernat S, Arbanas SM (2017) Method for prediction of landslide movements based on random forests. Landslides 14:947–960. https://doi.org/10.1007/s10346-016-0761-z
Lagomarsino D, Tofani V, Segoni S, Catani F, Casagli N (2017) A tool for classification and regression using random forests methodology: applications to landslide susceptibility mapping and soil thickness modelling. Environ Model Assess
Lasanta-Martínez T, Vicente-Serrano SM, Cuadrat-Prats JM (2005) Mountain Mediterranean landscape evolution caused by the abandonment of traditional primary activities: a study of the Spanish Central Pyrenees. Appl Geogr 25:47–65. https://doi.org/10.1016/j.apgeog.2004.11.001
Lee S, Ryu J-H, Won J-S, Park H-J (2004) Determination and application of the weights for landslide susceptibility mapping using an artificial neural network. Eng Geol 71:289–302. https://doi.org/10.1016/S0013-7952(03)00142-X
Lesschen JP, Cammeraat LH, Nieman T (2008) Erosion and terrace failure due to agricultural land abandonment in a semi-arid environment. Earth Surf Process Landf 33:1574–1584. https://doi.org/10.1002/esp.1676
Liu L, Li S, Li X, Jiang Y, Wei W, Wang Z, Bai Y (2019) An integrated approach for landslide susceptibility mapping by considering spatial correlation and fractal distribution of clustered landslide data. Landslides 16(4):715–728
Meng Q, Miao F, Zhen J, Wang X, Wang A, Peng Y, Fan Q (2016) GIS-based landslide susceptibility mapping with logistic regression, analytical hierarchy process, and combined fuzzy and support vector machine methods: a case study from Wolong Giant Panda Natural Reserve, China. Bull Eng Geol Environ 75(3):923–944
Moore ID, Wilson JP (1992) Length-slope factors for the revised universal soil loss equation: simplified method of estimation. J Soil Water Conserv 47(5):423–428
Moreno-de-las-Heras M, Lindenberger F, Latron J, Lana-Renault N, Llorens P, Arnáez J, Romero-Díaz A, Gallart F (2019) Hydro-geomorphological consequences of the abandonment of agricultural terraces in the Mediterranean region: key controlling factors and landscape stability patterns. Geomorphology 333:73–91. https://doi.org/10.1016/j.geomorph.2019.02.014
Mousavi SZ, Kavian A, Soleimani K, Mousavi SR, Shirzadi A (2011) GIS-based spatial prediction of landslide susceptibility using logistic regression model. Geomatics, Natural Hazards Risk 2(1):33–50
Natekin A, Knoll A (2013) Gradient boosting machines, a tutorial. Front Neurorobot 7:21. https://doi.org/10.3389/fnbot.2013.00021
Neuhäuser B, Damm B, Terhorst B (2012) GIS-based assessment of landslide susceptibility on the base of the weights-of-evidence model. Landslides 9:511–528. https://doi.org/10.1007/s10346-011-0305-5
Park HJ, Jang JY, Lee JH (2019) Assessment of rainfall-induced landslide susceptibility at the regional scale using a physically based model and fuzzy-based Monte Carlo simulation. Landslides 16:695–713. https://doi.org/10.1007/s10346-018-01125-z
Peh KK, Lim CP, Quek SS, Khoh KH (2000) Use of artificial neural networks to predict drug dissolution profiles and evaluation of network performance using similarity factor. Pharm Res 17:1384–1388. https://doi.org/10.1023/a:1007578321803
Pepe G, Mandarino A, Raso E, Scarpellini P, Brandolini P, Cevasco A (2019) Investigation on farmland abandonment of terraced slopes using multitemporal data sources comparison and its implication on hydro-geomorphological processes. Water 11(8):1552. https://doi.org/10.3390/w11081552
Pham BT, Tien Bui D, Prakash I (2017) Landslide susceptibility assessment using bagging ensemble based alternating decision trees, logistic regression and J48 decision trees methods: a comparative study. Geotech Geol Eng 35:2597–2611. https://doi.org/10.1007/s10706-017-0264-2
Pham BT, Tien Bui D, Prakash I (2018) Bagging based support vector machines for spatial prediction of landslides. Environ Earth Sci 77:47. https://doi.org/10.1007/s12665-018-7268-y
Phillips SJ, Dudík M (2008) Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography
Pijanowski BC, Brown DG, Shellito BA, Manik GA (2002) Using neural networks and GIS to forecast land use changes: a land transformation model. Comput Environ Urban Syst 26:553–575. https://doi.org/10.1016/S0198-9715(01)00015-1
Pradhan B, Lee S (2010) Regional landslide susceptibility analysis using back-propagation neural network model at Cameron Highland, Malaysia. Landslides 7:13–30. https://doi.org/10.1007/s10346-009-0183-2
R Core (2018) Team. R: a language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing. Available from: https://www.R-project.org/
Randin CF, Dirnböck T, Dullinger S, Zimmermann NE, Zappa M, Guisan A (2006) Are niche-based species distribution models transferable in space? J Biogeogr 33:1689–1703. https://doi.org/10.1111/j.1365-2699.2006.01466.x
Raso E, Mandarino A, Pepe G, Di Martire D, Cevasco A, Calcaterra D, Firpo M (2019a) Landslide inventory of the Cinque Terre National Park, Italy. In IAEG/AEG Annual Meeting Proceedings, San Francisco California 2018-Volume 1 (pp. 201-205). Springer, Cham
Raso E, Cevasco A, Di Martire D, Pepe G, Scarpellini P, Calcaterra D, Firpo M (2019b) Landslide-inventory of the Cinque Terre National Park (Italy) and quantitative interaction with the trail network. J Maps 15:818–830. https://doi.org/10.1080/17445647.2019.1657511
Raso E, Di Martire D, Cevasco A, Calcaterra D, Scarpellini P, Firpo M (2019c) Evaluation of prediction capability of the MaxEnt and Frequency Ratio methods for landslide susceptibility in the Vernazza catchment (Cinque Terre, Italy). Springer book “Applied Geology: Approaches to Future Resource Management” (in press)
Reichenbach P, Rossi M, Malamud BD, Mihir M, Guzzetti F (2018) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180:60–91. https://doi.org/10.1016/j.earscirev.2018.03.001
Sadr MP, Maghsoudi A, Saljoughi BS (2014) Landslide susceptibility mapping of Komroud sub-basin using fuzzy logic approach. Geodynamics Res Int Bull 2(2):XVI–XXVIII
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227
Schilirò L, Montrasio L, Scarascia Mugnozza G (2016) Prediction of shallow landslide occurrence: validation of a physically-based approach through a real case study. Sci Total Environ 569-570:134–144. https://doi.org/10.1016/j.scitotenv.2016.06.124
Schilirò L, Cevasco A, Esposito C, Mugnozza GS (2018) Shallow landslide initiation on terraced slopes: inferences from a physically based approach. Geomatics, Natural Hazards Risk 9:295–324. https://doi.org/10.1080/19475705.2018.1430066
Segoni S, Lagomarsino D, Fanti R, Moretti S, Casagli N (2015) Integration of rainfall thresholds and susceptibility maps in the Emilia Romagna (Italy) regional-scale landslide warning system. Landslides 12:773–785. https://doi.org/10.1007/s10346-014-0502-0
Segoni S, Tofani V, Rosi A, Catani F, Casagli N (2018) Combination of rainfall thresholds and susceptibility maps for dynamic landslide hazard assessment at regional scale. Front Earth Sci 6:85
Segoni S, Pappafico G, Luti T, Catani, F (2020) Landslide susceptibility assessment in complex geological settings: sensitivity to geological information and insights on its parameterization. Landslides 1–11. https://doi.org/10.1007/s10346-019-01340-2
Sepe C, Confuorto P, Angrisani AC, Di Martire D, Di Napoli M, Calcaterra D (2019) Application of statistical approach to landslide susceptibility map generation in urban settings. In Proc IAEG/AEG Annu Meeting Proc. Springer, Cham, San Francisco, vol. 1, pp 155–162
Sharma A, Tiwari KN, Bhadoria PBS (2011) Effect of land use land cover change on soil erosion potential in an agricultural watershed. Environ Monit Assess 173:789–801. https://doi.org/10.1007/s10661-010-1423-6
Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293. https://doi.org/10.1126/science.3287615
Tarolli P, Preti F, Romano N (2014) Terraced landscapes: from an old best practice to a potential hazard for soil degradation due to land abandonment. Anthropocene 6:10–25. https://doi.org/10.1016/j.ancene.2014.03.002
Teerarungsigul S, Torizin J, Fuchs M, Kühn F, Chonglakmani C (2016) An integrative approach for regional landslide susceptibility assessment using weight of evidence method: a case study of Yom River basin, Phrae Province, northern Thailand. Landslides 13:1151–1165. https://doi.org/10.1007/s10346-015-0659-1
Terranova R (1984) Aspetti geomorfologici e geologico-ambientali delle Cinque Terre: rapporti con le opere umane. Studi e ricerche di Geografia 7:38–90 (In Italian)
Terranova R, Zanzucchi G, Bernini M, Brandolini P, Campobasso S, Faccini F, Zanzucchi F (2006) Geologia, geomorfologia e vini del parco Nazionale delle Cinque Terre (Liguria, Italia). Boll Soc Geol Ital 6:115–128 (In Italian)
Thuiller W, Lafourcade B, Engler R, Araújo MB (2009) BIOMOD - a platform for ensemble forecasting of species distributions. Ecography 32:369–373. https://doi.org/10.1111/j.1600-0587.2008.05742.x
Thuiller W, Georges D, Engler R, Breiner F (2016) biomod2: ensemble platform for species distribution modeling. R Package Version 3:3–7. https://CRAN.R-project.org/package¼biomod2
Trigila A, Iadanza C, Esposito C, Scarascia-Mugnozza G (2015) Comparison of logistic regression and random forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy). Geomorphology 249:119–136. https://doi.org/10.1016/j.geomorph.2015.06.001
Tsangaratos P, Ilia I (2016a) Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: the influence of models complexity and training dataset size. CATENA 145:164–179. https://doi.org/10.1016/j.catena.2016.06.004
Tsangaratos P, Ilia I (2016b) Landslide susceptibility mapping using a modified decision tree classifier in the Xanthi perfection, Greece. Landslides 13:305–320. https://doi.org/10.1007/s10346-015-0565-6
Tsangaratos P, Ilia I, Hong H, Chen W, Xu C (2017) Applying information theory and GIS-based quantitative methods to produce landslide susceptibility maps in Nancheng County, China. Landslides 14:1091–1111. https://doi.org/10.1007/s10346-016-0769-4
Tukey JW (1953) Some selected quick and easy methods of statistical analysis. Trans N Y Acad Sci 16:88–97. https://doi.org/10.1111/j.2164-0947.1953.tb01326.x
Umar Z, Pradhan B, Ahmad A, Jebur MN, Tehrany MS (2014) Earthquake induced landslide susceptibility mapping using an integrated ensemble frequency ratio and logistic regression models in west Sumatera Province, Indonesia. CATENA 118:124–135. https://doi.org/10.1016/j.catena.2014.02.005
Van Westen CJ, Rengers N, Soeters R (2003) Use of geomorphological information in indirect landslide susceptibility assessment. Nat Hazards 30:399–419. https://doi.org/10.1023/B:NHAZ.0000007097.42735.9e
Vorpahl P, Elsenbeer H, Märker M, Schröder B (2012) How can statistical models help to determine driving factors of landslides? Ecol Model 239:27–39. https://doi.org/10.1016/j.ecolmodel.2011.12.007
Wang LJ, Guo M, Sawada K, Lin J, Zhang J (2015) Landslide susceptibility mapping in Mizunami City, Japan: a comparison between logistic regression, bivariate statistical analysis and multivariate adaptive regression spline models. Catena 135:271–282
Wilson JP, Gallant JC (2000) Terrain analysis: principles and applications. Wiley, Hoboken
Xiao T, Segoni S, Chen L, Yin K, Casagli N (2019) A step beyond landslide susceptibility maps: a simple method to investigate and explain the different outcomes obtained by different approaches. Landslides 5:853–640. https://doi.org/10.1007/s10346-019-01299-0
Yalcin A, Reis S, Aydinoglu AC, Yomralioglu T (2011) A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey. CATENA 85:274–287. https://doi.org/10.1016/j.catena.2011.01.014
Yao X, Tham LG, Dai FC (2008) Landslide susceptibility mapping based on Support Vector Machine: A case study on natural slopes of Hong Kong, China. Geomorphology 101 (4):572–582. https://doi.org/10.1016/j.geomorph.2008.02.011
Yeon Y-K, Han J-G, Ryu KH (2010) Landslide susceptibility mapping in Injae, Korea, using a decision tree. Eng Geol 116:274–283. https://doi.org/10.1016/j.enggeo.2010.09.009
Youssef AM, Al-Kathery M, Pradhan B (2015) Landslide susceptibility mapping at Al-Hasher area, Jizan (Saudi Arabia) using GIS-based frequency ratio and index of entropy models. Geosci J 19(1):113–134
Zurada JM (1992) Introduction to artificial neural systems, vol 8. West publishing company, St. Paul
Acknowledgements
The authors would like to acknowledge the Centro Studi Rischi Geologici of the Cinque Terre National Park. The authors also thank Dr. Alessandro Novellino (British Geological Survey) for revising the text and for providing valuable advice. Finally, the authors thank Consorzio interUniversitario per la prevenzione dei Grandi Rischi (CUGRI) for providing technological support. In addition, the authors would like to thank the two anonymous reviewers for their valuable and insightful comments to improve the paper.
Funding
The Cinque Terre National Park and its Director, Eng. Patrizio Scarpellini, provided funding for this project and for logistic support during field operations.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
ESM 1
(DOCX 29 kb).
Rights and permissions
About this article
Cite this article
Di Napoli, M., Carotenuto, F., Cevasco, A. et al. Machine learning ensemble modelling as a tool to improve landslide susceptibility mapping reliability. Landslides 17, 1897–1914 (2020). https://doi.org/10.1007/s10346-020-01392-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10346-020-01392-9