Abstract
The main purpose of the present study is to evaluate the model validation stage of a routine landslide susceptibility mapping. For the purpose, model validation is assessed in three stages; (1) during model data production, (2) during model construction, and (3) during the production of model consequences; landslide susceptibility maps. As the results of these evaluations, it is revealed that training and testing data sets should be separated considering an appropriate separation ratio which is about 80 % and 20 % of the presence (1) data after completion of inventory studies. Correct classification percentages, error matrices, and the Kappa index are suggested to be considered for the training data sets during model construction. Additionally, again the correct classification percentage and the Root Mean Square Error (RMSE) should be considered during this stage for the testing data sets as well. In order to evaluate the spatial performance of the produced landslide susceptibility maps, the use of the Receiver Operating Characteristic (ROC) curves and the Area Under Curve (AUC) statistics is recommended. In the present study, the maximum Kappa index (k) value was calculated to be 0.459 for both the random sampling 1 (Rnd1) in the model 1 and for the random sampling 2 (Rnd2) in the model 2 during the model construction stage. The AUC values were calculated for these random samplings to be 0.781 and 0.790 respectively during the production of the model consequence stage in the study.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
Introduction
The model validation stage constitutes one of the most important issues in natural hazard modelling studies (Begueria 2006). Obviously, it could be considered that validation of landslide susceptibility models as well as the maps should also be the most critical topic in landslide susceptibility researches. Although the model validation assessment has a very important role in modelling, many studies in literature do not give necessary importance to model validation, they just allow to be evaluated some basic validation statistics to assess the performance of the models. In landslide susceptibility mapping studies, the performance of the models are checked using different validation statistics by different researchers and that contributes to differences in terms of performance evaluation and the comparisons of the models according to these performances. The main purpose of this study is to represent a procedure for the assessment of the performance of landslide susceptibility mapping. The study proposes a flow chart which evaluates the current validation indices of the models in three stages: i) the model data production stage, ii) model construction stage and iii) the production of model consequences stage. For the purpose, the landslide susceptibility analyses performed by Dagdelenler (2013) in eastern part of Gallipoli Peninsula (Canakkale, Turkey) were evaluated.
Model Validation
As mentioned, the performance evaluations of the models were performed in three stages given in the flow chart in Fig. 1. These stages are described in detail below.
During Model Data Production
During the landslide susceptibility analyses performed by Dagdelenler (2013), a total of 10 variables (7 continuous and 3 categorical) were used as independent variables and also presence (1) and non-presence (0) data of mapped landslides was used as the dependent variable. 20 % of the presence (1) data was separated as testing and 80 % of the presence (1) data was separated as training data sets for the models (Nefeslioglu et al. 2011, 2012; Oh and Pradhan 2011; San 2014). This process was carried out three times the models data sets Rnd1, Rnd2 and Rnd3 selected at random were obtained. For each random set, 80 % of the presence (1) data was separated and the training sets were generated by separating also 80 % of non-presence (0) data which equal to 80 % of presence (1) data in number. As it could be clearly realized, this stage constitutes the preparedness for further performance evaluations.
During Model Construction
The landslide susceptibility analyses were performed by applying the logistic regression technique by Dagdelenler (2013). As the results of the logistic regression analyses using training data sets, correct classification percentages (%), error matrices and validation statistics and Kappa index (k) values derived from the error matrices were determined. Correct classification percentages (%) calculated as the results of logistic regression analyses for the models were found to be acceptable and quite similar. They vary between 78 and 79 %.
An error matrix shows the number of correctly estimated observations for positive and negative cases. In the error matrix in Table 1, the observed and predicted presence (1) and non-presence (0) data sets are represented by a letter (a, b, c, and d). Validation statistics (Table 2) and Kappa index (k) values derived from the error matrices of the models are calculated according to formulations of the validation statistics derived from the error matrix (Begueria 2006). The formula of the Kappa index (k) derived from error matrices was seen in (1,2, and 3). According to the Kappa index classification chart proposed by Landis and Koch (1977), Kappa index value for each model indicates that the model compatibility powers were moderately good.
Where,
P = The proportion of observations in agreement; Pe = The proportion in agreement due to chance; k = Kappa index.
RMSE performance index and correct classification percentage (%) values were calculated by using the testing data sets. Validation indices were derived from the error matrix and Kappa index values (threshold dependent) were specified by using the training data sets. Correct classification percentages for the first, second and third random sets of the landslide body sampling model were calculated as follows: 79.7 %, 80.5 % and 69.8 % respectively. The correct classification percentages for the landslide susceptibility models considered different buffer distances (d = 25 m, d = 50 m, d = 75 m and d = 100 m) in the seed cell samplings (Dagdelenler 2013) vary between 76.6 and 88.5 %. The seed cells obtained by the seed cell sampling strategy (Suzen and Doyuran 2004) are assumed to represent the pre-failure conditions of the landslides for the topographical parameters in particular. The calculated RMSE values for the Model 1 (landslide body samplings) are 0.398, 0.395 and 0.453 respectively and RMSE values vary between 0.334 and 0.422 for the Model 2 (seed cell samplings) at different buffer distances (d = 25 m, d = 50 m, d = 75 m, d = 100 m).
During the Production of Model Consequences
In the third stage of the performance evaluation procedure, the resultant landslide susceptibility maps were analysed by using the ROC curves and the area under ROC curves (AUC). The ROC curve evaluation and the AUC are threshold independent indices which are determined during the production of model consequences. Area under ROC curve value is used as a single threshold independent validation statistics (Begueria 2006). The An AUC value which is close to 1 means the performance of the model is good (Fawcett 2006). The ROC curves of the models were drawn and the AUC values were determined (Table 3). According to the results, the calculated AUC values for the models were found to be close to 1 and were very close to each other (Table 4). These results show that the performances of the models are quite acceptable.
Results and Conclusions
The validation indices are evaluated in three stages such as during model data production, model construction, and production of model consequences. In addition, a generalized flow chart for the performance evaluation of the landslide susceptibility models is proposed. According to the flow chart, it could be clearly realized that which validation indices are calculated from which data set and in which stage of the model. In recent huge landslide susceptibility literature, there is vagueness about the validation of the models constructed. This uncertainty starts from the model data production and goes up to production of the model consequences. The common way applied for validation is the evaluation of the ROC curves for whole study area (Ayalew and Yamagishi 2005; Mathew et al. 2007; Pradhan 2010). However, the performance evaluation of the model construction stage is commonly ignored particularly in the studies in which the bivariate statistics, artificial intelligence and data mining techniques are applied (Saito et al. 2009; Yilmaz 2009; Oh and Pradhan 2011; Akgun et al. 2012; Bui and Pradhan 2012; Conforti et al. 2014). Obviously, in order to apply this stage, a pre-processing stage including data production for further evaluations is necessary. The performance evaluation of the model construction stage was suggested to be a separate routine step in model validation for landslide susceptibility analyses in the proposed flow chart. It is commonly desired from a landslide susceptibility model to provide both high prediction capacity for the constructed model and high generalization capacity for the application results in whole study area (Can et al. 2005). Assuming that if all probabilities are calculated to be 1 for whole study area, in this case the spatial performance of the model is found to be 100 %. However, the resultant landslide susceptibility texture is irrational and it could be assumed that there is no generalization capacity for the model for this situation. The probabilities will also be calculated to be 1 for whole areas without landslides so that the calculated AUC values will also be low. In other words, the model prediction capacity is maximum while the generalization capacity is minimum. Hence, it could be clearly realized that the validation indices for the model construction and production model consequences should be evaluated separately. Therefore, it could be concluded that the proposed methodology in this study enables controlling both prediction and generalization capacities for any landslide susceptibility evaluations appropriately.
According to the results obtained from the sample analyses of the models given in this study, the validation indices are quite close and this also shows that model predictive and generalization capacities could be evaluated to be acceptable.
References
Akgun A, Sezer EA, Nefeslioglu HA, Gokceoglu C, Pradhan B (2012) An easy to use MATLAB program (MamLand) for the assessment of landslide susceptibility using a Mamdani fuzzy algorithm. Comput Geosci 38:23–34
Ayalew L, Yamagishi H (2005) The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology 65:15–31
Begueria S (2006) Validation and evaluation of predictive models in hazard assessment and risk management. Nat Hazards 37:315–329
Bui D, Pradhan B (2012) Landslide susceptibility mapping at Hoa Binh province (Vietnam) using an adaptive neuro-fuzzy inference system and GIS. Comput Geosci 45:199–211
Can T, Nefeslioglu HA, Gokceoglu C, Sonmez H, Duman TY (2005) Susceptibility assessment of shallow earthflows triggered by heavy rainfall at three subcatchments by logistic regression analyses. Geomorphology 72:250–271
Conforti M, Pascale S, Robustelli G, Sdao F (2014) Evaluation of prediction capability of the artificial neural networks for mapping landslide susceptibility in the Turbolo River catchment (northern Calabria, Italy). Catena 113:236–250
Dagdelenler G (2013) Assessment of sampling and validation strategies in production of landslide susceptibility Maps (Eastern Part of Gallipoli Peninsula). Hacettepe University, Institute of Science, Ankara, Turkey, 186 p. (unpublished)
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Mathew J, Jha VK, Rawat GS (2007) Application of binary logistic regression analysis and its validation for landslide susceptibility mapping in part of Garhwal Himalaya, India. Int J Remote Sens 28(10):2257–2275
Nefeslioglu HA, Gokceoglu C, Sonmez H, Gorum T (2011) Medium-scale hazard mapping for shallow landslide initiation:the Buyukkoy catchment area (Cayeli, Rize, Turkey). Landslides 8:459–483
Nefeslioglu HA, San BT, Gokceoglu C, Duman TY (2012) An assessment on the use of Terra ASTER L3A data in landslide susceptibility mapping. Int J Appl Earth Observ Geoinform 14:40–60
Oh HJ, Pradhan B (2011) Application of a neura-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area. Comput Geosci 37:1264–1276
Pradhan B (2010) Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia. Adv Space Res 45:1244–1256
Saito H, Nakayama D, Matsuyama H (2009) Comparison of landslide susceptibility based on a decision-tree model and actual andslide occurrence: The Akaishi Mountains, Japan. Geomorphology 109:108–121
San BT (2014) An evaluation of SVM using polygon-based random sampling in landslide susceptibility mapping: The Candir catchment area (western Antalya, Turkey). Int J Appl Earth Observ Geoinform 26:399–412
Suzen ML, Doyuran V (2004) Data driven bivariate landslide susceptibility assessment using geographical information systems: a method and application to Asarsuyu catchment, Turkey. Eng Geol 71:303–321
Yilmaz I (2009) Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat-Turkey). Comput Geosci 35:1125–1138
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Dagdelenler, G., Nefeslioglu, H.A., Gokceoglu, C. (2014). Landslide Susceptibility Model Validation: A Routine Starting from Landslide Inventory to Susceptibility. In: Sassa, K., Canuti, P., Yin, Y. (eds) Landslide Science for a Safer Geoenvironment. Springer, Cham. https://doi.org/10.1007/978-3-319-05050-8_64
Download citation
DOI: https://doi.org/10.1007/978-3-319-05050-8_64
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05049-2
Online ISBN: 978-3-319-05050-8
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)