Introduction

In northern Algeria, landslides represent an alarming geological disaster because of the economic consequences and human lives losses they cause, principally in urban areas. Hence, they present a serious threat for both human lives and their properties and constitute a major constraint for either the economic development or the urban planning of many cities. During the last decades, several cases of damaging landslides have been reported throughout the country (Djerbal and Melbouci 2012; Hadji et al. 2013; Bougdal et al. 2013; Djerbal et al. 2014; Bourenane et al. 2014; Laribi et al. 2014; Guirous et al. 2014; Bourenane et al. 2016; Bourenane 2017; Djerbal et al. 2017; Hallal et al. 2017; Hallal et al. 2019; Bourenane et al. 2021a; Bourenane and Bouhadad 2021b).

The impact of these land instabilities is significantly worsened by the rapid demographic growth, the rapid and the uncontrolled development of the urbanization in landslide-prone areas, the heavy and prolonged rainfall trend, and lack and/or insufficient initiative aiming to understand the landslide hazards and risks (Bourenane et al. 2014; Bourenane et al. 2016; Hadji et al. 2017; Bourenane et al. 2021; Bourenane and Bouhadad 2021). Landslides often occur in the rainy seasons particularly during torrential rainstorms recorded between October and April. This phenomenon is known throughout northern Algeria: Tellean Atlas domain, where the rainfall regime is concentrated in a short period with heavy and prolonged rainfalls (Bourenane 2017; Bourenane et al. 2019; Bourenane et al. 2021b).

The city of Azazga located in the mountainous province of Kabylia, in northern Algeria, is one of the most heavily affected urban areas by the frequent and progressive landslides that caused serious damage to dwellings, infrastructures, forests, and agricultural fields during the last century (Djerbal et al. 2014; Bourenane et al. 2021). The most recorded and known mass movements were registered in 1952, 1955, 1973, 1974, 1984, 1985, 2003, 2004, 2012, 2014, and 2018 during the winter season, characterized by high-intensity rainstorm events with long rainfall periods (Bourenane et al. 2021a). The most recent damaging landslides occurred in 2012, 2014, and 2018 that affected the urban center of Azazga city have caused severe damage to buildings, cultivated land, roadways, and public infrastructures (Fig. 1). In recent years, the landslide vulnerability has been substantially increased due to the rapid development of urbanization and infrastructure in landslides-prone areas as well as the ongoing change in precipitation trends (Bourenane 2017; Bourenane et al. 2019; Bourenane et al. 2021a) In fact, the extension of the city, since the independence in 1962, is continually confronted with severe construction and urban planning problems.

Fig. 1
figure 1

Examples of the most recent landslides occurred in 2012, 2014, and 2018 that caused ground deformations and severe damage to buildings, roadways, and public infrastructures in the urban area of Azazga city

Nowadays, these landslides constitute a serious threat not only to the local populations and the environment but also a constant constraint to urban planning and development. Indeed, the rapid unplanned urbanization, the bad land-use planning, the environmental mismanagement, and a lack of risk mitigation strategy exacerbate the impact of hazards and increase the risk. Unfortunately, there was insufficient consideration of the landslide phenomena in the local strategy of development and land use planning of the city. Indeed, very little research is available which aimed to predict and prevent these events, despite the continuous progression of landslides and their related damage effects (Djerbal et al. 2014; Bourenane et al. 2021a).

In order to reduce the damage of properties and the losses of human lives as well as to contribute to the risk reduction for the sustainable urban planning and development of the Azazga city, it becomes necessary to generate comprehensive landslide susceptibility maps (LSMs). The landslide susceptibility mapping is considered as an imperative task that can help authorities to reduce landslides disasters losses by serving as a guideline for durable land-use planning such as the restriction of urban extension in hazardous zones.

According to Varnes (1984), landslide susceptibility is defined as the probability of the spatial occurrence of landslides in a given area for a given predisposing terrain factors. The landslide susceptibility mapping is related to the subdivision of a given area into homogeneous zones and their ranking according to their degrees of landslide susceptibility. Thus, a LSM indicates areas likely to be affected by landslides in the future based on the correlation with historical distributions of landslides and their associated factors.

Various modeling methods and techniques often based on geographic information system (GIS) have been successfully developed and applied for landslide susceptibility mapping at different scales (regional, medium, large and local). These methods can be categorized into three groups: (i) heuristic, (ii) deterministic, and (iii) statistical techniques (Guzzetti et al. 1999; Aleotti and Chowdhury 1999; Van Westen et al. 2003; Ayalew and Yamagishi 2005).

The heuristic (or qualitative) method is a direct approach that is based on knowledge and experiences of the expert. The landslide susceptibility is determined directly using the subjective decision rules of the expert to categorize landslide-prone areas and producing a qualitative LSM (Van Westen et al. 2003; Thiery et al. 2007; Bourenane et al. 2014, 2016).

The deterministic (or geotechnical) methods focus on the analysis of the mechanical equilibrium of a potential slide block and calculate the slope safety factor based on mathematical modeling of the physical laws controlling slope failure (Zhou et al. 2003; Jelínek and Wagner 2007). The deterministic models require a large amount of data related to the material properties, such as mechanical characteristics and the degree of saturation to produce reliable results. Obtaining such data for large areas is not practical, and this method is therefore not applicable for the medium and large scale (Terlien et al. 1995).

The statistical (or quantitative) method is an indirect approach, employed to reduce subjectivity in qualitative analysis. It is based on mathematical correlations between the landslide-controlling factors and the distribution of landslides. The main concept of the indirect quantitative statistical approaches is that the controlling factors of future landslides are the same as those observed in the past (Guzzetti et al. 1999).

During the last decade, statistical models such as the bivariate statistical analysis (frequency ratio, statistical index, weights of evidence) and the multivariate statistical analysis (multiple linear regressions, logistic regression, and discriminant) have been widely applied throughout the world and provides reliable results (Pradhan and Lee 2010; Pradhan and Youssef 2010; Tien Bui et al. 2011; Kevin et al. 2011; Yalcin et al. 2011; Ozdemir and Altural 2012; Shahabi et al. 2012; Demir et al. 2015; Bourenane et al. 2016; Thiery et al. 2007; Hadji et al. 2017; Demir 2018). In the bivariate statistical analysis, each factor map is combined with the landslide inventory map, in order to evaluate the weighting values for each factor class using GIS technologies. Then, the results of the weights are summed up and classified to obtain a LSM. The multivariate statistical analysis is based on the stepwise variable selection to predict the probability of occurrence of a dichotomous event from a set of variables that may be continuous, discrete, or both in combinations. The main difference between the logistic regression and the other multiple statistical analyses is that the independent variables didn’t have to be normally distributed, and the predicted values are converted into probabilities between 0 and 1.

Huge progress has been accomplished, which showed that the statistical methods provide fully satisfactory results and now are considered as very rigorous, more objective, and more suitable for landslide susceptibility at medium and large scale (1:50,000, 1:25,000, and 1:10,000) because of their potential to minimize errors related to the expert subjectivity (Van Westen 1997; Lee and Min 2001; Thiery et al. 2007; Pradhan and Lee 2010; Bourenane et al. 2014; Bourenane et al. 2016). Nowadays, there has been a rapid improvement in the preparation of LSMs because of the continuous progress of computer science and geomatics technology. Indeed, the advance in GIS technologies and remote sensing techniques has tremendously helped the preparation of LSM with greater accuracy in spatial data management, manipulation, storing, processing, and easier spatial analysis of large amounts of data that alleged the update of the susceptibility assessment procedures.

In Algeria, despite the increase and the widespread of landslides occurrences, there are still limited and insufficient initiatives in landslide susceptibility mapping. Nevertheless, there are examples of successively performed studies (Hadji et al. 2013; Bourenane et al. 2014; Djerbal et al. 2014; Bourenane et al. 2016; Hadji et al. 2017; Djerbal et al. 2017; Bourenane et al. 2019; Karim et al. 2019; Merghadi et al. 2020).

In this article, we attempt to consider the challenges of landslide hazards in land use planning to initiate durable policies and legislation for mitigation and prevention purposes. The LSM may contribute to the risk prevention and mitigation and set up of a management policy for sustainable urban planning and development in prone areas to landslides. Nevertheless, producing a reliable LSM is still problematic and constitutes a challenging task due both to the complex characteristics of landslides and the used modeling approach. The optimal appropriate model for a given area depends greatly on the applied modeling methods and also on the quality of the used data (Yilmaz 2009). To overcome this problem, a variety of approaches have been developed in order to understand the mechanisms and the conditioning factors that control the landslides as well as to predict their spatial occurrence. The statistical methods supported by GIS have gained popularity in the field of landslide susceptibility mapping. The statistical models seem more accurate compared to the physical ones, which need multiple iterations and simulations to find detailed geotechnical parameters in preparing the susceptibility outputs. However, they have certain limits related to their difficulty in explicating the outputs results of the black box models and overfitting in the presence of limited training data samples.

The main purpose of this work is to prepare, validate, and compare the LSMs by applying four statistical methods including frequency ratio (FR), statistical index (SI), logistic regression (LR), and weights of evidence (WoE) with the help of GIS techniques for the city of Azazga in northern Algeria. These methods are tested and validated, and the results are compared and discussed.

This research work is a part of a thematic approach focused on the understanding of the landslide hazard and also on a methodology of evaluation and mapping of landslides hazard at a large scale. Thus, this investigation completes the different works undertaken on the prediction of landslides in urban areas in order to improve the scientific understanding and the spatial variation of landslide hazard in the city of Azazga. The final results provide valuable orientations for landslide hazard reduction and may serve as guidelines for land use development planning in the city of Azazga.

Description of the study area

The city of Azazga is situated in the Tizi Ouzou province in northern Algeria (Fig. 2a) at about 135 km east of the capital city of Algiers and at about 35 km east of Tizi-Ouzou Wilaya (prefecture) (Fig. 2b). The study area concerns the urban zone delimited by the master city plan (PDAU) perimeter and defined by its geographical coordinates: latitude 36.74° N and longitude 4.37° E (Fig. 2c). The Azazga region is highly susceptible to gulley erosion and landslide phenomena due to its geomorphological, geological, and climatic characteristics as well as anthropogenic activities.

Fig. 2
figure 2

Geographical location of the study area within (a) the northeast of the Tizi-Ouzou prefecture versus administrative division of north Algeria; (b) the north center of Algeria and at East of the capital city Algiers; (c) limit of the urban perimeter on the digital elevation model (DEM)

Geomorphologically, the Azazga region belongs to the massif of Grande Kabylia which is a part of the north mountainous area of the Tellean Atlas, characterized by three major landform types namely: (i) mountains of El Abed in the east whose altitude can exceeds 700 m, (ii) hills where the altitudes are varying between 100 and 400 m, and (iii) alluvial plain of Sébaou located at the western part of Azazga city with altitude ranging between 50 and 150 m. The city of Azazga corresponds to a hilly morphology area located on the foothill of the El Abed Mountain backing to a plateau with a maximum slope of 10° and delimited from both the North and South by two slope breaks exceeding 15°. The altitude of the study area ranges between 500 and 800 m above sea level and decreases from the northeast to the southwest (Fig. 2c).This geomorphology resulted from the geological, tectonic, hydrographical, and erosion processes.

From a geological point of view, the Azazga region belongs to the North-Kabyle Flysch domain located in the internal zones of the Maghrebin chain where three lithological formations outcrops (Fig. 3): (i) the marly clays of the Cretaceous flysch; (ii) the clays and sandstones of the Oligocene Numidian flysch; (iii) the Mio-Pliocene clays and marls; and (iv) the clayey and sandy sandstones of Quaternary scree. These geological formations, which are affected by cracks and faults, are covering a large surface of the urban area of Azazga. In addition, they are very sensitive to the presence of water because of the high plasticity of marls and clays, consequently, predisposed to erosions, landslides, and flows. Landslides are located in the flysch area, an allochthonous domain, characterized by thrust slicks that have been displaced during or after their sedimentation during the alpine cycle and deposited in a tectonized orogenic zone. The main tectonic features include (Gelard 1979; ORGM 1996) (Fig. 3): (i) tectonic contact oriented N-S separating the “flysch of Azazga” unit from the Numidian unit; (ii) a fault system network oriented SW-NE and NW–SE; (iii) the overthrust of the Numidian sandstones on the Cretaceous clay.

Fig. 3
figure 3

Geological map of the studied area

The hydrographic network is represented by the Aboud, the Iazoughen, and the Boulina rivers of a semi-permanent flow which are associated with a temporary flow affluent (Fig. 1). The hydrogeology of the area is mainly commanded by the distribution of the impermeable flysch substratum and the permeable quaternary scree. The upper layer is relatively permeable allowing water to pass through, whereas the underlying shale layer could be impermeable. The groundwater of the slope fills by rainfalls through surface infiltration and fluctuates seasonally. Consequently, the groundwater resource is significant, particularly, during the winter period (December to February) when rainfall is at its highest precipitation. Therefore, the groundwater level increases during the winter season.

The climate of the study area belongs to the Mediterranean type, characterized by dry seasons from June to September and rainy, sometimes snowy, winters from October to April. According to the precipitation database covering a time period of 64 years (1950 to 2014) of the National Agency of Meteorology and Hydrology (ANRH 2014), the intensity and frequency of the precipitations are concentrated over a short period during the rainy season extending from December to March that represents 50 to 60% of the yearly precipitation (ANRH 2014). Highly variable rainfall amounts (700 to 1200 mm year−1) occur with intense storms during winter and autumn seasons, which represent a major landslides hazard factor (ANRH 2014).

The human activities developments that the city has experienced since the historical occupation of the Azazga slopes, since 1974 until today, following the rapid development of urbanization without consideration of the real land constraint has led to significant morphological changes and modification of the stability conditions (deforestation, excavation, extensive clear-cut logging, and vegetation removal). The gradual extension of the urban area with the extensive land use activities in the northern and the southern parts of the city along inappropriate land use constitute the main factors of the increase of the frequency of landslides.

Methodology

As stated earlier, the main objective of the present research is to investigate and compare the applied FR, SI, LR, and WoE models for landslide susceptibility mapping in the city of Azazga. For this purpose, the adopted methodology requires the following five steps (Fig. 4): (i) data gathering (data types and data sources) and construction of a related spatial database; (ii) landslide inventory mapping based on the interpretation of aerial photographs, satellite images, historical records, and geological field surveys; (iii) landslide predisposing factors mapping prepared from aerial photographs, geological map, DEM, historical seismicity dataset and precipitation database; (iv) model building, assessment by statistical modeling and mapping of landslide susceptibility using the FR, SI, LR, and WoE models in GIS; and (v) verification of the quality and the performance of the different used models and validation of the obtained LSMs using the receiver operating characteristics curves (ROCs) and the statistics rules for spatial effective LSMs.

Fig. 4
figure 4

Methodological flowchart for landslide susceptibility mapping

Spatial database construction and landslide density analysis

The assessment and the mapping of landslide susceptibility are largely dependent on the inputs of both event landslide and event-controlling factor data. The reliability of LSM depends on the amount and the quality of the used input data. Thus, the data gathering and the geodatabase building constitute the first and the most fundamental step in producing LSM. In the framework of the present research, the data was collected from different sources and is used to generate the thematic layers (Table 1). The landslide inventory map of the study area is prepared using aerial photographs analysis, high-resolution satellite (Alsat 2A images), and Google Earth image analysis and as well as extensive field observations. Based on the landslide inventory data, the following ten important causative factors including slope, aspect, altitude, lithology, precipitation, land use, distance to drainage, distance to faults, and distance to roads were identified and considered for landslide susceptibility mapping. These factors have been extracted from the high-resolution satellite images, the Alsat 2A, the aerial photographs, the geological maps, the DEM, and the rainfalls database (Table 1). The processing of all thematic layers including the georeferencing of maps, the assignment of coordinate systems and data, the visualization, the extraction, and geo-processing of the raster datasets was done using GIS software (ArcGIS v10.2). All the data were geo-referenced in the same local projection system of Algeria (WGS 1984 and UTM Zone 31 North).

Table 1 Spatial database used in the study area

Landslide inventory map

The landslide inventory is a key element and basic data in landslide susceptibility assessment, particularly, when a statistical probability approach is adopted. A landslide inventory map identifies the spatial locations of the existing landslides along with their types and their time of occurrence. The first and the fundamental step in landslide susceptibility assessments correspond to the acquisition of information about historical landslides that constitute a key element for future landslide prediction (Guzzetti et al. 1999).

The detailed landslide inventory map of the Azazga area was produced at a scale of 1:10,000 through a combination of the following steps (Table 1): (i) analysis and interpretation of aerial photographs taken in 1973, 1984, 2000, and 2008 at the scales of 1/20,000, 1/20,000, 1:10,000, and 1:4000, respectively, high-resolution Alsat 2A Geo-Eye panchromatic satellite images (2.5 m) taken in 2011 at a scale of 1:10,000, as well as time series of Google Earth satellite imagery taken between 2003 and 2018; (ii) historical landslide records (landslide reports, newspaper records, and data collected from the field interviewing of local peoples) verified and completed by (iii) extensive and detailed geological fieldworks performed between 2014 and 2019 (CGS 2010). The landslide inventory map shows the spatial distribution of landslides and their geomorphological features (typology of the landslide, topography, deformation characteristics, lithology, area and activity) and their geometrical parameters (hill slope gradients, the perimeter, and the maximal length of the landslides). The landslide perimeter covers a surface area of about 2816 km2 (281, 6 ha) that represents 31% of the total perimeter of the Azazga urban area. The mapped landslides are classified according to the classification proposed by Varnes (1978) into rotational and translational slides (27, 95%), flows (2, 05%), and falls (1%). The depth of the landslides varies from shallow (< 2 m) to very deep (> 20 m). The size of the largest landslide is about 6841,00 m2, whereas the smallest one covers an area of 805 m2.

For landslide susceptibility assessment, the landslide inventory map was randomly divided into two separate subsets: (i) the training landslide data sets (70%) used for developing or building landslide models and (ii) the testing dataset (30%) for performance evaluation and landslide models validation (Yesilnacar and Topal 2005; Chung and Fabbri 2003). Figure 5 shows pictures of newly field mapped landslide locations induced by the 2012, 2014, and 2018 rainfall events in the study area. Figure 6 shows the spatial distribution of landslides used in this analysis.

Fig. 5
figure 5

Examples of observed different types of recently occurred landslides in the study area. a Rotational landslides in the district of Ighil Bouzel. b Rotational landslides in front of the cultural center in Ighil Bouzel. c Translational slides in Aghni Guizem in south eastern of the City of Azazga. d Rotational landslides along the national road RN12 in the north of the City. e, f Flows in the sector of Tadart in the north of Azazga

Fig. 6
figure 6

Landslide inventory map of the study area

Landslide predisposing factors

The landslide susceptibility assessment and mapping are based on the selection of relevant controlling factors by considering that future landslides will occur in the same conditions than the observed ones in the past. The selected statistical method compares parametric maps with the landslide inventory map, and results are then extrapolated to the entire evaluated area with a final product of the LSM. The landslide inventory analysis of the study area has identified and outlined the main predisposing factors or predictive variables to introduce in the statistical model in order to assess and map landslide susceptibility.

Ten causative factors (Fig. 7) related to the causes of landslide occurrence including slope, aspect, altitude, land use, lithology, precipitation, distance to drainage, distance to faults, distance to lineaments, and distance to roads have been identified and analyzed, and thematic layers have been derived and prepared mainly from the following (Table 1): (i) the available national databases (geological maps, topographic map, DEM, precipitation map, and boreholes from geotechnical studies); (ii) aerial photographs and satellite image interpretation and Google Earth imagery analyses; and (iii) field geological/geotechnical investigation. The thematic layers generated in GIS software have been re-sampled in a 10 m × 10 m grid size in order to facilitate the easy raster-based computation. In the analysis, the influence of each predisposing factor on the landslide occurrence, the landslide density for each class of each factor was calculated by dividing the landslide occurrence area by the class’s area of each factor. For this purpose, all landslide causative factor maps were converted into raster and classified with the same pixel size (10 m × 10 m) in the same projection using Arc toolbox tools under GIS as a spatial analysis tool. Then, the landslide inventory raster map was overlapped with the landslide factor raster class through the combined spatial analysis tool under toolbox to extract landslide pixels for each landslide class of each factor. Then, the influence of each factor class was determined using equations of frequency ratio (Eq. 1), statistical index (Eq. 3), weights of evidence (Eq. 7), and logistic regression (Eq. 9) methods. Finally, the results are summarized in Tables 2, 3, 4, and 5.

Fig. 7
figure 7

Landslide-conditioning factors of the study area. a Precipitation map. b Lithological map. c Slope angle map. d Aspect map. e Altitude map. f Land use type. g Distance to rivers map. h Distance to faults. i Distance to lineaments. j Distance to roads map

Table 2 Frequency ratio values of the selected landslide conditioning factors
Table 3 Statistical index (SI) calculated, for each class of the selected landslide causal factors
Table 4 Weighting values (WC) calculated, for each class of the selected landslide causal factors
Table 5 Logistic regression (LR) coefficients for each class of the selected landslide causal factors

The statistical analysis of density based on the observed relationship between each factor and the spatial distribution of landslides is very useful to reveal the correlation between landslide locations and factors. In this study, the landslide density was applied and combined with a GIS to evaluate the relationship between the susceptibility and the triggering factors with landslide occurrences. Figure 8 shows the density of each landslide type in percentage in each landslide-conditioning factor. The results indicated that rainfall affected highly the occurrence of landslides, and constitutes the principal triggering factor in the area of Azazga. The average seasonal precipitation during the period of 1950 and 2014 ranges between 700 and 1400 mm (Fig. 7a). The frequency values of landslides increase with the increasing amount of rainfall. The highest concentration of landslides has occurred in the two highest rainfall classes (Fig. 8a) ranging from 1000 to 1050 mm (26%) and 900 to 950 mm (24%). This indicates a strong correlation of landslide events with a great amount of rainfall that considerably impacts the landslide occurrence.

Fig. 8
figure 8

Area and landslide density for each class of the ten factors used in this analysis. Values are reported for each landslide type and for all types together. a Precipitation map. b Lithology. c Slope angle. d Aspect. e Altitude. f Land use. g Distance to rivers. h Distance to faults. i Distance to lineaments. j Distance to roads map

Regarding the geology, the occurrence of landslides in the region of Azazga is closely related to the lithology and the material property variations. In this study, the lithology is classified into seven classes as shown in Fig. 7b. Landslide density percentage is highest in two lithological classes, namely the sandy clayey of Quaternary scree and the marly clays of the Cretaceous flysch (Fig. 8b). The quaternary scree deposit consists of sandstone blocks embedded in a clayey to sandy-clay deposits of low shear strength and the presence of shallow aquifers (CGS 2018). They are highly weathered at the surface and are crossed by a network of widely open fissures and, therefore, sensitive to concentrated runoff and erosion. The presence of water generates pore water pressure that reduces the shear strength of the slope materials. In addition, the hydrographic network has crossed through the slope toe of this loose soil deposit, which caused its removal, by the stream undercutting. Therefore, the slope material resisting forces are reduced. The Cretaceous flysch is mostly constituted by a gray to greenish clays finely bedded, folded, and friable can be cut into thin platelets. They are strongly folded sediments with schistosity texture because of the tectonic action, which, in many places, led to the formation of a thick weathered shallow layer. The internal structure of the flysch is characterized by the dip of the lithological layers, the schistosity plans, and other tectonic fractures in the slope direction.

In terms of the geomorphology, the slope gradient (Fig. 7c), the aspect (Fig. 7d), and the elevation (Fig. 7e) are considered as important controlling factors in slope stability. Landslide density is highest in the 10°–20° (29%) category, followed by the 30°–40° (23%), 20°–30° (24%), and 0°–10° (23%) categories (Fig. 8f). The landslide density percentage is relatively low and increases with the orientation angle (aspect) reaching the maximum at the north aspect (Fig. 8d). The north oriented slopes are more violently affected by rainfalls. The elevation is associated with landslides as a result of other factors. The density is highest in the 200–300 m and then decreases (Fig. 8e).

The land use is ranked with the causal factors of the landslides in relation especially with the presence/absence of vegetation. In the area under investigation (Fig. 7f), the landslide density is concentrated on the vegetated and agricultural layer followed by the forest and the urban area (Fig. 8c). The variations of the vegetation in any area constitute an important parameter affecting the slope failures as the slope stability is very sensitive to the changes in vegetation state. The high density of landslides in these areas can be explained by a very high human activity development in inappropriate new highland settlements due to the rapid growth of the population. The soil cohesion is modified by the extension of the urbanization depending on the type of vegetation, and thus, cultivated or sparsely vegetated areas are more prone to landslide processes. The barren land and mountainous cultivated areas also have a significant number of landslide events. Additionally, crops can increase the moisture of soil and alter the groundwater conditions mainly by earthworks and urbanization. In the study area, cover crops represent 50% of the total area. The extensive cultivated land combined with the altered groundwater conditions is capable of causing landslide problems.

The fluvial erosion of slope toe is one of the most common causal factors of the landslide occurrence that may induce failure of the banks due to slope undercutting/basal erosion, especially in areas of the dense drainage network, hilly landscape, and deep valleys. Such cases are mainly observed along the Iazoughen and Boulina rivers in the city of Azazga. The distance from these main rivers is considered as an important factor in characterizing the landslide susceptible areas (Fig. 7g). The landslide occurrences increased close to rivers (Fig. 8j). The frequency values show that buffer zones with distances smaller than/or equal to 50 m from rivers are strongly associated with landslide appearance. The frequency of the landslides decreases with the increase of the distance to the drainage network. This is related to the dynamics of rivers that influence the triggering of landslides by the concentrated runoff of water. This generates saturation of the soil and the basal erosion on the banks that activate the dynamics of the slope (slope undercutting and removal of abutment).

The tectonic structures, such as fractures or faults, are also considered as related to favorable conditions for landslide occurrence. Thus, major structural discontinuities produced by fractures and faults were included as a parameter in this study (Fig. 7h). In our study, there is no trend showing that locations close to faults have decreased landslide density (Fig. 8g). Tectonic lineaments are zones of weakness, characterized by heavily fractured rocks which are prone to instability. The proximity to these structures increases the probability of occurrence of landslides as a selective erosion and drainage of water along fault planes that cause landslides. The predominant tectonic lineaments in the study area are in the NE–SW and N–S directions (Fig. 7i). There is a trend showing that locations close to lineaments have increased the landslide density (Fig. 8h).

The distance to roads is considered also among the anthropogenic factors influencing the occurrence of landslides due to the fact that opening of roads frequently modifies the slope stability; the most current action, the large excavations, the removing of vegetation, and the external loads by earth filling. In this study, a distance to roads map (Fig. 7j) was constructed using the defined five buffer categories. The result of the landslide density for each distance class showed that there was no influence of roads with 80% of the landslides occurred within the interval less than 100 m from the landslide frequency (Fig. 8i).

Modeling approaches

The landslide susceptibility assessment in the study area was implemented using four simple statistical models which are FR, SI, WoE, and LR with the help of GIS techniques to generate LSMs. The assessment process is based on a cross-analysis of determining factors and spatial frequency of landslides through statistical models using the GIS matrix method. By using the equations of the used models (FR, SI, LR, and WoE), the weighting factor values of the training sets of each layer for each landslide factor have been evaluated in order to generate the final LSMs. The resulted maps have been classified by dividing the total number of elements (weight value) into the following five distinct classes using the standard deviation method: low, moderate, high, and very high susceptibility.

FR method

The FR model (Lee and Min 2001) allows one to derive the spatial relationship between the distribution of landslides and their landslide conditioning factors. The main advantage of the FR method is that it is easy to apply and provides easily comprehensible results. The FR is defined as the ratio of the area where landslides occurred in the total area. It expresses the relationship between the landslides in the class of landslide factor and the area in the class. When the ratio value is less than 1, it means a lower correlation between landslide occurrence and landslide factors, and a value greater than one means a high correlation. It can be calculated using Eq. 1:

$$FR = \left(\frac{Ls pix}{A pix}\right)=(\frac{Landslide pixel class}{Area pixel class})=(\frac{\% Landslide pixel class}{\% Area pixel class})$$
(1)

where FR is the frequency ratio, Ls pix is a landslide pixel in a factor class, and A pix is the total pixel area of the class in the study area. In the study area, the frequency ratio for each causative factor class was calculated using Eq. 1 and the results are indicated in Table 1.

After the calculating of FR for each landslide-conditioning factor using Microsoft Excel under GIS, the FR value for each factor class was attributed by the joint in the ArcGIS tool. Afterward, the weighting landslide factors were rasterized using the spatial analysis search tool. Afterward, the landslide susceptibility index (LSI) is estimated by summation the frequency ratio of each factor type or class by the Map Algebra raster calculator of the spatial analysis tool and using Eq. (2):

$$\begin{aligned}LSIFR\hspace{0.17em}=&\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}slope\hspace{0.17em}+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}slope aspect\hspace{0.17em}+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}precipitation\hspace{0.17em}\\&+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}lithology\hspace{0.17em}+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}land use\hspace{0.17em}+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}distance to streams\hspace{0.17em}\\&+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}distance to roads\hspace{0.17em}+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}distance to faults\hspace{0.17em}\\&+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}distance to lineaments\hspace{0.17em}+\hspace{0.17em}FR\hspace{0.17em}*\hspace{0.17em}altitude\end{aligned}$$
(2)

where LSI is the landslide susceptibility index, FR is the frequency ratio of each landslide factor class.

After the calculation of LSI, the index values were ranked and classified into a different landslide susceptibility levels in order to establish the final LSM using the standard deviation method in the ArcGIS tool.

SI method

The SI namely also information value (IV) method is one of the bivariate statistical methods (Van Westen 1997) based on a statistical correlation between the predisposing factors and the distribution of landslide areas. The SI value for each class of each factor is given as the natural logarithm of the landslide density in the categorical class divided by the landslide density in the entire map of the factor:

$$\begin{aligned}SI&=In (\frac{Conditional probability CP}{Prior probability P})=In \frac{Densclass}{Densmap})\\&=In(\frac{Npix(S)/Npix(N)}{SNpix(S)/SNpix(N)})\end{aligned}$$
(3)

where SI is the weight given to the class of a given factor. The conditional probability is the ratio of the number of landslide pixels in the class of factors to the number of pixels in a class, and the prior probability is the ratio of the total number of pixels of landslides to the total number of pixels of the study area. Densclass represents the landslide density within the class of landslide factor, and Densmap is the landslide density in the total area of landslide factor. Npix (S) is the number of landslide pixels in a landslide factor class. Npix (N) is the total number of pixels in the same landslide factor area. SNpix (S) is the number of pixels of all landslides. SNpix (N) is the total number of all pixels. The natural logarithm is used to take into account the large variation in weights. When the SI is > 0.1, the correlation between the landslide occurrences and the landslide factor is high, which implies that it will have a high probability of landslide occurrence, though, when the SI is < 0.1. There is a low correlation between factors and landslide indicating a low probability of landslide occurrence. After calculation of the SI for each class of landslide factor using Microsoft Excel and GIS, the SI for each factor class is given through the link in the ArcGIS tool. Then, the LSI of the study area is calculated as in Eq. 4 and after rasterization the weighted landslide factors using the lookup tool in spatial analysis.

$$\begin{aligned}LSISI =& SI*\hspace{0.17em}slope\hspace{0.17em}+\hspace{0.17em}SI*\hspace{0.17em}slope aspect\hspace{0.17em}+\hspace{0.17em}SI*\hspace{0.17em}lithology\hspace{0.17em}\\&+ SI*\hspace{0.17em}land use + SI*\hspace{0.17em}distance to streams \\&+ SI*precipitation + SI*distance to roads\\& + SI*distance to faults + SI*distance to lineaments \\&+ SI*altitude\end{aligned}$$
(4)

where LSI is the landslide susceptibility index and SI is the statistical index of each landslide factor class. The higher value of LSI indicates the higher probability of landslide occurrence. After the calculation of LSI, the index values were classified into different landslide susceptibility degrees in order to establish the final LSM using the standard deviation method in the ArcGIS tool.

WoE method

The WoE method is a statistical method (Bonham-Carter et al. 1989) that uses the log-linear form of the Bayesian probability model to estimate the probability based on the concept of posterior and prior probability (P). The WoE approach is based on information obtained from the spatial correlation between landslide distribution and landslide causative factors. The WoE approach calculates the spatial relationship between the distribution of landslides (L) and the landslide causative factors (B) within the area, in the form of negative weights (W) and positive (W+). These negative and positive weights are calculated from the ratios of the natural logarithms as follows (Bonham-Carter 1994):

$${W}^{+}=In\frac{P(B/L)}{P(B/\overline{L})}$$
(5)
$${W}^{-}=In\frac{P(\overline{B}/L)}{P(\overline{B}/\overline{L})}$$
(6)

where P is the probability of the ratio, B is the predictive factor, and L is the landslide. The overbar sign “¯” represents the absence of the class and/or landslide or predictive factor. A positive W+ and negative W weights are an indication of the positive and negative correlation between the landslides occurrence and the presence of the predictable variable, respectively. ln is the natural logarithm (logit) used in order to estimate the conditional probability of landslide occurrence.

The difference between the negative and positive weights, as computed for each class of each analyzed factor, is known as the weight contrast WC:

$$WC\hspace{0.17em}=\hspace{0.17em}W+ - W-$$
(7)

The contrast WC represents the complete spatial relationship between the predictive variable and landslides. The value of WC is naturally between zero and two; when the value of C tends to 0, the presence of the considered factor did not affect the distribution of landslides in the area, while when C is approximately equal to or greater than 2, the correlation is significant (Barbieri and Cambuli 2009).

In this work, after calculation of the WC for each class of landslide factor using Microsoft Excel under GIS, the WC for each factor class is given through the link in the ArcGIS tool. Subsequently, the LSI is calculated after rasterization of the contrast WC using the lookup tool in the spatial analysis as in Eq. 8:

$$\begin{aligned}LSIWC\hspace{0.17em}=&\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}slope\hspace{0.17em}+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}slope aspect\hspace{0.17em}+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}lithology\hspace{0.17em}\\&+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}land use\hspace{0.17em}+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}distance to streams\hspace{0.17em}\\&+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}precipitation\hspace{0.17em}+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}distance to roads\hspace{0.17em}\\&+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}distance to faults\hspace{0.17em}+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}distance to lineaments\hspace{0.17em}\\&+\hspace{0.17em}WC\hspace{0.17em}*\hspace{0.17em}altitude\end{aligned}$$
(8)

where WC slope, WC exposure, WC fault, WC river, WC precip, WC land use, WC road, and WC lithology are the distribution-derived weight of slope, exposure, fault, distance to rivers, precipitation, land use, distance to roads, and lithology maps, respectively.

After LSI calculation, the index values were categorized hierarchically and into different landslide susceptibility classes to establish the final LSM using the standard deviation method in the ArcGIS tool.

LR method

The LR is a multivariate analysis method that allows to evaluate a multivariate regression relationship between an independent (landslide causative factors) and dependent (landslides) variable (Lee and Pradhan 2007). The important advantage of LR over multiple linear regression is that through the addition of an appropriate relation function to the normal linear regression model, the variables may be discrete or continuous, or any combination of both types, and they do not essentially have normal distributions (Lee and Pradhan 2007). In the LR, the dependent variable is a binary variable representing the presence (1) or the absence (0) of a landslide, whereas the independent variables can be continuous, discrete, dichotomous, or a mix of any of them. In order to predict the possibility of landslide occurrence in each grid, the probability (P) was calculated using the LR model, which is expressed as in Eq. 9 (Lee and Pradhan 2007):

$$(P\hspace{0.17em}=\hspace{0.17em}\frac{1}{1+{e}^{-z}})$$
(9)

where P is the probability of landslide occurrence, which varies from 0 to 1 on an S-shaped curve; z is the linear combination defined as in Eq. 10 whose its value varies from − ∞ to + ∞:

$$Z\hspace{0.17em}=\hspace{0.17em}intercept\hspace{0.17em}+\hspace{0.17em}b1X1\hspace{0.17em}+\hspace{0.17em}b2X2\hspace{0.17em}+\hspace{0.17em}b3X3\hspace{0.17em}+\hspace{0.17em}\dots \hspace{0.17em}+\hspace{0.17em}bnXn$$
(10)

where X1, X2, X3, and Xn are the independent variables and b1, b2, b3, and bn are the slope coefficient of the logistic regression model.

Validation model

The evaluation of the accuracy degree or the validation of the landslide susceptibility model is the most important task and step in landslide susceptibility modeling. Without validation, the prepared LSM has no scientific relevance (Chung and Fabbri 2003). Among the most available valuable methods to determine the accuracy of the different landslide susceptibility models, we use in this study the ROC and the statistics rules for spatial effective LSMs. The accuracy and the performance of the produced LSMs as well as the validation process of the models were performed by comparing the known landslide location data with the obtained LSMs.

The ROC curve is one of the useful statistical methods used to represent the performance or the quality of the landslide susceptibility model. The area under curve (AUC) value is used to evaluate the efficiency of a forecast system by describing the system’s ability to predict accurately the non-occurrence or the occurrence of a landslide event (Chung and Fabbri 2003; Yesilnacar and Topal 2005). The AUC value and the correspondent performance can be rated as follows (Yesilnacar and Topal 2005): 0.5–0.6 (poor performance model), 0.6–0.7 (average performance model), 0.7–0.8 (good performance model), 0.8–0.9 (very good performance model), and 0.9–1 (excellent performance model).

In order to validate the used models in this study, the landslide area was partitioned randomly into two categories: 30% of landslides for model validation and 70% of landslides for training, considering their spatial allocation and using the random division technique. Then, the models were validated by using the ROC curves.

The resulted LSMs were also verified and validated using two statistical rules for spatial effective LSMs (Bai et al. 2010; Pradhan and Lee 2010): (i) the percentages of landslides increased with the degree of susceptibility where the smaller amount of landslide was distributed in low and very low susceptibility classes, and the higher amount of landslide was scattered in the high susceptibility class of the LSMs; (ii) the high susceptibility class should cover only small areas.

Results analysis

Results of the FR model

By using the training data, the frequency ratios of each factor’s class were calculated from their correlation with landslide events by applying the FR model (Eq. 2) as indicated in Table 2. Then, the FR of each layer class is summed up to yield the LSI using Eq. 2.

According to Table 2, the lithological features of the study area represent an important factor in landslide occurrence. The Quaternary scree class exhibited higher FR values (1.76) which is > 1, indicating high landslide susceptibility. However, the Numidian sandstone class has a low FR value (0.452 which is < 1), indicating a low susceptibility of landslide occurrence. The slope classes 10°–20°, 20°–30°, and 30°–40° have the highest value of FR (1.176, 0.96, and 1.01, respectively) and low FR value (0.95) for slope class 0°–10°. This relationship indicated that the landslide susceptibility increases as the slope gradient increases. In the case of the slope exposure factor class, the FR value is higher for the south-aspect (1.35), the northeast-aspect (1.45), and the north-aspect (1.78) indicating high landslide susceptibility. The remaining slope aspect classes indicated a low landslide susceptibility because their FR values were < 1. As shown in Table 2, the value of the FR is > 1 for the land use class of pasture and agriculture area (1.564), indicating a high landslide susceptibility. This is due to the cultivated land with degrading vegetation, which has increased the soil moisture, and the pore water pressure, which leads to the reduction of shear strength and slope failure. The urban area showed a low FR value, indicating a low susceptibility to landslide occurrence. This can be explained by the fact that land settlement coincides with low and gentle slope gradient parts of the study area. For the distances to the streams, the obtained FR values indicated that as the distance to the rivers increases, the probability of landslide occurrence decreases (Table 2). The high value is observed for the distances between 0 and 50 m indicating a high probability of landslide occurrence in this distance. However, for the distances > 150 m, the value of the FR is < 1, indicating a low landslide susceptibility. This is due to the riverbank erosion, the river undercutting, the regressive erosion, and the gully effects. The proximity to roads gives high values (1.26) for distances between 200 and 250 m and low values for distances between 0 and 50 m, indicating that there is no influence of the effect of roads on landslide occurrence. In addition, the results show that the distance from faults and lineaments increases as the landslide frequency decreases (Table 2). On the other hand, the FR analysis showed that higher FR values were distributed in higher rainfall zones (Table 2). That means that the landslide probability increases with the amount of precipitations. Lastly and regarding the altitude factor, the FR values indicated that as the altitude increases the probability of landslide occurrence decreases.

Results of the SI model

In the study area, the SI value of each landslide factor class calculated by the overlay of the landslide factor raster with the landslide raster layer using Eq. 3 is reported in Table 3.

As presented in Table 3, the SI value is > 0.1 for the lithology class of Quaternary scree (0.567), which indicates high landslide susceptibility. Nevertheless, for the remaining classes, the SI values are < 0.1 indicating a low susceptibility of landslide occurrence. For slope classes 10°–20° and 30°–40°, the SI values are > 0.1 (respectively, SI = 0.162 and 0.011), indicating high landslide probability, and the SI values are < 0.1 for slope classes of 0°–10° and 20°–30° (SI =  − 0.04 and − 0.033, respectively), indicating low landslide probability (Table 3). In the case of slope aspect factor, the SI values are > 0.1 for northwest facing (SI = 0.11), west facing (SI = 0.19), southwest facing (SI = 0.05), southeast facing (SI = 0.22), and east facing (SI = 0.61), indicating high landslide probability. However, SI values are < 0.1 for the remaining slope aspect classes that indicated a low probability of landslide occurrence. As indicated in Table 3, the value of SI for the pasture/agriculture land use class (0.44) is > 0.1, noticing high landslide susceptibility. The SI values for the remaining factor classes (forest and urban area land) are < 0.1, indicating a low susceptibility to landslide occurrence. The SI values are > 0.1 for the precipitations class of 900–950 mm (SI = 0.06) and 1000–1050 (SI = 0.149), indicating high landslide probability. However, the SI values are < 0.1 for the remaining classes indicated low susceptibility to landslide occurrence. Regarding the distance to the streams, for the 0–50 m and the 50–100 m, the value of the SI are > 0.1, which is, respectively, 0.28 and 0.05, indicating high landslide susceptibility. However, for the distances > 100 m, the SI values are < 0.1, specified the low landslide susceptibility. The proximity to roads gives high values (SI values are > 0.1) for distance classes of 150–200 m and 200–250 m indicating high landslide susceptibility, while the remaining distances classes indicated a low influence on the landslide occurrence. Moreover, the results showed that the distance from faults and lineaments increases as the landslide frequency decreases (Table 2). For the distance to faults class, the 0–50 m, 50–100 m, 100–150 m, and 150–200 m classes show values of SI > 0.1, indicating high landslide susceptibility, while for the remaining distances classes, the SI values are < 0.1, which specifying low landslide susceptibility.

The distance to the lineament classes of 0–50 m, 50–100 m, 100–150 m, and 150–200 m showed that SI values are > 0.1, indicating high landslide susceptibility, while for the remaining distance classes, the SI values are < 0.1 defining low landslide susceptibility. In terms of altitude, the classes of 200–300 m, 300–400 m, and 600–700 m showed that SI values are > 0.1, indicating high landslide susceptibility, while the remaining altitude classes gave SI values < 0.1, indicating low landslide susceptibility.

Results of the WoE model

In this study, firstly, the various thematic maps of landslides affecting factors were overlapped on the landslide map. Then, the weights and WC values were calculated for each of the landslide-related factors, using Eqs. 5, 6, and 7 (Table 4). Afterward, the conditional independence was tested and verified before the integration of the predictor patterns to map the landslide susceptibility. The chi-square values, serving to test the conditional independence between all pairs of binary patterns for each predictive factor, were evaluated at a 95% significance level and 1 degree of freedom. The calculated chi-square values are greater than the values shown in Table 4, suggesting that the pairs are not significantly different.

The resulting contrast, according to Table 4, showed the importance of the conditioning factors in the occurrence of landslides. The contrast is negative for the unfavorable factor of occurrence of landslides and positive for the favorable factor in the occurrence of landslides. The contrast value (Table 4) analysis revealed that the highest landslide susceptibility factors corresponded to the Quaternary scree deposits, the slope class (30°–40°), and the north slope aspect classes. On the contrary, the distance to a river, the distance to lineaments, the land use, the precipitation, the distance to streams (m), the distance to roads, the distance to faults, and the altitude indicated a low probability of landslide occurrence as evidenced by the weights close or inferior to zero (Table 4).

Results of the LR model

In order to assess the spatial relationship between dependent and independent variables using the logistic regression model. The spatial databases of the ten conditioning factors and landslides were converted into a grid format, and then, into Excel data format files for use in the statistical package Real Statistics by using the logistic regression analysis. The input factors represent the independent variables and the occurrence of the landslide corresponds to the dependent variable. Firstly, the weighting of the factors classes was based on the percentage area of landslides in the homogenous units. The percentage area of landslides which depended on each factor has been identified by calculating the ratio of the observed landslide area to the area of homogeneous units. Then, the correlation between landslide events and landslide affecting factors was estimated, and the logistic regression model was run to obtain the logistic regression coefficients.

The weight factor and the logistic regression coefficient for each thematic layer are shown in Table 5. The Hosmer and Lame show test indicated that the goodness of fit of the equation can be accepted, because the significance of chi-square is larger than 0.05 (16.635). A higher R-square value of Cox (0.75) and Snell R2 (1) and Nagelkerke R2 (1) indicates a better model. The ROC value of 0.852 indicates a good correlation between the independent and dependent variables. Finally, the binary logistic regression model and their respective coefficients are given as in the following Eq. 12:

$$\begin{aligned}z\hspace{0.17em}=&\hspace{0.17em}-\mathrm{ 18,668}\hspace{0.17em}+\hspace{0.17em}0.848*precipitation\hspace{0.17em}+\hspace{0.17em}9044*lithology\hspace{0.17em}\\&+\hspace{0.17em}1509*land use\hspace{0.17em}+\hspace{0.17em}7963*aspect\hspace{0.17em}+\hspace{0.17em}1951*altitude\hspace{0.17em}\\&+\hspace{0.17em}\mathrm{10,550}*slope\hspace{0.17em}+\hspace{0.17em}\mathrm{5,247}*distance to faults\hspace{0.17em}\\&-\hspace{0.17em}9266*distance to lineaments\hspace{0.17em}-\hspace{0.17em}8103*distance to roads\hspace{0.17em}\\&+\hspace{0.17em}6139*distance to streams\end{aligned}$$
(11)

According to Eq. 12 and Table 5, the slope, the lithology, the aspect, the distance to streams, the distance to faults, the precipitation, the land use, and the altitude are susceptible to landslide occurrence because of their positive coefficients. However, the distance to roads and the distance to lineaments indicate a negative relation with the landslide occurrence in the study area. In addition, the slope, the lithology, and the aspect coefficients show that among the effective factors in landslide occurrence the “slope” parameter has a more crucial effect than any other parameter.

Landslide susceptibility mapping

After the evaluation of the LSI of each landslide factor using the FR, the SI, the LR, and the WoE, the LSI is ranked into distinct susceptibility classes according to the LSI value. In this research, the LSI has been divided into five classes based on the standard deviation method since the obtained values in the LSI using the FR, SI, LR, and WoE models showed a normal distribution (Ayalew and Yamagishi 2005; Yalcin et al. 2011). In the literature, many methods are serving to divide weight values into classes, such as the standard deviation, the equal interval, and the natural break methods (Ayalew and Yamagishi 2005; Yalcin et al. 2011). The standard deviation method is appropriate and used due to the normal distribution of LSI values. This method uses the mean value to generate class breaks, and it allowed us to divide the result of this study into five categories by adding or subtracting 1 standard deviation at a time. Using the standard deviation method in ArcGIS 10.1, the LSM of the study area of the FR, SI, LR, and WoE method was classified into five susceptibility classes. The result of LSMs is shown in Fig. 9.

Fig. 9
figure 9

Landslide susceptibility map obtained using a FR method, b SI method, c WoE method, and d LR method

The obtained LSM by using the FR model (Fig. 9a), 1% of the total area, is classified as very low landslide susceptibility. Low, moderate, high, and very high susceptibility areas represent 7%, 36%, 34%, and 21% of the total area, respectively (Fig. 10). The LSM produced using the SI model (Fig. 9b) shows that the very high and high susceptibility zones represent a great percentage with 34%, and 20%, respectively, but the percentages of moderate, very low, and low susceptibility classes are 32%, 12%, and 1%, respectively (Fig. 10). The LSM generated with WoE model (Fig. 9c), which included 1% of the total area, is determined to be of very low landslide susceptibility class. The low and moderate susceptibility classes respectively take 11%, and 42% of the total area. The high and very high zones values are close to 29% and 16%, respectively (Fig. 10). The LSM generated with LR model (Fig. 9d) contains 4% and 6% of the total area, respectively, very low and low susceptibility. The moderate, high, and very high susceptibility classes respectively, 43%, 25%, and 20% of the total area (Fig. 10).

Fig. 10
figure 10

The relative distribution of various susceptibility classes of different LSMs

Validation and comparison of the landslide susceptibility maps

For the validation of the used models in this study, the landslide area was subdivided into two categories: 30% of the landslide for model validation and 70% of the landslide for training, taking into account their spatial location using the random distribution technique. Then, we validated the used models using ROC curves. The ROC curves have been obtained by comparing the landslide validation data set, with the four LSMs and the area under curves was calculated for the four landslide models. These results indicate that the FR model has the highest accurate prediction (78.40%) than the SI (78.10%), the WoE (73.50%), and the LR (72.10%) models (Fig. 11). From there, it is concluded that all the used models in this study showed practically good accuracy in predicting the landslide susceptibility in the study area.

Fig. 11
figure 11

Receiver operating characteristics (ROC) curves representing the quality of the four used models (FR, SI, WoE, and LR)

The obtained LSMs were also tested and validated using the two statistical rules for spatial effective LSMs (Bai et al. 2010; Pradhan and Lee 2010). The percentages of landslides within the five susceptibility classes have been determined and presented in Fig. 12. It is deduced that the higher amount of landslides was scattered in the high and very high susceptibility classes, and the smaller amount of landslides was distributed in the low and very low susceptibility classes of the LSMs. Figure 12 shows that the high and very high susceptibility classes of all LSMs contain 75 to 93% of the active landslide zones. While the moderate zone gives 13 to 20% of the active landslide zones and around 3% of the active landslide zones coincide with the low susceptibility class. The very low susceptibility area shows less than 1% of the active landslide zones in all obtained LSMs. The results of Fig. 12 show clearly that the percentages of landslides increase effectively from very low to very high susceptibility, and the high susceptibility class covers only small areas.

Fig. 12
figure 12

Active landslide zones falling into the various classes

Discussion and conclusion

The severe and progressive landslides affecting the city of Azazga, northern Algeria, constitute not only a serious threat for both the local populations and the environment but also a persistent constraint to urban planning and development. Therefore, predicting and delineating landslides areas are crucial tasks to reduce the landslides hazard and their associated damages. In this study, we investigate the potential application of statistical models and the GIS as relatively new approach for landslide susceptibility mapping in the city of Azazga. Firstly, a landslide inventory map was prepared using aerial photographs and satellite images interpretation supported by field surveys. The identified mass movements include falls, slides, and flows that cover a surface of about 281.6 ha which corresponds to 31% of the total urban surface of Azazga agglomeration. Then, ten landslide predisposing factors including the slope, aspect, altitude, land use, lithology, precipitation, distance to drainage, distance to faults, distance to lineaments, and distance to roads have been derived from high-resolution Alsat 2A satellite images, aerial photographs, geological map, DEM, and rainfall database. The LSMs were produced using four methods and classified into five susceptibility classes: low, very low, moderate, high, and very high.

For validation, the obtained LSMs have been compared with known landslide locations using the ROC technique. According to the obtained AUC, the FR model has higher prediction performance (78.40%) compared to the SI (78.10%), the WoE (73.50%), and the LR (72.10%) models. The results revealed also that all the used models provided good accuracy in landslide susceptibility mapping in the city of Azazga. In addition, the results of the accuracy procedure by using statistics rules showed that the density of the landslides increased from low to very high susceptibility zone. On the other hand, a high percentage of landslides has occurred in very high susceptibility area.

Compared to cases studies throughout the world, such results have been observed in some models such as FR, SI, WI, LR, and AHP (Lee and Min 2001; Ayalew and Yamagishi 2005; Thiery et al. 2007; Pradhan and Youssef 2010; Pradhan and Lee 2010; ; Pradhan and Youssef 2010; Yalcin et al. 2011; Tien Bui et al. 2011; Kevin et al. 2011; Kevin et al. 2011;Tien Bui et al. 2011;Shahabi et al. 2012; Ozdemir and Altural 2012; Ozdemir and Altural 2012; Demir et al. 2015; Bourenane et al. 2016; Hadji et al. 2017; Karim et al. 2019; Demir 2018). The obtained LSMs in this study can be considered as a useful guide for future development and planning in the urban area of Azazga. Such susceptibility maps provide information on the spatial prediction probability of landslide occurrence in the area. They are a helpful and valuable tool for risk reduction. As our results are given at large-scale mapping, the exact extent of the slope instability areas and the details of the high susceptibility areas are well determined; this will be useful for further detailed site-specific studies. The development of urbanization in landslide-prone areas can be avoided if the LSM is available.

Based on the obtained LSMs, a range of mitigation techniques has been recommended in order to reduce the impact of the existing landslides which include the following: (i) restricting the development planning in landslide-prone areas by using the obtained LSMs; (ii) controlling by means of codes and urban rules the human activity in the landslide-prone areas (i.e., excavation, construction, grading, cutting slopes, landscaping, irrigation activities, vegetation clearance…); (iii) protecting the existing developments by mean of physical mitigation measures (such as drainage, down counterfort berms that serve as buttresses, and protective barriers); and finally (iv) developing and implementing of monitoring and warning systems.