Introduction

Mass movements are important geo-ecosystemic phenomena in nature, scattered in different places, ranging from mild hills to steep mountains (Gruber et al. 2009). Destructive results of landslides in human life and economy of many nations all over the world are very severe (Nefeslioglu et al. 2008). Landslide susceptibility is defined as a quantitative or qualitative assessment of the classification, volume (or area), and spatial distribution of landslides which exist or may potentially occur in an area (AGS 2007).

Nowadays, statistical methods are more applicable for prediction and classification of environmental problems in various regions (Paliwal and Kumar 2009). In recent years, many international research institutes have allocated considerable funds for evaluation of landslide hazards and preparation of proposals for description of their spatial distribution (Guzzetti et al. 1999). A quantitative statistical approach in the world literature has been widely used since the early 1990s. In bivariate statistical analysis, overlay of parameter maps and calculations of landslide densities form the core of the analysis, the importance of each parameter, or specific combinations of parameters can be analyzed separately. Using normalized values (landslide density per parameter class in relation to the landslide density over the whole area), a total hazard map can be created by the addition of the weights for individual parameters. The weight values can also be used to create decision rules, which are based on the experience of the earth scientist. The first paper dealing with such a statistical approach was published almost 25 years ago, (Carrara 1983, 1988); later, the author modified his original methodology to the geographic information system (GIS) environment (Carrara et al. 1990, 1991).

Landslide susceptibility is defined as a quantitative or qualitative assessment of the classification, volume (or area) and spatial distribution of landslides which exist or potentially may occur in an area (AGS 2007). Landslide susceptibility assessment in a GIS environment is based upon suitable selection of factors which play a dominant role in slope stability. The evaluated input factors reflect geological, climatic, and hydrologic conditions as well as morphometric characteristics of the relief and actual landscape structure of the studied area. Landslide susceptibility assessment and statistical processing are based on an axiom of actualism, i.e., we presume that future landslides will occur under the same conditions as in the past.

Several bivariate or multivariate approaches were developed for landslide susceptibility mapping. A synthesis of the available methods, their applicability, and drawbacks can be found in articles by Yin and Yan (1988), Carrara et al. (1995), Chung et al. (1995), Soeters and van Westen (1996), Atkinson and Massari (1998), Aleotti and Chowdhury (1999), Guzzetti et al. (1999), Clerici et al. (2002), Dai and Lee (2002), Iovine et al. (2003a, b), Suzen and Doyuran (2004), van Westen et al. (2005), Clerici et al. (2006), Lee and Sambath (2006), Castellanos and Van Westen (2007), Lee et al. (2007), Iovine (2008), Dewitte et al. (2010), Oh et al. (2009), Pradhan and Lee (2010), Pradhan et al. (2010), and Oh et al. (2011).

Zab basin is one of the most susceptible regions for occurrence of landslide. The aim of this work is to ascertain a reproducible procedure to estimate landslide susceptibility with a bivariate method of information value analysis, bivariate method of density-area, multivariate method with linear regression analysis, and multivariate method of discriminate analysis approach and landslide causative factor databases developed using satellite images with the aid of GIS in Zab basin, while limiting the collection of landslide and thematic data.

Landslide susceptibility processing and practical verification of the methodology can provide a basis for urbanism, land use planning, and for public administration offices and insurance companies. The methodical procedure in preliminary geological investigation stages presents low-cost research, especially for larger areas and lined structures which are endangered both by extremely slow landslides and by rapid debris flows (Greif et al. 2006).

Case study region

The study area is located in the southwest mountainous area of West Azerbaijan Province along the Zab river basin in Sardasht between the latitudes of 36° 8′ 25″ N and 36° 26′ 27″ N and the longitudes of 45° 21′ 21″ E and 45° 40′ 44″ E (Fig. 1).

Fig. 1
figure 1

General position of Zab basin regarding the political provincial and national borders

The central part of the Zab river basin stretches 30 km from north to south and east to west respectively, about 520 km2 (Fig. 1). This basin is one of the most populated residential areas in the region and includes one city, three townships, and more than 80 villages. This zone is quite susceptible to landslide due to its climatic conditions, geology, geomorphologic characteristics, and human activities.

In the aspect of tectonic movement, since the study region is located in major Zagros thrust direction, faults are the main causes of pit formation. The region morphology is strongly affected by tectonic forces. The role of tectonics on a series of landslides has been proved in various studies. Regarding the passage of the most important faults located in west of Iran, highlighting and giving importance to this cause in the region of study is indeed essential. The target research zone, tectonically, is located in Sanandaj-Sirjan region, and its east–north-east region is located in Mahabad-Khoy zone (Khezri 2010). Also, because of the steep rocks and several faults as well as their different functions, steep and layer lengths are different (Fig. 2).

Fig. 2
figure 2

Cross-section in 1:100,000 geological mapping across the Zab valley in the northern part of the research area

Sub-regional faults of the region along with the main fault of Piranshahr and with their different functions have caused different slope classes and various layer lengths. It also has caused a steep slope in West Piranshahr faults (in the central part above the Vavan Abad to Kacgal Abad), with slope of more than 60 degrees causing slope instability and mass movements (especially stone fallings and snow and avalanches movements).

Geomorphological settings

The Zab basin is situated in the mountain zone, characterized by a mountain climate influenced by multiple snow avalanches. The Zab River is located in the basin central section, while the western and eastern boundaries are represented by high crests of limestones and sandstones. The study area can be subdivided into two geomorphological units separated by a major fault in a north/south direction. The eastern unit is dominated by autochthonous sandstones outcrops, while the western unit is composed of autochthonous shale and lime stone.

The eastern unit of basin is drained by the Zab basin and other rivers which cut an asymmetric valley in highly fractured sandstones. The gentle slopes (10–30°) are covered by andesite deposits of 1 to 10 m thick and by erosion and cutting of hills; these slopes are affected by shallow rotational or translational slides triggered by the undercutting of torrents. In contrast, the steep slopes (20–70°) are characterized by bare soils and affected by rockfalls on sandstones (Fig. 3).

Fig. 3
figure 3

Geomorphological map of Zab basin

The western unit, drained by three main torrents, presents an irregular topography of alternating steep convex slopes and planar slopes. The steepest convex slopes (N35°) are carved in andesite and travertine outcrops or affected by rock-block or complex slides (Khezri 2010). The planar slopes (10–35°) composed of thick marble deposits (from 4 to 20 m) are very often cultivated and affected by rotational or translational slides.

Most landslides within the western unit are located along streams or on gentle slopes, where the contact of travertine deposits and black marls creates a hydrological discontinuity favorable for slope movements.

Objective

In this paper, we first introduce the general objectives, the mapping unit types, and the most commonly used hazard evaluation methods. We then discuss the experience gained from the application of GIS-based models of hazard and risk due to slope failures over test areas in Central Zab basin, ranging in size from some tens to some thousands of square kilometers, and outline the potentials and pitfalls of the approach.

The procedure adopted for this research includes four steps:

  1. (a)

    Identification of the best way to calculate landslide probabilities based on the characteristics of the landslide inventory.

  2. (b)

    Identification of the most relevant combination of predisposing terrain factors avoiding conditional dependence.

  3. (c)

    Evaluation of the degree of model fit by statistical tests and comparisons with the landslide inventory.

  4. (d)

    Evaluation of the best indirect susceptibility map in comparison with a direct susceptibility map

Methods and materials

Materials

There are no universal guidelines regarding the selection of factors in landslide susceptibility mapping (Ayalew et al. 2005). One parameter may be an important controlling factor for landslide occurrence in a certain area but not in another one. The selection of causal factors therefore needs to take the nature of the study area and data availability into account. Collectively, nine parameters of lithology, distance to fault, horizontal gravity acceleration of earthquake, distance to waterway, distance to road, slope, slope aspect, land use/land cover, and rainfall amount were studied.

The data entry and production were the most cumbersome and time-consuming steps of this study. First of all, a digital elevation model (DEM) of the study area was generated from a triangulated irregular network model that was derived from digitized contours of four 1:50,000 scale topographical maps with a contour interval of 25 m. The slope and slope aspect parameters were obtained from the generated DEM.

The critical point was the selection of appropriate pixel size for positional accuracy and precision of susceptibility levels in the resultant map. The positional accuracy needed for 1:50,000 scale maps must be 150 m. For this reason, a pixel size of 50 m was selected for our DEM. The DEM was then used to describe geomorphological and geological processes in the landscape (Fig. 4).

Fig. 4
figure 4

3D image picture of Zab basin, extracted using Landsat ETM + satellite photos and the relevant digital elevation model

Fault lines were derived from 1:100,000 scale geology maps and the aerial orthogonal distance of all pixels to fault lines calculated. A similar process was carried out for road lines and drainage networks. In addition, the kilometer square density of drainage networks, road, and fault lines were also used to demonstrate the importance of the features in the whole study area. To maintain the 1-km2 search distance, a 564-m search radius with a 100 m offset was used.

To store the information of these parameter maps in a uniform thematic database, the size of each pixel for all the products was 100 × 100 m.

To compare with other type of zonings obtained from different research, we first interpreted the aerial photos photographs of the region taken in 1996 with 1:40,000 scale by stereoscopic method. Regions susceptible to sliding were pinpointed on the aerial photographs of the studied region. On the other hand, in this research, landslide locations were obtained by Global Positioning System (GPS) at the center of the main scarp. As slide scars are of relatively small size, typically 25-m wide, they were recorded as point data in a GIS. Features of 23 slides were recorded, and ID of the landslides was introduced.

A total of 23 landslides were identified in the study area, which covered an area of 6.41 km2, accounting for 3.07 % of the study area. The properties of the landslides were recorded on a standard landslide inventory data sheet, but the main purpose herein was to map only the boundaries of the landslides. A digitized map of landslide boundaries was produced, and these digitized maps were input into GIS. A vector-to-raster conversion was performed to provide a raster data of the landslide areas.

Since the best undisturbed morphological conditions (conditions before a landslide occurs) will be extracted from the vicinity of the landslide polygon itself, in this research, we used the Suzen and Doyuran approach. Suzen and Doyuran (2004a, b) presented a new approach in the generation of decision rules of landslide occurrence, called seed cells. Seed cells are achieved by adding a buffer zone to the crown and flanks of the landslide. In the study, 5,681 seed cells were derived from landslide polygon boundaries

Considering that the best undisturbed morphological conditions (conditions before failure) and influencing factors can be extracted from the vicinity of the landslide itself, seed cells were selected using 25 m buffer zone around each landslide point. This buffer interval was chosen because the width and length of the landslide depletion zone rarely exceed 25 m (Che et al. 2011). As the factor raster maps were produced at 20 m spatial resolution, 25 m buffer resulted in at least four seed cells per landslide scar.

In order to define the horizontal gravity acceleration of earthquakes, we used the records of the historical and also quake-systematic over the statistical period of 1910–2009. Events with the highest magnitude in each 5-year period were extracted by the Earthquake Software, and then coordinates of the center of the earthquakes within an area of 150-km radius from the center of the studied region were defined.

Afterward, using Ambersiz and Melvil (2009) experimental formulas and the Earthquake software, magnitude of the surface and voluminal waves and the earthquake magnitude were determined, together with coordinates of the point (Barzegar and Maleki 1998). To define the utmost horizontal acceleration of the earth at the center through Donovan and Demzomer experimental formula, horizontal gravity acceleration was determined; for more certainty, the average horizontal gravity acceleration was used. After doing the needed estimations, we drew the co-gradient map for Zab basin by Ilwis and Winsurf softwares.

Waterways of Zab basin were digitized, and all the needed operations for using this information in Ilwis environment were carried out. The rasterized pixel size is considered to be 25 × 25 m2.

We drew the annual co-rainfall curves in Zab river basin by using 30-year statistics of the stations in the studied region, interpolation technique, and kriging in GIS environment.

Another dataset used was land cover, which was interpreted from Landsat ETM+ image on 21 April 2009. It was calibrated using field observations. Because of significant cloud coverage, results of the classification were edited and simplified by manual digitization. The interpreted images were then digitally processed to further modify the boundaries by supervision classification with ERDAS (Earth Resource Data Analysis System) software. The accuracy of the land cover interpretation was checked by in the field work. Seven main land cover types were considered, namely second-class pasture, first-class pasture, settled, natural forest, man-made forest, dry farm land, and barren land. Based on validation from field observations, the land cover map has accuracy of the order of the Landsat image spatial resolution (∼30 m). After geo-referencing the resultant image, a combination of bands 1, 4, and 7 was used to make complex color pictures and the operational information layer created by the method of categorization of utmost probability (Dymond et al. 2006).

Finally, zoning and evaluation of landslide were carried out through all of the above-mentioned methods.

Methods

Landslide inventory

Landslide inventory maps show locations and characteristics of landslides that have moved in the past but generally do not indicate the mechanism(s) that triggered them. Therefore, inventory maps provide useful information about the spatial distribution of locations of existing landslides and the potential for future landsliding. The landslides which are currently indefinite in characteristics and boundaries were identified using dated satellite images. As a result, the satellite images were very useful in determination of landslide inventory map (Yalcin and Bulut 2007). All study landslide inventories were prepared from four methods including: geomorphological, event, seasonal, and multi- temporal inventories (Guzzetti et al. 2012).

Landslide inventory at a scale of 1:50,000 to 1:25,000 for local zoning through interpretation of SPOT 5 satellite images on the 25 May 2008, field surveys, and analysis of literature for years 2008 and 2009 by a geomorphologist (Thiery et al. 2003, 2004). Air photo interpretation was carried out on 1996 with 1:40,000 scale by stereoscopic method. Fieldwork was carried out between June 2007 and June 2008 to complete the photo interpretation.

To reduce uncertainty linked to an expert in charge of mapping (Ardizzone et al. 2002; Wills and McCrinck 2002), two degrees of confidence were defined for the photo-interpretation and information of available literature (landslide recognition or not), while four degrees of confidence (very high, high, medium, and low) were distinguished for the field survey.

The minimum area covered by an inventoried and mapped landslide is 1,600 m2. Smaller landslides are represented by a dot. Minor and lateral scarps may be distinguished as well as up-slope deformations such as tension cracks or minor landslides. Landslides are classified. Original mass, volume, and averaged velocity are recorded from direct information or expert assessment (Fig. 5).

Fig. 5
figure 5

Landslide inventory map of Zab basin

In this research, bivariate statistical methods (density-area and information value) and multivariate ones (linear regression and discriminate analysis) were applied for landslide susceptibility map.

Bivariate statistical analysis

In bivariate statistical analysis, each factor map is combined with the landslide distribution map, and weighting values based on landslide densities are calculated for each parameter class.

In bivariate statistical analysis, occurrence of landslide is considered as a dependent variable, and each of environmental parameters affecting this phenomenon is considered as an independent variable. The significance of any parameter in relation to instability of the slopes is analyzed separately from other parameters, and frequency of the data (like the slide surface or the number of landslides that occurred in a region) is used for prediction of occurrence probability (van Westen et al. 1999, 2002).

Nine parameters were crossed with the landslide inventory map (in Ilwis environment) and calculated density-area for each landslide in each variable class. Then, information value for each class was recorded, and the map from information value of each parameter was drawn. Finally, through algebraic addition of the information value maps of parameters and classification of the map already obtained, we prepared a map of landslide susceptibility mapping.

It must be mentioned that, after finding the frequency curve of collective weights, based on the low points of the curve, the threshold of each class was calculated. The obtained thresholds were used in classification of the map for collective weights and production of the map for zones susceptible to landslides. Figure 6 illustrates the flow chart of research.

Fig. 6
figure 6

Different steps of preparing the map for zones of landslide hazards

Multivariate statistical method

Multivariate statistics is a form of statistics encompassing the simultaneous observation and analysis of more than one statistical variable. The application of multivariate statistics is multivariate analysis. Methods of bivariate statistics, for example, simple linear regression and correlation, are special cases of multivariate statistics in which two variables are involved (Pranab Kumar et al. 1986).

Multivariate statistics concerns understanding the different aims and background of each of the different forms of multivariate analysis and how they relate to each other. The practical implementation of multivariate statistics to a particular problem may involve several types of univariate and multivariate analyses in order to understand the relationships between variables and their relevance to the actual problem being studied (Schervish 1987).

Multivariate statistical analyses related to landslides subject determine the weights of landslide causal factors based on the relative contribution of each in the presence or absence of past landslide events within a defined land unit (Dai et al. 2010; Suezen and Doyuran 2004b; Ayalew and Yamagishi 2005; Nandi and Shakoor 2009).

Until recently, many researchers have used various multivariate statistical methods for zoning landslides. In fact, in all of these methods, what is done is the simultaneous analysis of some variables, since natural phenomena like landslides are the result from simultaneous operation of several variables. On the other hand, because of the combination of information phenomena (parameter maps) for creation of geo-units (congruent units) and the large amount of information, application of geographic information systems is inevitable. Indeed, each congruent unit can be used as evidence in the statistical analysis (Begueria 2006). Statistical methods of linear multivariate regression and discriminate analysis are two methods used in this research.

The landslides of study area sorted according to their modes of occurrence. This helped in understanding that different triggering factors control different slope movement types. Landslides in the study area are mainly rotational with a few translational, flow, and complex types. Flow and complex types were eliminated before the analysis stage because there were very few observed. Therefore, our resultant map shows susceptibility levels of slides having rotational character.

The seed cell method proposed by Suezen and Doyuran (2004b) is modified and combined with the Infoval (information value) method used by Vijith et al. (2009) using Arc GIS 9.1 software. This model is based on the following steps:

  1. 1.

    Systematic documentation of the location and characteristics of past landslides in the study area and converting them into seed cells;

  2. 2.

    Identification of key factors controlling slope stability, systematic mapping of these factors, and transforming them into raster maps;

  3. 3.

    Calculation of zonal statistics between the seed cells and the factor maps to obtain the number of seed cells per factor class (seed cell or landslide density) which is later used to create weighted factor maps;

  4. 4.

    Summation of the weighted factor maps and classification of the quantitative value into four susceptibility classes.

Discussions

The purpose of landslide hazard or susceptibility mapping is to highlight the regional distribution of potentially unstable slopes based on a detailed study of the factors responsible for landsliding. The resulting maps are useful to establish standards and requirements for the use of land on and around slopes that are likely to fail, to assess the risk that a proposed use of land will affect the stability of an area, and to develop and review mitigation options.

Landslide influencing data layers

We select nine factors as slope, slope aspect, distance to road, distance to drainage network, distance to fault, land use and land cover, geological factors, horizontal gravity acceleration of earthquakes, and climatic condition of the studied area namely. These factors divided to several classes that they had of some landslides in their area. Extent of area and frequency of landslides determined proportion and effect of each class to the landslide that occurs.

Slope is an undeniable role in the occurrence of landslides. To evaluate the slope insatiability, considering the separate classification categories of slope classes, current slope movements have been associated with slope classes. Slope classes of 25–45° which encompass more than 46 % of the region are favorable for the slope movements. Some researchers consider slopes greater than 30° suitable for mass movements; a large part of the region is located in such territory.

Also matching the slope of floor plans, zoning maps, and current instability maps, the slope effect, particularly in the steep gorge Grzhal (middle region) and Zamzyran mountainous region (the east), is remarkable. Slope steeps in the mentioned zone range from the moderate to high ranking areas it covers. However, despite the low slope in the southern part of the land (slope less than 15°), the conditions governing the relative risk is high to very high. The main road opening and Zab river erosion caused by unloading foot sides caused slope movement irritability (Fig. 7).

Fig. 7
figure 7

Slope map of Zab basin

Slope aspect has an accelerated role in a landslide event. Many times, slope aspect controlled the amount of water in the slopes and hillsides. The different parts of a hillside are affected by aspect inequality. In this research, the slope of the north had landslides more than the other sides. If we look at Fig. 8, we notice that half of the basin’s landslides occurred in the north classes (11 landslides), and other classes have a lower proportion of landslides occurring. Contrasting is the value of slope aspect in south of slope, which is a non-vulnerable area, because it gets wet the least, as a consequence of the angle of the sun. This is more negative in SW aspect compared with the south of the slope in the entire study area.

Fig. 8
figure 8

Slope aspect of Zab basin

The lithology and geology factor is considered as independent variable in landslide formation in that lithology and its varied structure tends to lead to a variation in stone stability and strength and also to a varied soil texture (Ayalew and Yamagishi 2005). The layer was extracted from 1:100,000 maps of the Iranian Geology Organization. It was quantified and then appended to the model since it proved to be statistically significant. It is important to note that the study area lacks lithographical consistence and uniformity, and lithological sequence and disruptions are clearly visible between different strata, which evidently suggest the role of dynamic tectonic forces. Type of rock showed a very robust influence on landslide event. Alluvial is a sensitive rock for landslide occurrence. Almost one third of landslides points occurring in the area are in the alluvial class. Rock types are closely connected with lineaments, so lineation has an effective role in a landslide occurrence. When lineament is increased, the fracture of rock also increases; therefore, plenty of fractures effected landslide occurrences. Then, erodible rock is much more prone to landslide events whereas alluvial is more erosional and deeper-weathering, so a number of landslides have been in this class, more than other classes (Fig. 9).

Fig. 9
figure 9

Lithology map of Zab basin

Land use shows the strong relationship between landslides. Man-made forests are very sensitive to landslide occurrences compared with others, based on contrast value. Although dry farm land has the most landslides among land use classes, the extent of the area is bigger than other classes, particularly the man-made forest class. Second-class pasture is the most insensitive area in among land use classes because it has a cover of plants. Generally, second-class pasture and natural forest areas have better stability against landslides, although in some areas, loads of trees represent more pressure to failure slopes, and that is the cause of the number of landslides occurring (Fig. 10).

Fig. 10
figure 10

Land use map of Zab basin

Rainfall is an important factor for landslide-occurrence areas. Precipitation, in particular, sudden, intense rains, and snow melting are controlling factors which trigger mass movements through providing water, and thereby increasing the underground hydrostatic level and pour water pressure. When the soil undergoes such pressure, waters within it will create a negative (upward) pressure, as they cannot drain quickly. When the pore pressure is equivalent to the upper pressure resulting from loading, the shearing resistance of the slope decreases and will lead to failure of the engaged mass. Since the study area is second only to the northern regions in Iranian territory in terms of precipitation rate, the class 700–800 mm has plenty of landslide points, although the area extent class (700–800 mm) has smaller than the class with higher precipitation. The regional precipitation distribution pattern was mapped and incorporated into the model (Fig. 11).

Fig. 11
figure 11

Precipitation map of study area

Distance to the road is the most crucial factor in generating slope instability. Most of landslides occur chiefly near roads (0–20 m), but area extent of this class is large, so contrast value is lower than another class (50–100 m), because it has plenty of pixels compare with other classes. However, spatial relationship between landslide occurrence and road construction is very close, especially in northwest of basin (Fig. 12).

Fig. 12
figure 12

Distance to road map of Zab basin

Distance to fault is another factor in generate slope instability. It has generally been observed that the probability of landslide occurrence increases at sites close to lineaments, about half of the landslides have occurred in class 0-1000 m, which not only affect the surface material structures but also make contribution to terrain permeability causing slope instability. In addition, the kilometer square density of drainage networks, road and fault lines were also used to demonstrate the importance of the features in the whole study area (Fig. 13).

Fig. 13
figure 13

Distance to fault map of Zab basin

Drainage network is very important because of the cut slopes by the river and also the amount of the moisture that the river provides within the slopes. The number of landslides that have occurred in the class 0–200 m proximity was 47 landslides, from positive weight and in contrast to and more than other classes. It indicates that, far from river, landslides occurred at a very decreased rate, and the contrast of values shows a clear differentiation of positive and negative weights (Fig. 14).

Fig. 14
figure 14

Distance to waterway map of Zab basin

Horizontal gravity acceleration of earthquakes is another important factor in this study area, in terms of generating slope instability and landslides. In order to define the horizontal gravity acceleration of earthquakes, we used the historical and quake-systematic records of the statistical period 1910–2009. Events with the highest magnitude in each 5-year period were extracted by the Earthquake Software, and then the coordinates of the center of the earthquakes within a 150-km radius from the center of the studied region were defined. Earthquakes affected the stability of slopes of the study area in two ways. First, seismic shaking caused loss of cohesion and/or reduction of the frictional strength of the substrate by shattering of rock mass or liquefaction in soft, saturated soils and appears to be especially severe in topographic slopes located on the hanging wall. Second, the stability of hill slopes is controlled by their topographic gradient, the mechanical properties of the substrate, pore fluid pressures, and gravitational and transient accelerations.

Analysis and results

Mitigation of landslide-related hazards can only be successfully undertaken when detailed information about frequency, magnitude, and character of slope failures within a particular area is known (Vijith et al. 2009; Liu et al. 2006). For these reasons, the identification of landslide-prone areas through susceptibility assessment represents a cheap and fast method in understanding this hazard and in ensuring that appropriate mitigation strategies are implemented (Bai et al. 2009). Landslide susceptibility assessment involves predicting where a potentially damaging landslide may occur without any reference to the time, or the intensity of associated damage (Sorriso Valvo 2002; van Westen et al. 2005). It expresses the spatial correlation between predisposing terrain factors and the distribution of landslide scars.

In this research, most of the slides, referred to as landslides, have been registered, and zoning operation is carried out with reference to these existing landslides. Besides landslides, other mass movement which is expansion of rockfalls in the studied region was also studied and mapped; nevertheless, it was not applied in the zoning exercise of this research.

Figures 15 and 16 respectively illustrate the map for zones susceptible to landslide hazards, which have been prepared through the two methods of information value and density-area.

Fig. 15
figure 15

Map of landslide susceptibility mapping prepared through information value method in Zab basin

Fig. 16
figure 16

Map of landslide susceptibility mapping drawn by density-area method in Zab basin

In this research, most of landslides triggered by earthquakes that occurred in study area at years 2004 and 2009 and the intense and severe rainfall can be excluded immediate causes of the observed clustering of earthquake-triggered landslides, because it did not rain heavily during the earthquakes in question. However, longer-lived patterns of groundwater flow give rise to spatial gradients in the susceptibility to slope failure in study area. On the other hand, in periodic ridge-and-valley topography, gravity-driven groundwater flow was directed vertically downward through ridges. Geology may be assumed constant, but other conditions change due to rainfall and seismicity.

Processing the data for multivariate analysis

The variables used in landslide susceptibility analysis are sometimes quantitative (slope, rainfall, utmost gravity acceleration, and distances to waterway, fault, and road), and sometimes they are qualitative (slope aspect, plant cover, and lithology).

We must change the qualitative data into quantitative ones. To change qualitative data, we used dummy variable method (van Westen et al. 2002). By using Ilwis software, we crossed the nine parameter maps and obtained 341 congruent geo-units. Afterward, the map for congruent points was crossed with the landslide inventory map; all points susceptible to sliding were given code 1, and code 0 (zero) is for those points without sliding. Moreover, the presence of any level for all parameters in each congruent unit was indicated by code 1, and code 0 refers to their absence (Ercanoglu et al. 2004). Generally, 37 columns (36 levels of each parameter as independent variable and sliding column as dependent variable) and 341 rows (congruent units) were used for production of the matrix of variables. These variables were dummy variables and later used as expanded independent variables in two statistical analyses methods. In Tables 1 and 2, the letters in variables column are indicative of the category, where A is slope direction; D, distance to waterway in meters; F, distance to fault in meters; G, horizontal gravity acceleration (in meters per square second); L, formation type; B, distance to road; R, annual rainfall in millimeters; S, slope size in percentage; and V, land cover and function of the lands; \( Q_2^t \), Young terraces; 1, Sandstone, Quartzite; m, Dolomite, Limestone; an, Andesite; K1, Limestone; Kml, Marble; Kph, Phylite; gn, gneiss; ∈z, Shale; DFL, as dry farm land; FCP, as first-class pasture; MMF, as man-made forest; SCP, second-class pasture.

Table 1 Linear regression model with stepwise expanded data method
Table 2 Distinction function model with discriminate analysis of discrete data

Linear regression with stepwise expanded data method

In this method, with the help of SPSS software and stepwise method, and also through application of 341 congruent units (repetitions), we determined the best equation and regressors (parameters). Indeed, the result of 341 congruent units was from combination of the nine parameters. The best model (see Table 1) has employed only 17 variables among 36 ones. Correlation coefficient (R) of the equation (R) is 54 %, and validity of each variable is 96 %.

Table 1 shows the coefficients of each variable together with its validity. Positive coefficients in the regression equation are indicative of unstable increase, and negative ones are indicative of instability reduction. After determination of regressive equation on the basis of the existing coefficients through the multivariate standard regression equation (Y = B 0 + B 1 x1 + B 2 x2 + … + B n xn), where Y is slide as a dependent variable; B 0 and B n are the coefficients brought in Table 1, and x1 to xn are 17 independent variables as classes of each parameter, the primary zoning map for the whole region was drawn (Fig. 17), and then according to the frequency levels of 25 %, 50 %, and 75 %, four classes of low, medium, high, and very high susceptibility to landslides were determined (Ohlmacher and Davis 2003; Song et al 2008; Yesilnacara and Topal 2005).

Fig. 17
figure 17

Map of landslide susceptibility mapping by using linear regression with stepwise expanded data method

Discriminate analysis of discrete data

Linear equation most properly differentiates between two stable (code 1) and unstable (code 0) geo-units, zones that are landslide hazards. In this regard, through application of the matrix (341 × 37), which includes 37 dummy variables using SPSS software and stepwise method, distinction equation or function is calculated (Table 2).

The mean for identity function amount for the unstable group (code 1) and stable group (code 0) are 0.643 and 0.148, respectively. Moreover, the Chi square test for distinction function (Chi square = 556) with validity level of 0.96 is quite meaningful where they are highly correlated. By using 341 congruent units and 37 dummy variables and applying distinction function equation obtained from discrete data, we prepared the landslide susceptibility mapping (Fig. 18).

Fig. 18
figure 18

Map of landslide hazard zoning by using discriminate analysis with stepwise expanded data method

The map of landslide susceptibility has been classified with frequency levels of 25 %, 50 %, and 75 % (low, medium, high, and very high). Moreover, 58 % of congruent units have been classified properly. From Tables 1 and 2, among 36 variable classes, only 17 are meaningful with validity levels of 96 % and 93 %, and the rest show no significant difference. Correlation scope of various parameters or the role of each parameter in susceptibility mapping (based on correlation among parameters in stepwise linear regression method) is respectively shown in Table 3.

Table 3 Correlation coefficient of the landslide parameters based on linear regression

Evaluation and comparison of methods for landslide susceptibility mapping with documentation on the collected data

We used the landslide inventory map for evaluation and comparison of the methods of susceptibility mapping. In order to do that, we crossed the abovementioned map with map of susceptibility in the geographical information system, and landslide index was used for evaluation of hazard classes on zoning maps (Eq. 1). Landslide index is defined as follows (van Westen et al. 2002).

$$ Li = {\left( {{{{\left( {{{{Si}} \left/ {{Ai}} \right.}} \right)}} \left/ {{\left( {\sum\nolimits_1^n {\left( {{{{Si}} \left/ {{Ai}} \right.}} \right)} } \right)}} \right.}} \right.^{ * }}100 $$
(1)

In the above relation: landslide index is percentage of sliding surface ratio in any zone to the area of that zone, divided by the total sliding ratio to the total surface of the zones. In order to compare the obtained zoning maps, we considered the parameter for precision of the predicted results (P) (Ayalew et al. 2005). Precision of the predicted results can be estimated by the following relation (Eq. 2).

$$ P = {{{Ks}} \left/ {S} \right.} $$
(2)

In relation 1, Li is the index for occurrence of landslide hazard in each hazard zone (percent), Si is slide area in each hazard zone, Ai is the area of each zone, and n is the number of hazard classes. In relation 2, Ks is the area of slide zone in upper medium hazard level, and S is the area of landslide in the region.

As Table 4 shows, from low to very high levels, Li amounts increase in all of the methods used. Therefore, all of the methods used for zoning the hazard levels have yielded acceptable results. However, this classification is more suitable for information value method and bivariate density-area method (in these methods, Li amounts for low and medium hazard levels are smaller). In Table 5, the amount for P refers to the high efficiency of bivariate methods of information value and density-area in susceptibility mapping in the studied region.

Table 4 Comparison of the information obtained from crossing each of the zoning maps with the map of landslides distribution
Table 5 Precision of various methods of landslide susceptibility mapping

From the statistical methods that were used, respectively, precision of the method (P) in medium to very high zones were 96 % for information value, 92 % for density-area, 88 % for linear regression, and 74 % for discriminate analysis, which are all compatible with the conditions for occurrence of landslides in the region (Table 5).

In addition, Table 4, apart from independence of information layers, the bivariate methods (information value and density-area) have more conformity with the conditions for occurrence of landslides in all zones. In spite of the limitations of bivariate methods, specially the independence of information layers and also the correlation among natural parameters (variables) in landslide phenomenon, linear regression, and discriminate analysis compared with bivariate methods respectively with precisions (P) of 88 % and 74 % in recognition of regions susceptible to landslide (high risk and very high risk) show more suitable results.

Validation of susceptibility maps

Landslide susceptibility map generated in this study exploits the relative landslide density method (R-index) to assess the relationship between the landslide susceptibility map and landslide inventory points (sample datasets). The sample data were collected by field work and GPS. The number of landslides which is detected in filed observations was 29, a consequence of heavy rainfall. Kinds of landslides according to size occurred throughout the region. Information value, density area, linear regression, and discriminate analysis methods were evaluated. Linear regression and discriminate analysis methods have been more consistent with recurrent landslides having occurred in sensitivity classes. Although in the map of diagnostic analysis only the class of high hazard is consistently fewer, other classes match with distribution of landslide occurrence. Validation of susceptibility maps was performed with a formula defined as follows (Eq. 3):

$$ R = {{{\left( {{{{ni}} \left/ {{Ni}} \right.}} \right)}} \left/ {{\sum {\left( {{{{ni}} \left/ {{Ni}} \right.}} \right) \times 100} }} \right.} $$
(3)

Where ni is the number of landslides that occurred in the sensitivity class i and Ni is the number of pixels in the same sensitivity class i. From the classes that were very-high hazard, linear regression and discriminate analyses methods work better than information value and density-area methods. The R-index sample datasets are for classes of very-high hazard in information value, density-area, linear regression, and discriminate analysis maps namely %148, %149, %171, and %175 (Table 6). It is unlikely that, from assessment of all classes viewed, discriminate analysis is more exact than in the other three classes (Fig. 19).

Table 6 Validation (R-index) of information value, density-area, linear regression, and discriminate analysis methods
Fig. 19
figure 19

R-index validation of four methods for susceptibility mapping in the study area

Conclusion

This research was done through application of statistical methods and their comparison in regard to susceptibility mapping in Zab basin. We obtained results related to landslide susceptibility in the basin and the role of environmental variables with regard to significance, precision, and value of the methods applied in the research.

The result shows that landslide hazard in the studied region is influenced by the geographical situation and some other environmental variables, as the presence of phyllite-sensitive formations and effective slopes (30 %–45 %), where the upper part of the basin is highly susceptible to sliding and increase of sediment weights of Zab River. According to the studies and evaluations, variables such as lithology, slope, annual rainfall, land use, slope aspect, distance to waterway, distance to road, horizontal gravity acceleration, and distance to fault, respectively, have influenced the most the landslide occurrence in the studied region.

Precision of the medium class and above are 96 % for bivariate method of information value and 92 % for the bivariate method of density area, respectively. In multivariate statistical method of linear regression, we used 17 parameters with validity levels of 96 % and 91 % in the model. The coefficient R for regression equation was 65 % which is a quite acceptable figure. In multivariate statistical method of discriminate analysis by using 17 parameters, distinction function was formulated. According to the study and the previously brought tables, in all zones, bivariate methods (information value and density-area), apart from the independence of information layers, are more compatible with the conditions for occurrence of landslides.

Despite the limitations of bivariate methods, especially the hypothesis of independence of information layers and also correlation of natural parameters (variable) in landslide phenomenon, linear regression and discriminate analysis methods achieved the precisions of 88 % and 74 %. Thus, both show better results than bivariate methods.

As a final conclusion, we found out that all of these methods produced acceptable results in classification of hazard classes. Furthermore, statistical methods of information value, density-area, linear regression, and discriminate analysis are all useful in susceptibility mapping of Zab basin.