Abstract
Land subsidence is a complicated hazard that artificial intelligence models can model it without approximation and simplification. In this study, for the first time in land subsidence studies, we used and compared the accuracy and efficiency of hybrid fuzzy-gene expression programming (F-GEP) and fuzzy-artificial neural network (F-ANN) models in estimating land subsidence susceptibility modeling in Varamin aquifer of Iran. For this purpose, after selecting and gathering information from fifteen geo-environmental and hydrogeological effectual factors including specific yield, erosion, aquifer thickness, distance of fault, bedrock level, digital elevation model (DEM), annual rainfall, clay thickness, transmissivity (T), soil type, Debi zonation of pumping wells, slope based on DEM, groundwater drawdown in 20 years, land use, and lithological units event based on literature review in the GIS environment, they were first standardized with GIS fuzzy membership functions, and then GEP model was used to integrate the layers. For this step, using 70% of the data (2919 pixels) for the train and 30% (1251 pixels) for the test. Finally, using several statistical criteria and radar image data, the models were validated. We repeat the model on ANN, and our results showed that F-GEP model (with R2 = 0.99 and RMSE = 0.004) is more accurate than F-ANN model (with R2 = 0.94 and RMSE = 0.056) for land subsidence susceptibility modeling in the study area.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
A hazardous geological phenomenon that has accrued in recent years in many urban areas worldwide is land subsidence (Chen et al. 2019). As given from UNESCO define, land subsidence is “settlement or gradual downward settling of the ground’s surface, which may have a slight horizontal displacement vector” (UNESCO 2018). Damages to the natural environment and even economic losses are some effects of this geological hazard (Hu et al. 2004; Waltham 1989). Land subsidence, as one aftereffect of water resources mismanagement and excessive use, occurs when the reduction of groundwater levels leads to the compression of soil (Pacheco et al. 2006). Accordingly, this phenomenon that can caused by groundwater excessive pumping has been widely reported in many areas, such as Rafsanjan (Mousavi et al. 2001), Shanghai (Hu et al. 2004), Mashhad (Motagh et al. 2007), Mexico City (Calderhead et al. 2011), Tianjin (Lixin et al. 2011), California (Galloway and Burbey 2011), Su-Xi-Chang (Chen et al. 2013), Arak (Rajabi and Ghorbani 2016), Antelope Valley, Kerman (Abdollahi et al. 2019), Bangkok, Kashmar (Lashkaripour et al. 2006; Rahmati et al. 2019), Tehran (Dehghani et al. 2013; Mahmoudpour et al. 2013; Ranjbar and Ehteshami 2019). In these areas, severe damages occurred and are including fractures in underground lines and transport path, increasing flood risk, building cracking, and loss of ground level (Mohebbi Tafreshi et al. 2019).
In recent decades, increasing damages of land subsidence caused numerous studies worldwide that have attempted to susceptibility zonation of land subsidence risk and identify the factors that affect it (Abdollahi et al. 2019; Wang et al. 2019).
Some researches have appraised the factors affecting land subsidence risk. For example, Burbey (2002) assessed the fault’s effects on land subsidence of Nevada’s submarine sedimentary basins in the United States. Their research showed those joints in the fault’s adjacent that act as a barrier to flow, tend to horizontal deformation; conversely, in places where they do not, vertical deformation caused.
Oh and Lee (2010) for evaluating factors affecting land subsidence, have used seven main factors including land use, groundwater depths, fault distance, geology, the depth of faults, the gradient obtained from topographic maps, and the capability of landing from crater data.
Putra et al. (2011), in Rongkop (Indonesia), appraised the land subsidence risk. Their risk map developed based on five parameters of land use structures, distance to valley-like (cratering), slope, lithology, and elevation.
Park et al. (2012) utilized five main factors affecting land subsidence, including slop, geology, distance of fault, land use, and fault depth in Samcheok City, Korea.
Shadfar et al. (2016) concluded that “excessive groundwater pumping” factor primarily and “lithology” factor secondarily, are effectual in creating land subsidence in the Buin Zahra area.
Rezaee (2016) investigated the land subsidence risk in Kermanshah Plain. Their results show that in the south and east of the Deh-e-Platan village, in the east of Kermanshah, which level of groundwater is low and the aquifer has fine-grained sediments, the land subsidence risk is higher than elsewhere.
Behyari et al. (2017) in their research in Marzan Abad, Iran, studied the effect of tectonic on land subsidence occurrence. Accordingly, the results showed that the geological factors such as fault fractures and the presence of joint have led to the forming weaknesses in the soil structure and instability in the region, and on the other hand, has caused the transfer water to the subsurface calcareous units and has created dissolution cavities as a sample of subsidence.
Minderhoud et al. (2018) assessed the interaction effect of land subsidence and land use in the Mekong delta, Vietnam. Their results showed that land use can affect on intensification natural subsidence, the anthropogenic subsidence, or the land subsidence process. In various land use classes, different rates of land subsidence occurred. Accordingly, in those classes of land use which natural variations because of human activities have been changed, the highest rates of land subsidence occurred.
Moreover, new researches have investigated land subsidence susceptibility using hydrogeological, climate, geophysical, and geological data, as well as methods like statistics, genetic algorithm (GA) (Manafiazar et al. 2019; Taravatrooy et al. 2018), fuzzy algebra (Bianchini et al. 2019; Chanapathi et al. 2019; Ghorbanzadeh et al. 2018; Rafie and Samimi Namin 2015; Yu et al. 2018), artificial neural network (ANN) (Abdollahi et al. 2019; Dehghani et al. 2013; Oh et al. 2019; Tien Bui et al. 2018; Wang et al. 2018), and random forest (RF) models in geographic information system (GIS) applications (Ilia et al. 2018; Mohammady et al. 2019; Pourghasemi and Mohseni Saravi 2019).
In recent decades, several meta-modeling techniques have appeared as promising methods for modeling high dimensional and nonlinear processes. ANN (Tongal and Booij 2017; Zaman Zad Ghavidel and Montaseri 2014), GEP (Aziz et al. 2017; Kisi et al. 2019), fuzzy logic (Jahangoshai Rezaee et al. 2020; Wang and Chen 2015) and statistical methodologies (Barbulescu and Popescu-Bodorin 2019; Elhatip et al. 2008; Leduc and Ouldali 1990) are the best examples. Accordingly, highly accurate results of the GEP model and the ANN model in numerous studies have led us to evaluate and compare the results of these two models in the land subsidence approach.
Since the results of the hybrid models (Barzegar et al. 2016; Elalfy et al. 2018; Jamshidi et al. 2019; Moeeni and Bonakdari 2017; Wang and Hu 2019), especially in combination with the fuzzy models (Abass et al. 2011; Moghassem and Fallahpour 2013; Wang et al. 2010), show higher efficiency and accuracy than the non-hybrid models, in this study the hybrid mode of both GEP and ANN models was used. The advantage of such hybrid techniques is that they can deal with cases that are difficult for one alone as a universal approximator, and in particular that they can potentially find simpler solutions than either alone, viz. a more parsimonious model.
As a result of the literature, no work or limited works have evaluated together, erosion, fault, rainfall, land use, clay thickness, Debi of pumping wells, the effect of soil type, hydrodynamic properties of the aquifer, such as T, and Sy, on land subsidence susceptibility and its scatter. Simultaneous investigation of the parameters that have been identified as the main cause of the land subsidence in various researches in different regions of the world helps to identify and manage the important and effective factor of the subsidence event in the study area. This recognition can be applied to managers in adopting appropriate measures to reduce the negative effects of subsidence. Although the number of parameters affecting a phenomenon does not have a direct impact on the accuracy of the models, the use of more parameters can extend the evaluation circle of the parameters affecting the phenomenon and present a comprehensive susceptibility assessment procedure. Also, since there is a significant vacuum in answering the question: “What is the influence of more factors affect a phenomenon on the accuracy of models?”, and no direct research to answer the question has been done, so one of the aims of this paper, and the reasons for using maximum parameters affecting the subsidence phenomenon, is to investigate and attempts to clarify the relationship between the number of parameters and the accuracy of the model.
Despite using the hybrid ANN and GEP models especially in combination with fuzzy logic in various researches, so far, there has been no researches (or limited researches) worldwide on the use of these types of models in assessing land subsidence susceptibility. Consequently, the main objective of this study is to compare the hybrid F-GEP and the hybrid F-ANN models for land subsidence susceptibility modeling in Varamin aquifer. The findings of this research can provide scientific evaluation for sustainable development and a decrease in human and ecological risk due to land subsidence damages, based on land subsidence susceptibility map.
2 Study area
Varamin aquifer in the southeast of Tehran province, Iran (that is bounded by the latitudes of 540,000–580,000 N and the longitudes of 3,888,000–3,930,000 E in 39 N zone according to UTM coordinate system), is a part of Varamin sub-basin (Fig. 1a). The important communication paths, such as the East–West transit road and the Mashhad-Tehran railroad (Fig. 1b) are located in this area (Mohebbi Tafreshi et al. 2019). Moreover, part of the national electricity transmission network routes is located in this area (Fig. 1c). It has crossed the area affected by the land subsidence in Varamin, 2 km from the Mashhad-Tehran railroad and 5 km from the electricity transmission routes. Meanwhile, 670,000 people live in the area affected by the land subsidence and 4 villages and population centers are in the area.
The Location of Varamin aquifer is in the Central zone of Iran from the structural viewpoint (Berberian and King 1981). This aquifer is divided into two parts (the mountains and the plain) by the Pishva hill. This hill is an anticline (Sadeghi et al. 2006). In terms of structural processes, especially the folding of Tertiary deposits can have formed mountains. Geological outcrops in this area (Fig. 2) included a diversity of formations, mostly marl, sandstone, shale, and conglomerate with the age of the Eocene to Quaternary (Sadeghi et al. 2006). Accordingly, the Pliocene and Quaternary deposits in the northeast and south of the Varamin-Eyvanekey road, northeastern and northern parts of Sharif Abad, and south of the village of Shah Qazi and Yousef Abad are observable, which according to their adjacent maps and their consistency, most of them composed of the conglomerate equivalent of the Hezardareh Formation (Sadeghi et al. 2006). The northeastern and northern boundary formations of the area are often related to marl, Eocene volcanic, and Oligomiocene limestone, as well as silt and shale with evaporative sediments of Miocene (Sadeghi et al. 2006).
As observed in Fig. 3a, Sy ranging from 13 to 16% in the north of Varamin aquifer (at the beginning of the cone, which the alluvium has coarse-grained sediments). This amount around the city of Varamin in the middle of the plain is about 10% and is about 2–5% in the southern part of the plain (TRWA 2018).
A remarkable part in the central and southern areas of the study area has high sensitivity classes (Fig. 3b), in terms of susceptibility to erosion (Alimohammadi 2009). Moreover, moderately susceptible and hard erosion-resistant formations are seen in most of the northern areas of the Varamin sub-basin, and also a few separate parts in the northern and southern parts of the area (Alimohammadi 2009).
The Varamin aquifer is an unconfined aquifer (Nakhaei et al. 2019). In the center of the north half of the aquifer (Fig. 3c), highest thickness is seen up to 280 m, and in the southwest part of the aquifer the lowest thickness of the aquifer is less than 50 m (Shemshaki et al. 2006).
The tectonic movements of this region are affected by Parchin, Kahrizak, Pishva, and Eyvanekey faults (Figs. 2, 3d). The Kahrizak and Eyvanekey faults are thrust faults with a dip to the north, in which Eyvanekey fault has a northwest-southeast trend (IIEES 2010). Similarly, Pishva fault with a dip to the northeast is also a thrust fault that forms the boundary between the mountains and the plains in Pishva city by splitting the Quaternary sediments (IIEES 2010).
The average altitude of this area (Fig. 3f) is 950 m above sea level (Mohebbi Tafreshi et al. 2019). Accordingly, the highest elevation is 1148 m in the northern part, and the lowest elevation is 810 m in the southern and southeast of the aquifer (Nejatijahromi et al. 2019). The northeast to the Southeast of the aquifer is the direction of the topographic slope (Fig. 3l). The annual average rainfall of the study area (Fig. 3g) is 187.4 mm and the annual average temperature is 16.4 °C (Nejatijahromi et al. 2019). On this basis, Siberian fronts from the north, west, and northwest, the Mediterranean fronts have often influenced Varamin aquifer’s climate (Mokhtari and Espahbod 2009).
In the south and north half of the aquifer (Fig. 3i), the pattern of transmissivity is heterologous (Atarzadeh et al. 2014). The maximum transmissivity estimated in the north aquifer reaches up to 3000 m2/day (Mokhtari and Espahbod 2009). However, its trend because of a considerable change in the sediment grain size, or aquifer thickness was decreasing into the south half of the aquifer. Accordingly, It is seen that in the east and south half of the aquifer, the maximum amount is up to 150 m2/day (TRWA 2018).
Forage maize, barley, pistachio, grape, vegetable, and alfalfa are the main crops of Varamin Aquifer (Nejatijahromi et al. 2019).
3 Input data
As shown in Table 1, 18 input layers are evaluated and prepared to be employed in the GIS environment. Accordingly, the radar image until 2015 as an indicator of the land subsidence rate was used for comparison and verification of the results.
3.1 Land subsidence effective factors
In the present study, 15 effectual factors including annual rainfall, soil type, T, Debi zonation of pumping wells, aquifer thickness, clay thickness, DEM, Sy, groundwater drawdown in 20 years, bedrock level, lithological units, erosion, slope based on DEM, land use, and distance of fault were used for land subsidence susceptibility modeling, based on literature review (Ayalew et al. 2005; Behyari et al. 2017; Karsli et al. 2009; Minderhoud et al. 2018; Wang et al. 2009). Accordingly, descriptions some of them are as follows:
Slope: One of the most effective factors which has a high effect on the development and expansion of diaclase in lithostratigraphic units and can control land subsidence (Arca et al. 2018; Dai and Lee 2001; Suh et al. 2013). Accordingly, in areas with a gentle slope, the speed of runoff is less, and consequently, there is adequate time for surface water influence into the depths and the dissolution cavities formation, especially in calcareous units. Therefore, the slope because of the loss of calcareous regions (such as karsts) is an affirmative and causative factor in karstic subsidence (Behyari et al. 2017).
Land use: From the land use viewpoint, urban areas, rangelands, and agriculture (due to groundwater harvesting to irrigate crops) are the most water consumed (Taheri et al. 2018). Since increased water consumption can lead to lower groundwater levels and an increased likelihood of subsidence, those kinds of land use that are more water consumption, are more important in assessing subsidence (Minderhoud et al. 2018).
The distance of faults: As the fault activities (such as earthquake) are affecting the possibility of land subsidence occurrence, the higher distance from the faults demonstrating that the region has a lower proportionality for the likelihood of land subsidence. In the lower distance, this probability is higher, conversely (Aalipour Erdi et al. 2017; Arca et al. 2018; Chen et al. 2016; Hu et al. 2019; Pradhan et al. 2014).
Bedrock depth: When the bedrock is located at a low depth, because of the low thickness of the alluvium, it is not possible to drill wells. As we know, groundwater is stored in areas that have a higher thickness. Usually, in these kinds of areas, excessive drilling of wells and consequently, excessive pumping leads to increased subsidence and vertical displacement of layers (WRI 2014).
Drawdown: In regions that are covered by semi consolidated or unconsolidated alluvial sediments, excessive groundwater pumping, can lead to land subsidence (Poland 1984). In the USA, more than 80% of the identified land subsidence has happened because of mismanagement exploitation and overuse of groundwater (USGS 2019b). As described, excessive groundwater pumping lead to the reduction of the groundwater level and consequently increases the land subsidence occurrence (USGS 2019a).
Lithology: The formations and lithologies that include fine-grained materials such as silt and clay in their structure will enhance the subsidence rate. On the other hand, because of the water influence on dissolution structures such as carbonates and gypsums, lithological structures including these materials also erosion and enhance the subsidence as a dissolved sink.
Soil type: When there are unconsolidated fine-grained sediment layers (such as silt and clay) in the aquifer structure, simultaneously with the drop in hydraulic height, the effective stress is enhanced, and the consolidation phenomenon happens (Terzaghi 1925). Consequently, the effect of which becomes manifest as subsidence in the land surface (Nameghi et al. 2013).
Rainfall: Since the higher amounts of rainfall lead to enhance water infiltration, it can increase the groundwater table. Consequently, enhancing rainfall is not only considered as a non-intensification factor in subsidence occurs but also it can be considered as a preventive or mitigating factor in subsidence because of the increase in the groundwater table.
T: Accurate data of hydraulic properties such as transmissivity is significant for reliable predictions of land subsidence modeling (Li and Zhang 2018). The lower T amount leads to enhance soil compressibility amount and subsequently enhances the land subsidence rate.
Aquifer thickness and aquifer hydraulic parameters: These parameters have a positive correlation and direct relationship with subsidence occurrence. Based on the Lohman (1961) equation, the land subsidence depends on the storage coefficient and its parameters, as bellow:
In this equation, Δb is the rate of land subsidence, Δp is the reduces the pressure head on the aquifer, γ is the water density, n is porosity, b is the aquifer thickness (or saturated thickness), β is the water compressibility [conversely of Young’s modulus for water \(\left( {\beta = \frac{1}{{E_{w} }}} \right)\)], S is the storage coefficient in a confined aquifer that is calculated based on De Wiest (1966) equation as bellow:
In this equation, α is the water compressibility [conversely of Young’s modulus for the solid grain material of the aquifer \(\left( {\alpha = \frac{1}{{E_{s} }}} \right)\)].
4 Methods
4.1 Factors standardization
ArcGIS version 10 software has various fuzzy membership functions to normalizing parameters in the fuzzy logic extension, which is used usually in many fuzzy logic applications (Mohebbi Tafreshi et al. 2019; Raines et al. 2010). Uses any of these functions are performed based on the spread factor and midpoint. Selecting a membership function for fuzzy normalizing is relevant to the importance, identity, and relationship of each criterion with the goal (Mohebbi Tafreshi et al. 2019). In this research, for normalization the factors, three fuzzy membership functions were used and described as follow:
Fuzzy Small: When small input values have a higher membership value, this function is used (Mohebbi Tafreshi et al. 2018; Raines et al. 2010; Zadeh 1965). The membership amounts that are less than the midpoint have increased (Fig. 4a).
In this equation, user inputs f1 is the spread, and f2 is the midpoint.
Fuzzy Linear: This function establishes a linear relationship between the maximum and minimum values defined by the user (Raines et al. 2010; Zadeh 1965). 0 and 1 awarded to the values that are less than the minimum value and the values greater than the maximum value, respectively (Fig. 4b).
In this equation, min and max are user inputs.
Fuzzy Large: When large input values have more membership value, This function is used and is precisely the opposite of the small function (Mohebbi Tafreshi et al. 2018; Zadeh 1965). In this function, the membership amounts that are more than the midpoint, have increased (Fig. 4c).
In this equation, f1 is the inputted spread amount by the user, and f2 is the midpoint.
4.2 Modeling using GEP
GEP is a generalized genetic algorithm that was first proposed by Ferreira in 1999 (Ferreira 2001) based on Darwin’s theory. For gene expression algorithm, the first step is production an initial population of solutions. To do the first step, an accidental process or application of some information can be used. Then a tree expression can be produced as a form of chromosomes expression, and fitting function can evaluate it and determine the fitting of a solution in the problem domain (Abbasi et al. 2019). Suitability level of fitting function usually can be evaluated by processing some instances of the actual problem, also called fitting cases. The tree structure helps to express the initial population at each stage as a simple linear structure, and all changes are made only on simple structures, so there is no need for relatively complex structures to expand at each stage (Abbasi et al. 2019). If the satisfactory quality of a solution is found or generations reach a specific number, evolution ceases, and the best solution is reported (Maroufpoor et al. 2019). On the other hand, if no stopping conditions are found, the best solution is kept by the current generation (meaning elitism), and the rest of the solution is left to a selective process. Choosing or choosing has the function of survival of the fittest, and accordingly, the best people have a better chance of producing children. The whole process is repeated for several generations, and as the generation moves forward, the quality of the population is expected to improve on average (Ferreira 2006). The algorithm defines a target function in terms of qualitative criteria and then applies the mentioned function to compare different problem-solving solutions in a step-by-step process of data structure correction, and finally, the appropriate solution. In this method, various phenomena are modeled using a set of functions and a set of terminals. The set of functions usually includes the arithmetic functions [+, −, *, /] of trigonometric functions and other mathematical functions or user-defined functions that they believe may be appropriate for model interpretation. The set of terminals consists of constant values and independent variables of the problem (Ferreira 2001).
In this study, GeneXpro Tools software was used to predict, develop, and implement a gene expression-based programming model. One of the strengths of gene expression planning is that the genetic diversity criterion is very simple and so genetic operators act on the chromosome level. Also, one of the strengths of this approach is its unique multi-gene nature that allows for the evaluation of complex models involving several sub-models. The modeling process of prediction of Varamin Plain subsidence is presented as follows:
The first step was to select the appropriate fitting function in which the root mean square error function was chosen as the fitting function (Mehdizadeh et al. 2016). The second step is to select the set of input variables and the set of functions to generate the chromosomes. In this study, four main operators including [+, −, *, /] and mathematical functions [Tanh, X2, Atan, Inv, 3Rt, Ln, NOT, Min2, Max2, Exp, Avg2] were `used. The third step involves selecting the structure and architecture of the chromosomes, which include the length of the head and the number of genes (Mehdizadeh et al. 2016). The fourth step is to select the linking function that was used in this study to add the link between subcategories. Finally, in step 5, the genetic operators and the rate of each of them are selected. In this case, a combination of all refinement operators such as mutation, inversion, three types of transposition, and three types of combinations where used.
In GEP that is a development of GA, various kinds of chromosomes such as linear or simple are encoded to the individuals, and then transformed into an expression parse tree completely separating the genotype and phenotype which causes GEP much faster (100–10,000 times) than the GP (Ferreira 2001; Dey et al. 2015). For instance, the expression tree of an algebraic expression (Eq. 6) is shown in Fig. 5.
In GEP, more complex technological and scientific programs can be solved with the help of linear chromosomes and Expression Trees (ET) (Dey et al. 2015). A chromosome is a linear symbolic string of constant length consisting of one or multiple genes of equal size. A typical GEP chromosome is presented in Fig. 6. Each linear chromosome is namely replication, genetically manipulated, replication, recombination mutation, and transposition (Ferreira 2001; Dey et al. 2015). Structurally, they are composed of genes that comprised of the tail and head parts (Dey et al. 2015). As shown in Eq. 7, the tail length (tl) is a function of head length (hl) and the number of arguments of the function (m):
Although all genes of the GEP have the same size, they are coded for different expression trees of different sizes (Alkroosh and Ammash 2015). The trees represent a spatial illustration showing the interactions among the gene’s components on the map of the solution (Alkroosh and Ammash 2015). Figure 7 presents the genes expression trees of the chromosome in Fig. 6.
4.3 Modeling using ANN
ANNs are one of the computational methods that assisting the learning process, using processors called neurons, and by adjusting the weights to obtain a model using the available input samples. The neuron is the smallest information processing unit that forms the basis of neural network performance. Based on Fig. 8 a neuron consists of three main parts (Arjun and Kumar 2011). The synapse set establishes the relationship between the input xj and the neuron by the weights of wkj. The uk is the summing set that sum up the weighted input signals. An activation function [\(\varphi \left( . \right)\)] used to constrain the output range. The bk bias constant is used to reduce or increase the output of the neuron.
Equations 8 and 9 represent the neural network structure mathematically:
The learning process of the learning network is performed by the input–output sample k, where the input vectors are x1, x2, …, xn and the output vectors corresponding to each input vector are y1, y2, …, yn. wkj and uk are the weights and bias vectors of hidden layer and network outputs, respectively. Each neuron receives all outputs of the previous layer’s neurons, but each receives a specific weight. After creating the network and determining the number of hidden layers and the number of neurons, the network is trained by available input–output samples and is implemented by a weighted vector learning law (Ross 2005). The activation function of each neuron is to determine the output from the sum of its weighted inputs. Generally, for all neurons in a layer, the same activation function is chosen, although such a condition is not necessary (Ross 2005).
Figure 9 shows the structure of the multi-layer perceptron (MLP) network with I inputs, one hidden layer (number of units in the layer is O) and one output layer. According to Fig. 9, depending on the type and location, the layers can be divided into input, hidden, and output layers. The input layers receive the information and provide it to the system. The output layers send the obtained values out of the system. The hidden layers are the layers whose input and output are only within the system. I is the number of input variables, H is the number of hidden layer nodes, and O is the number of output variables. One of the essential learning algorithms of ANN, which is also used in this research, is called back error propagation law. The back error propagation law is used to train multilayer feedforward neural networks, commonly referred to as MLP multilayer perceptron networks (Fig. 10). The back error propagation law consists of two main paths. The first path is called the forward path in which the input provided to the input layers is propagated through the network, layer by layer, to the output layer. In this way, network variables are considered constant and unchanged. In this algorithm, the objective function designed for network training is usually defined as the sum of the mean squares of the errors. The error value after the calculation is distributed in the backward path of the output layer and by the network layers throughout the network. In this way, the weights of the MLP network are changed and adjusted to minimize the sum of squares of the network error.
4.4 Theoretical comparison between the ANN and the GEP methods and the conditions for their applications
An ANN that known as one of artificial intelligence-based technique, is a flexible mathematical method that is mighty to recognize intricate nonlinear relationships between input and output data sets.
The main advantage of ANN models over the statistical methods is that the latter assume linear relationships and/or normal distribution, while reality is non-linear and non-normal. Thus the ANN model is capable to conform to the real world. An important advantage of ANNs is its capability to exert large and intricate systems with many interrelated parameters (Nourani et al. 2011). The no free lunch theorem states that uniformly averaged over all target functions the expected error is the same for any two algorithms. Nonetheless, there are other reasons for stating that there are advantages of ANN over other algorithms. For example, the ANNs show graceful degradation was you may have noisy input data or even the removal of units and the ANN still functions. Another advantage is the inherently distributed nature of ANNs which allows better implementations across a distributed environment. The ANN is a non-parametric model, thus eliminates the error in parameter estimation, while most of the statistical methods (MLR, etc.) are parametric models that need higher background of statistic (Singh and Su 2016).
The drawback of this method is that the final product is not in the form of mathematical equations that can be easily implemented. Basically, a major limitation of common soft computing techniques is that no closed-form prediction equation is provided by them (Mohammadzadeh et al. 2019). In the last decade due to the importance of the research topic Numerous Studies were concentrated on many linear and nonlinear regression equations (Pham et al. 2016). Modeling by using artificial intelligence (AI) has been a very active research area (Pham et al. 2016). According to previous researches, although AI techniques such as ANN have demonstrated their superior capability over traditional modeling methods and so ANN was one the successful choice that used for prediction problems, it has some following limitations: 1. ANN does not provide information about the relative significance of the various parameters (Samui 2008) 2. A common criticism of neural networks is that they require a large diversity of training for operation (Saberi et al. 2013) 3. The knowledge acquired during the training of the model is stored in an implicit manner and hence it is hard to come up with reasonable interpretation of the overall structure of the network (Samui 2014) 4. In order to the ANN be able to learn it is essential to define the examples and to teach the network based on the desired output by demonstrating these examples to the network. The network’s success is directly proportional to the selected instances, and if the event cannot be indicated to the network in all its aspects, the network can produce false output. In addition, ANN has some intrinsic disadvantages such as less generalizing performance, arriving at the local minimum and over-fitting, and slow convergence pace (Samui 2014).
GEP is another artificial intelligence-based technique commonly used at nonlinear systems. The GEP method is a newer technique than ANN. The advantages of GEP are: first, the chromosomes are simple entities: linear, compact, relatively small, and easy to be genetically manipulated (replicate, mutate, recombine, transpose) and second, the expression trees are exclusively the expression of the respective chromosomes (Moghassem and Fallahpour 2013). The important powerful property of GEP is that the user can easily take a clear formula of the relation between the inputs and output, which makes GEP more interesting (Guven and Kisi 2013; Parasuraman et al. 2007).
Unlike ANN, GEP is self-parameterizing that creates the model’s structure without any user tuning (Danandeh Mehr et al. 2014). It is also, unlike ANN, which are black-box models that do not describe the physical relationships among various process components (Alavi et al. 2011; Moghassem and Fallahpour 2013) are capable of giving explicit expressions of the relationships between dependent and independent variables (Wang et al. 2016). Technicians with less skill can more easily use those expressions than ANN models (Wang et al. 2016).
As a conclusion, both have similarities in what they can do, but depending on the problem sometimes ANNs will fit fine, sometimes GEP will; i.e., ANN are usually straightforward to implement and work pretty well but their black box nature make them non-user friendly (Wolpert and Macready 1997). On the other hand GEP results are often human friendly, but coding such an algorithm from scratch can be painstaking (Wolpert and Macready 1997). Notwithstanding one has to take a look at the no free lunch theorem (NFLT) which states that two algorithms are equivalent when their performance is averaged across all possible problems (Wolpert and Macready 1997).
4.5 Performance evaluation
To performance evaluation, seven significant statistical criteria based on observed land subsidence were used. The descriptions of these statistical criteria are below:
The coefficient of determination (R2) shows how many percents of the changes in the dependent variable is explained by the independent variable. In other words, the R2 indicates how much the dependent variable changes are affected by the independent variable, and the other changes in the dependent variable are related to other factors. The R2 is always between 0 and 100%.
where Ft is the forecast data, At is the actual data (observed land subsidence), and n is the number of data.
The average of the second power of the deviation of an estimator from its real value is the Mean Squared Error (MSE) defines. This statistic criterion is of particular utility among statisticians (Lehmann and Casella 1998).
where Ft is the forecast data, At is the actual data (observed land subsidence), and n is the number of data.
A robust measure of overlapping data is named the Median Absolute Error (MAE) criteria. This is a more resistant criteria in the field of overload data to the standard deviation (Willmott and Matsuura 2005).
where At is the actual data (observed land subsidence), Ft is the forecast data, and n is the number of data.
The number of deviations of estimated values from the observed values defined as the root mean square error (RMSE). In other words, dispersion of the data is shown in this criteria, and the excellent performance of the model expresses in the smaller RMSE and closer to zero. (Hyndman and Koehler 2006).
where Ft is the forecast data, At is the actual data (observed land subsidence), and n is the number of data.
The other three statistical sensors that GeneXpro Tools software specifically uses to evaluate model performance are Relative Absolute Error (RAE) (Eq. 14), Relative Squared Error (RSE) (Eq. 15) and Root Relative Squared Error (RRSE) (Eq. 16), respectively.
In the above equations, At is the actual data (measured subsidence from the radar images), Ft is the data estimated by the model, and \(\bar{A}\) is the average of real data.
5 Results and discussions
5.1 Factors standardization with GIS fuzzy memberships
Based on that lower amounts have a enhance effect on land subsidence in the DEM, Sy, distance of fault, T, rain, and slope parameters, it must use the “Small” function to fuzzy standardize of these factors (Mohebbi Tafreshi et al. 2019). Figure 11 shows the procedure of fuzzy standardization one of these kinds of parameters using fuzzy “small membership” function.
“Large membership” function was used in those kinds of parameters that higher amounts have a higher effect on the rate of land subsidence (Mohebbi Tafreshi et al. 2019). Accordingly, the parameters of aquifer thickness, bedrock depth, Debi, and G.W. drawdown have been fuzzy standardize by this membership function (Mohebbi Tafreshi et al. 2019). Figure 12 shows the procedure of fuzzy standardization one of these kinds of parameters using fuzzy “large membership” function.
Since the land use, geology, erosion, and soil type have qualitative classes, hence to fuzzy standardize these kinds of parameters, the “linear membership” function was used after the assigned a numerical value to each qualitative class (Table 2). Accordingly, the larger numerical value representative a higher effect on land subsidence (Mohebbi Tafreshi et al. 2019). Figure 13 has been shown the procedure of fuzzy standardization one of these kinds of parameters using fuzzy “linear membership” function. Figure 14 presents all fuzzificated factors.
5.2 Land subsidence susceptibility modeling with GEP
In this study, 70% of data (2919 pixels) used for training and 30% (1251 pixels) for testing were entered into the model, randomly. The statistical measures of the best fitness, R, R2, and RMSE were used to evaluate the performance of the model. The parameters and their rates at various stages of using GeneXproTools software to estimate the subsidence are summarized in Table 3.
Table 4 shows the best mode in the training and testing phases (Figs. 15, 16). This result shows that the use of bedrock level, slop, soil, geology, aquifer thickness parameters, and +, −, *, /, Tanh, X2, Atan, Inv, 3Rt, Ln, NOT, Min2, Max2, Exp, Avg2 operators, will lead to improved model performance and excellent modeling results with real data.
Figure 17 shows the effect of each parameter on land subsidence in F-GEP modeling. Accordingly, the G.W. drawdown parameter had the highest impact, and the Debi of pumping wells parameter had the least effect on the land subsidence in the study area. The results of the GEP modeling on the high influence of G.W. drawdown parameter on the land subsidence are in line with the results of Shadfar et al. (2016) and Shemshaki et al. (2006). These results also are in line with the results of Sundell et al. (2019) that In their paper mentioned the high impact of groundwater and clay thickness parameters on subsidence and its associated hazards.
Since the GEP model can obtain the mathematical relationship between inputs and output variables, so in Table 5, the mathematical and numerical relations are shown. Numerical constants randomly generate each of the graceful chromosome genes and help simplify the equation (Table 6). Given the four genes here, each gene has its sub-tree and its equation, which ultimately yields the final equation concerning the graft function. Figure 18 shows the structure of the desired output model tree.
Since the link function is the sum function, the genes must be aggregated to obtain the answer equation, which results at the end of the final equation (Eq. 17) is as fellow:
Finally, in Fig. 19, land subsidence susceptibility map based on F-GEP model presented.
5.3 Land subsidence susceptibility modeling with ANN
In this study, the ANN was used to model the subsidence. In other words, the ANN receives the input information that contains bedrock level, T, clay thickness, annual rainfall, aquifer thickness, slope based on DEM, Debi zonation of pumping wells, soil type, groundwater drawdown in 20 year, erosion, distance of fault, Sy, land use, and lithological units, and relates them to a mathematical logic with existing responses where subsidence values have occurred. Figure 20 shows the structure of the neural network with 14 inputs, two hidden layers (number of units in the first layer eight and the second layer six), and one output layer used in this study (Table 7). This network used 70% (2919 pixels) used for training and 30% (1251 pixels) of the data for the test. The hyperbolic tangent function was used for the processing elements (neurons) in the hidden layer. R2 and RMSE statistical criteria were used to select the appropriate number of neurons in the middle layer and the desired number of replicates and to evaluate neural network learning and obtain the best results. In order to find the optimal state of the networks, various threshold functions such as sigmoid logistic function, linear function, and hyperbolic sigmoid tangent were used. For each ANN network, in the default combination and with different iterations, the values of R2 and RMSE error coefficient were investigated. The number of iterations (which the RMSE error value of the test data was the lowest, and R2 was the highest) selected as the number of initial iterations.
Figure 21 shows the desired output and actual network output and Fig. 22 shows the correlation coefficient between observational and computational subsidence, the error column for each learning process and the error value for each data for train, validation, and test data. These results (Table 8) indicate an excellent approximation of this network for this study (over 94%).
Figure 23 illustrates the importance of input variables to the neural network in predicting subsidence. According to Fig. 23, variable G.W. drawdown is the most important and variables clay thickness, T, Sy, and geology have the next rank in the subsidence occurrence. The results of the ANN modeling on the high influence of G.W. drawdown parameter on the land subsidence are in line with the results of _ENREF_89 Li and Zhang (2018) that In their paper mentioned the high impact of G.W. Drawdown, clay thickness, and hydraulic properties such as transmissivity on subsidence and its associated hazards.
Finally, in Fig. 24, land subsidence susceptibility map based on the F-ANN model presented.
5.4 Empirical comparison between the F-ANN and the F-GEP methods
As shown in Fig. 25, the overall accuracy of the GEP model with higher amounts of R (0.99861) and R2 (0.99722), and lower amounts of MAE (0.00321), MSE (0.00021), and RMSE (0.01461), is greater than the ANN model. Based on these results, it seems that in non-linear geologic events such as land subsidence, landslide, and flood which are dependent on some other independent parameters of geology, hydrogeology, hydrology, soil and so on, the use of the GEP model leads to better concordance with values of actual data and has more accurate results than the ANN model. This result is in line with Nourani et al. (2014), Luo et al. (2019), and Pashazadeh and Javan (2020) researches in which the concordance with actual data in GEP model is higher than other models including ANN.
Table 9 shows the accuracy of the two models in each of the susceptibility classes. Based on this table, the highest degree of conformity in the ANN model is observed in the very low class and the low, very high, high, and moderate classes are in the next category, respectively. Meanwhile, in the GEP model, the highest degree of conformity is observed in the low class and the very high, moderate, high, and very low classes are in the next category, respectively.
As can be seen in Table 9 and Fig. 26, despite the higher accuracy of the GEP model in most classes, in the very low class, the fit of the ANN model based on the R and R2 statistical criteria is higher (red dashed line). However, according to the RMSE, MAE, and MSE statistical criteria, it is still the GEP model that has higher accuracy (blue dashed line).
Based on the results of model validity, it can be seen that the GEP model using 10 parameters yields better results than the ANN model using 14 parameters. Its cause can be attributed to the “Tree-based” nature of the GEP model. These types of models (like the support vector machine model) have some advantages such as feature selection and pruning (Naghibi et al. 2018) and are very robust to noise (Tien Bui et al. 2016). Feature selection leads to the selection of the most important factors which can be used for splitting and making the decision and makes the results more acceptable (Naghibi et al. 2018)_ENREF_62.
6 Conclusions
In this research, we tried to evaluate the accuracy of GIS-based hybrid F-GEP and F-ANN models for estimating the risk of land subsidence in Varamin aquifer based on radar image data. In order to standardize and fuzzification the factors before importing them into the two ANN and GEP models, the factors were divided into three groups according to their nature and three “large”, “small”, and “linear” fuzzy membership functions were used. Accordingly, DEM, Sy, the distance of fault, T, rain, and slope parameters by the “small” membership function, the parameters of aquifer thickness, bedrock depth, Debi, and G.W. drawdown by the “large” membership function, and the land use, geology, erosion, and soil type by the “linear” membership function, were standardized. For modeling with the F-GEP model, fourteen inputs, and +, −, *, /, Tanh, X2, Atan, Inv, 3Rt, Ln, NOT, Min2, Max2, Exp, Avg2 operators in thirty chromosomes, seven head, and four genes were used. In this regard, for modeling with the F-ANN model, fourteen inputs, two hidden layers (number of units in the first layer eight and the second layer six), and one output layer were used. In both models, 70% data used for training and 30% for testing were entered into the models. The results of the present study showed that overall accuracy based on the values of R, R2, MSE, MAE, and RMSE statistical criteria in the F-GEP model are better than the F-ANN model. Accordingly, the F-GEP model is more accurate than F-ANN model in land subsidence susceptibility modeling. Despite the clearly superiority of the F-GEP model based on R and R2 statistical criteria, the comparison of the susceptibility classes accuracy shows this model did not perform well in zoning and estimating “Very low sensitive regions” class and the F-ANN model performed better. However, the model output show that both models perform very well in estimating and zoning areas with “Very high” and “Low” risk classes of subsidence. The results also showed in both F-ANN and F-GEP models, the groundwater drawdown and the clay thickness parameters had the highest effect on land subsidence in Varamin aquifer. This result is in line with the previous studies in Varamin aquifer.
This study showed that the F-GEP is a powerful programming algorithm in land subsidence susceptibility modeling. It seems that the “Tree-based” nature of the F-GEP model causes the results more accurate.
Using support vector machine (SVM), random forest, and other tree-based algorithms and comparing them with the results of the current research is a suggestion for future work, which may further improve the modeling accuracy, especially in susceptibility classes.
References
Aalipour Erdi M, Malekmohammadi B, Jafari HR (2017) Risk zoning of land subsidence due to groundwater level declining using fuzzy analytical hierarchy process. Iran J Watershed Manag Sci Eng 11:25–34
Abass SA, Mervat ZS, Abdallah AS (2011) Integer programming model for generation expansion planning problem under fuzzy environment. Int J Manag Sci Eng Manag 6:323–327. https://doi.org/10.1080/17509653.2011.10671180
Abbasi A, Khalili K, Behmanesh J, Shirzad A (2019) Drought monitoring and prediction using SPEI index and gene expression programming model in the west of Urmia Lake. Theor Appl Climatol 138:553–567. https://doi.org/10.1007/s00704-019-02825-9
Abdollahi S, Pourghasemi HR, Ghanbarian GA, Safaeian R (2019) Prioritization of effective factors in the occurrence of land subsidence and its susceptibility mapping using an SVM model and their different kernel functions. Bull Eng Geol Environ 78:4017–4034. https://doi.org/10.1007/s10064-018-1403-6
Alavi AH, Aminian P, Gandomi AH, Esmaeili MA (2011) Genetic-based modeling of uplift capacity of suction caissons. Expert Syst Appl 38:12608–12618. https://doi.org/10.1016/j.eswa.2011.04.049
Alimohammadi A (2009) Provision and preparation of provincial planning plan, Studies of natural and environmental resources, analysis of the status of geology, mineral resources and soil. Deputy of Planning, Tehran Governorate
Alkroosh I, Ammash H (2015) Soft computing for modeling punching shear of reinforced concrete flat slabs. Ain Shams Eng J 6:439–448. https://doi.org/10.1016/j.asej.2014.12.001
Arca D, Kutoğlu HŞ, Becek K (2018) Landslide susceptibility mapping in an area of underground mining using the multicriteria decision analysis method. Environ Monit 190:1–14. https://doi.org/10.1007/s10661-018-7085-5
Arjun CR, Kumar A (2011) Neural network estimation of duration of strong ground motion using Japanese earthquake records. Soil Dyn Earthq Eng 31:866–872. https://doi.org/10.1016/j.soildyn.2011.01.001
Atarzadeh AA, Tavana B, Abrazi B (2014) Quantitative and contamination studies of Varamin aquifer (Groundwater studies). Yekom Consulting Engineering
Ayalew L, Yamagishi H, Marui H, Kanno T (2005) Landslides in Sado Island of Japan: Part II. GIS-based susceptibility mapping with comparisons of results from two methods and verifications. Eng Geol 81:432–445. https://doi.org/10.1016/j.enggeo.2005.08.004
Aziz K, Haque MM, Rahman A, Shamseldin AY, Shoaib M (2017) Flood estimation in ungauged catchments: application of artificial intelligence based methods for Eastern Australia. Stoch Environ Res Risk Assess 31:1499–1514. https://doi.org/10.1007/s00477-016-1272-0
Barbulescu A, Popescu-Bodorin N (2019) Assessing the history-based predictability of regional monthly precipitation data using statistical and fuzzy methods. Stoch Environ Res Risk Assess 33:1435–1451. https://doi.org/10.1007/s00477-019-01702-1
Barzegar R, Adamowski J, Moghaddam AA (2016) Application of wavelet-artificial intelligence hybrid models for water quality prediction: a case study in Aji-Chay River, Iran. Stoch Environ Res Risk Assess 30:1797–1819. https://doi.org/10.1007/s00477-016-1213-y
Behyari M, Alizadeh A, Mahmoodi S (2017) Evaluation of the effect active structures on land subsidence risk using multi-criteria decision models. J Adv Appl Geology 7(49–5):6. https://doi.org/10.22055/aag.2017.13229
Berberian M, King GCP (1981) Towards a paleogeography and tectonic evolution of Iran. Can J Earth Sci 18:210–265. https://doi.org/10.1139/e81-019
Bianchini S, Solari L, Del Soldato M, Raspini F, Montalti R, Ciampalini A, Casagli N (2019) Ground subsidence susceptibility (GSS) mapping in Grosseto Plain (Tuscany, Italy) based on satellite InSAR data using frequency ratio and fuzzy logic. Rem Sens 11:2015. https://doi.org/10.3390/rs11172015
Burbey TJ (2002) The influence of faults in basin-fill deposits on land subsidence, Las Vegas Valley, Nevada, USA. Hydrogeol J 10:525–538. https://doi.org/10.1007/s10040-002-0215-7
Calderhead AI, Therrien R, Rivera A, Martel R, Garfias J (2011) Simulating pumping-induced regional land subsidence with the use of InSAR and field data in the Toluca Valley, Mexico. Adv Water Resour 34:83–97. https://doi.org/10.1016/j.advwatres.2010.09.017
Chanapathi T, Thatikonda S, Pandey VP, Shrestha S (2019) Fuzzy-based approach for evaluating groundwater sustainability of Asian cities. Sustain Cities Soc 44:321–331. https://doi.org/10.1016/j.scs.2018.09.027
Chen Y, Shu L, Burbey TJ (2013) Composite subsidence vulnerability assessment based on an index model and index decomposition method. Hum Ecol Risk Assess 19:674–698. https://doi.org/10.1080/10807039.2012.691405
Chen B, Gong H, Li X, Lei K, Zhu L, Gao M, Zhou C (2016) Characterization and causes of land subsidence in Beijing, China. Int J Rem Sens 38:808–826. https://doi.org/10.1080/01431161.2016.1259674
Chen B et al (2019) Land subsidence lagging quantification in the main exploration aquifer layers in Beijing plain, China. Int J Appl Earth Obs Geoinf 75:54–67. https://doi.org/10.1016/j.jag.2018.09.003
Dai FC, Lee CF (2001) Terrain-based mapping of landslide susceptibility using a geographical information system: a case study. Can Geotech J 38:911–923. https://doi.org/10.1139/t01-021
Danandeh Mehr A, Kahya E, Yerdelen C (2014) Linear genetic programming application for successive-station monthly streamflow prediction. Comput Geosci 70:63–72. https://doi.org/10.1016/j.cageo.2014.04.015
De Wiest RJM (1966) On the storage coefficient and the equations of groundwater flow. J Geophys Res 1896–1977(71):1117–1122. https://doi.org/10.1029/JZ071i004p01117
Dehghani M, Zoej MJV, Entezam I (2013) Neural network modelling of Tehran land subsidence measured by persistent scatterer interferometry. Photogrammetrie Fernerkundung Geoinf 2013:5–17. https://doi.org/10.1127/1432-8364/2013/0154
Dey P, Sarkar A, Kumar Das A (2015) Prediction of unsteady mixed convection over circular cylinder in the presence of nanofluid—a comparative study of ann and gep. J Nav Architect Mar Eng 12:57–71. https://doi.org/10.3329/jname.v12i1.21812
Elalfy D, Gad W, Ismail R (2018) A hybrid model to predict best answers in question answering communities. Egypt Inform J 19:21–31. https://doi.org/10.1016/j.eij.2017.06.002
Elhatip H, Hınıs MA, Gülbahar N (2008) Evaluation of the water quality at Tahtali dam watershed in Izmir-Turkey by means of statistical methodology. Stoch Environ Res Risk Assess 22:391–400. https://doi.org/10.1007/s00477-007-0127-0
Ferreira C (2001) Gene expression programming: a new adaptive algorithm for solving problems. arXiv preprint cs/0102027 http://www.gene-expression-programming.com/webpapers/GEP.pdf
Ferreira C (2006) Gene expression programming: mathematical modeling by an artificial intelligence vol 21. Studies in computational intelligence. Springer, Berlin. https://doi.org/10.1007/3-540-32849-1
Galloway DL, Burbey TJ (2011) Review: regional land subsidence accompanying groundwater extraction. Hydrogeol J 19:1459–1486. https://doi.org/10.1007/s10040-011-0775-5
Ghorbanzadeh O, Blaschke T, Aryal J, Gholaminia K (2018) A new GIS-based technique using an adaptive neuro-fuzzy inference system for land subsidence susceptibility mapping. Spat Sci. https://doi.org/10.1080/14498596.2018.1505564
Guven A, Kisi O (2013) Monthly pan evaporation modeling using linear genetic programming. J Hydrol 503:178–185. https://doi.org/10.1016/j.jhydrol.2013.08.043
Hu RL, Yue ZQ, Wang LC, Wang SJ (2004) Review on current status and challenging issues of land subsidence in China. Eng Geol 76:65–77. https://doi.org/10.1016/j.enggeo.2004.06.006
Hu L et al (2019) Land subsidence in Beijing and its relationship with geological faults revealed by Sentinel-1 InSAR observations. Int J Appl Earth Obs Geoinf 82:101886. https://doi.org/10.1016/j.jag.2019.05.019
Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22:679–688. https://doi.org/10.1016/j.ijforecast.2006.03.001
IIEES (2010) An analysis of source parameters of earthquakes in Tehran region. International Institute of Earthquake Engineering and Seismology. http://www.iiees.ac.ir/en/?s=varamin. Accessed 6 July 2019
Ilia I, Loupasakis C, Tsangaratos P (2018) Land subsidence phenomena investigated by spatiotemporal analysis of groundwater resources, remote sensing techniques, and random forest method: the case of Western Thessaly, Greece. Environ Monit 190:623. https://doi.org/10.1007/s10661-018-6992-9
Jahangoshai Rezaee M, Yousefi S, Eshkevari M, Valipour M, Saberi M (2020) Risk analysis of health, safety and environment in chemical industry integrating linguistic FMEA, fuzzy inference system and fuzzy DEA. Stoch Environ Res Risk Assess 34:201–218. https://doi.org/10.1007/s00477-019-01754-3
Jamshidi S et al (2019) Combining gene expression programming and genetic algorithm as a powerful hybrid modeling approach for pear rootstocks tissue culture media formulation. Plant Methods 15:136. https://doi.org/10.1186/s13007-019-0520-y
Karsli F, Atasoy M, Yalcin A, Reis S, Demir O, Gokceoglu C (2009) Effects of land-use changes on landslides in a landslide-prone area (Ardesen, Rize, NE Turkey). Environ Monit 156:241. https://doi.org/10.1007/s10661-008-0481-5
Kisi O, Khosravinia P, Nikpour MR, Sanikhani H (2019) Hydrodynamics of river-channel confluence: toward modeling separation zone using GEP, MARS, M5 Tree and DENFIS techniques. Stoch Environ Res Risk Assess 33:1089–1107. https://doi.org/10.1007/s00477-019-01684-0
Lashkaripour G, Rostami Barani H, Kohandel A, Torshizi H (2006) Decline in groundwater levels and land subsidence in the Kashmar plain. Paper presented at the international conference on earth sciences, Tehran, Iran. https://www.researchgate.net/publication/294688542_Decline_in_groundwater_levels_and_land_subsidence_in_the_Kashmar_plain. Accessed 6 July 2019
Leduc R, Ouldali S (1990) Probabilistic modeling of aerated lagoons: a comparison of methodologies. Stoch Hydrol Hydraul 4:65–81. https://doi.org/10.1007/BF01547733
Lehmann EL, Casella G (1998) Theory of point estimation, 2nd edn. Springer, New York. https://doi.org/10.1007/b98854
Li L, Zhang M (2018) Inverse modeling of interbed parameters and transmissivity using land subsidence and drawdown data. Stoch Environ Res Risk Assess 32:921–930. https://doi.org/10.1007/s00477-017-1396-x
Lixin Y, Fang Z, He X, Shijie C, Wei W, Qiang Y (2011) Land subsidence in Tianjin, China. Environ Earth Sci 62:1151–1161. https://doi.org/10.1007/s12665-010-0604-5
Lohman S (1961) Compression of elastic artesian aquifers. US Geol Surv Prof Pap 424-B:47–49
Luo Z, Luo Z, Qin Y, Wen L, Ma S, Dai Z (2019) Developing new tree expression programing and artificial bee colony technique for prediction and optimization of landslide movement. Eng Comput. https://doi.org/10.1007/s00366-019-00754-9
Mahmoudpour M, Khamehchiyan M, Nikudel M, Gassemi M (2013) Characterization of regional land subsidence induced by groundwater withdrawals in Tehran, Iran. Geopersia 3:49–62. https://doi.org/10.22059/jgeope.2013.36014
Manafiazar A, Khamehchiyan M, Nadiri A (2019) Comparison of Vulnerability of the Southwest Tehran Plain Aquifer with Simple Weighting Model (ALPRIFT Model) and Genetic Algorithm (GA). Kharazmi J Earth Sci 4:199–212
Maroufpoor S, Shiri J, Maroufpoor E (2019) Modeling the sprinkler water distribution uniformity by data-driven methods based on effective variables. Agric Water Manag 215:63–73. https://doi.org/10.1016/j.agwat.2019.01.008
Mehdizadeh S, Behmanesh J, Khalili K (2016) Comparison of artificial intelligence methods and empirical equations to estimate daily solar radiation. J Atmos Sol Terr Phys 146:215–227. https://doi.org/10.1016/j.jastp.2016.06.006
Minderhoud PSJ, Coumou L, Erban LE, Middelkoop H, Stouthamer E, Addink EA (2018) The relation between land use and subsidence in the Vietnamese Mekong delta. Sci Total Environ 634:715–726. https://doi.org/10.1016/j.scitotenv.2018.03.372
Moeeni H, Bonakdari H (2017) Forecasting monthly inflow with extreme seasonal variation using the hybrid SARIMA-ANN model. Stoch Environ Res Risk Assess 31:1997–2010. https://doi.org/10.1007/s00477-016-1273-z
Moghassem A, Fallahpour A (2013) Yarn strength modelling using adaptive neuro-fuzzy inference system (ANFIS) and gene expression programming (GEP). J Eng Fibers Fabr. https://doi.org/10.1177/155892501300800409
Mohammady M, Pourghasemi HR, Amiri M (2019) Land subsidence susceptibility assessment using random forest machine learning algorithm. Environ Earth Sci 78:503. https://doi.org/10.1007/s12665-019-8518-3
Mohammadzadeh D, Kazemi S-F, Mosavi A, Nasseralshariati E, Tah J (2019) Prediction of compression index of fine-grained soils using a gene expression programming model. Infrastructures 4:26. https://doi.org/10.3390/infrastructures4020026
Mohebbi Tafreshi A, Mohebbi Tafreshi G, Bijeh Keshavarzi MH (2018) Qualitative zoning of groundwater to assessment suitable drinking water using fuzzy logic spatial modelling via GIS. Water Environ J 32:607–620. https://doi.org/10.1111/wej.12358
Mohebbi Tafreshi G, Nakhaei M, Lak R (2019) Land subsidence risk assessment using GIS fuzzy logic spatial modeling in Varamin aquifer, Iran. GeoJournal. https://doi.org/10.1007/s10708-019-10129-8
Mokhtari H, Espahbod M (2009) The Investigation of hydrodynamic parameters potentiality of the Varamin Plan regarding the variation of salinity gradient. J Earth 4:27–47
Motagh M, Djamour Y, Walter TR, Wetzel H-U, Zschau J, Arabi S (2007) Land subsidence in Mashhad Valley, northeast Iran: results from InSAR, levelling and GPS. Geophys J Int 168:518–526. https://doi.org/10.1111/j.1365-246X.2006.03246.x
Mousavi SM, Shamsai A, Naggar MHE, Khamehchian M (2001) A GPS-based monitoring program of land subsidence due to groundwater withdrawal in Iran. Can J Civ Eng 28:452–464. https://doi.org/10.1139/l01-013
Naghibi SA, Pourghasemi HR, Abbaspour K (2018) A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS. Theor Appl Climatol 131:967–984. https://doi.org/10.1007/s00704-016-2022-4
Nakhaei M, Mohebbi Tafreshi A, Mohebbi Tafreshi G (2019) Modeling and predicting changes of TDS concentration in Varamin aquifer using GMS software. J Adv Appl Geol 9:25–37. https://doi.org/10.22055/aag.2019.27539.1903
Nameghi H, Hosseini SM, Sharifi MB (2013) An analytical procedure for estimating land subsidence parameters using field data and InSAR images in Neyshabur plain. Sci Q J Iran Assoc Eng Geol 6:33–50
Nejatijahromi Z, Nassery HR, Hosono T, Nakhaei M, Alijani F, Okumura A (2019) Groundwater nitrate contamination in an area using urban wastewaters for agricultural irrigation under arid climate condition, southeast of Tehran, Iran. Agric Water Manag 221:397–414. https://doi.org/10.1016/j.agwat.2019.04.015
NGOI (2008) Topography map (1:50000). National Geographic Organization of Iran. http://www.ngo-org.ir/. Accessed 6 July 2019
Nourani V, Kisi Ö, Komasi M (2011) Two hybrid artificial intelligence approaches for modeling rainfall–runoff process. J Hydrol 402:41–59. https://doi.org/10.1016/j.jhydrol.2011.03.002
Nourani V, Pradhan B, Ghaffari H, Sharifi SS (2014) Landslide susceptibility mapping at Zonouz Plain, Iran using genetic programming and comparison with frequency ratio, logistic regression, and artificial neural network models. Nat Hazards 71:523–547. https://doi.org/10.1007/s11069-013-0932-3
Oh HJ, Lee S (2010) Assessment of ground subsidence using GIS and the weights-of-evidence model. Eng Geol 115:36–48. https://doi.org/10.1016/j.enggeo.2010.06.015
Oh HJ, Syifa M, Lee CW, Lee S (2019) Land subsidence susceptibility mapping using bayesian, functional, and meta-ensemble machine learning models. Appl Sci 9:1–17. https://doi.org/10.3390/app9061248
Pacheco J, Arzate J, Rojas E, Arroyo M, Yutsis V, Ochoa G (2006) Delimitation of ground failure zones due to land subsidence using gravity data and finite element modeling in the Querétaro valley, México. Eng Geol 84:143–160. https://doi.org/10.1016/j.enggeo.2005.12.003
Parasuraman K, Elshorbagy A, Carey SK (2007) Modelling the dynamics of the evapotranspiration process using genetic programming. Hydrol Sci J 52:563–578. https://doi.org/10.1623/hysj.52.3.563
Park I, Choi J, Jin Lee M, Lee S (2012) Application of an adaptive neuro-fuzzy inference system to ground subsidence hazard mapping. Comput Geosci 48:228–238. https://doi.org/10.1016/j.cageo.2012.01.005
Pashazadeh A, Javan M (2020) Comparison of the gene expression programming, artificial neural network (ANN), and equivalent Muskingum inflow models in the flood routing of multiple branched rivers. Theor Appl Climatol 139:1349–1362. https://doi.org/10.1007/s00704-019-03032-2
Pham A-D, Hoang N-D, Nguyen Q-T (2016) Predicting compressive strength of high-performance concrete using metaheuristic-optimized least squares support vector regression. J Comput Civ Eng 30:06015002. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000506
Poland JF (1984) Guidebook to studies of land subsidence due to groundwater withdrawal. United Nations Educational, Scientific and Cultural Organization, Paris, Studies and Reports in Hydrology 40:305. https://unesdoc.unesco.org/in/rest/annotationSVC/DownloadWatermarkedAttachment/attach_import_4d651c8f-42bd-478e-8f0e-318b0ef13ec2?_=065167engo.pdf
Pourghasemi HR, Mohseni Saravi M (2019) 6-Land-subsidence spatial modeling using the random forest data-mining technique. In: Pourghasemi HR, Gokceoglu C (eds) Spatial modeling in GIS and R for Earth and environmental sciences. Elsevier, Amsterdam, pp 147–159. https://doi.org/10.1016/B978-0-12-815226-3.00006-5
Pradhan B, Abokharima MH, Jebur MN, Shafapour Tehrany M (2014) Land subsidence susceptibility mapping at Kinta Valley (Malaysia) using the evidential belief function model in GIS. Nat Hazards 73:1019–1042. https://doi.org/10.1007/s11069-014-1128-1
Putra DPE, Setianto A, Keokhampui K, Fukuoka H (2011) Land subsidence risk assessment in Karst Region, Case Study: Rongkop, Gunung Kidul, Yogyakarta-Indonesia In: Mitteilungen zur Ingenieurgeologie und Hydrogeologie-Festschrift zum 60. Geburtstag von Univ.Prof. Dr. Rafig Azzam. RWTH Aachen University, German, pp 39–50. https://repository.ugm.ac.id/id/eprint/134971. Accessed 6 July 2019
Rafie M, Samimi Namin F (2015) Prediction of subsidence risk by FMEA using artificial neural network and fuzzy inference system. Int J Min Sci Technol 25:655–663. https://doi.org/10.1016/j.ijmst.2015.05.021
Rahmati O, Golkarian A, Biggs T, Keesstra S, Mohammadi F, Daliakopoulos IN (2019) Land subsidence hazard modeling: machine learning to identify predictors and the role of human activities. J Environ Manag 236:466–480. https://doi.org/10.1016/j.jenvman.2019.02.020
Raines GL, Sawatzky DL, Bonham-Carter GF (2010) New fuzzy logic tools in ArcGIS 10. http://www.esri.com/news/arcuser/0410/files/fuzzylogic.pdf. Accessed 6 July 2019
Rajabi AM, Ghorbani E (2016) Land subsidence due to groundwater withdrawal in Arak plain, Markazi province, Iran. Arab J Geosci 9:1–7. https://doi.org/10.1007/s12517-016-2753-7
Ranjbar A, Ehteshami M (2019) Development of an Uncertainty Based Model to Predict Land Subsidence Caused by Groundwater Extraction (Case Study: Tehran Basin). Geotech Geol Eng 37:3205–3219. https://doi.org/10.1007/s10706-019-00837-w
Rezaee P (2016) Forecast locations at risk of subsidence plain Kermanshah. J Spat Plan 20:235–251
Ross TJ (2005) Fuzzy logic with engineering applications. Wiley, New York
Saberi M, Mirtalaie MS, Hussain FK, Azadeh A, Hussain OK, Ashjari B (2013) A granular computing-based approach to credit scoring modeling. Neurocomputing 122:100–115. https://doi.org/10.1016/j.neucom.2013.05.020
Sadeghi A, Fonodi M, Davari M, Nourozi M, Zakili F, Keihani A (2006) One hundred thousandth geology map of Varamin. Geological Survey and Mineral Exploration of Iran (in Pesian). https://gsi.ir/fa/map/207/-%D9%88%D8%B1%D8%A7%D9%85%DB%8C%D9%86. Accessed 6 July 2019
Samui P (2008) Prediction of friction capacity of driven piles in clay using the support vector machine. Can Geotech J 45:288–295. https://doi.org/10.1139/T07-072
Samui P (2014) Vector machine techniques for modeling of seismic liquefaction data. Ain Shams Eng J 5:355–360. https://doi.org/10.1016/j.asej.2013.12.004
SCWMRI (2010) Erosion, land use and soil maps (1:250000). Soil Conservation and Watershed Management Research Institute. https://www.environmental-expert.com/companies/soil-conservation-and-watershed-management-research-institute-scwmri-24937. Accessed 6 July 2019
Sentinel-1 (2015) https://sentinel.esa.int/web/sentinel/missions/sentinel-1. Accessed 6 July 2019
Shadfar S, Nasiri E, Chitgar S, Ahmadi A (2016) Hazard zonation of land subsidence using analytical hierarchy process (AHP) case study (city of Buin Zahra). Territory 12:101–116
Shemshaki A, Boulourchi MJ, Entezam Soltani I (2006) The study of land subsidence in Tehran plain and its casual factors. Paper presented at the 24th Earth Sciences meeting, Geological survey and mineral explorations of Iran. https://www.civilica.com/Paper-GSI24-GSI24_071.html. Accessed 6 July 2019
Singh O, Su EC-Y (2016) Prediction of HIV-1 protease cleavage site using a combination of sequence, structural, and physicochemical features. BMC Bioinform 17:478. https://doi.org/10.1186/s12859-016-1337-6
Suh J, Choi YE, Park H-D, Yoon S-H, Go W-R (2013) Subsidence hazard assessment at the Samcheok Coalfield, South Korea: a case study using GIS. Environ Eng Geosci 19:69–83
Sundell J, Haaf E, Tornborg J, Rosén L (2019) Comprehensive risk assessment of groundwater drawdown induced subsidence. Stoch Environ Res Risk Assess 33:427–449. https://doi.org/10.1007/s00477-018-01647-x
Taheri Z, Barzghari G, Dideban K (2018) A framework to estimation of potential subsidence of the aquifer using algorithm genetic. Iran Water Resour Res 14:182–194
Taravatrooy N, Nikoo MR, Sadegh M, Parvinnia M (2018) A hybrid clustering-fusion methodology for land subsidence estimation. Nat Hazards 94:905–926. https://doi.org/10.1007/s11069-018-3431-8
Terzaghi K (1925) Principles of soil mechanics, IV—Settlement and consolidation of clay vol 95. http://scholar.google.com/scholar_lookup?hl=en&volume=95&publication_year=1925&pages=874-878&journal=Eng.+News+Rec.&issue=3&author=K.+Terzaghi&title=Principles+of+soil+mechanics%2C+IV%2C+Settlement+and+consolidation+of+clay. Accessed 6 July 2019
Tien Bui D, Pham BT, Nguyen QP, Hoang N-D (2016) Spatial prediction of rainfall-induced shallow landslides using hybrid integration approach of least-squares support vector machines and differential evolution optimization: a case study in Central Vietnam. Int J Digit Earth 9:1077–1097. https://doi.org/10.1080/17538947.2016.1169561
Tien Bui D et al (2018) Land subsidence susceptibility mapping in South Korea using machine learning algorithms. Sensors (Basel) 18:1–20. https://doi.org/10.3390/s18082464
Tongal H, Booij MJ (2017) Quantification of parametric uncertainty of ANN models with GLUE method for different streamflow dynamics. Stoch Environ Res Risk Assess 31:993–1010. https://doi.org/10.1007/s00477-017-1408-x
TRWA (2018) Report of groundwater resources studies in Varamin Area (in Persian).Tehran Regional Water Authority
UNESCO (2018) Proposal for the establishment of the land subsidence international initative (LaSII). United Nations Educational, Scientific and Cultural Organization. International Hydrological Programme, Paris. https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=2ahUKEwit4vSPqs3jAhUisaQKHe_NA-kQFjABegQIAhAC&url=https%3A%2F%2Fen.unesco.org%2Fsites%2Fdefault%2Ffiles%2Fic-xiii_ref_5_land_subsidence.pdf&usg=AOvVaw0_RGemY4ifoJiBQDz7dBnN. Accessed 6 July 2019
USGS (2019a) Land subsidence in California. Cause and effect. United State Geological Survey. https://www.usgs.gov/centers/ca-water-ls/science/cause-and-effect. Accessed 6 July 2019
USGS (2019b) Land subsidence. United State Geological Survey. https://www.usgs.gov/special-topic/water-science-school/science/land-subsidence?qt-science_center_objects=0#qt-science_center_objects
Waltham AC (1989) Ground subsidence. Blackie Glasgow. https://scholar.google.com/scholar_lookup?title=Ground%20subsidence&author=AC.%20Waltham&publication_year=1989. Accessed 6 July 2019
Wang B, Chen Z (2015) A model-based fuzzy set-OWA approach for integrated air pollution risk assessment. Stoch Environ Res Risk Assess 29:1413–1426. https://doi.org/10.1007/s00477-014-0994-0
Wang P, Hu JC (2019) A hybrid model for EEG-based gender recognition. Cogn Neurodyn 13:541–554. https://doi.org/10.1007/s11571-019-09543-y
Wang G, Qin L, Li G, Chen L (2009) Landfill site selection using spatial information technologies and AHP: a case study in Beijing, China. J Environ Manag 90:2414–2421. https://doi.org/10.1016/j.jenvman.2008.12.008
Wang W, Ruan W, Li Q (2010) Fuzzy decision tree construction with gene expression programming. In: 2010 IEEE international conference on intelligent systems and knowledge engineering, 15–16 Nov. 2010, pp 244-248. https://doi.org/10.1109/ISKE.2010.5680877
Wang S, Fu Z-y, Chen H-s, Nie Y-p, Wang K-l (2016) Modeling daily reference ET in the karst area of northwest Guangxi (China) using gene expression programming (GEP) and artificial neural network (ANN). Theor Appl Climatol 126:493–504. https://doi.org/10.1007/s00704-015-1602-z
Wang HW, Lin CW, Yang CY, Ding CF, Hwung HH, Hsiao SC (2018) Assessment of land subsidence and climate change impacts on inundation hazard in Southwestern Taiwan. Irrigat Drain 67:26–37. https://doi.org/10.1002/ird.2206
Wang Y, Wang Z, Cheng W (2019) A review on land subsidence caused by groundwater withdrawal in Xi’an, China. Bull Eng Geol Environ 78:2851–2863. https://doi.org/10.1007/s10064-018-1278-6
Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82. https://doi.org/10.3354/cr030079
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82. https://doi.org/10.1109/4235.585893
WRI (2014) Prediction of subsidence due to groundwater resource utilization using combined modeling and interferometric technique in radar satellite imagery. Water Research Institute. Iran Ministry of Energy http://wrr-wri.ir/wp-content/uploads/2017/12/Qom.pdf. Accessed 6 July 2019
Yu HM, Wu YX, Shen JS, Zhou AN (2018) Assessment of social-economic risk of Chinese dual land use system using fuzzy AHP. Sustainability 10:2541. https://doi.org/10.3390/su10072451
Zadeh LA (1965) Fuzzy sets. Fuzzy Sets Inf Control 8:338–353. https://doi.org/10.1016/S0019-9958(65)90241-X
Zaman Zad Ghavidel S, Montaseri M (2014) Application of different data-driven methods for the prediction of total dissolved solids in the Zarinehroud basin. Stoch Environ Res Risk Assess 28:2101–2118. https://doi.org/10.1007/s00477-014-0899-y
Acknowledgements
This study was supported by the Research Institute for Earth Sciences (RIES), Geological Survey of Iran (GSI) (No. 98-P-T-114). The authors also are thankful to Kharazmi University, Dr. Shemshaki and Dr. Morsali in GSI, and Dr. Heydarian and Dr. Mokhtari in Regional Water Company of Tehran (RWCT) for providing the necessary data to carry out this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict interest.
Ethical standards
It is confirmed that this manuscript is an original work of the authors and has not been published or under review in another refereed journal, and is not published anywhere.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mohebbi Tafreshi, G., Nakhaei, M. & Lak, R. A GIS-based comparative study of hybrid fuzzy-gene expression programming and hybrid fuzzy-artificial neural network for land subsidence susceptibility modeling. Stoch Environ Res Risk Assess 34, 1059–1087 (2020). https://doi.org/10.1007/s00477-020-01810-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-020-01810-3