Introduction

Research on land subsidence is topical in engineering and environmental research, as its impacts are observed in many countries including China, Iran, Japan, Singapore, Thailand and the USA (Ye et al. 2015; Nadiri et al. 2018a; Hayashi et al. 2009; Phien-wej et al. 2006; Zhang et al. 2018; Goh et al. 2019; Galloway et al. 1999). Some engineering impacts of subsidence are outlined by Kihm et al. (2007), which include the following: (i) a collapse of water wells and the breakage of their casings and ancillary works by compacted soil due to decline of the water table in aquifers; (ii) the requirement for realignment of diversion structures in open channels; (iii) redistribution of stresses and strains in building works and structures creating the potential for excessive forces and subsequent failures. Arguably, aquifer management practices should feed information to planning stages of development works and the paper investigates accuracies of identifying subsidence hotspots, with potential to serve as a planning and management tool.

Studies on land subsidence are fragmented, as their focus is on settlements of ground surface triggered by movements or removals of groundwater by anthropogenic activities (Poland et al. 1972; Anumba and Scot 2001). These studies are triggered by a host of processes including the following: (i) excessive water abstraction from aquifers, e.g. in Thailand, Spain, Iran and China (Lorphensri et al. 2011; Schmid et al. 2014; Mateos et al. 2017; Nadiri et al. 2018a; Wang et al. 2019); (ii) degradation in organic soils, e.g. Venice, Italy (Tosi et al. 2013); (iii) extraction of underground fossil fuels, e.g. Wilmington oil reservoir, California (Colazas and Strehle 1995); (iv) karstification due to dissolution of limestone, e.g. Turkey and Spain (Desir et al. 2018; Doğan 2005); (v) subsurface mining, e.g. Bethlehem Mines Corporation, central Pennsylvania (Sossong 1973) and Kamptee Colliery, India (Soni et al. 2007); (vi) braced excavation in residual soils with groundwater drawdown (Zhang et al. 2018). Subsidence is a feature since the Industrial Revolution (1750–1950) that often impacts groundwater resources by withdrawals through aquifer pumpage. There is a gap in the state-of-the-art due to fragmentations in the techniques in each field of study with no cross-cutting techniques.

Land subsidence hazards often stem from two main processes: (i) geogenic processes, e.g. Avila-Olivera and Garduño-Monroy 2008; Cui and Tang 2010; Gu et al. 2018; and (ii) anthropogenic processes, e.g. Anumba and Scot 2001; Poland et al. 1972; and Wang et al. 2019). Each process often depends on a set of variables and their collective product may give rise toward subsidence. Past researches use different techniques to investigate and monitor subsidence and settlement by a range of approaches, e.g. Interferometric Synthetic Aperture Radar (InSAR) (Fernández-Camacho et al. 2015), global positioning system (GPS) (Sato et al. 2007), field evidence and historical data (Psimoulis et al. (2007), ground-penetrating radar (GPR) (Avila-Olivera and Garduño-Monroy 2008). These techniques are not analytical and therefore their efficacy is limited to past incidents but the paper explores a technique that learns from site data.

The ALPRIFT framework, introduced recently by Nadiri et al. (2018a), integrates both geogenic and anthropogenic dimensions to identify potential hotspots for subsidence from local data. It is a mapping technique, which models Subsidence Vulnerability Indices (SVI) using seven general-purpose data layers, as well as those from satellite remote sensing, including direct measurements for ground-truthing. ALPRIFT is particularly suitable to study areas with sparse data to map proactively hotspots vulnerable to subsidence by a modelling strategy to produce defensible information from the available data, though it is not capable of predicting future incidents.

The authors’ research on subsidence is innovative by introducing the ALPRIFT framework and investigating its performances, which parallels their ongoing program of research on groundwater vulnerability framework using DRASTIC by Aller et al. (1987). Both are mapping techniques but DRASTIC maps aquifer vulnerability caused by anthropogenic contaminants, whereas ALPRIFT does that for aquifer subsidence. Both ALPRIFT and DRASTIC frameworks involve a scoring system in terms of rates to account for relative significance of local conditions and by weights to account for relative significance of each data layer. The rates and weights of both frameworks are not transferrable to each other and they have no theoretical or empirical basis but are tested using performance metrics. The authors’ experience in implementing DRASTIC frameworks is that more defensible vulnerability mapping is feasible by learning the values of the weights by formulating strategies using artificial intelligence (AI). As ALPRIFT is a field of research at its infancy, investigating various strategies to learn the values of rates and/or weights from local data is now wide open. The paper investigates one such strategy by using AI to run multiple models (AIMM) at two levels and this strategy in itself is topical, as follows.

Learning at Level 1

This comprises the application of three variations of fuzzy logic (FL), which evolved as follows: (i) Zadeh (1965) presented the original of FL in 1965; (ii) this was transformed into a modelling capability using subjectively prescribed rules; and (iii) various algorithms emerged to learn the rules from the data. The paper uses three such algorithms, which are (i) Sugeno FL (SFL), see Sugeno (1985); (ii) Mamdani FL (MFL), see Mamdani (1976); and Larsen FL (LFL), see (Larsen 1980). Traditional FL modelling practices select one of these models as the ‘superior’ one but the paper regards them as fit-for-purpose and reuses them in the next levels. Notably, Khatibi et al. (2011a, b, 2012, 2017) criticise traditional approaches preoccupied with selecting best models.

Learning at Level 2

The authors have devised various further learning strategies (Nadiri et al. 2013, 2017a, 2018a, b, c) and the paper investigates the following: (i) at Level 1, SFL, MFL and LFL are adopted to optimise the weight values and make predictions; (ii) at Level 2, these are regarded as multiple models (MM) and are reused as inputs to another AI model to run MMs and, hence, the AIMM strategy. This study uses GEP at Level 2 and this is to be outlined in due course.

The need for identifying hotspots is increasingly necessary for Ardabil plain, Ardabil province, northwest Iran, for several reasons, as follows: (i) Based on the data provided by the Ardabil Regional Water Authority (ARWA), average decline of groundwater level is greater than 16 m in the past 10 years (from 2005 to 2015) at an approximate rate of 2 m/annum. (ii) One of the main economic sectors of this region is agriculture but traditional irrigation practices have transformed without any plans since the availability of tubewell pumping in 2000 and this corresponds to over-abstraction. Owing to the absence of an effective planning system with participation, the situation has given rise to the emergence of ‘the tragedy of the commons’, a concept that is introduced by Hardin (1968). With no effective basin management plans, anthropogenic encroachments have concealed opportunities but distressed water table by pumpage. These collectively increase vulnerability to induce land subsidence by exposing soil characteristics to the risk of collapsing under upper soil load. Subsidence risk in Ardabil plain is now comparable to other anthropogenic risks and the detection of hotspot areas vulnerable to subsidence in this research can be the first step towards their possible management in the future.

Methodology

Basic ALPRIFT framework for subsidence and its measurement

The heuristic capability of the basic ALPRIFT framework (BAF), introduced by Nadiri et al. (2018a), is capable of mapping Subsidence Vulnerability Indices (SVI) of any aquifer/plain system by pooling together seven data layers, which can serve as a proxy of subsidence. These data layers and the mathematical formulation for BAF SVI are detailed by Nadiri et al. (2018a), which may be referred to for their basic information. Their reproduction here and in the “Input data layers” section.

Soil texture and structure of a study area are captured by reducing the overall geomorphological, hydrogeological characteristics of the set of seven data layers, which are assumed to be independent of one another, comprising aquifer media (A), land use (L), pumping of groundwater (P), recharge (R), impacts instigated by aquifer thickness (I), fault distance (F) and water table variation (T). They are further categorised as follows: A, the medium, where subsidence may prevail; LP, these account for anthropogenic contributions to subsidence; RIFT, these account for intrinsic tendencies towards subsidence. The assigned BAF weights and rates are shown in Fig. 4, but for more details, see Nadiri et al. (2018a).

The methodology is captured in the flowchart in Fig. 1, which presents the procedure for assessing SVI through the following steps: (i) discretise the plain under study in GIS into pixels; (ii) set up an array of seven raw data layers for each pixel, each of which comprises their prescribed values; (iii) assign the values of the rates at each pixel; and (iv) assign weights to each data layer. SVI is expressed as follows:

$$ \mathrm{SVI}={\mathrm{A}}_{\mathrm{r}}{\mathrm{A}}_{\mathrm{w}}+{\mathrm{L}}_{\mathrm{r}}{\mathrm{L}}_{\mathrm{w}}+{\mathrm{P}}_{\mathrm{r}}{\mathrm{P}}_{\mathrm{w}}+{\mathrm{R}}_{\mathrm{r}}{\mathrm{R}}_{\mathrm{w}}+{\mathrm{I}}_{\mathrm{r}}{\mathrm{I}}_{\mathrm{w}}+{\mathrm{F}}_{\mathrm{r}}{\mathrm{F}}_{\mathrm{w}}+{\mathrm{T}}_{\mathrm{r}}{\mathrm{T}}_{\mathrm{w}} $$
(1)

where SVI is the Subsidence Vulnerability Index, uppercase acronyms represent the ALPRIFT data layers, subscripts ‘r’ denote rates and ‘w’ denotes weights.

Fig. 1
figure 1

Flowchart illustrating BAF and AIMM (FLs-GEP)

Microwave remote sensing satellite images are also employed in this study for detecting land subsidence based on interferometry technique. For this goal, Sentinel-1A (Interferometric Synthetic Aperture Radar (InSAR)) was obtained for the investigation period from August 2015 to August 2016. The initial InSAR images were acquired from https://scihub.copernicus.eu/, where the site provides open access; the images have a vertical resolution of 10 m, and as such, they are unsuitable for a direct assessment of subsidence on their own. The interferometry approach was applied to drive land subsidence between August 2015 and August 2016. The research implemented interferometry technique, which is discussed in greater detail by Nadiri et al. (2018a). In this context, the methodological scheme for InSAR data processing included four main steps as follows: (i) analysing and computing a set of data points at the observation wells with known geo-referenced locations; (ii) comparing land subsidence at known points and the InSAR image in GIS environment in order to obtain the InSAR pixel values at each known point; (iii) adjusting automatically InSAR values at each pixel by applying the ENVI ground-truth modules through a set of sparsely measured subsidence values; (iv) carrying out ground-truthing and normalising the subsidence data layer (i.e. calculate \( \frac{{\mathrm{SGT}}_{\mathrm{i}}}{{\mathrm{SGT}}_{\mathrm{max}}} \)); (iv) conditioning the output, as:

$$ {\mathrm{CSVI}}_{\mathrm{i}}=\frac{{\mathrm{SGT}}_{\mathrm{i}}}{{\mathrm{SGT}}_{\mathrm{max}}}\times {\mathrm{SVI}}_{\mathrm{max}} $$
(2)

where SGTi = subsidence at grid cell i after ground-truthing (in cm) and SGTmax is its maximum; CSVI is another data layer created for conditioned SVI at pixel i, SVImax is the BAF maximum and hence the term conditioning.

Three models based on fuzzy logic

Three models are developed using fuzzy logic (FL) to learn the weighting values in lieu of prescribed weights in BAF. Models using the fuzzy set theory are a strategy to develop a relationship between input data and output data. Each fuzzy set uses membership function (MF), which may have different types, e.g. S-shape, Gaussian, triangular, sigmoid, trapezoidal and Z-shape and these are decided by trial-and-error. Fuzzy sets use partial membership, in the range from 0 and 1.

The first-generation FL models used prescriptive rules but the paper uses the second-generation approach of clustering techniques, developed since the 1980s, which are capable of automatically identifying the possible structures within the data and identifying optimum rules. Different clustering methods have been developed to identify clusters within the data and their optimum numbers (Chiu 1994, Asadi et al. 2014). Most appropriate clustering methods are subtractive clustering (SC) (Li et al. 2001) and fuzzy C-means (FCM) (see Bezdek 1981). In this study, zero or first-order models were used to learn the weight values from the site data, in association with Sugeno FL (SFL), Mamdani FL (MLF) and Larsen FL (LFL).

The paper uses three approaches to implement models of FL as illustrated in Fig. 1 and outlined as follows. Sugeno FL (SFL) models use (i) constant or linear output membership functions, referred to as zero and first-order SFL, respectively (Sugeno 1985), and (ii) SC to extract fuzzy if-then rules (Fijani et al. 2013; Tayfur et al. 2014; Nadiri et al. 2014, 2015; Moazamnia et al. 2020). In the SC methods, the number of clusters is the same as the number of rules and controlled by the clustering radius, which takes values between 0 and 1.

Mamdani FL (MFL) models use (i) the output membership functions of the ‘MIN’ operation for their fuzzy implications (Mamdani 1976; Nadiri et al. 2019), and (ii) rules are identified by the FCM clustering method (Lee 2004).

Larsen FL (LFL) operates the same as MFL with the exception that the ‘PRODUCT’ operator is used by LFL in the implication operations (Larson et al. 2001). The methodology presented above is illustrated in Fig. 1 by an example with two dependent parameters of A and B, each with triangular MFs. The figure also illustrates its implication and aggregation.

Artificial intelligence running multiple models by GEP

The three FL models of MFL, LFL and SFL are AI models at Level 1, which mark the limit of conventional practices in terms of extracting information from the data. Sufficing to one level of learning by AI technics is the norm but Khatibi et al. (2017) criticise limiting the learning to one level. Nadiri et al. (2013, 2017a, b, c, 2018b)) formulate a strategy of learning at two levels, as follows: The outputs of Level 1 AI models are reused as inputs to yet another AI model, which also uses the same target values as the Level 1 models. The AI model at Level 2 in this study are constructed by the genetic expression programming (GEP) strategy and hence it forms a Level 2 AI strategy to run multiple models (MM), a strategy that the authors refer to as artificial intelligence running multiple models (AIMM), see Fig. 1. The strategy comprises: (i) formulate three supervised SFL, MFL and LFL models for modelling the Subsidence Vulnerability Index (SVI) of a study area using ALPRIFT data layers; (ii) produce output of the three models to reuse; and (iii) construct a GEP model, in which the target values are as those for Level 1 models but its input data are the three output results from the Level 1 SFL, MFL and LFL models, see Fig. 1.

GEP, introduced by Koza (1994), uses ‘parse tree’ structures for identifying optimum solutions, which comprise terminal and function sets. The function comprises the basic operators such as {−, +, ×, ÷, √, sin, log, …} and the terminal set, which forms the real components of the functions and their parameters. Notably, both of these components together emulate the role of chromosomes as in biological systems. The modelling strategy investigates the types of functions through trial-and-error for each system by tree structures.

GEP is used to combine the models at Level 2 for modelling subsidence vulnerability. The modelling procedure in implementing GEP are the following: (i) select the fitness function, (ii) select terminal and function sets to generate the initial set of chromosomes, (iii) select the structure of the chromosomes, (iv) set their linking function and (v) select genetic operators (Ghorbani and Khatibi 2012).

In this research, GEP was implemented by the GeneXproTools 4.0 software. The particulars of the model running in this study include different programs in terms of their sizes and shapes and the use of fixed length and linear chromosomes. AIMM in Fig. 1 is expressed mathematically as follows:

$$ {\mathrm{CSVI}}_{\mathrm{i}}^{\mathrm{FL}}={\mathrm{FL}}_{\mathrm{i}}\left(\mathrm{A},\mathrm{L},\mathrm{P},\mathrm{R},\mathrm{I},\mathrm{F},\mathrm{T}\right) $$
(3)
$$ {\mathrm{CSVI}}_{\mathrm{AIMM}}=\mathrm{GEP}\left({\mathrm{FL}}_1,{\mathrm{FL}}_2,{\mathrm{FL}}_3\right) $$
(4)

where\( {\mathrm{CSVI}}_{\mathrm{i}}^{\mathrm{FL}} \) is the output from each FL model (i = 1: SFL; i = 2: MFL; i = 3: LFL);

Study area

The study area is approx. 990 km2 (Fig. 2) and comprises the Ardabil plain in central parts of the Ardabil province. The aquifer is the outcome of geomorphological processes in the Qarasu (blackwater) basin, which drains three major mountain ranges: (i) Mount Savalan (Sabalan) at its west, which is 4811 m AMSL with a permafrost crater at the peak; (ii) Qushchu mountains ranges at its southern side; and (iii) Baghari (also known as Talesh) mountains at its east. The main tributaries of Qarasu include Balikhli Chay, Qara Chay, Hire Chay, Nemin Chay, Naqiz Chay, Sulaya Chay and Nuran Chay.

Fig. 2
figure 2

Location and geological map of the Ardabil basin

Geological context

Mount Savalan and Talesh mountains of the study area are volcanic and largely of the Eocene period, which is the result of the expansion of the Laramide orogeny phase. Geological formations in Ardabil plain comprise (i) volcanic formations composed of basalt (Cretaceous), andesite and tracyandesite (Quaternary) in the eastern part as the dominant formation of the basin; (ii) limestone formation mostly occurred in the Jurassic period located in the northern part; (iii) marl, sandstone and tuff formations mostly occurred in Neogene and located in the northwest of the area; (iv) conglomerates during the Cretaceous epoch and volcanic conglomerate, tuff conglomerate exposed during Neogene and Quaternary; and (v) traces of recent alluvium related to Quaternary mostly at the north of the study area (Nadiri et al. 2017a).

At the west, fractures are likely to serve as the main source of recharging the alluvial aquifer of Ardabil Plain, which are related to the Igneous and pyroclastic rocks originating from volcanic activity of Mount Savalan. Also, in the western part of Ardabil plain, non-carbonate rocks of Lahar and conglomerate outcrops are likely to feed the aquifer in Ardabil plain, which expands approx. 430 km2. The plain is mainly covered with alluvial deposits of the sand, silt and clay, a composition rife to subsidence risk in an agriculturally well-developed region.

The southern part of the region is exposed to the activity of the main faults running in the northeast to southwest direction belonging to the upper Eocene volcanic activities. The area is largely formed through tertiary tectonic processes and the Savalan volcano in the Quaternary period became the primary feature. The faults in the area include (i) the East Ardabil Fault along the north to south direction parallel to the Astara fault and continues towards the Masouleh fault; (ii) the Talesh fault runs for approx. 400 km along many series largely in the northeast to southwest direction; (iii) Savalan faults are outcomes of volcanic activities but the subsequent lava flows cover fractures and sometimes deep faults, which run in different directions in the vicinity of Mount Savalan, resulting in hot water mineral springs along their paths.

Hydrogeology of study area

The aquifer under the Ardabil plain is unconfined. The larger basin area related to Ardabil plain is approx. 4003 km2, of which 75% are mountainous and 25% are plain. The main aquifer recharge sources at Ardabil plain are as follows: (i) groundwater flows through the joints and fractures at the mountainous upper catchment areas and crosses the aquifer boundary and thereby recharges it; (ii) rainfall directly falling on the plain infiltrates soil and thereby percolates groundwater; and (iii) interactions between the watercourses and the aquifer with net gains and losses vary through time. According to geo-electric studies, changes in alluvium thicknesses at the margins of the plain are often low but gradually increase towards its middle so that higher thicknesses are noted at eastern and the middle parts of the plain (Nadiri et al. 2017a). Groundwater in the aquifer is withdrawn through 3669 deep and semi-deep withdrawal wells, 391 natural springs and 48 qanats (Fig. 3).

Fig. 3
figure 3

Location of abstraction wells in Ardabil plain

Land use

Ardabil is located in the study area and it is a historic city with a population of more than half a million but that at the plain is more than one million. Agriculture in the study area is the main economic activity and the region is well-known for its fertility and high agricultural activities. There are some processing industries around Ardabil including food and drink packing, electronic devices producing, paper producing and plastic tools producing factories. The land use classification is discussed further in the next section.

Dataset preparation

The study area is divided into pixels of dimensions 500 m × 500 m, which leads to generating 3760 number of pixels. The databank of the study area shown in Fig. 3 comprises a provision of 55 observation wells for recording groundwater levels, 34 number of wells with geological logs and 3669 number of abstraction wells. The tedious data preparation involves a set of decisions, which are presented in detail by Nadiri et al. (2018a). This section presents a sufficient amount of detail only to ensure the reproducibility of the models presented in this study.

Some of the basic information about these data layers are reproduced in the section: Input data layers

ALPRIFT with its seven data layers and its associated best practice procedure is detailed by Nadiri et al. (2018a, b, c). The preparation of the data layers is outlined in Fig. 1 and this section presents the main decisions appropriate for this study and specifies data sources.

Aquifer media (A)

It expresses soil texture and structure as follows: (i) use the 34 geological logs available for the study area, including their qualitative accounts of soil composition in terms of clay, sand and gravel (ARWA 2016); (ii) prescribe the rate values as per Fig. 4; (iii) estimated average-rated proportions at each well distribute these values using inverse distance weights (IDW) for the study area. The rate for each class of the A-data layer varies from 5 to 10 (Nadiri et al. 2018a). The rated A-data layer together with their proportions is shown in Fig. 4a.

Fig. 4
figure 4

Raster BAF data layers—rated data; a Aquifer media (weight, 5); b land use (weight, 4); c pumping of groundwater (weight, 4); d recharge (weight, 3); e impacts in terms of thickness (weight, 2); f fault distance (weight, 5); g water table decline (weight, 1). Note: The A, P, I and T data layers use the IDW interpolation technique

Land use (L)

The steps for processing this data layer with 28 m spatial resolution (Iran National Cartographic Centre 2013) is as follows: (i) produce a polygon vector from the raster map, (ii) superimpose the grid on the vector map and (iii) assign the rate values as per Nadiri et al. (2018a, b, c) to each pixel. The rated L-data layer together with their proportions is shown in Fig. 4b.

Pumping of groundwater (P)

The measured monthly discharge data from 3669 available abstraction wells (October 2015–October 2016) obtained from ARWA (2016) used for calculating the Annual discharge values were as follows: (i) the Theisen polygons were drawn for the 34 observation wells; (ii) withdrawal wells were identified within each polygon; (iii) the abstracted groundwater was summed within each polygon and was divided by the area within the polygon; (iv) contours of abstraction area were drawn as per algorithm specified in Fig. 4c; (v) the value of abstracted groundwater at each pixel was taken off using the contour maps; and (vi) the data layer was reclassified into six classes as per P-data layer specified by Nadiri et al. (2018a). Groundwater pumpage ranges from 0.42 to 48.75 cm/yr and the rated P-data layer together with their proportions is shown in Fig. 4c.

Recharge (R)

The data layer for net recharge was prepared as per Scanlon et al. (2002) using three raster layers as follows: (i) use rainfall values at Benis-Sanjan station, provided by ARWA (2016), and its value is 317 mm for the period from October-2015 to October-2016; (ii) use slope values and its values vary in the range of 0–19% in the basin and is less than 2% for plain; and (iii) use soil permeability provided by ARWA (2016), where its values vary from very slow to high. The net recharge map is the outcome of combining these three layers (0.18–44 cm/year) after dividing into three categories as per Nadiri et al. (2018a). The rated R-data layer together with their proportions is shown in Fig. 4d.

Impact of aquifer thickness (I)

Thicknesses of the sediments in the study area were measured by geo-electric surveys. The survey data from 97 wells were used and their depth ranges were calculated to be in the range of 13–213 m. This was divided into six categories as per Nadiri et al. (2018a). The rated I-data layer together with their proportions is shown in Fig. 4e.

Fault distance (F)

The distances between fault lines and every pixel of the study area were calculated by the Euclidean distance method in a GIS environment using the map by the Geological Survey and Mineral Exploration of Iran. The output map was reclassified into six classes as per Nadiri et al. (2018a). The rated F-data layer together with their proportions is shown in Fig. 4f.

Water table decline (T)

The water table (T) declination at 34 observation wells was calculated by subtracting the water table for a 1-year period (October 2015–October 2016) (ARWA 2016). The declined range is from 0.85 to 2.59 m for 2015–2016. The reclassified map uses four classes (Nadiri et al. 2018a). The rated T-data layer together with their proportions is shown in Fig. 4g.

Target data

Figure 5a shows the CSVI data layer but reclassified into four designated bands as per Fig. 4. The data layer serves as target to the three AI models to optimise the ALPRIFT weights and as to the supervised GEP model at Level 2. As the maximum CSVI from FL models is not constrained, AI models may slightly exceed those from BAF. The average values are given in Fig. 5a for the period August 2015–August 2016. The maximum subsidence over the study area is 2.14 cm for the period of August 2015–August 2016 but the evidence from observation wells shows this to be 9 cm for the 10-year period (average of 0.9 cm) based on the sparse observation well. Risk exposures to subsidence are probably accelerating at the study area and this is expected but more data is needed to confirm it.

Fig. 5
figure 5

Subsidence mapping to identify hotspots: a measured subsidence after ground-truthing; b mapping by BAF; c mapping by SFL; d mapping by MFL; e mapping by LFL; f mapping by AIMM (GEP)

Implementation of the framework and three models

The input and output datasets for implementing the SFL, MFL and LFL models require the division of the data into two sets of training and testing phases. Thus, the training phase of the models use randomly selected 80% of the data and the testing phase uses the remaining 20% of the data. Model performances are measured by comparing the modelled values with their corresponding measured values using the following performance metrics: correlation coefficient (r), Nash and Sutcliffe Efficiency coefficient (NSE) and correlation index (CI) Nadiri et al. (2018c), and receiver operating characteristics (ROC) curves together with area under curve (AUC) (Swets 1988).

Correlation coefficient varies from 1 to − 1, at which the measured metric is indicative of a perfect performance but if r is closer to zero, it indicates poor performances. NSE varies in the range of 1 to − ∞, where (i) NSE = 1 reflects a perfect match between the modelled and measured values, (ii) NSE = 0 is a reflection of the cases when information content of the modelled results to be of a similar order of the mean of the observed data and (iii) NSE < 0 is a reflection of the residual errors to be larger than that of the measurements (Nourani et al. 2008). NSE is known to be sensitive to outliers.

The paper uses another performance metric, referred to as correlation index (CI), which employs only those data and results at the observation points. The procedure for calculating CI-values is as follows: (i) the subsidence target values are grouped into four subsidence bands (SB 1, SB 2, SB 3 and SB 4); (ii) the modelled subsidence values are also grouped into four modelled bands (MB 1, MB 2, MB 3 and MB 4); (iii) a matrix is formed and the score for each field in the matrix is counted; (vi) weights of 1, 2, 3 and 4 are assigned for the band differences being 3, 2, 1 and 0, respectively; (v) the weighted scores are summed together.

The minimum SVI by ALPRIFT is 24 and its maximum is 240. However, the results for SFL, MFL, LFL and AIMM are not constrained and therefore they can produce values outside these limits.

Receiver Operating Characteristic (ROC) is often used for spatial goodness-of-fit. It is an analysis tool for radar images and is known as “Signal Detection Theory” to differentiate between the blips from a friendly ship, an enemy target or noise. The signal detection theory is now used in wider modelling practices. The ROC curve accuracy is measured by the Area Under Curve (AUC) and its value of 1 represents a perfect test but of 0.5 reflect existence of strong noise.

Results

Basic ALPRIFT framework

The basic ALPRIFT framework (BAF) vulnerability indexing is only used in this investigation for processing the data layers and benchmarking. GIS is used to process the raw data and Figs. 4 maps their rated values. SVI values were mapped by Eq. (1) through the procedure illustrated in Fig. 1 with prescribed rate and weight values as per Nadiri et al. (2018a). The SVI classification is mapped in Fig. 5b, which also gives the proportion of areas within each band (Band 1 has 3% of subsidence potential; Band 2 has 57% of subsidence potential; Band 3 has 37% of subsidence potential and Band 4 has 5% subsidence potential).

The performance measure for BAF is based on the correlation coefficient and NSE through the comparison of SVI values (Fig. 5b) with measured subsidence (Fig. 5a). It is visually apparent that there are convergences and divergences between the respective results, but this is natural as BAF is not a prediction but identification of potentials. The results are given in Table 1, according to which r and NSE values for BAF are 0.55 and 0.21 respectively, and the values signify that the information extracted from the data layers by BAF is significant but not yet defensible. This is a notable similarity with the authors’ reported studies on the DRASTIC vulnerability indices, see Nadiri et al. (2017a, b, c); as well as with that of BAF (Nadiri et al. 2018a). Thus, the low correlation justifies seeking for further improvement by using the SFL, MLF and LFL models as detailed below.

Table 1 Results of the four models and BAF for the testing phase

Sugeno, Mamdani and Larsen fuzzy logic

The learning strategy at Level 1 in this investigation is the use of SFL, MLF and LFL. The SFL model was implemented by optimising the weights for each ALPRIFT data layer by matching the modelled values with the CSVI values. The subtractive clustering (SC) technique was used by systematically increasing the cluster radii from 0 to 1. The results show that the optimum clustering radius is of 0.4, at which value the RMSE value was minimised and rendered the value of 0.1. The outcome is the generation of 9 membership functions (fuzzy if-then rules). This is then used in the SVI assessment by the SFL model by formulating fuzzy if-then rule i, expressed as follows:

$$ \mathrm{Rule}\ i:\mathrm{If}\left(\genfrac{}{}{0pt}{}{\begin{array}{c}\begin{array}{c}\mathrm{A}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{A}}^{\mathrm{i}}\ \mathrm{and}\\ {}\mathrm{L}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{L}\kern0.5em \mathrm{and}}^{\mathrm{i}}\end{array}\\ {}\mathrm{P}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{P}}^{\mathrm{i}}\ \mathrm{and}\end{array}}{\begin{array}{c}\mathrm{R}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{R}}^{\mathrm{i}}\kern0.5em \mathrm{and}\\ {}\mathrm{I}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{I}}^{\mathrm{i}}\ \mathrm{and}\\ {}\mathrm{F}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{F}}^{\mathrm{i}}\ \mathrm{and}\\ {}\mathrm{T}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{T}}^{\mathrm{i}}\ \mathrm{and}\end{array}}\right)\mathrm{Then}\ {\mathrm{SVI}}_{\mathrm{i}}={m}_{\mathrm{i}}A+{n}_{\mathrm{i}}L+{o}_{\mathrm{i}}P+{p}_{\mathrm{i}}R+{q}_{\mathrm{i}}I+{r}_{\mathrm{i}}F+{s}_{\mathrm{i}}T+t $$
(5)

where \( {\mathrm{MF}}_{\mathrm{A}}^{\mathrm{i}} \) is membership function of the ith cluster of input A,\( {\mathrm{MF}}_{\mathrm{L}}^{\mathrm{i}} \) is membership function of the ith cluster of input L and so forth. “AND” is a fuzzy operator, which connects the various MFs together as per rule. SVIi is the output of rule i and the linear function of the ith cluster of output, and mi, ni, oi, pi, qi ri, si and t are constant coefficient values.

The aggregation process comprises the output (SVI) as the weighted average of all outputs:

$$ \mathrm{SVI}=\frac{\sum_{\mathrm{i}}{w}_{\mathrm{i}}{\mathrm{SVI}}_{\mathrm{i}}}{\sum_{\mathrm{i}}{w}_{\mathrm{i}}} $$
(6)

where wi is the firing strength of rule i, assessed via the “and” operator.

Table 1 includes the presentation of the SFL modelling results in terms of r, NSE and RMSE, where these values for the training phase are 0.95, 0.9 and 7.23, and for the testing phase they are 0.86, 0.93 and 7.99, respectively. The results show that SFL produces significant improvements over that of BAF. The CSVI mapping of Ardabil plain from the SFL model is shown in Fig. 5c.

MFL and LFL models were implemented by optimising the BAF weights to match CSVI values. The fuzzy C-mean (FCM) clustering method was adopted to both extract the clusters and fuzzy rules. The optimum number of if-then rules was identified by trial-and-error to be equal to 16, in which the number of if-then rules is the same for the input and output data clusters. A fuzzy if-then rule i is formulated for mapping SVI by both MFL and LFL models and expressed as:

$$ \mathrm{Rule}\ i:\mathrm{If}\left(\genfrac{}{}{0pt}{}{\begin{array}{c}\begin{array}{c}\mathrm{A}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{A}}^{\mathrm{i}}\ \mathrm{and}\\ {}\mathrm{L}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{L}\kern0.5em \mathrm{and}}^{\mathrm{i}}\end{array}\\ {}\mathrm{P}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{P}}^{\mathrm{i}}\ \mathrm{and}\end{array}}{\begin{array}{c}\mathrm{R}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{R}}^{\mathrm{i}}\kern0.5em \mathrm{and}\\ {}\mathrm{I}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{I}}^{\mathrm{i}}\ \mathrm{and}\\ {}\mathrm{F}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{F}}^{\mathrm{i}}\ \mathrm{and}\\ {}\mathrm{T}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\mathrm{T}}^{\mathrm{i}}\ \mathrm{and}\end{array}}\right)\mathrm{Then}\ {\hat{\mathrm{CSVI}}}_{\mathrm{j}}\ \mathrm{belongs}\ \mathrm{to}\ {\mathrm{MF}}_{\hat{{\mathrm{CSVI}}_{\mathrm{j}}}}^{\mathrm{i}} $$
(7)

where \( {\hat{\mathrm{CSVI}}}_{\mathrm{j}} \) is the output; \( {\mathrm{MF}}_{\hat{{\mathrm{CSVI}}_{\mathrm{j}}}}^{\mathrm{i}} \) corresponds to the membership function of the output of rule i. Table 1 includes the presentation of the MFL and LFL results; the CSVI mapping of the plain obtained from the MFL model is shown in Fig. 5d and that from LFL is shown in Fig. 5e.

Comparing BAF, SFL, MLF and LFL performances

Performance measures are calculated by comparing the SVI results with the target values by BAF, SFL, MLF and LFL. The results for processing r, NSE and RMSE use all the pixel data but for correlation index (CI), only the values at the observation wells are used. Table 1 lays out the procedure for calculating CI-values together with their values, according to which the SFL strategy improves on the performance of BAF.

Measurements at the observation wells directly assess the performances and hence used to calculate CI-values but r, NSE and RMSE values are calculated using CSVI values from all the grid points. Table 1 shows that r value for BAF is 0.55 but based on statistical tests at 95% level of significance, the signal in this typical result is significant, even though its value is low. Thus, the correlation between the modelled values and CSVI values is significant; and for SFL MFL and LFL, the better performing model is SFL compared with MFL and LFL in terms of all of the performance measures. The authors do not intend to rank the models at Level 1, as they are not the ends but the means to an end at Level 2. Although SFL improves the performance metrics compared with BAF to 0.9 from 0.21, notably NSE value of 0.9 indicates that there is room for further improvements and this is archived by GEP at Level 2, as below.

AIMM by gene expression programming

Gene expression programming (GEP) is used to implement the AIMM strategy by learning from the models at Level 1 (SFL, MFL and LFL) for the assessment of SVIs. Its inputs comprise the results of SFL, MFL and LFL models and its targets comprise distributed CSVIs. The GEP modelling requires the selection of the appropriate basic operators to build its parse tree and some of the investigated ones are given in Table 2.

Table 2 Implementation of GEP: basic functions for parse tree, default parameters and results

Model runs test the performances of F1 and F2 functions, defined in Table 2, by minimising their RMSE. The RMSE values are also presented in Table 2, together with default values of model runs. The performance of the F1 function is selected in terms of RMSE and it is preferred for being parsimonious. The outputs of the GEP model are SVI values at all of the pixels. The equation of GEP model to assess the subsidence vulnerability indices is as follows:

$$ \mathrm{CSVIAIMM}=\left(\frac{\mathrm{MLF}}{\left(\left(\left(-6.6\times \mathrm{LLF}\right)-\mathrm{SLF}\right)\times \left(\mathrm{SLF}+\mathrm{MLF}\right)\right)}\times \mathrm{LLF}\right)+\left(\left(\left(\mathrm{SFL}-\left(\mathrm{MFL}-1.3\right)+\mathrm{LFL}\right)\right)\times \mathrm{MFL}\right)+\left(\left(\left(\mathrm{MFL}+\mathrm{SFL}\right)-\left(\frac{-1.4}{8.9}\right)\right)\times \left(\left(\mathrm{LFL}-\mathrm{SFL}\right)-\mathrm{MFL}\right)\times \mathrm{MFL}\right) $$
(8)

where SFL, MFL and LFL in the above equation are terminal set (actual parameters).

The vulnerability mapping by AIMM is given in Fig. 5f and its performance measures in terms of r, NSE and RMSE are given in Table 1, in which the modelled CSVI values are compared with their corresponding conditioned values of measurements as defined in the “Implementation of the framework and three models” section. The improvements in the performance metrics of GEP compared with those at Level 1 provide evidence that the AIMM model driven by GEP is capable of learning from SFL, MFL and LFL models as well as from the target values. The performance of the AIMM model is further tested by AOC/RUC, which renders a ROC value of 0.964. Also, the variation of sensitivity against specificity is shown in Fig. 6, according to which the signal captured by GEP is strongly defensible.

Fig. 6
figure 6

Evaluation of AIMM performance based on AUC criterion

Inter-comparison of the mapping results

A visual inter-comparison of the mapped results in Fig. 5a–e is indicative of a greater convergence between those at Fig. 5a and Fig. 5f but divergences are also observed, as follows: (i) as per Fig. 5a, there is a highly vulnerable small patch of area of Band 4 (0.37% of the study area) in the middle of the study area but none of the mapping results (Figs. 5b–f) are able to match its severity; this probably indicates that the models are not sensitive to small local variations. (ii) The southern Band 3 in Fig. 5a (22% of the study area) is identified as Band 4 (less than 28% of the study area) by BAF, see Fig. 5b; Band 4 (23% of the study area) by SFL, see Fig. 5c; Band 4 (19% of the study area) by MFL, see Fig. 5d; Band 4 (19% of the study area) by LFL, see Fig. 5e; and Band 4 (19% of the study area) by AIMM, see Fig. 5f. This divergence between the mapped results and observed values may be explained as the ability to anticipate deteriorating future conditions and the necessity for more conservative planning decisions in the future.

Discussion

Topical research on impacts of land subsidence is wide and existing activities are reflected by (Mateos et al. 2017; Yin et al. 2016; Hayashi et al. 2009; Stramondo et al. 2007) among others. These studies focus on managing impacts of land subsidence, mitigation measures and recovery of impacted lands. Land subsidence causes degradation as the loss of utility in various forms including damage to physical, social, cultural or economic functions and can undermine ecosystem diversity. Subsidence is often triggered by over-abstraction of groundwater resources during water shortage or drought episodes. Remediation methods include recharging aquifers using surface water from alternative sources if feasible, developing basin and drought management plans. In the last few decades, sustainable drainage systems (SuDS) have also become viable techniques to recharge groundwater aquifers. The BAF is an approach introduced by Nadiri et al. (2018a), which is suitable for planning as a management tool by drawing up a map of hotspots in Ardabil plain, as given in Fig. 5b. However, the response of the authorities responsible for planning water usages in Iran remains to be seen. Without a planned approach, international experience shows that it is unlikely to resolve the conflict between achieving equitable water allocations and mitigating the decline in groundwater levels. Conversely, without a planned approach, permanent damage is almost assured.

Nadiri et al. (2018a) sufficed to using SFL alone to learn from local data to set the rating and weighting of the ALPRIFT data layers, even though the NSE metric was found to be 0.74. As they did not explore the application of the AIMM strategy, this is taken up in this study but for a different case study. The AIMM strategy is open to diverse selection of models both at Levels 1 and 2 of learning by AI. The paper investigates the use of multiple models of fuzzy logic at Level 1 and GEP at Level 2.

Although there may be room for further learning and testing different strategies, currently, the authors are focussed on testing the application of ALPRIFT to different study areas, during which the impacts of different selections of AI techniques using the AIMM strategy will be explored. The results presented in Fig. 5 show that the improvements by extracting further information make the results defensible. The results presented in the paper also confirms the authors’ overall conclusion that AI techniques and two levels of learning are better placed to identify the weight values of a framework than using their prescribed values.

The classification of the land use data layer in the paper remains the same as that reported by Nadiri et al. (2018a). However, the authors are anticipating revising this data layer by considering many internationally available land-use classification schemes.

Proof-of-concept for the AIMM strategy is supported by the results which are presented in the research but it adds further to the defensibility of ALPRIFT to enhance its Technology Readiness Level, as defined by the NASA classification (see: https://www.nasa.gov/sites/default/files/trl.png). Therefore, the results reported by the paper take the TRL from Level 4 higher but further studies are needed to ensure that it is at TRL 9. The authors’ have formulated a program for this purpose until a working tool emerges for the service of practitioners. Further investigations are underway for Salmas plain in the west Azerbaijan province adjacent to the Lake Urmia basin. The authors are also considering various techniques to transform a vulnerability framework into a risk framework.

Conclusion

ALPRIFT is a consensual integration of the following data layers: aquifer media (A), land use (L), pumping of groundwater (P), recharge (R), impacts instigated by aquifer thickness (I), fault distance (F) and water table variation (T). Each data layer is processed and assigned with appropriate rating values to account for local variations and weights to account for the relative importance of each ALPRIFT data layer. A further data layer is also developed based on ground-truthing of the data downloaded from Sentinel-1 satellites.

The ALPRIFT framework is investigated by applying it to Ardabil plain, in the Ardabil province, northwest Iran, where the groundwater level decline is approx. 16 m during 2005–2015 period. This is a strong telltale sign for subsidence, as evidenced by measured settlements at 34 observation wells. The paper identifies hotspots by a modelling strategy at two levels of learning as follows: (i) without a strategy, the mapping results by using the basic ALPRIFT framework are found to be fit-for-purpose in terms of performance metrics. (ii) The learning at Level 1 is based on identifying the values of weights by the strategy of using three fuzzy logic (FL) models, and their results in terms of performance metrics show improvements toward the defensibility of the mapping results but with room for improvements. (iii) The learning at Level 2 is based on genetic expression programming (GEP), which reuses multiple models at Level 1 as inputs together with target values (Sentinel-1 data with ground-truthing), and their results in terms of a set of performance metrics show the mapping results to be defensible. The results produce an enhancement for the Technology Readiness Level of the ALPRIFT framework.