Introduction

Groundwater is one of the most precious freshwater resources in many countries, especially in arid and semiarid regions (Neshat et al. 2014). Groundwater quality especially in the unconfined aquifer is sensitive to surface contamination, caused by the agriculture and industrial activities (Al-Hanbali and Kondoh 2008; Javadi et al. 2011a, b). There is a clear urgent need for treating contaminated groundwater, and furthermore, for delineating potential vulnerability areas and applying effective policies to prevent groundwater from being polluted. However, traditional methods such as field investigations at regional scales are often not effective, time-consuming, and costly. Consequently, the assessment of groundwater vulnerability has been subject to intensive research and a variety of methods have been developed (Neukum and Azzam 2009), such as DRASTIC model (Aller et al. 1987).

DRASTIC model is widely used to estimate the groundwater vulnerability (Babiker et al. 2005; Chitsazan and Akhtari 2008; Huan et al. 2012; Sener et al. 2009; Yin et al. 2012). This method is gradually becoming a standardized approach for assessment of groundwater vulnerability due to the property of easy application at regional scales with Geographic Information Systems (GIS). Seven hydrogeological factors, including depth to groundwater, net recharge, aquifer media, soil media, topography, impact of vadose zone, and hydraulic conductivity, are used to evaluate the vulnerability index. As a relative measure, the vulnerability index is dimensionless. The higher values indicate more vulnerable to contamination.

Generally, the ratings and weightings of the seven features used in DRASTIC model are assigned specific values as shown in Table 1 without considering the regional hydrogeological conditions, which then makes DRASTIC model an easy target for criticism. Moreover, this technique disregards the impact of alternative data on human activities, such as land use. Thus, many researchers have modified the traditional model to further develop this method (Al-Hanbali and Kondoh 2008; Hernández-Espriú et al. 2014; Neshat et al. 2013, 2014; Secunda et al. 1998; Sener and Davraz 2012; Thirumalaivasan et al. 2003).

Table 1 Original DRASTIC weightings and ratings (Aller et al. 1987)

An evolutionary notable modification is applying analytic hierarchy process (AHP) to determine the weightings and/or ratings for parameters used in DRASTIC model (Thirumalaivasan et al. 2003). AHP method is a powerful approach in dealing with multicriteria decision-making technique. A set of pairwise comparisons (PCMs) is used to obtain the weights in regard to the impact of the decision and subdecision criteria (Yalcin et al. 2011). Accordingly, by using AHP technique, the ratings and/or weightings of geodata layers are optimized by transferring acquisition of knowledge from experts to a model.

While AHP method is practically useful, subjectivity from experts may distort the results of estimation. Genetic algorithms (GAs) is recently the most prominent and widely used adaptive optimization search approach based on a direct analogy to Darwinian natural selection and genetics in biological systems (Holland 1975). This method shows a strong ability in global optimization. Rahimi et al. (2014) used the GAs method to the flood spreading site selection of Gareh Bygone plain, Iran. Giacobbo et al. (2002) presented the feasibility of using GAs for estimating the parameters of groundwater contaminant transport model. Zhang et al. (2014b) employed GAs method to optimize the initial weights determined by artificial neural networks (ANN) to assess the occurrence of earth fissure in Su-Xi-Chang land subsidence area.

Jianghan Plain, located in the middle reaches of Yangtze River, is a major farming area in central China. Recent studies indicate that groundwater is subjected to pollution in many areas due to different anthropogenic activities, such as the extensive fertilizes use and industrialization (Duan et al. 2015; Zhou et al. 2012). Given the abovementioned considerations, this study is to propose a new methodology to modify DRASTIC model by integrating land use, AHP and GAs methods and then to evaluate the groundwater vulnerability in Jianghan Plain.

The study area

Location and climate

Jianghan Plain, situated between latitude 29.4°–31.3°N and longitude 111.4°–114.5°E, is an alluvial plain deposited by the Yangtze and Han rivers. It is located in the central and southern regions of Hubei Province (Fig. 1). It is well known as “national base of fish and rice” in China, and agriculture is the main human activity in majority of the study area. Under a north subtropical monsoonal climate, the annual average temperature is 15–17 °C. The annual average precipitation is 1269 mm, increasing from 800 mm in the northwest to 1500 mm in the southeast, 30–50% of which concentrates in summer.

Fig. 1
figure 1

Location map and digital elevation model of study area

Topography

Jianghan Plain is a semiclosed basin with a higher elevation in the northwest and a lower elevation in the southeast (Fig. 1). The middle region is a low alluvial plain with elevations ranging from 20 m in the southeast to 40 m in the northwest. The outlying areas mostly consist of two-level terraces with elevations of 40–80 m and 80–120 m, respectively. The outer area of the terraces is primarily composed of craggy terrain and low hills above 120 m. The slope ranges between 0° and 22° with an average value of 2°–3° in the central flat plain (Zhou et al. 2012).

Geology

The geological map of study area was prepared based on lithological units, recent sediments, and landform (Fig. 2). The central plain is composed of Holocene alluvial–lacustrine and upper Pleistocene sediments. The thickness of alluvial–lacustrine sediments varies greatly in different regions with depth of 100–200 m in the center and 20–100 m in the outlying areas. The outlying hills, which have been suffered from erosion, are formed with porous and fissured media of mid Pleistocene, lower Pleistocene, and Pliocene sediments. The lithology of Jianghan Plain is mainly Quaternary lacustrine sediments, which are mainly composed of sandy clay, clayey sand and clay.

Fig. 2
figure 2

Geological map of study area

Hydrogeology

The hydrogeological characteristics are closely related to the topography. The hilly areas primarily consist of aquitard, while the central areas consist of unconsolidated water-bearing sediment layers. These aquitard and unconsolidated layers constitute a complete hydrogeological unit. Based on the geological and hydrogeological settings, aquifer system can be subdivided into unconfined aquifer and confined aquifer (Gan et al. 2014). The unconfined aquifer is mainly composed of clay, clayey silt, and sandy silt. The depth to groundwater is usually 0.5–3.5 m below ground surface. There are two types of confined aquifer: the confined aquifer in Quaternary porous media and the confined aquifer in porous-fissured media of Pliocene and lower Pleistocene sediments. The former is mainly distributed in the lower plain, while the latter occurs in the hilly areas.

Materials and methods

Data collection

Table 2 presents the data used in this study. The features are obtained from different sources to establish thematic layers.

Table 2 Data and its sources used in this study

Depth to groundwater (D)

Depth to groundwater represents the depth from the surface to the groundwater table. It determines the thickness of the unsaturated zone through which the infiltrating water must travel before reaching the aquifer. The depth to groundwater was measured in August 2014 from 786 wells in the study area. Besides, available data of 43 monitoring wells, which measured the groundwater level (depth to groundwater) every five days, were collected in August 2014. The minimum depth to groundwater occurs near August, and fertilizers are extensively used in this month (Zhang et al. 2014a). Therefore, this month is selected to consider the worst possible case scenario. A log-normality distribution of the depth to groundwater is confirmed by the Kolmogorov–Smirnov test performed in SPSS 20.0 software. Kriging interpolation is then employed to estimate the distribution of depth to groundwater (Fig. 3a).

Fig. 3
figure 3

Geodata layers involving: a Depth to groundwater. b Net recharge. c Aquifer media. d Soil media. e Topography. f Impact of vadose zone. g Hydraulic conductivity. h land use

Net recharge (R)

Net recharge is the amount of water that penetrates into the aquifer from the surface. It is the principal vehicle that transports the contaminants vertically to the groundwater and horizontally within the aquifer. Net recharge is not only the results of rainfall and irrigation return flow, but also the results of river and lakes recharge (Guan et al. 2016). However, the current study focuses on the vulnerability potential to pollution from the surface, particularly by the human activities. Therefore, the river and lakes recharge is not considered in this study. The net recharge map is obtained based on the rainfall and the irrigation return flow data and calculated by:

$${\text{Net}}\;{\text{recharge}} = {\text{rain}}\;{\text{fall}} \times {\text{recharge}}\;{\text{rate}} + {\text{irrigration}}\;{\text{return}}\;{\text{flow}}$$
(1)

The rain fall map is obtained by interpolating a 20-year mean of annual precipitation (mm/year) from seven representative rainfall stations (Wuhan, Jingzhou, Jingmen, Qianjiang, Tianmen, Xiantao, and Xiaogan). The recharge rate and irrigation return flow maps are obtained from Wuhan Center of China Geological Survey. The net recharge layer is presented in Fig. 3b.

Aquifer media (A)

Aquifer media is an important factor controlling groundwater flow path. A weakly permeable aquifer with relatively low recharge rates may result in a low vulnerability, while a highly permeable aquifer with great recharge potential may lead to a high vulnerability. Classification of this parameter is based on a report obtained from Hubei Institute of Hydrogeology and Engineering Geology (Chen et al. 1985) (Fig. 3c).

Soil media (S)

Soil media is identified as the uppermost part of the vadose zone, and it influences the infiltrating process of contamination (Baalousha 2010). The soil media map is derived from the previous studies (Zhao et al. 2007) and presented in Fig. 3d.

Topography (T)

Topography in DRASTIC model refers to the slope of the land surface and its variation. It controls the runoff and determines the residence time of pollutants after precipitation. Besides, the topography has a significant effect on groundwater flow rate and current direction (Al-Hanbali and Kondoh 2008). The topography of study area is categorized into five groups according to the digital elevation model (DEM) provided by Chinese Academy of Sciences using “Slope” tool in ArcGIS 10.0 software(Fig. 3e).

Impact of vadose zone (I)

The vadose zone, also refers to the unsaturated zone, is between the land surface and the top of the phreatic zone. It controls the length or duration of hydrologic flow path in controlling the contaminant delivery. Based on 1358 boreholes available information data collected in the study area, the impact of vadose zone layer is generated as shown in Fig. 3f. The drilling logs show a large complexity and variety of the vadose medium which indicated a multilayer vadose system in Jianghan Plain. The thickest medium from each drilling logs upper the groundwater level is expected to be defined as the vadose material. Thiessen polygons are then constructed so that the centroid of vadose medium from each well is assigned to an area.

Hydraulic conductivity (C)

The hydraulic conductivity reflects the ability of aquifer materials to transmit water, which in turn controls the rate at which contaminants will flow with groundwater under a given hydraulic gradient (Neshat et al. 2014). The hydraulic conductivity distribution map is generated using the pumping test results of the study area. Kriging interpolation algorithm is applied to interpolate the hydraulic conductivity and to create the raster layer as shown in Fig. 3g.

Land use (L)

Landsat Thematic Mapper (TM) satellite images of August 2014 are used to produce the land use map. TM satellite images are widely applied for land use classification owing to their relatively high spatial resolution (Liu et al. 2014). Supervised classification method with maximum likelihood clustering technique is employed to classify the land use into six classes: urban, agriculture, water, bare land, grass, and forest, as illustrated in Fig. 3h.

Nitrate measurements

The nitrate concentration (expressed as mg/L NO3–N) associated with groundwater was measured in August 2014 from 97 wells distributed in the study area (Fig. 4). Outliers are detected using the Z-scores (Haddad et al. 2015). The Z-scores of the observations could be calculated as:

$$Z_{i} = \frac{{x_{i} - \bar{x}}}{s},\quad {\text{with}}\;s = \sqrt {\frac{1}{n - 1}\sum\limits_{i = 1}^{n} {\left( {x_{i} - \bar{x}} \right)^{2} } }$$
(2)

where xi is the observation value, \(\bar{x}\) is the mean, and s is the standard deviation. The nitrate concentration values obtained by the laboratory are considered acceptable for Z-scores between −2 and 2 (Chakravarty et al. 2013). The distribution of Z-scores and the corresponding nitrate concentration for the groundwater samples is shown in Fig. 5. As can been seen, two samples are rejected due to the Z-scores felt outside the acceptable limit of ±2.

Fig. 4
figure 4

Nitrate sampling locations in the study area

Fig. 5
figure 5

Z-scores and NO3–N concentration for groundwater samples

Description of the proposed methodology

Figure 6 illustrates the methodological flowchart. A combination of land use, AHP and GAs methods is proposed to modify the DRASTIC model to assess the groundwater vulnerability. Detailed steps are described as follows:

Fig. 6
figure 6

Flowchart of proposed methodology

  1. 1.

    The conventional DRASTIC method is applied (DRASTIC model).

  2. 2.

    The land use map is overlapped with the resultant DRASTIC model (DRASTICL model).

  3. 3.

    PCMs are used to develop the decision hierarchy structures and determine the priorities of ranges for parameters. New ratings are determined by AHP method (AHP–DRASTICL model).

  4. 4.

    A GAs program is developed to re-ascertained the weightings of the eight factors (DRASTICL–GAs model).

  5. 5.

    AHP and GAs methods are coupled to generate a new technique (AHP–GAs model).

  6. 6.

    Statistical analyses including Pearson’s correlation coefficient and analysis of variance (ANOVA) F statistic are adopted to verify the vulnerability mapping by taking nitrate as the typical pollutant.

  7. 7.

    The single-parameter sensitivity analysis is performed to analyze the “effective weights” of the parameters against their theoretical weights.

Groundwater vulnerability assessment

DRASTIC model

DRASTIC model involves the following seven hydrogeological features: D (Depth to groundwater), R (Net Recharge), A (Aquifer media), S (Soil media), T (Topography), I (Impact of vadose zone), and C (Hydraulic conductivity). The model calculates an intrinsic vulnerability index with different weighting factors based on the following equation:

$${\text{DRASTIC}}\;{\text{Index}} = \sum\limits_{i = 1}^{7} {\left( {W_{i} \times R_{i} } \right)}$$
(3)

where W i is the weighting for parameter i with an associated rating of R i . Each parameter is rated from 1 to 10, a rating of 10 indicating a highest potential to the pollution, and each parameter is weighted from 1 to 5 using a Delphi (consensus) method to express their relative importance with respect to each other. The ratings and weightings of Aller et al. (1987) are shown in Table 1.

Development of DRASTIC model

Incorporating the land use to DRASTIC model

Land use is the primary factor causing habitat degradation and poor water quality. It can affect the groundwater quality and increase the pollution risk, especially in agriculture areas. For this reason, it should be considered and incorporated into the DRASTIC vulnerability index when evaluating the groundwater vulnerability. The new model can be named DRASTICL by acronyms of the eight factors. Then by converting the land use map into a raster image to overlay the DRASTIC index, the results of DRASTICL model are expressed as the flowing equation:

$${\text{DRASTICL}}\;{\text{Index}} = {\text{DRASTIC}}\;{\text{Index}} + L_{\text{w}} \times L_{\text{r}}$$
(4)

where L w and L r correspond to the weighting and rating of land use factor, respectively. The land use factor is given a weighting of 5, due to the potential impact of this parameter on groundwater. Detailed ratings are given in Table 3, as illustrated in previous studies (Shirazi et al. 2013).

Table 3 Description and rating for each land use class
Correcting the ratings using AHP method

AHP, developed by Saaty (1980), is often used to solve strategic decision problems. This method is based on the construction of a series of PCMs. By ranking the importance of the criteria, a set of PCMs is established to calculate the relative weights. The consequence of each PCMs could be summarized in a square matrix, in which each element ranges from 1/9 (absolute unimportance) to 9 (absolute importance) as described in Table 4. Priorities are then computed by normalizing each column of the matrix, to derive the normalized primary right eigenvector, the relative weight vector, by following equation:

$$A \times w = \lambda_{ \hbox{max} } \times w$$
(5)

where A is the PCM; w is the principal eigenvector; \(\lambda_{ \hbox{max} }\) is the largest eigenvalue of matrix A. The consistency index (CI) is calculated to determine the quality of the result of the AHP. The formula for the CI is as follows:

$${\text{CI}} = \frac{{\lambda_{ \hbox{max} } - n}}{n - 1}$$
(6)

where n is the dimension of matrix A. CR could be checked based on CI, by:

$${\text{CR}} = \frac{\text{CI}}{\text{RI}}$$
(7)

where RI is the ratio index (Lin and Yang 1996). As a general rule, the judgment matrix is acceptable if CR ≤ 0.1. In contrast, if CR is greater than 0.1, the evaluation procedures have to be reviewed and reconsidered.

Table 4 Scale for PCMs used in AHP (Saaty 1980)

In this study, AHP method is employed to correct the initial ratings participating in DRASTICL model. To improve the accuracy and objectivity in decision-making, three hydrogeologists are invited to fill the PCMs to evaluate the ratings of corresponding eight parameters. The experts are supposed to specify their judgments of the relative importance of each class by asking questions like “with respect to parameter x, how much more important is class a to class b?”. Not only normal comparison values, that is, 1,2,…,9 and 1/2,1/3,…,1/9, are used to constructed the PCMs, but also non-normal comparison values such as 1.1, 2.3,…,9.6 and 1/1.1, 1/2.3,…,1/9.6 are used when experts have conflict opinions. Table 5 reports an example of the inputs of the PCMs used to derive the optimized ratings for the land use parameter. The RI = 1.24 and the CR = 0.06 < 0.10, which indicates that the judgment matrix passes the consistency check. The weights produced from the AHP method ranges from 0.04 to 0.33, where 0.04 denotes the least susceptible and 0.33 denotes the most susceptible to groundwater vulnerability. The same assessments are applied to compare alternatives with respect to the other parameters. The final alternative ratings derived using AHP method are shown in Table 6.

Table 5 PCM for land use parameter
Table 6 Alternative ratings optimized by AHP method
Optimizing the weightings using GAs method

GAs is an iterative method based on the process of genetic selection and natural elimination in biological evolution. GAs handles a population of possible solution to optimize problems using techniques inspired by natural evolution, such as selection, crossover, and mutation. It is a useful tool for searching and optimization problems. Candidate solutions are retained and ranked for the each iteration according to their eligibility. A fitness function is used to remove the unqualified solutions. The algorithms are stopped when the termination condition is met. In GAs, the design variable is a string of numbers, usually called chromosome, which represents the solution. The most common way of encoding is a binary string with 1 and 0 s. The length of the string depends on the accuracy. In this study, Python 2.6 software is utilized to establish the GAs program.

The goal of this procedure is to find the optimal weightings of the eight parameters which maximize, as the objective function, the Pearson’s correlation coefficient between the vulnerability index and nitrate concentration. The algorithm is founded on a population size of 100 chromosomes, each one of which consists of 80 bits string (every 10 bits string coding one of the eight weighting values) and 200 generation steps. Table 7 represents the GAs features used in this study. The GAs would be stopped once the maximum generation number has evolved or if there is no change to population’s best fitness for 50 iterations.

Table 7 GAs features used in this study
Combining AHP and GAs methods

AHP method is proposed to correct the ratings of eight parameters, while GAs method is proposed to produce an optimal weighting combination. Additionally, AHP and GAs methods are combined to generate a new technique. The eight factors are reclassified according to the modified ratings optimized by AHP method. Then the GAs program is executed again to find new solutions by using the same features as shown in Table 7.

Normalization of vulnerability index

Given the different ranges of the vulnerability index obtained from these models, for example, the DRASTICL index is absolutely higher than the DRASTIC index for a result of direct addition, and the relativity of groundwater vulnerability index, unity-based normalization process is employed for presenting the vulnerability maps based on:

$${\text{Normalized}}\;{\text{Index}} = \frac{{V - V_{ \hbox{min} } }}{{V_{ \hbox{max} } - V_{ \hbox{min} } }}$$
(8)

where V is the vulnerability index, V min is the minimum vulnerability index, and V max is the maximum vulnerability index. The normalized vulnerability index ranging from 0 (lowest potential to contamination) to 1 (highest potential to contamination) is then divided into four groups with the equal interval: very low (<0.25), low (0.25–0.50), high (0.5–0.75), very high (0.75–1). It is helpful in comparing the vulnerability maps produced by different models.

Model validation

Nitrate, one of the major contaminant from anthropogenic activities, is not usually found in nature groundwater (Baalousha 2010). With regard to the fact that this contaminant is highly risky to human health and it mainly comes from the nitrogen fertilizers, which are frequently used in agricultural areas to enhance the crop production in the study area, the nitrate concentration is selected as an indicator of initial contamination. Statistical analyses including Pearson’s correlation coefficient and ANOVA F statistic are performed to validate the vulnerability results of different models.

Pearson’s correlation

Pearson’s correlation analysis is investigated to check the degree of association between the vulnerability index and nitrate concentration. The sampling locations map is overlaid on the vulnerability index maps, and the corresponding value for each point is extracted within GIS environment. The attribute file of the sampling locations is updated with the data from vulnerability index map based on the spatial relationship between the features. Then Pearson’s correlation coefficient could be calculated by:

$$\rho = \frac{{{\text{cov}}({\mathbf{X}},{\mathbf{Y}})}}{{\sigma_{{\mathbf{X}}} \sigma_{{\mathbf{Y}}} }}$$
(9)

where ρ is the correlation coefficient, cov is the covariance, and σ is the standard deviation.

ANOVA F statistic

F statistic is the famous statistic for the ANOVA to compare the means of samples from different levels. Larger values of F reject the null hypothesis that the means are equal (Montgomery 2008). The larger analysis of ANOVA F statistic is, the less overlap there is between the nitrate values in different vulnerability classes (Huan et al. 2012). ANOVA F statistic can be calculated by:

$$F = \frac{{{{\text{SST}} \mathord{\left/ {\vphantom {{\text{SST}} {k - 1}}} \right. \kern-0pt} {k - 1}}}}{{{{\text{SSE}} \mathord{\left/ {\vphantom {{\text{SSE}} {n - k}}} \right. \kern-0pt} {n - k}}}}$$
(10)

where SST and SSE are, respectively, the sum of squares for treatment and the sum of squares for error. k − 1 and n − k are, respectively, the freedom degree for treatment and freedom degree for error.

Sensitivity analysis

Subjectivity is inevitable in the selection of ratings and weightings in all parametric techniques, which can strongly affect the vulnerability assessment. Sensitivity analysis could provide valuable information on the contribution of ratings and weightings of input parameters and could help analyst to judge the significance of subjective elements. There are two types of sensitivity analysis: the map removal sensitivity analysis and the single-parameter sensitivity analysis (Gogu and Dassargues 2000). In this study, single-parameter sensitivity analysis is conducted to evaluate “effective weight” of each parameter.

Single-parameter sensitivity analysis checks the spatial significance of the parameters in the index computation. The effective weight could be defined as follows:

$$W_{{{\text{p}}i}} = {{\left( {P_{{{\text{R}}i}} \times P_{{{\text{W}}i}} } \right)} \mathord{\left/ {\vphantom {{\left( {P_{{{\text{R}}i}} \times P_{{{\text{W}}i}} } \right)} {V_{i} \times 100\% }}} \right. \kern-0pt} {V_{i} \times 100\% }}$$
(11)

where W pi is the effective weight for each unique condition subarea i, V i is the vulnerability index, P Ri and P wi are the rating values and weighting values of parameter P assigned to subarea i. A “unique condition subarea” is one or more polygons in the vulnerability map with a unique combination of rating values of the factors used to compute the vulnerability index (Napolitano and Fabbri 1996).

Results and discussion

Application of DRASTIC model

The assigned layers for the seven DRASTIC parameters are constructed in raster format with a pixel size of 30 m, which are briefly discussed.

The maximum depth to groundwater is found in the northwest of the study area. Depth generally increases from southeast to northwest. According to Aller et al. (1987), ratings of 10, 9, 7, and 5 are assigned to the corresponding four classes, respectively.

The net recharge, calculated by Eq. (1), varies between 92.8 and 326.8 mm/year. Net recharge in 40% of study area is greater than 254 mm/year. DRASTIC standard ratings of 3, 6, 8, 9 are assigned to the corresponding four classes.

As for the aquifer media of study area, most parts are deposited with clay. Deposits of silt are mainly located in the middle region of the study area. Deposits of sand and gravel contribute to a very low percentage of the study area. The higher permeability of the aquifer, the potential for pollution is greater. In this study, rating of 8 is assigned for the aquifer media type of sand and gravel, 5 for silt, and 3 for clay.

Based on the previous studies, types of soil media mainly consist of clay, silty clay, silt, and sand. Silt and clay could be found in most part of the study area. Sand with high permeability is assigned a rate of 9 following the DRASTIC classification. Rating of 3 is attributed for silt, 2 for silty clay, and 1 for clay.

The topography is divided into five classes according to Aller et al. (1987), and ratings from 1 to 10 are assigned to the different classes.

The vadose zone of the study area is subdivided into five groups: clay, mild clay, silty clay, silt, and sand and gravel, with ratings of 1, 2, 3, 4, and 5, respectively.

The hydraulic conductivity of the study area is smaller than 4.1 m/d. According to Aller et al. (1987), rating of 1 would be assigned to all subareas. However, it is meaningless because the information of this factor would not be completely reflected. A simple modification is proposed relating to the rating values of hydraulic conductivity: 1 for conductivity <1 m/d, 3 for conductivity 1–2 m/d, 5 for conductivity 2–3 m/d, and 7 for conductivity 3–4 m/d.

Figure 7a and Fig. 8a present the normalized DRASTIC vulnerability index and the corresponding percentages, respectively. Obviously, the normalized vulnerability index ranges from 0 to 1 for the whole plain. The DRASTIC results illustrate that 3.98% of the total area has “very high” vulnerability, 45.09% has the “high” vulnerability, and more than half (50.93%) has the “low” and “very low” vulnerability.

Fig. 7
figure 7

Vulnerability maps: a DRASTIC. b DRASTICL. c AHP–DRASTICL. d DRASTICL–GAs. e AHP–GAs

Fig. 8
figure 8

Percentage of vulnerability areas: a DRASTIC. b DRASTICL. c AHP–DRASTICL. d DRASTICL–GAs. e AHP–GAs

Development of DRASTIC model

DRASTICL model

Figure 7b shows the normalized vulnerability map for DRASTICL model. The difference between the DRASTICL and DRASTIC results is obvious. Figure 8b illustrates that 8.80% of the area is under a “very high” contamination risk, and 57.73% is covered by “high” class of vulnerability to pollution. The “low” and “very low” classes are at 27.66 and 5.81%, respectively.

AHP–DRASTICL model

Figure 7c presents the normalized vulnerability map for AHP–DRASTICL model. Figure 8c indicates that the percentage for “very high” and “high” classes is at 34.87%, while the percentage for “low” and “very low” classes is at 65.13%.

DRASTICL–GAs model

By taking Pearson’s correlation coefficient between the vulnerability index and the nitrate concentration as the objective function and the weightings of the eight parameters as the optimal variables, the GAs stopped when the highest value of fitness function reached 58.17%. After decoding the corresponding weighting values, the vulnerability mapping procedure was carried out. Figure 7d and Fig. 8d illustrate the vulnerability map and percentages of the areas for DRASTICL–GAs model, respectively. As shown in Fig. 8d, the percentages for “very high” and “high” classes are at 24.05 and 55.53%, respectively. 4.76% of the total area is covered with “very low” class of vulnerability to pollution, and 15.66% is covered with “low” class.

AHP–GAs model

The GAs stopped when the highest value of fitness function reached 75.31%. The results of AHP–GAs model indicate that 7.11% of the study area belongs to a “very high” class. The percentage for the “high” class is 60.16%. The “low” and “very low” classes are 26.81 and 5.92%, respectively (Fig. 8e). The vulnerability map illustrates the “high” and “very high” vulnerability zones mostly occur in the central Jianghan Plain, particularly along the Yangtze River (Fig. 7e). There are mainly farmlands where fertilizers are extensive used. In addition, the depth to groundwater is shallower in these areas, therefore, causing a higher groundwater pollution risk.

Validation of the models

Table 8 reports the validation results of the five models. The mean nitrate concentration values associated with the four classes are calculated for different models. As can be seen, the mean values increase from low vulnerability to high vulnerability. ANOVA F statistic shows that there is a markedly significant difference across vulnerability classes for all models (P < 0.01). In other words, the null hypothesis of equal means should be clearly rejected and there is a significant difference in means between different classes. High nitrate concentration could be observed in the high groundwater vulnerability areas and vice versa. The results demonstrate that the vulnerability assessment is an efficient approach to evaluate the groundwater vulnerability.

Table 8 Validation of groundwater vulnerability mapping

In most case, correlation of vulnerability results with actual pollution occurrence is a technique for validating the accuracy of groundwater vulnerability mapping (Huan et al. 2012). It is important to note that a good performance on only a single contaminant may involve uncertainties. However, considering the extensive agricultural activities and the immoderate use of nitrogen fertilizers in the study area, the nitrate concentration is taken as the primary indicator to validate the accuracy of the vulnerability mapping. Pearson’s correlation value is calculated at 41.07% for the original DRASTIC model. This value is relatively low because the original model does not consider the regional hydrogeological conditions and land use factor. The DRASTIC model needs to be improved to reflect the actual groundwater vulnerability. The correlation coefficient increases to 55.60% when the land use map is introduced. The results indicate that the land use factor has significant effect on the groundwater vulnerability. Fertilizers used in agricultural, industrial activities, septic tanks, and sewer systems within urban areas are important sources of nitrate contamination. These hazards can absolutely increase the groundwater pollution risk. A correlation value of 58.05% is found for the AHP–DRASTICL model, and a correlation value of 59.83% is obtained for the DRASTICL–GAs model. These observations lead to conclusion that the vulnerability map could be more realistic by both optimizing the ratings and weightings. The Pearson’s correlation value is calculated at 75.31% for the AHP–GAs model, indicating a strong correlation between the vulnerability index and the measured nitrate concentration. Generally, there is a considerable increase in the correlation coefficient after each modification and the accuracy improves over 30% compared to the original model. Furthermore, it is noticed that the ANOVA F statistic increases to 12.47, which is twice larger than that obtained by the original DRASTIC model, when the land use factor is incorporated. AHP–GAs model shows the largest ANOVA F statistic of 13.10, which indicates that the least overlap exists between the nitrate concentration values in different vulnerability classes. Statistically, it can be argued that the modifications proposed in this study are effectiveness and the AHP–GAs model may be the most suitable model for study area.

Single-parameter sensitivity analysis

A total of 9892 unique condition subareas are used to calculate the “effective weight” that each parameter has according to Eq. (11). Statistical analysis is performed to analyze and display the results as shown in Table 9. In particular, in Jianghan Plain, the results reveal that the depth to groundwater mostly influences the vulnerability when comparing both theoretical and effective weights. The results confirm that land use with 19.86% of mean effective weight is a parameter that strongly affects the vulnerability index. The most unexpected result, shown in Table 9, is that the net recharge has a much higher effective weight with an average value of 23.87% against the theoretical weight that is 17.60%. The topography seems to be of low importance in groundwater vulnerability since it has the lowest effective weight. The significance of depth to groundwater and net recharge as well as the land use highlights the importance of obtaining accurate, detailed, and representative information about these factors.

Table 9 Statistical analysis of the parameter sensitivity of unique condition subareas for AHP–GAs model

Conclusion

DRASTIC model has been widely applied to evaluate the contamination risk of an aquifer. However, it may be not appropriate for accurate assessment of groundwater vulnerability due to the obvious drawbacks of ignoring the effect of local hydrogeological conditions. The present study resulted in a new DRASTIC-based model, new techniques to optimize the ratings and weightings of selected parameters. And then it was used to assess the groundwater vulnerability in Jianghan Plain, China. The results showed that the modified model had a large improvement over the conventional DRASTIC method. 7.11% of the study area has “very high” pollution potential, 60.16% has “high” pollution potential, 26.81% has “low” pollution potential, and the remainder of the study area (5.92%) has “very low” risk for groundwater pollution. The depth to groundwater was detected as the most significant factor affecting the groundwater vulnerability with 39.28% of mean effective weight, followed by net recharge (23.81%) and land use (19.86%). It highlights the importance of obtaining accurate, detailed, and representative information for a more efficient interpretation of the vulnerability index about these factors. Detailed and frequent monitoring should be carried out for observing the changing level of pollutants in groundwater. Further investigations are also required in order to understand the relationship between the groundwater vulnerability and some other contaminants.

In conclusion, the proposed method could be considered as a promising tool in coordination with the original DRASTIC method for groundwater vulnerability assessment in other study areas globally.