Introduction

Groundwater of sufficient quantity and quality is often available within the subsurface of large geological systems. However, the uncontrolled or unsustainable exploitation of groundwater in combination with anthropogenic activities on the land surface—especially if linked with inadequate changes in land use—can lead to a rapid and severe deterioration of groundwater quality (Bouchaou et al. 2008; Valle Junior et al. 2014; Erostate et al. 2018; Zendehbad et al. 2019; Boumaiza et al. 2020a). To determine the sensitivity of groundwater to potential anthropogenic contamination, resource managers often rely on aquifer vulnerability assessment. This evaluation is essential for implementing effective groundwater management strategies and for raising public awareness about the risk of groundwater contamination. Aquifer vulnerability is related to potential contamination pathways or any other pressures between the groundwater source and potential receptors (Foster et al. 2013). This vulnerability may be divided into two categories: (1) intrinsic, which considers the physical properties of the aquifer system, i.e., intrinsic geological and hydrogeological characteristics, independent of the nature of the contaminants (Gogu and Dassargues 2000); and (2) specific, which is dependent on contaminant properties, i.e., physical and biogeochemical attenuation processes, and the physical properties of the aquifer system (Doerfliger et al. 1999). Various methods for assessing aquifer vulnerability have been developed, and these methods have been reviewed in multiple papers (Gogu and Dassargues 2000; Shirazi et al. 2012; Kumar et al. 2015; Wachniew et al. 2016; Iván and Mádl-Szőnyi 2017; Machiwal et al. 2018). Machiwal et al. (2018), in a comprehensive review, identified three types of aquifer vulnerability assessment: (1) index-based methods, (2) statistical-based methods, and (3) process-based methods. Index-based methods can be further separated into two groups, depending on their applicability to either granular or karst aquifers. The commonly used granular porous aquifer index-based methods include DRASTIC (Aller et al. 1987), GOD (Foster 1987), AVI (Van Stempvoort et al. 1993), SINTACS (Civita and De Maio 2004), ISIS (Gogu and Dassargues 2000), and SEEPAGE (Moore and John 1990). Index-based methods incorporate various factors that are related to the characteristics of contaminant transport through the unsaturated and saturated zones.

The ratings and relative weights of factors, used to assess the aquifer vulnerability, are subjective and have been modified for different case studies; such modifications also included adding and/or ignoring some factors. DRASTIC is one of the most widely used index-based approaches for assessing aquifer vulnerability (e.g., Fritch et al. 2000; Ibe et al. 2001; Baalousha 2006; Saibi and Ehara 2008; Awawdeh and Jaradat 2010; Brindha and Elango 2015; Sadiki et al. 2018). The DRASTIC factors are depth to water table (D), recharge (R), aquifer media (A), soil media (S), topography (T), the impact of the vadose zone (I), and hydraulic conductivity (C) (Aller et al. 1987). The use or interpretation of these factors varies markedly among case studies. For example, Zhou et al. (2010) proposed an adapted DRAV index in which they removed the topography factor (T) and replaced soil type (S) and hydraulic conductivity (C) by a vadose zone lithology factor (V). This DRAV index was considered more adapted to arid regions characterized by limited runoff. Liggett and Allen (2011) modified the DRASTIC factor ratings to account for a site-specific lithology; the soil drainage was incorporated into the soil media factor (S) and topography (T). These modifications provided a more detailed map of aquifer vulnerability to contamination for a given study aquifer than when using the original DRASTIC index. The DRASTIC factors have also been adjusted for specific settings. For example, Wang et al. (2007) developed DRAMIC (a DRASTIC-derived index) for use in urban settings. In DRAMIC, the soil type factor (S) and topography (T) are substituted by the factor (M), which considers aquifer thickness. They also replaced the hydraulic conductivity factor by contaminant impact, denoted by the letter (C). Justification for the use of this adapted index includes (1) cities often being built on relatively flat areas, thereby reducing the importance of topography and (2) the concrete ground-surface covering in urban areas that often limits the available information about the characteristics of the underlying soil. Singh et al. (2015) introduced an anthropic factor (A) to the original DRASTIC index to incorporate the anthropogenic influence in urbanized environments. Their adapted index, DRASTICA, gave much weight to the added anthropic factor (A) because they assumed its effect on vulnerability was similar to that for the factors of water-depth and vadose zone material. The factors and weightings used in a DRASTIC-based vulnerability assessment can also be varied to take into account the effects of land use on groundwater contamination. For example, Panagopoulos et al. (2006) used the correlation coefficient of each DRASTIC factor with the nitrate concentration in groundwater to evaluate the ratings and weights of all DRASTIC factors. They observed, similar to other studies (Rosen 1994; McLay et al. 2001), that the factors of hydraulic conductivity (C) and soil type (S) had no influence on nitrate concentrations in groundwater. Hence, the hydraulic conductivity and soil type factors were removed from the DRASTIC index, whereas they incorporated land use. Ruopu et al. (2014) also deemed land use to be relevant when assessing aquifer vulnerability and proposed the DRASTIL index (where L refers to land use). Other researchers have also added land use as a factor to the original DRASTIC index (Al-Hanbali and Kondoh 2008; Heiß et al. 2020). Chenini et al. (2015) studied the vulnerability of aquifers to contamination by assuming that only factors related to the vadose zone are involved in vertical contaminant transport. They, therefore, adapted the DRIST index, derived from DRASTIC, by eliminating aquifer type (A) and hydraulic conductivity (C). Guo et al. (2007) numerically evaluated the rating values and weights of DRASTIC and other related factors. They identified soil type (S), topography (T), and the impact of the vadose zone (I) as factors that could be ignored, whereas they found that other factors, including the ratio of cumulative thickness of clay layers to the total thickness of vadose zone and the contaminant adsorption coefficient of sediment in the vadose zone, to be relevant when assessing aquifer vulnerability to contamination. Other DRASTIC-derived index alternatives have proposed vulnerability indices more adapted to specific properties, e.g., a modified DRASTIC for pesticide contamination and a modified DRASTIC specific to nitrate in aquifers, for which factor weights were modified from those in the original DRASTIC index (Huan et al. 2012; Neshat et al. 2014; Saha and Alam 2014; Fusco et al. 2020). Several studies have relied on the establishment of a linear correlation between nitrate concentration and vulnerability maps to validate the approach used to assess aquifer vulnerability (e.g., Panagopoulos et al. 2006; Kazakis and Voudouris 2015; Arauzo 2017; Shrestha et al. 2017). Pacheco et al. (2018) demonstrated the poor applicability of such a correlation as a validation process, because the dynamics of nitrate within aquifers can be dominated by lateral flow; a dynamic assumed to be negligible in vulnerability assessment methods.

Modifications of factors and weights for the DRASTIC-derived indices can be justified given that the assessed vulnerability based on the original DRASTIC index is often considered to be unsatisfactory (Al-Zabet 2002; Pacheco and Sanches Fernandes 2013; Pacheco et al. 2015). A variety of statistical techniques, ranging from a simple linear regression to complex statistical techniques, have been utilized to describe the importance of a factor relative to the others. Some studies (e.g., Javadi et al. 2011) are based on correlation analysis between factor and nitrate concentrations in groundwater. Others have used the single parameter sensitivity analysis based on subarea conditions that can be identified by GSI (e.g., Hasiniaina et al. 2010). The Analytic Hierarchy Process assessment method has been also used in many studies (e.g., Sener and Davraz 2013). This technique normalizes the assigned weights to factors using the eigenvector technique, which reduces the subjectivity involved in the initial assigned weights. The practice of applying fuzzy logic statistical methods is also increasingly used in vulnerability assessments using the DRASTIC index (e.g., Pathak and Hiratsuka 2011; Rezaei et al. 2013). This method can be used to cope with vaguely defined classes or categories by making it possible to define the ‘‘membership degree’’ of an element in a set by means of a membership function. Pacheco and Sanches Fernandes (2013) used a multivariate statistical method (called Correspondence Analysis) in which the rationale for the adjustment of factor weights is the minimization of redundancy between factors. Despite efforts aiming to improve the DRASTIC index to make it more adaptative to the particularities of the studied regions, no attempt has been undertaken so far to develop an operational methodology by evaluating the potential effect of land use on the overall groundwater quality, thus making it possible to select the relevant factors for assessing the intrinsic aquifer vulnerability and to determine their relative importance. The challenge in this way is accentuated when groundwater samples are collected under a variability of aquifer conditions (i.e., various hydraulic conductivity levels, various soil types, etc.). Barbulescu (2020) underlined the necessity of an informed selection of relevant factors when assessing aquifer vulnerability, the proper validation of factor weighting, and the value ranges allocated to factor categories. Consequently, this study aims to develop a reliable methodology for (1) selecting the relevant factors when assessing the intrinsic aquifer vulnerability and (2) determining the relative importance of the selected factors. Our proposed methodology has been developed by using a large data set of porous granular aquifers, which integrates—in addition to some DRASTIC factors—data related to groundwater quality and land use. If the DRASTIC method is used to assess aquifer vulnerability to contamination for a given region, our approach serves an initial step to determine the relevant factors related to the aquifer in question, while the determined relative importance of these factors is used to validate factor weighting. This study does not validate the original DRASTIC ratings and ranges of the factor categories (Aller et al. 1987) and thus developing a comprehensive adapted vulnerability index for the study region, as well as mapping vulnerability, lies beyond the scope of this study.

Data sources

We have based our study on a regional-scale data set acquired during the Quebec government’s hydrogeology characterization program—Programme d’acquisition de connaissances sur les eaux souterraines (PACES)—undertaken in the Saguenay-Lac-Saint-Jean (SLSJ) region of Quebec, Canada (13,200 km2) (Fig. 1). A major output of the PACES-SLSJ Project was the development of a comprehensive groundwater data set generated through (1) the cataloging and digitizing of existing relevant information related to regional groundwater; (2) a regional groundwater sampling campaign; and (3) the application of a quality control process to screen the data for accuracy and quality (CERM-PACES 2013). This hydrogeological data set has been used, among others, to build 3D-subsurface hydro-structural models (Chesnaux et al. 2011; Hudon-Gagnon et al. 2015; Foulon et al. 2018); understand the chemical evolution of regional groundwater systems (Walter et al. 2017, 2018, 2019); quantify regional groundwater recharge and water transit times (Chesnaux 2013; Huet et al. 2016; Chesnaux and Stumpp 2018; Boumaiza et al. 2020b, c; Labrecque et al. 2020); characterize the internal architecture of granular aquifers (Boumaiza et al. 2015, 2017, 2019a); perform more realistic analyses of heterogeneous—non purely Theissian—flow systems (Ferroud et al. 2018, 2019); and to identify field evidence of hydraulic connections between bedrock aquifers and the overlying granular aquifers (Richard et al. 2014, 2016a, b). These granular systems in the SLSJ region were deposited following the last deglacial episode, some 11,800 years ago when the SLSJ lowlands were invaded by the Laflamme Sea. The regional SLSJ graben physiography is marked by large accumulations of Quaternary deposits (sand, gravel, and clay-silt) to a thickness of 180 m in the central SLSJ lowlands (Dionne and Laverdière 1969; Lasalle and Tremblay 1978). In this study, we used groundwater samples collected as part of the PACES-SLSJ project from the granular unconfined aquifers (Fig. 1).

Fig. 1
figure 1

Location of the study area and the groundwater sampling network sites used to generate the data

Description of the developed operational methodology

Our methodology for identifying the relevant factors and their relative importance for assessing aquifer vulnerability is summarized in Fig. 2. We detail the ten methodological steps of Fig. 2 in the following subsections.

Fig. 2
figure 2

Successive steps of the developed operational methodology. The confirmations YES and NO indicate where a preceding step has been either completed (YES) or remains to be completed (NO)

Step 1: Selecting an aquifer vulnerability index method

We selected the DRASTIC index for applying our proposed methodology because of the available information related to DRASTIC factors. Nonetheless, our evaluation process is limited to only five DRASTIC factors: the depth of the water table from the ground surface (D); the average annual recharge (R); the dominant aquifer soil type (S); the surface topography expressed as terrain-slope (T); and the average hydraulic conductivity of the aquifer (C). Aquifer type (A) is excluded because our study is restricted to unconfined granular aquifers. Therefore, the impact of the vadose zone (I) was not considered in the present study because the vadose zones of all studied aquifers are dominated by granular material. Information related to the considered factors at the groundwater sampling sites is compiled in Supplementary data (Appendix 1). The methods used for estimating these factors are described in CERM-PACES (2013).

Step 2: Preparing the data set, including groundwater quality and land use

Assessing groundwater quality

We evaluated groundwater quality at the sampling sites using the water quality index (WQI) (Horton 1965). The chemical concentrations of Ca2+, Mg2+, Na+, HCO3, SO4−2, Cl, F, NO3, Fe, Mn, and Zn of 98 groundwater samples were considered (Supplementary data (Appendix 1)). These chemical parameters were selected because their concentrations in groundwater are rarely below the detection limit. We combined these chemical parameters with total dissolved solids (TDS) and pH to evaluate the WQI of each sample for drinking purposes (Horton 1965). To compute WQI, we first calculated the relative weight of each considered chemical parameter using Eq. 1.

$${{\text{W}}_{\text{i}}\text{ = }\frac{{\text{w}}_{\text{i}}}{\sum_{\text{i=1}}^{\text{n}}{\text{ w}}_{\text{i}}}},$$
(1)

where Wi is the relative weight of the chemical parameter, n is the sum of the chemical parameter weights, and wi is the attributed weight to the chemical parameter (Table 1). We assigned each chemical parameter a value for wi between 1 (least effect on water quality) to 5 (greatest effect on water quality) on the basis of their perceived effects on primary health and their relative importance on drinking water quality. We assigned the highest weight of 5 to NO3 because this parameter has important health effects, whereas the lower weights of 1 and 2 were assigned to Zn, Ca2+, Mg2+, and Na+ due to their minimal importance in water quality assessments (Ramakrishnaiah et al. 2009; Şener et al. 2017; Sethy et al. 2017).

Table 1 Data used to calculate the relative weights of the chemical parameters

We then calculated the quality rating scale for each chemical parameter using Eq. 2, where qi is the quality rating scale, Ci is the measured concentration of the chemical parameter (mg/L) in the groundwater sample, and Si is the drinking water standard concentration (mg/L) for each chemical parameter (Table 1). Finally, we calculated the sub-index (SIi) for each chemical parameter using Eq. 3, and from this we determined WQI as the sum of the SIi values (Eq. 4) (Horton 1965). Thus,

$${{{q}}_{{i}} = \left(\frac{{{C}}_{{i}}}{{{S}}_{{i}}}\right).{ 100}},$$
(2)
$${{\text{SI}}_{{i}} = {{W}}_{{i}}{. }{{q}}_{{i}}},$$
(3)
$${{\text{WQI}} = \sum {\text{SI}}_{\text{i}}}.$$
(4)

The evolution of groundwater chemistry within geological systems is complex; chemical elements can be present at naturally high concentrations or/and be increased because of anthropogenic stressors (Appelo and Postma 2005). To determine whether the groundwater quality of a site is predominantly affected by geological (i.e., natural) or anthropogenic sources, we applied a Spearman rank correlation coefficient, as this coefficient is not restrained by sample size or the general distribution of variables (Huan et al. 2012). A Spearman rank value of + 1.0 or –1.0 indicates, respectively, a positive or negative correlation between two examined variables, whereas a value of 0.0 indicates no correlation. For all of our sampling sites, the WQI is correlated with the DRASTIC index (CERM-PACES 2013), by assuming that the DRASTIC index represents the geological influence (Heiß et al. 2020). The calculated Spearman rank correlation coefficient was 0.06, suggesting that groundwater contamination within the unconfined aquifers of SLSJ region is not linked (primarily) to regional geology. Thus, we can assume that the WQI of our study region is affected mainly by surface anthropogenic sources, from which the potential contamination is assumed to be transported only vertically, as lateral contaminant transport is neglected in the DRASTIC method. The calculated WQI values for the groundwater samples collected from the unconfined aquifers of SLSJ region are presented in Supplementary data (Appendix 1) and were classified according to Table 2.

Table 2 Water quality classification

Classifying the land-use effect

Comprehensive information on land use across the SLSJ region is available from CERM-PACES (2013). The land-use map, which was produced at the same time as the groundwater sampling campaign, indicated that SLSJ surface areas are occupied by four main sectors: forestry, agricultural, urban, and industrial sectors. Some areas represent mixed agricultural-urban sectors. We proposed a subjective land-use effect to rate the overall effect of land use on groundwater quality (Table 3). Higher values of the land-use effect (on a scale from 1 to 5) reflect a greater land-use influence. The assigned land-use effect levels for the groundwater sampling sites are presented in Supplementary data (Appendix 1). Despite the presence of an industrial sector in the SLSJ region, we did not record a level 5 land-use effect at the locations where the groundwater samples were collected.

Table 3 Land-use effect level codification

Step 3: Selecting and ranging a factor

Step 3 consists of selecting one factor, e.g., hydraulic conductivity, that will be further evaluated in steps 4–9 (Fig. 2). Step 3 also involves, for the selected factor, establishing categories that represent different value ranges, e.g., slope category 1, 0–6%; slope category 2, 6–12% (Table 4). Here, our factor ranges are derived from existing DRASTIC ranges (Aller et al. 1987). A factor range is introduced by dividing each factor into intervals (Table 4) with the aim of determining whether groundwater quality is affected within different factor ranges. For example, aquifers having a lower hydraulic conductivity (Range 1, Table 4) are expected to be less vulnerable than aquifers having a higher hydraulic conductivity (Range 3, Table 4).

Table 4 Ranges of the aquifer factors

Steps 4 and 5: Determining the potential impact of land use on groundwater quality

To identify the variation in groundwater quality in relation to land use, we introduce a mean-weighted WQI (MWWQI) in Step 4. The MWWQI provides a representative indicator of groundwater quality for a set of groundwater samples collected from various locations within the same land-use category. The MWWQI for a given land-use category is calculated using Eq. (5), adapted from Boumaiza et al. (2019a). In this study, Pi is the occurrence probability of WQI class for land-use effect level Di, where i is the weight (1–5) of the WQI classes (Table 5). Pi is calculated according to Eq. (6), where ni is the number of groundwater samples within WQI class Di, and nt is the total number of groundwater samples, both considered for each land-use effect level. An example of the calculation of MWWQI is presented in Table 5.

$${{\text{MW}}_{\text{WQI}}\text{ } = \sum_{{i=1}}^{5}{{{P}}}_{{i}}{{\cdot D}}_{{i}} },$$
(5)
$${{{P}}_{{i}} = \frac{{{n}}_{{i}}}{{{n}}_{{t}}}} .$$
(6)
Table 5 Example of calculating MWWQI

MWWQI is initially calculated for the set of groundwater samples belonging to the same range for a given factor, e.g., groundwater samples collected from sites having a terrain-slope range of 0–6% (Category 1; Table 4), and found within the same land-use category, e.g., a level 1 land-use effect, Table 3. The identical calculation is then undertaken for the set of groundwater samples belonging to the other land-use categories (2–4) of the same factor range, e.g., Category 1. A best-fit curve representing the calculated MWWQI versus the land-use effect levels is then traced. This traced best-fit curve shows the variation in groundwater quality—evaluated using the MWWQI—on the basis of the variation in land-use effect levels within a single factor’s range. Finally, Step 5 (Fig. 2) performs the identical Step 4 process, although for all ranges of this factor.

Steps 6 and 7: Illustrating the impact of land use on groundwater quality

Once the calculations are completed in Step 5, the next step consists of drawing all the best-fit curves that represent the calculated MWWQI versus the land-use effect levels. Curves are drawn according to the different ranges of the factor (see an example of the generated best-fit curves in Fig. 3). These best-fit curves represent the groundwater contamination sensitivity curves based on the factor (GCSC/AF sensitivity curves). Step 7 (Fig. 2) involves the identical process—described in steps 3 to 6—for each of the analyzed factors (water table depth, average annual recharge, soil type, terrain-slope, and aquifer hydraulic conductivity). For a given factor, GCSC/AF sensitivity curves should plot in a sequential order related to the factor ranges, and show a logical sequence, as observed in Fig. 3. Here, we refer to a logical sequence as GCSC/AF sensitivity curves plotting in an increasing or decreasing order relative to the factor’s ranges. If the GCSC/AF sensitivity curves of a given factor do not show a logical sequence, we reject this factor and do not analyze it in the subsequent steps 8–10 (Fig. 2).

Fig. 3
figure 3

Groundwater contamination sensitivity curves based on water table depth. Each best-fit line and its equation correspond to the range symbol of the same color in the legend

Steps 8 and 9: Illustrating the relationship between factor and groundwater quality

In Step 8, we illustrate the relationship between the factor and groundwater quality; such a process permits the comparison of all analyzed factors in Step 10 (Fig. 2). For this step, the produced GCSC/AF sensitivity curves are converted to groundwater contamination sensitivity curves according to land use (GCSC/LU sensitivity curves). Here, the MWWQI is evaluated from the fitted GCSC/AF sensitivity curves, rather than using the original values, for each land-use effect level (1–4) according to each factor’s range. Afterward, we plot the assessed MWWQI values—for each land-use effect level—against the factor’s ranges to produce the GCSC/LU sensitivity curves. The ranges of the factor in the GCSC/LU sensitivity curves are represented by a single average value (see an example of the generated GCSC/LU sensitivity curves in Fig. 8a). The process is repeated for the other factors, except for the dominant aquifer soil type (Table 4). This latter factor is represented by single values of 1, 2, and 3, which correspond to gravel, sand, and loam-clay, respectively. As the factors are expressed in the GCSC/LU sensitivity curves at different scales, i.e., water table depth (m), recharge (mm/years), soil type (grade number), terrain-slope (%), and hydraulic conductivity (cm/s), Step 9 (Fig. 2) involves mathematically normalizing these differing scales. The normalization process is undertaken by subtracting the mean value from the obtained value and then dividing the result by the standard deviation. With a normalized scale, the GCSC/LU sensitivity curves of the various factors can be compared with each other in Step 10.

Step 10: Determining the impact of the factors on groundwater quality

If we assume that groundwater quality has a relationship with the factor, e.g., a water table depth increase causes the MWWQI to increase, the curves linking a factor to groundwater quality (GCSC/LU sensitivity curves) should be illustrated in the form of a slope. Steeper slopes for the GCSC/LU sensitivity curves, including a large MWWQI interval, indicate that the associated factor is more sensitive to the land-use effect level than lower-slope GCSC/LU sensitivity curves. Hence, the factor showing the steepest GCSC/LU sensitivity curve has the greatest effect on groundwater quality. Such slopes vary for each factor, however, as a function of land-use category. Our approach differentiates the effect of each factor—and also considers slope variations in relation to the land-use category—by determining the slope for each GCSC/LU sensitivity curve. Subsequently, plotting the calculated slope versus the considered land-use effect levels, i.e., 1, 2, 3, and 4, identifies the relative impact of the factors (the generated view is shown in Fig. 10).

Results

We applied our approach to the regional-scale groundwater data set of the SLSJ region. Our methodology produced: (1) GCSC/AF sensitivity curves in Step 6 (Figs. 3, 4, 5, 6, 7); (2) GCSC/LU sensitivity curves in Step 8 (Fig. 8); normalized GCSC/LU sensitivity curves in Step 9 (Fig. 9); and (3) curves used to determine the relative effect of the factors in Step 10 (Fig. 10).

Groundwater contamination sensitivity curves per factor

Water table depth

Aquifers characterized by a water table depth of category 1 (0–4.5 m) are considered more vulnerable to groundwater contamination than aquifers having a greater water table depth, e.g., category 4 (> 23 m). Groundwater contamination sensitivity curves based on water table depth (Table 4) show an inverse relationship between MWWQI and the land-use effect level, i.e., when land-use effect level increases, the MWWQI decreases (Fig. 3). This pattern is expected because a heightened land-use effect corresponds to an increased potential for groundwater contamination (Table 3), whereas the MWWQI decrease corresponds to a degradation of groundwater quality (see Di in Table 5). Furthermore, these groundwater contamination sensitivity curves (Fig. 3) follow a logical sequence, i.e., groundwater samples of category 4 (water table depth > 23 m; Table 4) are least sensitive to contamination, and sensitivity is subsequently greater as water table depth decreases. For the land-use effect level 4, for example, a category 1 water table depth (0–4.5 m) has a MWWQI of approximately 3 (poor; Table 5), whereas a category 3 water table depth (15–23 m) has a MWWQI of approximately 4 (good; Table 5). Given that the groundwater contamination sensitivity curves based on water table depth show a logical sequence and demonstrate an inverse relationship between MWWQI and land-use effect level, we retained water table depth as a relevant factor for assessing aquifer vulnerability to contamination, and this factor is processed in steps 8–10 of our proposed methodology (Fig. 2).

Average annual recharge

Groundwater contamination sensitivity curves based on average annual recharge (Table 4) are aligned in a logical sequence (Fig. 4). Groundwater samples collected from locations having a lower average annual recharge (0–100 mm/year) are least sensitive to contamination, and sensitivity to contamination increases as average annual recharge increases. For a land-use effect level 3, for example, groundwater sampling locations characterized by an average annual recharge of category 1 (0–100 mm/year) have a MWWQI of approximately 5 (excellent; Table 5), whereas those having an average annual recharge of category 4 (> 250 mm/year) have a MWWQI of approximately 4 (good; Table 5). In a manner similar to the effect of varying water table depth, Fig. 4 shows that when the land-use effect level increases, the MWWQI decreases. That is, there is an inverse relationship between the MWWQI and land-use effect level. As groundwater contamination sensitivity curves show a logical sequence and express a clear inverse relationship between the MWWQI and land-use effect level, we considered the average annual recharge to be a relevant factor in the assessment of vulnerability to contamination. We, therefore, processed this factor in steps 8–10 (Fig. 2).

Fig. 4
figure 4

Groundwater contamination sensitivity curves based on the average annual recharge. Each best-fit line and its equation correspond to the range symbol of the same color in the legend

Dominant vadose zone soil type

In Fig. 5, groundwater contamination sensitivity curves based on the dominant vadose zone soil type (Table 4) illustrate that groundwater samples collected from locations having a vadose zone dominated by clay-loam are least sensitive to contamination, and sensitivity increases (as MWWQI decreases) as the vadose zone becomes increasingly dominated by highly permeable material. We observe a logical sequence of the traced sensitivity curves to contamination: clay-loam is least sensitive, followed by sand. Gravel is most sensitive to contamination (Fig. 5). Groundwater contamination sensitivity curves based on the dominant vadose zone soil type show an inverse relationship between the MWWQI and land-use effect level (Fig. 5), meaning that the increase of the land-use effect level involves a decrease of MWWQI. We, therefore, retained the dominant vadose zone soil type as a relevant factor and processed this factor in steps 8–10 (Fig. 2).

Fig. 5
figure 5

Groundwater contamination sensitivity curves based on the dominant vadose zone soil type. Each best-fit line and its equation correspond to the range symbol of the same color in the legend

Terrain-slope

According to the followed methodology, groundwater contamination sensitivity curves based on the slope of the land surface (Table 4) show that when the land-use effect level increases, MWWQI decreases by illustrating an inverse relationship (Fig. 6). However, we also observe in Fig. 6 that category 3 (12–18%) is the most sensitive curve to contamination by considering the lower MWWQI. This result is not expected as the highest slope should have a lower recharge and more overland flow and, consequently, a lower contaminant transport leading to a lower sensitivity to contamination. As well, the least sensitive curve to contamination is not clear in Fig. 6, as curves related to categories 1 (0–6%) and 4 (> 18%) intersect. We also observe random patterns between categories 1 (0–6%) and 2 (6–12%). Hence, these developed curves are not sensitive to the slope. From these observations, we, therefore, eliminate slope as a relevant factor and do not process this factor in steps 8–10 (Fig. 2).

Fig. 6
figure 6

Groundwater contamination sensitivity curves based on terrain-slope (topography). Each best-fit line and its equation correspond to the range symbol of the same color in the legend

Mean hydraulic conductivity

In Fig. 7, groundwater contamination sensitivity curves based on the average hydraulic conductivity of the aquifer (Table 4) show an inverse relationship between the MWWQI and land-use effect level, i.e., when the land-use effect level increases, the MWWQI decreases. Furthermore, the produced sensitive curves to contamination follow a logical sequence, i.e., groundwater samples collected from aquifer locations characterized as category 1 (0–5.2 × 10–3 cm/s; Table 4) are least sensitive to contamination, and sensitivity is greater as the aquifer’s hydraulic conductivity increases. For the land-use effect level 2; for example, a category 1 hydraulic conductivity would produce a MWWQI of approximately 5 (excellent; Table 5), whereas a category 3 hydraulic conductivity (5.2 × 10–3–9.5 × 10–2 cm/s) results in a MWWQI of approximately 4 (good; Table 5). Given the logical sequence of these curves and the clear inverse relationship between the MWWQI and land-use effect level, we retained average hydraulic conductivity as a relevant factor and processed this factor in steps 8–10.

Fig. 7
figure 7

Groundwater contamination sensitivity curves based on the mean hydraulic conductivity of the aquifer. Each best-fit line and its equation correspond to the range symbol of the same color in the legend

Groundwater contamination sensitivity curves according to land use

From the retained GCSC/AF sensitivity curves (Figs. 3, 4, 5, 7), we determined the MWWQI for each land-use effect level (1–4) according to the range of each factor. We then traced the GCSC/LU sensitivity curves by plotting the determined MWWQI values—for each land-use effect level—versus the range of each factor. Note that the factor in the GCSC/LU sensitivity curves is represented by a single mean value, except for the dominant aquifer soil type (Table 4), represented by single values of 1, 2, and 3, which correspond to gravel, sand, and loam-clay, respectively. We then normalized the scale of each factor. The results of this processing (Steps 8 and 9) are shown in Figs. 8 and 9.

Fig. 8
figure 8

Groundwater contamination sensitivity curves according to the land-use effect level for a water table depth, b average annual recharge, c dominant vadose zone soil type, and d mean hydraulic conductivity of the aquifer. Note that for the sensitivity curves of the average annual recharge and the average aquifer’s hydraulic conductivity, the x-axis values are presented from higher to lower values

Fig. 9
figure 9

Normalized groundwater contamination sensitivity curves according to the land-use effect level for a water table depth, b average annual recharge, c dominant vadose zone soil type, and d average hydraulic conductivity of the aquifer. Note that the x-axis values are presented from higher to lower values for the average annual recharge and the average hydraulic conductivity of the aquifer

Determining the relative impact of the factors

The determined slopes for each land-use category in Fig. 9 are presented in Table 6. For each factor, we then plotted the calculated slope versus the considered land-use effect levels, i.e., 1, 2, 3, and 4, to determine the relative impact of the factors (Fig. 10). For the land-use effect level 1, the slopes of the GCSC/LU sensitivity curves (Table 6) of the retained factors were ranked as water table depth (0.26), average hydraulic conductivity (0.15), dominant vadose zone soil type (0.13), and the average annual recharge (0.11). We observed the same pattern of relative ranking for the other land-use effect levels 2, 3, and 4 (Table 6; Fig. 10). Water table depth produced the highest slope values (Fig. 10). Consequently, this factor was considered to have a greater role in controlling groundwater vulnerability to contamination than the other retained factors. Each factor also showed a distinct slope-variation curve, i.e., the line connecting the determined slopes for each land-use effect level (Fig. 10). We note that each produced curve has a specific slope across the land-use effect levels (1–4), which can be ranked (from highest to lowest) as water table depth (0.13), average hydraulic conductivity (0.09), dominant vadose zone soil type (0.08), and average annual recharge (0.07). This ranking agrees with the relative importance ranking of the factor. In terms of relative importance, the analyzed factors were ranked in importance (from highest to lowest) as (1) water table depth, (2) average hydraulic conductivity, (3) dominant vadose zone soil type, and (4) average annual recharge.

Table 6 Determined slope of the normalized GCSC/LU sensitivity curves
Fig. 10
figure 10

Slope-variation curves for the determined relevant factors

Validation

We can validate our methodology by applying the developed GCSC/LU sensitivity curves (Fig. 9) to external data and verify whether our obtained GCSC/LU for a given factor (Fig. 9) can reliably predict MWWQI. For this purpose, we selected the GCSC/LU sensitivity curves related to water table depth (Fig. 8a) and used data from 22 groundwater samples collected from the granular unconfined aquifers of the Charlevoix-Haute-Côte-Nord (CHCN) region of Quebec (CERM-PACES 2015). This region experiences similar climatic conditions as those of the nearby SLSJ region, and the available data set provided complete information for each site, including water chemistry, land-use effect level, and water table depth (see Supplementary data (Appendix 1). Groundwater samples used for this validation are identified by the prefix CHCN). We then calculated the WQI of the 22 selected groundwater samples from the chemistry data and expressed the results according to the ranking proposed in Table 5 (this weighted-WQI, hereinafter called the measured-WWQI). We combined water table depths and land-use effect levels of the 22 groundwater samples in Fig. 8a to determine a synthetic value for WQI (this synthetic weighted-WQI is called hereinafter the predicted-WWQI). We note that the predicted-WWQI agrees with the measured-WWQI for 15 (68%) groundwater samples (Table 7). Nonetheless, seven (32%) groundwater samples have a predicted-WWQI that differs from the measured-WWQI. We used the root mean square error (RMSE) to evaluate the effectiveness of the obtained results, as RMSE is often used in geosciences to assess the test quality both in terms of accuracy and precision (e.g., Chesnaux et al. 2017; Boumaiza et al. 2019c). Here, RMSE corresponds to the mean difference between the predicted-WWQI, determined through our developed GCSC/LU sensitivity curves, and the measured-WWQI, computed from the water chemistry data (Eq. 7).

Table 7 The calculated RMSE as part of the validation process
$${\text{RMSE}} = {{(}\frac{1}{{{n}}} \, \sum_{{i=1}}^{{n}}{\left({{\text{Pred}}{\text{icted}} \, {{W}}}_{\text{WQI}}{ - }{{\text{Measur}}{\text{ed}} \, {{W}}}_{\text{WQI}} \, \right)}^{2}{)}}^{1/2}.$$
(7)

The calculated RMSE (named real-RMSE) indicates the produced error. To determine the maximum possible error that could occur (named max-RMSE), we replaced the predicted-WWQI by the maximum predicted-WWQI, representing a value (1–5). For example, if the measured-WWQI is found to be 2, the maximum predicted-WWQI value that produces max-RMSE is 5. Table 7 presents the real-RMSE and max-RMSE values and shows values of 0.7 and 3.8, respectively. The real-RMSE is lower than the max-RMSE and represents 18.5% of the max-RMSE. With this lower real-RMSE, the actual produced error is considered acceptable relative to the maximum produced error.

Discussion

Our identification of relevant and irrelevant factors agrees with Rupert (1999) who found water table depth, recharge, and soil type to be relevant factors for assessing aquifer vulnerability to contamination. Tesoriero and Voss (1997) also underlined surficial geology as a relevant factor, a factor that can be viewed as a synonym of the dominant vadose zone lithology in our study area. Babiker et al. (2005) used the DRASTIC index to assess aquifer vulnerability; their map sensitivity analysis indicated that the vulnerability index is highly sensitive to recharge, soil type, and topography. This outcome partially agrees with our study, as we also determined recharge and soil type to be relevant factors; however, we did not find topography to be relevant. As in this study, Guo et al. (2007) also found topography to be irrelevant. We also note that for hydraulic conductivity and soil type, these factors are considered irrelevant in some studies (e.g., Panagopoulos et al. 2006), whereas we found them to be relevant for assessing aquifer vulnerability in the SLSJ region. The inclusion or exclusion of factors into the vulnerability assessment process is often related to site-specific conditions and often made on the basis of expert opinion (Worrall et al. 2002). Contradictory evidence in regard to those factors controlling the aquifer vulnerability assessment process has been already noted, particularly when validation is performed for DRASTIC results (Close 1993; Maas et al. 1995). Nonetheless, our ordering of the relative importance of factors (1: water table depth; 2: average hydraulic conductivity; 3: soil type; 4: recharge) largely agrees with the weights in DRASTIC. DRASTIC weights water table depth at 5, hydraulic conductivity at 3, and soil type at 2. Pacheco et al. (2015) found that the importance of water table depth could be reduced (from 5 to 3), whereas the weights of the other factors remain unaltered. In our study, recharge ranks last, whereas in DRASTIC, this factor is the second-most important factor, having a weight of 4 within a weight range of 1 to 5 (Aller et al. 1987). An adapted DRASTIC index for pesticide contamination also allocated a relatively high weight (4) to recharge, but this weight value of 4 lies in the middle of the pesticide-DRASTIC weights ranging from 3 to 5 (Al-Zabet 2002). A common principle is that greater recharge heightens the possibility of contaminant transport within the aquifer. This view is a simplified assumption that ignores potential contaminant and sediment-specific sorption and reaction rates (e.g., Kiecak et al. 2019, 2020). The data used in our study were collected from the SLSJ region; this northern humid region usually experiences a heavy snow accumulation with limited water infiltration during winter/early spring (from November to April). This climatic regime leads to a lower effect of recharge in the eventual transport of intrinsic contamination, as the water available for infiltration into the subsurface is limited for almost half of the year.

We also found terrain-slope (topography) to be an irrelevant factor for assessing the aquifer vulnerability to contamination in the SLSJ region. This factor has commonly been excluded from the DRASTIC index, as the topography of the investigated areas is generally flat (e.g., Khan et al. 2014; Wu et al. 2014; Wang et al. 2017); nonetheless, this exclusion does not hold for regions characterized by a variable topography. The limited effect of the topography in the SLSJ region can be related to the regional climate conditions. The aquifer recharge in the SLSJ region is generally dominated by snowmelt when 5–6 months of accumulated snowpack melts in the spring season to become available for infiltration. Assuming that the snow-melting process first affects the upper snowpack layers (exposed to the sun), the snowpack lower layers can limit water infiltration on both sloped and flat areas. This mechanism causes the entire region to be, at least temporarily, under similar water infiltrating conditions. Therefore, this scenario limits the importance of surface slope as a controlling factor on aquifer recharge and minimizes the role of topography on aquifer vulnerability. Our operational methodology is intended as an initial step for selecting the relevant factors when assessing aquifer vulnerability and for determining the relative importance of these selected factors. Nonetheless, because the original DRASTIC factor category ranges adopted in our study (Table 4) have not been validated, we cannot propose a DRASTIC-derived index that is adapted to the SLSJ region, nor can we map regional vulnerability. We observed that the sensitivity curves of Figs. 3, 4, 5 and 7 plots in sequential order relative to the factor ranges, although for some cases, the sensitivity curves are quite similar, i.e., nearly overlapping. In Fig. 3, for example, we observe an obvious difference for the water table depth range of 0–4.5 m, whereas the other depth ranges show quite similar curves. Such an observation potentially justifies a re-evaluation of all factor ranges (Table 4) depending on the study site, to validate (1) the number of ranges required for each factor and (2) the limit values attributed to each factor range. We recommend complementary statistically-based studies to determine these range properties. Finally, our study relied on data collected from unconfined aquifers from a northern region characterized by humid climate conditions. This developed operational methodology could be, however, easily adapted to other regions that differ in geological specificities, e.g., confined aquifer, karstic aquifer, etc., and climate conditions, and our approach could also incorporate other available data and factors related to aquifer vulnerability indices. It is, therefore, possible to evaluate whether aquifer vulnerability assessments are affected by other factors not tested here.

Conclusions

In this study, we have presented an operational approach for: (1) selecting the relevant factors when assessing aquifer vulnerability to contamination; and (2) determining the relative importance of the selected factors. We developed our methodology using a large data set from granular aquifers of Saguenay-Lac-Saint-Jean region of Quebec, Canada. The available data included water table depth, average aquifer hydraulic conductivity, topography (terrain-slope), dominant vadose zone soil type, and average annual recharge. We combined these data with information related to groundwater quality and land use. We found topography to be an irrelevant factor for assessing the aquifer vulnerability in our study region. The relevant factors ranked in their relative importance (from highest to lowest) were: (1) water table depth; (2) average aquifer hydraulic conductivity; (3) dominant vadose zone lithology; and (4) average annual recharge. This ranking is representative only for this study region because it is established as a function of the applied data set and cannot be considered as a standard factor ranking. Our results partially agree with the weights of factors in DRASTIC, in which water table depth weight is 5, aquifer hydraulic conductivity is 3, and soil type is 2. Nonetheless, our result for the recharge factor, which we ranked as last, differs from that of the DRASTIC index where recharge was the second-most important factor. This study provides an original approach for integrating groundwater quality data and land-use effects with a data set of the characterized factors, a process that has not been previously applied in the weighting of DRASTIC factors. We do not intend this study to be a comparison with DRASTIC weighting or other methods or an evaluation of the DRASTIC index. Rather, we have introduced an operational methodology that can serve as an initial step for determining relevant the DRASTIC factors for assessing regional aquifer vulnerability. The second step, determining the relative importance of the relevant factors, serves to validate the weight attributed to each DRASTIC factor.