Introduction

Groundwater vulnerability can be considered a latent variable (i.e. a non-observable variable), which can be inferred considering other measurable dependent variables (Gogu and Dassargues 2000).

Assessment of groundwater vulnerability has experienced some important changes over the last 15 years as the availability of geo-environmental data has increased and interest in groundwater monitoring and protection has grown. Moreover, in these years, the development of GIS systems has simplified and optimised the application of methods, and their comparison.

A wide review of the different understandings of the groundwater vulnerability concept and of methods for assessing groundwater vulnerability is provided by Wachniew et al. (2016).

Two major categories of methods exist for assessing vulnerability: objective (physically based and statistical) and subjective methods. Moreover, we can distinguish between intrinsic vulnerability, based on the intrinsic property of the aquifer system, and specific vulnerability, related to one or more contaminants.

The physically based methods, not widely used, take into account flow and transport processes, but they are not necessarily based on deterministic calculations. Often, the vulnerability is estimated based on the contaminant residence time (Zwahlen 2004). Statistical methods are more oriented towards the specific vulnerability and predict probabilities of contamination on the basis of correlations between the properties of the aquifer, the origin of the contamination and the pollution occurrence, verified by groundwater chemistry monitoring studies (Masetti et al. 2008).

Subjective methods, widely identified with the parametric or overlay and index methods, are based on geological and hydrogeological factors influencing the vulnerability, usually represented as GIS layers. These factors or parameters are transformed from the physical range scale to a relative scale (i.e. rating). The rating process and the process of weighting of each layer on the basis of the importance of the physical parameter are subjective processes, often requiring the opinion of a hydrogeologist with expertise in the study area.

The two types of methods, objective and subjective, have their own advantages and disadvantages, and their choice usually depends on a series of factors: the scale of the problem, the hydrogeological characteristics of the area, data availability and the presence or absence of a groundwater quality network. In the literature, there have been many proposals to modify existing rating methods and statistical methods for assessing groundwater vulnerability in order to obtain improved methods bringing a reliable and scientifically defensible endpoint (Sorichetta et al. 2011; Ducci and Sellerino 2013).

The purpose of this work is to create a correlation between the two methods for a specific vulnerability assessment, trying to refine the rating methods using the major detail of the statistical analysis. This is in order to develop procedures that cope with the disadvantages and emphasise the advantages of the two types of methods; we attempt to combine the two approaches to perform a groundwater vulnerability assessment in a case study in Italy. The case study is located in an area in northern Italy with extensive urban and agricultural sectors. Keeping the structure of an already existing rating method, we use a spatial statistical approach to calibrate weights and ratings of a series of variables potentially affecting groundwater vulnerability.

Study area

The study area is contained completely within the Po Plain area, in northern Italy (Fig. 1), and covers approximately 2000 km2, where urban areas and agricultural activities are extensively and almost equally present (ERSAF 2014). The study was focused on the upper hydrogeological unit (Lombardia and Agip 2002) that corresponds to a shallow unconfined aquifer composed of gravels and sands. The aquifer has an average thickness of approximately 60 m, the transmissivity is higher than 10−2 m2/s, and the hydraulic conductivity ranges from 10−4 to 10−3 m/s. Recharge conditions are influenced by the presence of an extensive irrigation network and a variety of land uses and soil types.

Fig. 1
figure 1

a Location of the study area in light grey; b land cover in 2012

The shallow aquifer has been classified as a Nitrate Vulnerable Zone by the European Union (Nitrate Directive, 91/676/EEC). Indeed, nitrate is commonly and historically present in shallow groundwater all over the Po Plain area (Cinnirella et al. 2005). Nitrate presence in groundwater is related to by-products of both agricultural activities and urban waste (Howard 1997; Wick et al. 2012). Nitrate has two main features that make it an excellent environmental indicator of groundwater vulnerability to contamination (e.g. Tesoriero and Voss 1997): (a) high mobility and (b) widespread presence in groundwater. In addition, nitrate generally has a long history of monitoring: in the shallow aquifer of the study area, it was periodically monitored by the Province of Milan through a network of more than 300 wells. Monitoring results showed small changes of concentrations (in mg/L as nitrate: NO3−), in a range between 1 and 5 mg/L, due to very local and transitory episodes of contamination. These changes did not affect the general spatial pattern of nitrate distribution in groundwater (Sorichetta et al. 2013), with a general concentration decrease from 50 mg/L in the northern sector to 5 mg/L in the southern one.

Methods: overview and the new proposed approach

Considering the major features of the area and with the purpose of performing a specific vulnerability assessment for nitrate, a classic pesticide DRASTIC (Aller et al. 1987) and a version of DRASTIC revised through a spatial statistical method have been compared. The classic and revised DRASTIC maps were then compared to a map obtained with the Time–Input method (Kralik and Keimel 2003; Kralik 2008).

According to Baker (1990), due to the wide range and distribution of nitrate sources, and the frequency with which it is found in groundwater, nitrate can be considered a good indicator to assess overall groundwater vulnerability to non-point source contaminants and to evaluate the factors that influence groundwater vulnerability in general in areas with prevalent urban and agricultural land use.

DRASTIC method

Parametric system methods are developed with the specific purpose of identifying areas of relative vulnerability to anthropogenic contamination based on hydrogeological characteristics. All parametric system methods adopt almost the same procedure. The system definition depends on the selection of some parameters considered to be representative for groundwater vulnerability assessment. Each parameter has a defined natural range divided into discrete hierarchical intervals. To all intervals are assigned specific values reflecting the relative degree of sensitivity to contamination.

In this study approach, DRASTIC (Aller et al. 1987), the most widely applied parametric method, developed in by U.S. Environmental Protection Agency (EPA), was chosen for assessing groundwater pollution potential.

This method considers the following seven parameters: depth to water, net recharge, aquifer media, soil media, topography, impact of the vadose zone and hydraulic conductivity. Each mapped factor is classified either into ranges (for continuous variables) or into significant media types (for thematic data), which have an impact on pollution potential.

The typical rating range is from 1 to 10. Weight factors are used for each parameter to balance and enhance their importance. The final vulnerability index (D i ) is a weighted sum of the seven parameters and can be computed using the formula:

$${D_i} = \;\varSigma_{j = 1}^7\left( {{W_j} \cdot {R_j}} \right)$$
(1)

where D i  = DRASTIC Index for a mapping unit, W j  = Weight factor for parameter, j R j  = Rating for parameter j. When aimed towards specific vulnerability to nitrate contamination, DRASTIC proposes the use of a selected string of weights that is identified as pesticide DRASTIC (Anane et al. 2013).

Weights of Evidence method

Weights of evidence (WofE) can be defined as a data-driven Bayesian method, expressed in a log-linear form, that uses known occurrences as training sites (TPs) to generate predictive probability outputs (i.e. response themes) from multiple weighted explanatory evidences (i.e. evidential themes: Bonham-Carter 1994). Evidential themes represent a set of continuous or categorical spatial data, which may influence the spatial distribution of the occurrences in the study area (Raines 1999). Groundwater nitrate concentrations can be used as the response variable in a spatial statistical model by identifying a threshold value of concentration that makes the variable binary. Only the monitoring wells (or a part of them) having a concentration above the threshold value are considered TPs (Sorichetta et al. 2013).

The TPs are used to calculate the prior probability of occurrence of an event, and the positive and the negative weights (and thus the contrast and its studentised confidence value; see below for definitions) of each evidence class.

Positive (W +) and negative (W ) weights are computed for each evidence class based on the location of the TPs. Thus, for a given class B, W + and W are positive and negative or negative and positive, respectively, depending on whether there are more or fewer TPs in B than would be expected by chance.

The weights can be expressed as:

$${W^+ } = {\log_e}\frac{{P\left\{ {B|E} \right\}}}{{P\left\{ {B|\overline E } \right\}}}$$
(2)
$${W^- } = {\log_e}\frac{{P\left\{ {\overline B |E} \right\}}}{{P\left\{ {\overline B |\overline E } \right\}}}$$
(3)

where \(\;P\left\{ {B|E} \right\}\) and \(P\left\{ {B|\overline E } \right\}\) are the respective probabilities of a pixel being within B when it either contains or does not contain a TP, while \(P\left\{ {\overline B |E} \right\}\) and \(P\left\{ {\overline B |\overline E } \right\}\) are the respective probabilities of a pixel not being within B when it either contains or does not contain a TP (Sorichetta et al. 2012).

The contrast, defined as W + minus W , represents the overall degree of spatial association between each evidence class and the TPs, and can be considered as a measure of the class usefulness in predicting the location of the occurrences (Masetti et al. 2008). Positive contrast values mean a direct relationship (or a positive correlation) between the presence of the class and the training points, whereas negative contrast values mean an inverse relationship (or a negative correlation); values close to zero give a general low correlation. The studentised confidence value, defined as the contrast divided by its standard deviation, corresponds approximately to the statistical level of significance defined by standard z-tests and provides a useful measure of the significance of the contrast (Raines 1999). Bonham-Carter (1994) gives a complete mathematical description of the WofE and a detailed discussion of its assumptions and limitations (including sources of error and uncertainty).

Time–Input method

In addition to the DRASTIC method, the Time–Input method (Kralik and Keimel 2003; Kralik 2008) was also applied. The Time–Input is defined as a hybrid method in Wachniew et al. (2016) and provides the assessment of groundwater vulnerability on the basis of two factors: travel time and input, i.e. groundwater recharge. Vulnerability is expressed as the ratio between the thickness of the layers of the unsaturated zone by their hydraulic conductivity, measured in seconds [s], modified by the input correction factor (f) based on groundwater recharge measured in millimetres per year [mm/yr]:

$${\text{Vulnerability index = TIME[s]}} \times {\text{INPUT[f(mm/yr)]}}$$
(4)

Approach to improve rating methods using statistical analyses

The monitoring wells of the study area have been classified in two subsets containing the same number of wells, by identifying the median nitrate concentration (19.5 mg/L). The median value has been shown as the most reliable to define the two subsets in these statistical analyses (Masetti et al. 2009). Only monitoring wells showing nitrate concentrations higher than 19.5 mg/L have been considered selectable as TPs to evaluate positive and negative weights through Eqs. (2) and (3).

In the WofE analysis, the process of generalisation of evidential themes has been performed following the objective (semi-guided) procedure developed by Sorichetta et al. (2012), which allows us to obtain the maximum number of statistically significant classes for each evidential theme.

Contrasts obtained from the WofE analysis have been used to determine the DRASTIC indices with local and more accurate detail, by creating equations of the observed variables, representative of a relation between scores and values.

For each studied parameter, a score was assigned to each class, set up on the contrast value; considering the maximum DRASTIC rating and observing the direction of growth or decline of contrasts, the maximum score was divided for the number of classes counted in the contrast analysis, consistent with the direction of growth or decline. Therefore, each class has a rating value representative of the contrast value, which can be used to create a graph descriptive of the observed parameter.

The graph is characterised by values—on the x-axis—and scores—on the y-axis, so a trend line and a corresponding formula can express the distribution of points. The trend line obtained from the contrast scores can be compared to that of the DRASTIC graph, built with average values of the range criteria and rating values.

The comparison permits a detailed interpretation of the parameter considered for the present analysis; often, the two curves have the opposite trend, which depends on the site-specific processes prevailing in the study area. However, it is important to discuss carefully the parameter and its characteristics in the observed area, because the trend is strictly dependent on the hydrogeological characteristics of the studied area.

For this reason, in this pilot project, we decided to consider parameters, strictly hydrogeologically dependent, which were the most representative for the purpose of this work. Therefore, we used the following variables:

  • Groundwater depth (m)

  • Groundwater velocity (m/s)

  • Infiltration (mm/yr)

  • Vadose zone velocity (m/s)

Results and discussion

In the study area, groundwater depth (m) is characterised by an increasing trend of contrasts (Fig. 2a), with a classification in six classes.

Fig. 2
figure 2

a Contrasts of the statistically significant classes of the evidential theme representing groundwater depth; b groundwater depth ratings, obtained through the WofE (blue dots) and DRASTIC (orange squares) methods, and interpolated curves (dotted blue line and dashed orange line, for WofE and DRASTIC, respectively), with representative equations

The DRASTIC method defines seven classes for the water table depth parameter, with a maximum rating of 10; therefore, the classification of contrasts is based on the maximum score of 10, which has been equally divided into six parts.

Building a graph with scores (y-axis) and average values representative of the reference class (x-axis), it is possible to deduce the equation of the trend line of the groundwater depth parameter, as shown in Fig. 2b.

The graph shows an increasing trend, represented by a 4th-order polynomial regression equation, and a consecutive decreasing trend, represented by a linear equation. This is a good picture of the real situation in a detailed area, where the increase of water table depth could be related to the increase of oxygen amount in soil, which limits the denitrification process in the vadose zone (Nolan 2001). This amount increases with depth up to a constant value, or decreases. This behaviour has been observed at different scales, from the field dimension (Best et al. 2015) to regional and country scales (Kolpin et al. 1999; Nolan et al. 2002).

This examined variation can be used to provide better detail that can improve the DRASTIC classification.

Groundwater velocity (m/s) trend increases, and it is composed of four classes (Fig. 3a).

Fig. 3
figure 3

a Contrasts of the statistically significant classes of the evidential theme representing groundwater velocity; b groundwater velocity ratings, obtained through the WofE (blue dots) and DRASTIC (orange squares) methods, and interpolated curves (dotted blue line and dashed orange line, for WofE and DRASTIC, respectively), with representative equations

The DRASTIC method categorises this parameter into six classes, giving each a score from 1 to 10; therefore, the classification of contrasts is based on the maximum score of 10, which has been equally divided into four classes.

The graph in Fig. 3b illustrates that, in the studied area, the representative equation of groundwater velocity is a power equation.

In this case, the trend of average values is directly proportional to the decrease of rating; this could be linked to the prevailing of the dilution effect on transport process of nitrate in groundwater.

The district of Milan is characterised by an effective infiltration (mm/yr), represented by a decreasing contrast trend composed of five classes (Fig. 4a).

Fig. 4
figure 4

a Contrasts of the statistically significant classes of the evidential theme representing effective infiltration; b effective infiltration and net recharge ratings, obtained through the WofE (blue dots) and DRASTIC (orange squares) methods, respectively, and interpolated curves (dotted blue line and dashed orange line, for WofE and DRASTIC, respectively), with representative equations

The effective infiltration parameter is compared to the net recharge, considered for the application of the DRASTIC method.

Net recharge is divided into five classes, with a maximum rating of 9; therefore, the classification of contrasts is based on the maximum score of 9, which has been equally divided into five classes.

Figure 4b shows a comparison between the trend of contrasts reclassified and the trend of the DRASTIC method. The representative equation of effective infiltration is a power equation, which highlights the strong dilution effect, increasing with infiltration amount.

In the studied area, the hydraulic conductivity of the vadose zone (k; m/s) is characterised by an increasing of the contrasts trend, which is divided into three classes (Fig. 5a).

Fig. 5
figure 5

a Contrasts of the statistically significant classes of the evidential theme representing hydraulic conductivity of the vadose zone; b hydraulic conductivity of the vadose zone (k) and impact of the vadose zone media ratings, obtained through the WofE (blue dots) and DRASTIC (orange rectangles) methods, respectively. k curve (dotted blue line) has its representative logarithmic equation

This parameter can be compared to a variable of the DRASTIC method: the impact of vadose zone media, which determine the attenuation characteristics of the material below the typical soil horizon and above the water table.

The impact of the vadose zone is classified into 10 classes and has a maximum score of 10; so, similar to what was described for the previous variables, the classification of contrasts is based on the maximum score of 10, which has been equally divided into three classes.

Figure 5b illustrates that, in the studied area, the representative equation of hydraulic conductivity of the vadose zone is a logarithmic formula.

For the previous comparison—between hydraulic conductivity and soil characteristics of the vadose zone—it is interesting to know that velocity tends to propagate vertically in the vadose zone, from the surface to depth; therefore, k influences both the transport and dilution of contaminants.

After the analysis and correction of ratings in the previous variables, the weight for each parameter was calculated. The purpose of this process is to implement the DRASTIC methods with more detailed variables and so to give a suitable importance (i.e. weight) to the evaluated variables.

The variable weight is calculated from the ratio between the sum of the absolute values of contrasts and the number of classes. In Table 1, the evaluated weights are schematised.

Table 1 Scheme of evaluated weights

The mean absolute value of the sum of the contrasts for each of the four variables shows that infiltration has the highest weight, followed by groundwater depth, hydraulic conductivity of the vadose zone and groundwater velocity. These values were used to adjust the original DRASTIC weights for these four parameters, while for the other three parameters, the original DRASTIC scores and weights were retained.

Therefore, a classic pesticide DRASTIC map was compared to that obtained using the revised weights and scores for the four hydrogeological parameters. Typically, vulnerability maps represent a limited number of classes; subjective rating methods, such as GOD (Foster 1987) and EPIK (Doerfliger and Zwahlen 1997), individuated from four to five classes. Vulnerability scores, ranging from 23 to 226, obtained through DRASTIC (Aller et al. 1987), are generally reclassified into fewer classes (Rupert 2001; Hamza et al. 2007). Also, the final maps obtained from statistical methods tend to be represented by a limited number of classes, rarely more than six (Sorichetta et al. 2011). An excessive number of classes are inappropriate for land-use regulations (Foster et al. 2013), prescriptive purposes and for the limitations of our visual analytics abilities (Cowan 2001). Both classic and revised DRASTIC maps were categorised into six classes by dividing the range of 23–226 into six equal interval classes.

Figure 6 shows the classic (a) and the revised (b) maps together with the distribution of monitoring points with nitrate concentration higher than 25 mg/L. The value has been chosen because it represents a sort of guideline value defined by the EU standard (91/676/EEC) to identify potential critical areas. Moreover, the value is higher than the threshold value used for the statistical analysis and should be more easily associated with the most vulnerable area. The revised map shows a clear better agreement between vulnerability classes and the presence of a high concentration of nitrate, which represents the most important diffuse contaminant in this highly urbanised area. In fact, the classic map shows the highest frequency of wells with a nitrate concentration higher than 25 mg/L in class 4, whereas, in the revised map, frequency increases monotonically as the degree of vulnerability increases, as expected, with the highest frequency corresponding to the most vulnerable class (Fig. 7a, b).

Fig. 6
figure 6

Classic (a) and revised (b) DRASTIC maps and the spatial distribution of wells, showing nitrate concentration higher than 25 mg/L

Fig. 7
figure 7

Histograms of frequency of wells showing nitrate concentration higher than 25 mg/L for each vulnerability class of a classic, b revised DRASTIC and c Time–Input maps

The comparison allows us to observe that the revised DRASTIC method identifies many areas where the final score falls in the first two lowest vulnerability classes (i.e. light green and green): all six vulnerability classes are represented. While the classic method gives a minimum score falling in the medium–low vulnerability class (yellow) in a limited area in the north-eastern sector, only the four highest vulnerability classes are present, and there are no areas that fall into the two lowest classes. Observing the two maps, it is evident how the revised map provides a more accurate representation of the distribution of the different vulnerability classes compared to the distribution of wells impacted by high nitrate concentration. Specifically, the revised map shows that: (a) most of the impacted wells are contained within the highest vulnerability class in the northern central part of the study area; (b) classes 4 and 5 contain almost all the remaining impacted wells; and (c) classes 1–3 have a limited number of impacted wells, with classes 1 and 2 having most of the area without any impacted wells. On the other hand, the classic map shows that: (d) class 6 contains very few impacted wells; (e) class 4 has the most dense presence of impacted wells; (f) a large sector in class 5 does not have any impacted well.

Therefore, the revised method, by maintaining the same numbers (six) and meaning of the different vulnerability classes, in terms of low or high vulnerability, provides a different spatial distribution of classes, which can be considered more detailed and efficient than the distribution of classes in the classical method. This is due to the selection of specific hydrogeological parameters and major detail in the attribution of weights, based on a statistical method that allows us to better emphasise the importance of local processes affecting groundwater vulnerability in the area.

For the application of the Time–Input method, the travel time factor was obtained by the ratio of two DRASTIC layers: groundwater depth (m) and vadose zone velocity (m/s); the input factor, classified as a correction factor, depends on the DRASTIC layer infiltration (mm/yr) (Table 2).

Table 2 Correction factors for the Time–Input method (groundwater recharge by the amount of infiltrating water in mm/yr) (Kralik 2008)

The Time–Input method was developed for mountainous areas (Zwahlen 2004); therefore, it provides different correction factors for tectonics or for bedding inclination (Kralik and Keimel 2003), not necessary in this study dealing with a porous aquifer (Sect. 2).

The application of the Time–Input method and the comparison to previous maps was not trivial, requiring the conversion of the 10 classes provided by the method (Kralik and Keimel 2003) in six classes (Table 3). The final map (Fig. 8) shows the lower classes in the central and northern part, and a lack of intermediate classes; considering this difference, the map seems more in accordance with the classic DRASTIC; in fact, the correspondence with the nitrate content is low in the northern part of the study area. Despite more than half of wells with a nitrate concentration higher than 25 mg/L are contained in the highest vulnerability classes (4, 5 and 6) and the highest frequency corresponds to the highest vulnerability class, the distribution of wells is almost uniform in all the classes (Fig. 7c). Thus, the Time–Input method shows a level of performance between the classic and revised DRASTIC.

Table 3 Vulnerability classes for the Time–Input method (modified from Kralik 2008)
Fig. 8
figure 8

Map drawn up by the Time–Input method, and the spatial distribution of wells showing nitrate concentration higher than 25 mg/L

Conclusions

Many hydrogeological parameters can have different impacts on groundwater vulnerability according to site-specific conditions. This condition can alter both the type (direct or inverse) and the strength of relationships existing between each parameter and vulnerability. This implies that the weights and scores of each parameter should be determined for the study area through objective procedures that must scientifically support the use of site-specific values. The use of statistical methods for assessing groundwater vulnerability to contamination is an effective tool to determine the factors having the highest influence on groundwater vulnerability better. In this study, we have proposed a procedure to cope with this task by using a Bayesian-based model applied at a spatial scale. This method allows us to attribute a rating value to each hydrogeological parameter selected for this work (viz. groundwater depth, groundwater velocity, infiltration and vadose zone velocity, which was used to create a graph descriptive of the observed parameters. The trend of these graphs was compared to that of the DRASTIC graphs, built using the classic DRASTIC method.

Through the comparison with the classical DRASTIC method and the Time–Input method, the revised method shows a more realistic distribution of vulnerability classes in accordance with the distribution of wells impacted by high nitrate concentration.

Therefore, the revised method provides a different spatial distribution of classes, which is more detailed and efficient than the distribution in the classical method, due to the selection of specific hydrogeological parameters and major detail in the attribution of weights, based on a local scale.

In addition, the use of the Time–Input method, and the unsatisfactory results of this, demonstrated the importance in all vulnerability assessment methods of taking into account the specific hydrogeological conditions of the area.

In conclusion, our findings suggest that, to realise groundwater vulnerability in a contamination map, the assessment at local detail of the hydrogeological parameters involved in the analysis and the attribution of adequate weights to these parameters are both essential, which should be as objective as possible.