Introduction

Nitrate (NO 3 ) is an abundant contaminant in groundwater. Given its chemical stability in water, high mobility, wide range of sources, and the frequency with which nitrate has been measured in groundwater, it clearly represents a natural choice as an effective indicator for the protection of groundwater to surface contamination (Cohen and others, 1984). The groundwater pollution caused by diffuse nitrate sources may pose risk to human health and to the environment (Harrison, 1992; Lake and others, 2003). For this reason, the Council Directive 91/976/EC (Nitrate Directive) has been drafted to protect surficial and groundwater against pollution caused by nitrate from agricultural sources. The same Directive has obliged Member States to designate nitrate vulnerable zones (NVZ) and to take measures for reducing nitrate sources which can potentially determine contamination problems. Even if agricultural practices may be responsible for high-nitrate concentration in groundwater, nonagricultural sources as septic systems and leaking municipal sewers can significantly affect aquifer (Fetter, 1999; Nas and Berktay, 2006) and should be considered when the aim of a study is to determine NVZ in densely populated areas.

In this study the occurrence of elevated nitrate concentration in the groundwater of the Province of Milan (northern Italy) is related to both natural and anthropogenic variables and analyzed using the weights-of-evidence (WofE) modeling technique (Bonham-Carter, Agterberg, and Wright, 1988; Agterberg, Bonham-Carter, and Wright, 1989). These variables describe either the potential sources of nitrate or the relative ease with which it may migrate to groundwater: population density, nitrogen fertilizer loading, precipitation, irrigation, protective capacity of soils, landuse, vadose zone permeability, groundwater depth, and groundwater velocity.

Study area

The study area is about 1.989 km2 and is located in the Province of Milan (northern Italy) (Fig. 1) where both agricultural and industrial activities are extensively present. This area has a complex hydrogeological setting characterized by many aquifers with different degree of mutual interaction. Recharge conditions are also complex owing to the presence of an extensive irrigation net and a great number of soil and landuse types.

Figure 1
figure 1

Study area location around the city of Milan

The Milan plain subsoil is characterized by Pliocene-Pleistocene sediments; the upper unit forms the Traditional Aquifer (TR), which is the object of the study. Its water has been used mainly for public and industrial supplies since the beginning of 20th century. The TR is an unconfined aquifer with transmissivity ranging from 5 × 10−2 to 1 × 10−3 m2/s and permeability from 5 × 10−3 to 1 × 10−4 m/s. Its thickness ranges from 60 to 120 m. Sediments are mainly gravels and sands although, from north to south, the presence of clay-silt layers increases. From the middle part of the study area southward a discontinuous aquitard (5–10 m thick), constituted by clayey-silt, separates the shallow unconfined aquifer (TRa) from a semiconfined aquifer (TRb) not considered in this study. The regional flow goes from north to south (Fig. 2A), groundwater depth decreases from 30 m in the northern area to 5 m in the southern sector. Nitrates are monitored at least four times a year by a net of about 300 wells covering the whole area with an almost uniform spatial distribution (Fig. 2A). The minimum nitrate concentration is lower than 1.0 mg/l whereas the maximum is higher than 70 mg/1: the median concentration is about 20 mg/l. Forty-two percent of the monitored wells shows concentrations exceeding the guide value of 25 mg/l, established in European Community (1991) standards. The most impacted sector is located in the northeastern part of the study area (Fig. 2B), where concentrations exceed 50 mg/l, even if the whole northern sector shows values close to or greater than the aforementioned guide value of 25 mg/l. Nitrate concentration progressively decreases from north to south, where measured values are always lower than 10 mg/1. Note the inflexion southward of the isoconcentration curves in the area of the city of Milan which represents a local anomaly within the spatial trend of nitrate in groundwater.

Figure 2
figure 2

Piezometric levels and well location A, nitrate concentration (mg/1) in groundwater, B

Analysis of historical data confirms that this is the normal trend of nitrate concentration in groundwater. Changes probably are the result of local and transitory episodes of contamination which do not affect the general trend of nitrate distribution in groundwater. Mean nitrate concentration in the city of Milan and in four different sectors of the District shows how concentrations are almost constant during the period of record. As a result it is not possible to highlight specific temporal trends in the study area (Fig. 3). Once again note the great difference of concentrations observed in the northern and in the southern sectors.

Figure 3
figure 3

Trend of nitrate concentration in the city of Milan (center of district) and in other four sectors

Method

Recent years have registered an increase of the use of statistical methods for assessing groundwater vulnerability in regional studies (Lichtenberg and Shapiro, 1997; Tesoriero and Voss, 1997; Nolan, Hitt, and Ruddy, 2002; Worral and Besien, 2005). These methods correlate contaminant occurrence with natural and anthropogenic factors characterizing the area (explanatory variables). Advantages of the statistical approaches have been effectively outlined by Worral and Besien (2005) and can be summarized as follows: (a) the weights of each factor affecting vulnerability are not arbitrarily assigned; (b) the most appropriated explanatory variables may be objectively selected; (c) the influence of subjective opinions may be reduced progressively and the repeatability of tests can be guaranteed; (d) the results may be updated automatically as new information becomes available; and (e) validation procedures may be performed easily if new and updated groundwater observations are available.

In this work (Fig. 4) a prospective-specific vulnerability map has been derived according to the Bayesian WofE modeling technique, together with an analysis of factors influencing vulnerability in the study area. In particular results, described in this paper, represent an improvement of a previous work realized in the same area (Masetti, Poli, and Sterlacchini, 2005). WofE is a data-driven model that uses known occurrences as model training sites to create maps from weighted continuous or categorical input data layers. A final response may be derived by the combination of diverse spatial evidential themes in support of a hypothesis.

Figure 4
figure 4

Sequential steps followed in the study

The model is based on the idea of prior and posterior probability; the former is the probability that a terrain unit contains an event before considering any existing predictor variables (the favorable conditioning factors); the latter is an “adjustment” or response of the prior probability, taking into account the evidence from one or more spatial patterns. Positive and negative weights (W+ and W) are initially calculated; the magnitude of these weights depends on the measured association between the response variable and each class of each predictor variable. The difference between W+ and W defines the contrast (C), an overall measure of the degree of the spatial association between each class of the predictor variables and the response variable. The free extension ArcSDM, implemented by Kemp and others (2001) firstly in ArcView® 3.2 and then by Sawatzky and others (2004) in ArcGIS® 9.1, has been used in this study as a useful tool for automatically calculating the aforementioned parameters. Among these, Studentized Contrast (StudC) is the ratio of the contrast and the standard deviation of the contrast (Raines, 1999). It provides a useful measure of significance of the contrast because of the uncertainties inherent to weights and missing data. In this work, a value of StudC equal to 1.645 has been considered corresponding approximately 95% confidence, representing the minimum acceptable confidence level selected for this project.

Data sources

Evidential themes used in this study have been derived from many sources of information, different in type, accuracy, and survey scale.

Hydraulic conductivity has been obtained from borehole where hydraulic tests have been performed. Data from 606 wells (about 30 wells per km2) have been used to map the distribution of this variable.

Since 1998 piezometric levels have been monitored quarterly by the Province of Milan in terms of four measures a year. Data derived from 364 wells (about 18 wells per km2) have been used to map groundwater piezometric levels of the TRa aquifer.

Groundwater velocity has been computed by means of a numerical model (Alberti and others, 2000) to simulate the groundwater flow and to predict the variation of groundwater levels in the Milan District area. For this reason, internal conditions, such as water well abstractions and recharge (rainfall and irrigation), also have been considered and analyzed. The use of groundwater velocity is related to the concept of “instantaneous” dilution, occurring when the contaminant reaches groundwater, according to the simple approach of contaminant mass balance by Calabrese and Kostecki (1988).

The Hydrographic Service of Po River made data available for the construction of the rainfall map. After identifying rainfall stations with records longer than 20 years, mean annual values have been calculated as input for the map. Seven stations have been used to produce the final map.

More than half of the investigated area is included within an extensive irrigation net, mainly located in the central-southern sector of the District. Mean annual irrigation ratios have been calculated using the average incoming and outcoming discharge recorded in flow gauges by each Consortium, which regulates the availability of water in the canals within its jurisdiction. Recording stations were not homogeneously distributed; on the base of the number of stations in each Consortium and its extension, data have been used to produce local maps of mean annual irrigation at a scale ranging from 1:10,000 to 1:25,000. Final map has been obtained, aggregating single maps, at 1:25,000 scale.

Soil protective capacity (SPC) has been obtained from the local Agency of Services for Agriculture and Forest (ERSAF, 2004). This variable has been mapped at a 1:25,000 scale and divides soils in three protective capacity classes: high, moderate, and low. The variable describes soil capacity to reduce water-soluble polluting substances leaching from the surface and it is related to filtering and buffering capacity, because of both mechanical and biological/microbiological activity allowing degradation (ERSAF, 2004).

Land use has been derived from the regional technical map at a 1:10,000 scale. Three different uses have been distinguished: urban, agricultural, and woods. As a whole, they cover about the 97–98% of the study area. More specific subdivisions have not been taken into account in relation to the working scale adopted in this study and to the real lack of more detailed data.

Evaluation of nitrogen fertilizer loadings has been carried out considering both organic and chemical load (Acutis and Provolo, 2003), which can be correlated to breeding and agricultural activities, respectively. Therefore, this factor only considers anthropogenic sources of nitrate that are not related to urban areas. Data have been mapped at a 1:25,000 scale.

The term population density (ISTAT, 2001) has been selected to consider nitrate sources related to urban areas, mainly because of septic tanks and leakage from sewer systems (Nolan, Hitt, and Ruddy, 2002). Population density map is at a 1:25,000 scale.

Except for SPC and land use (considered as categorical variables, each one divided into three classes) all the other ordered evidential layers have been reclassified into categories before starting analysis. In effect, by grouping large data sets into fewer and more manageable classes, statistical significances and trends may be discovered and analyzed. In Table 1, the number of classes, the original range of raw data, and the reclassification strategies used in the study are reported for each evidential theme.

Table 1 Number of classes, range of raw data and method used to reclassify theme

The threshold concentration of nitrate in groundwater used to distinguish impacted and nonimpacted well has been selected equal to the median value of 20 mg/l; wells with NO 3 values >20 mg/l have been considered as training points. The value of prior probability is equal to 0.023.

Initially, weights and contrasts have been calculated for all the predictor classes. In this first step, the influence of some classes in determining low or high vulnerability values has been immediately identified. After that, a generalization strategy applied to the evidential themes has been performed in an effort to spatially define sites where each evidential theme shares a stronger association in comparison with the locations of training sites.

Weights and contrasts for each class then have been calculated by three distinct techniques: categorical, cumulative ascending, and cumulative descending (Carranza, 2004). Considering 1.645 as the threshold value of StudC for evaluating the significance of a measure and exploiting the hydrogeological knowledge of the study area, it has been possible to obtain two distinct generalizations for each evidential theme characterized by different degree of complexity (Table 2).

Table 2 Number of classes for each theme according to the applied generalization

Using a cumulative technique, the break values useful to define the limits of each class have been set up on the basis of the contrast value: the highest values supported by a StudC equal or greater than 1.645, have been selected as break values. For every continuous factor, the calculation of cumulative contrasts has been performed both ascending and descending. Finally, only the generalization method giving the best interpretable results in terms of contrasts trend and StudC values was selected.

Considering categorical technique, those classes not presenting sufficient significance of the contrast, have been generalized according to hydrogeological considerations in order to obtain few classes but with an improved level of confidence.

An example related to the application of the aforesaid techniques is shown in Figure 5. The groundwater velocity has been generalized based on cumulative ascending contrasts, whereas the population density on the basis of categorical contrasts. In both situations, the generalized classes show an improved statistical significance, testified by StudC values greater than 1.645, and by the general increasing trend of weights with the little exception of class 7 of population density. This class has been left separate from the others because it represents a well-defined geographic area comprising the city of Milan and its surroundings. This selection follows a practical purpose that is to better delineate the contribution of this specific sector, the most important urban area in the District, to nitrate contamination in groundwater.

Figure 5
figure 5

Example of generalization technique

Moreover, particular attention has been paid in managing SPC: in effect, analyzing this as a stand alone variable, a relevant significance has not been obtained, in spite of the hydrogeological importance recognized to this factor.

So, an attempt has been made to include this parameter in one analysis by combining irrigation (I), rainfall (R), and SPC through a simple mathematical operation to obtain a term equivalent to groundwater recharge (GWR): GWR = (R + I)*Ci, where Ci is an infiltration coefficient function of SPC (U.S. S.C.S., 1964; Schwab and others, 1993).

WofE is based on an assumption of conditional independence (CI) among the factors used. The presence of conditional dependency can inflate the final values of probability. In these experiments eventual problems of CI, because of population density-nitrogen fertilizers loading and land use, haven’t been tested because the final vulnerability value has to be considered just a relative index to compare the pixels and not an absolute value of probability.

Results

Different response themes have been calculated. The most interesting tests (in terms of final results) are presented here:

  • Response Theme Sdmuc 12: the generalizations used have been the simplest for each variable;

  • Response Theme Sdmuc 13: the generalizations used have been the most complex for each variable;

  • The influence of using different type of generalization for each variable in affecting the final result has been tested mixing together all the variables with different generalizations. Response Theme Sdmuc 14 represents one of these experiments as shown in Table 3. It produces the best results, in terms of success rate, compared with the others generated combining simplest and complex generalizations; and

  • Response Theme Sdmuc 17: same as Sdmuc 14 but combining rainfall, irrigation, and SPC to represent GWR.

Table 3 Generalization used in Sdmuc 14

In order to evaluate the quality of the different final maps in relation to their capacity to properly classify the training sites, the success rate curve has been calculated (Chung and Fabbri, 1999). Success rate has been evaluated considering a curve where on X-axis there is the cumulative percentage of susceptible area and on the Y-axis cumulative percentage of wells used as training points. An example of an efficiency of classification curve is shown in Figure 6. For each curve the efficiency in classification and the percentage of wells correctly classified (training points in areas where posterior probability > prior probability) have been estimated (Table 4).

Figure 6
figure 6

Success rate curve

Table 4 Basic statistic of the main models

The value of efficiency is represented by the area under the curve. A random classification corresponds to a 45° line that has a value of efficiency equal to 0.5. Greater efficiency values indicate that model more accurately classifies the training sites. The Sdmuc 14 model seems to reflect the best results (Table 4); however notable differences are not evident among the four tests. The success rate curve for Sdmuc 14 model is in Figure 6.

Validation procedures

Statistical techniques have been used (Chowdury and others, 2003) to test the validity of the final maps obtained for the previously mentioned response theme, simply comparing the spatial distribution of the probability values and wells characterized by different values of nitrate concentration. This strategy has been based on two well populations, extracted from the database; wells with a nitrate concentration < 20 mg/l, termed “nonimpacted wells,” and wells with a nitrate concentration > 20 mg/l, termed “impacted wells” and previously used in the WofE. Four classes of groundwater vulnerability have been considered: two characterized by values lower and two higher than the prior probability. Subsequently, calculated vulnerability values have been properly associated to each well, belonging to the impacted or nonimpacted sample, respectively. Histograms have been derived considering the frequency of occurrence of impacted and nonimpacted wells within each vulnerability class, given by

$$ W_{{\text{ci}}} /W_{\text{i}} * 100 $$

where W ci is the number of impacted (or nonimpacted) wells in a vulnerability class, “i” and W i is the total number of wells in the same vulnerability class “i”. This parameter allows comparison of the capability of each model to identify classes of specific vulnerability well fitting the spatial distribution of nitrate.

A further control has been performed analyzing the mean nitrate concentration of wells belonging to each class. All histograms have confirmed a general good quality of the results; however a specific ranking among all models has been possible considering the following parameters:

  1. (1)

    for the frequency of impacted (or non) wells, (a) angular coefficient of the regression line, the higher the absolute value the better the classification, (b) regression coefficient;

  2. (2)

    for the mean concentration, (a) angular coefficient of the regression line, (b) regression coefficient, and (c) the agreement with the threshold concentration value, meaning that classes with posterior probability lower than prior probability should have nitrate concentration lower than the threshold value, whereas classes with posterior probability higher than prior probability should have nitrate concentration higher than the threshold value.

Plots in Figure 7 represent the results of simulations Sdmuc 17 and Sdmuc 14. All plots show the good agreement among represented variables; the increase of vulnerability is correlated positively with the frequency of occurrence of impacted wells with high R 2 value (R 2 = 0.99). This result testifies the validity of the method in identifying areas showing different vulnerability degrees. The correlation between vulnerability classes and mean nitrate concentrations shows the strong positive linear relationship with high R 2 indicating once again the good quality of results.

Figure 7
figure 7

Correlation between vulnerability classes, based on posterior probability values, and frequency of occurrences of impacted wells and mean nitrate concentration for Sdmuc 17, A and Sdmuc 14, B

The compared analysis of plots, according to the given criteria, shows that Sdmuc 17 has the highest angular coefficient for the frequency of impacted well (even if a lower R 2) and for the mean concentration. Note also the agreement with the threshold nitrate concentration value of 20 mg/l both for classes with posterior probability lower and higher than prior probability (p = 0.023), not respected by Sdmuc 14.

According to these criteria best validation results have been obtained for Sdmuc 17, indicating a little difference of results derived from the success rate curve. This results can be explained by a best performance of model Sdmuc 17 in correctly classifying wells not used as training points, together with a better efficiency in locating wells in different vulnerability classes according to the measured concentration.

Results of this validation procedure have been considered more reliable respect to the success rate curve for the following reasons: (a) all wells have been used not training points only; (b) measured values of concentration are considered giving a more quantitative insight in the analysis of results. It is however important to consider that this validation procedure gives good results if number of measures of the explanatory variable is large enough to make the statistical approach significant.

Final map obtained by the Sdmuc 17 model (Fig. 8) represents the distribution of vulnerability classes and highlights the presence of the highest vulnerability class in the northern central sector of the area. There is a general trend to lower vulnerability value from the center to eastern, western and southern sectors where almost all the area is covered by the lowest vulnerability class.

Figure 8
figure 8

Final vulnerability map

Conclusions

In this study the WofE model has been used for assessing groundwater vulnerability considering both natural and anthropogenic variables describing either potential sources of nitrate or the relative ease with which it may migrate to groundwater. The WofE model belongs to the statistical methods used for assessing groundwater vulnerability, and it correlates contaminant occurrence with natural and anthropogenic factors (explanatory variables) characterizing a specific area. The modeling technique has great flexibility in terms of selection and testing of the most appropriate explanatory variables, validation techniques, and iterative refinements of results.

The final map (Sdmuc 17, obtained combining factors as shown in Table 3), which represents groundwater vulnerability to nitrate in terms of vulnerability classes, shows that the occurrence of high susceptible zones are concentrated in the northern area of the Province of Milan. Comparison between spatial distribution of these classes and the frequency of occurrences of nitrate in wells shows high degree of correlation.

Correlation between posterior probability classes and mean nitrate concentrations in wells located in each of these classes has a strong positive correlation with a high R 2 = 0.99. Mean values correspond to the threshold concentrations of nitrate used to perform the analyses.

According to contrast values (Table 5), density of population is the most important sources of nitrate whereas nitrogen fertilizer loading seems to have the weakest influence; so urban areas with their septic systems and leaking municipal sewers can be considered the main source of nitrate in the area. In particular (Fig. 7) areas with more than 1100 inhabitants/km2 show high positive contrast.

Table 5 Contrast and confidence of the evidential themes used for generating test Sdmuc 17

The different simulations performed in the study have shown that the simplest generalization of input variables, respecting the appropriate statistical significance, may be sufficient to obtain a good vulnerability map.