Introduction

The application of stochastic approaches to hydrogeology is emerging from some parts of academia and starting to appear more frequently in discussion meetings of water commissions and regulatory bodies involving decision makers, policy makers and stakeholders. In a changing world, professionals are becoming more open to accepting how uncertain future scenarios are. Uncertainty analysis and scenario evaluation can be taken into account during land-use planning processes and evaluation of natural resources. It is common now to describe events as probable and possible, referring to various degrees of probability (Rubin 2004). Stochastic hydrology theories and approaches have developed considerably during the last 30 years. However, stochastic methods have not been used very frequently in hydrological applications, despite the impressive success of essentially the same approaches in other scientific disciplines like atmospheric modeling, meteorological forecast, ocean monitoring and ecological assessment (Christakos 2004). So far, their application to real-world problems has been limited and they have not become a routine tool for hydrological modeling (Dagan 2002; Zhang and Zhang 2004; Rubin 2004; Renard 2007). One reason is that these methods generally fail when trying to represent complexities inherent to processes that cannot be modeled without rigorous rules. There is a need to balance these rigid theories and their application with quantitative accuracy in field engineering problems involving the subsurface environment (Hunt and Doherty 2011; Ginn 2004). Renard (2007) presented a survey of publications in the field of stochastic hydrogeology over the last two decades. While the total number of publications has increased in all fields of research, the author was surprised by the observation that the proportion of stochastic hydrogeology papers remained constant, suggesting that interest in these techniques within the research community is not increasing. Pappenberger and Beven (2006) discuss reasons why uncertainty analysis should not be used and Clarke (2010) points out the misuse of statistical methods in reporting hydro-climatological data. Neuman (2004) believes that the level of mathematical understanding required by stochastic theories is far beyond that of most hydrogeologists, in academia and in practice. Sudicky (2004) agrees, and adds that regulations are based on single numbers, and the uncertainty in the prediction of such numbers (e.g. the probability of exceeding it) is not something regulators can readily deal with. Nourani et al. (2011) believe that problems in data interpretation given by the lack of strong predictive tools (or lack of experienced users of those tools) contribute to a failure to reach consensus about the need for key water-management actions.

The motivation of this work is to present an integration of stochastic methods based on time-series modeling and geostatistics applied to field data collection of water heads and climatological observations, in order to provide reliable information for decision making in groundwater management and land-use planning. The aim is to demonstrate, through a case study, the use of stochastic approaches to explain observed phenomena, and communicate the results in a clear and understandable form, via simulated scenarios of variations in expected water-table levels. The study area is an outcrop zone of the Guarani Aquifer System (GAS) in Brazil, one of the major water reservoirs in the world, although little is known about its dynamics and recharge processes.

Materials and methods

The nature of the problem

The GAS is a transboundary aquifer covering up to 1.2 million km² of the Argentinean, Brazilian, Paraguayan and Uruguayan territories, encompassing almost all the Paraná and Chaco-Paraná sedimentary basins. During the past decade, efforts from the four countries were concentrated on investigating this reservoir and obtaining knowledge about its recharge and discharge mechanisms, in order to achieve sustainable and optimal exploitation of the groundwater. Due to the strategic, social and economic importance of this aquifer to the four countries, it is necessary that use of this water resource is coordinated (OAS 2009).

The implementation of a monitoring system, which allows the acquisition of detailed information about the condition and behaviour at specific sites within the GAS, is a prerequisite for efficient management of the aquifer (Wendland et al. 2007). Stochastic approaches applied to reservoirs as large as the GAS can give some information to decision makers and water commissioners, which allows them to use numerical intervals in risk analysis to solve real groundwater problems. Predicting water levels at unvisited locations and predicting extreme levels on specific dates allows inferences to be made in both spatial and temporal dimensions of a physically based water system (Manzione et al. 2010). There are still many gaps in monitoring network coverage for water levels and quality in the GAS, and there is also a lack of local knowledge about aquifer dynamics. With such general deficiency in data acquisition, stochastic methods represent an alternative way of dealing with uncertainties about water quantification. Accounting for model uncertainty is important for sustainable management of the enormous area covered by the GAS, and for the protection of its recharge areas.

Study area

The GAS was formed during the Jurassic (Botucatu Formation) and Triassic (Piramboia Formation) periods. The Piramboia Formation consists of silty-clayish sandstones of aeolian and restricted fluvial origins, and the overlying Botucatu Formation consists of well-sorted sandstones of aeolian origin (Sracek and Hirata 2002). The Triassic sandstones usually have a larger amount of clay in their lower layers, which diminishes, in relative terms, their hydraulic efficiency (Rabelo and Wendland 2009). The sandstone outcrops on the borders of the Paraná sedimentary basin are responsible for most of the aquifer recharge, and the sandstone is mostly confined by the Serra Geral basalt layer (Cretaceous). The region chosen for the present monitoring work is the Onça Creek watershed, which is located in the outcrop area of the GAS, in the central region of the province of São Paulo, Brazil (Fig. 1). This area of 5,800 ha possesses representative characteristics for the outcrop zone of the GAS and has land uses that are typical of São Paulo province, like sugarcane, reforestation (eucalyptus), citrus, pasture, and some patches of natural Cerrado vegetation, a extensive woodland-savannah present in several regions of South America. The Onça Creek watershed is located between 22°10’ and 22°15’ south and 47°55’ and 48°00’ west, being an affluent of Jacaré-Guaçú River. The Onça Creek flows mainly over sandstone of the Botucatu Formation, while at the basin outlet it flows over the Botucatu-basalt complex. Cenozoic soils present in the area result from sandstone weathering, showing homogeneous composition with almost no loam (Wendland et al. 2007). Climatic classification for the region, following Koeppen, indicates a humid subtropical climate (Cwa) with summer rains, showing a variation to a tropical climate with dry winter (Wendland et al. 2007). The mean annual precipitation is about 1,300–1,400 mm, while the annual mean temperature in the region is 20.5 °C.

Fig. 1
figure 1

Map of South America showing the Guarani Aquifer System outcrop areas and inset map of the Onça Creek watershed, located in the central part of São Paulo province, Brazil

Data sets from the Onça Creek watershed

The Onça Creek watershed is equipped with 23 monitoring wells (Fig. 1, with four wells outside the watershed boundaries), three compact climatological stations and a river station installed in the watershed for groundwater, weather and creek-level monitoring. Water-table depths were observed manually every 15 days, from April 2004 until July 2011, totalling more than 7 years of monitoring. These wells were deliberately selected to cover the range of land uses and possible hydrogeological domains in the area, in an attempt to characterize the different responses of water-table depths in the watershed. The length of the time series of water-table depths is 2,638 days. The filter levels of the wells vary with soil depth, but are mostly around 25 m below ground level. Also, 35-year datasets for precipitation and potential evapotranspiration were available from CRHEA/USP (Centre for Water Resources and Applied Ecology of the University of São Paulo), where climatologic data are collected continuously. These data were available from 1974 until the present date, with a daily frequency. The watershed is located approximately 1.5 km away from the CRHEA station.

Combining time-series models with geostatistics on monitoring data

The characteristics of hydrogeologic systems can vary greatly in space and time, but they are usually sparsely sampled. Knowledge of system parameters is therefore partial at best, and the most that can usually be done is to quantify uncertainty through stochastic, or related, models (Winter 2004). Time-series modeling provides an empirical stochastic method to model monitoring data from observation wells, without the complexity of physical mechanistic models. Also, the stochastic component in the model allows model uncertainty to be taken into account. In time-series analysis, many authors refer to transfer function-noise models to describe the dynamic relationship between climatological inputs and water-table depths (Box and Jenkins 1976; Hipel and McLeod 1994; Tankersley and Graham 1994; Van Geer and Zuur 1997; Yi and Lee 2003; Von Asmuth and Knotters 2004; Von Asmuth et al. 2008). In the same way, geostatistical methods are used to make probabilistic statements about quantities of interest at non-measured locations (Kitanidis 1997; Pebesma 2006). In general, these studies measure an attribute in relation to its spatial coordinates, but other GIS datasets can provide additional information for spatial prediction purposes, entering the prediction equations as predictors in a regression model, or as correlated measures. Pebesma (2006) states that in an environmental context, policy makers may be interested in aggregated, regional estimates from certain smaller or larger regions.

Stochastic simulation of time-series models and calculation of groundwater statistics

The first step in time-series analysis is to identify a model able to describe the time processes (Hipel and McLeod 1994). Ideally, the time-series model should not be based just on statistical methods, but should also include some physical insight about water-table dynamics. The PIRFICT model is an alternative to discrete-time transfer function-noise models (Von Asmuth et al. 2002). PIRFICT is an acronym for “predefined IR function in continuous time”. The predefined IR (impulse response) function is a Pearson III distribution function (PIII df), which describes the model response (water-table variation) to the system inputs. In addition, the PIRFICT model has a stochastic component (noise term) that accounts for model uncertainty. The parameters of the adjusted IR functions have a physical meaning, since they represent the influence of drainage resistance, storage coefficient of the soil and the dispersion time of precipitation through the unsaturated zone (Von Asmuth and Knotters 2004).

In the simple case of an undisturbed phreatic system, the relationship between water-table dynamics and system inputs can be modelled as:

$$ h\left( t \right) = \sum\limits_{{i = 1}}^{N} {h_{i}^{*}\left( t \right) + d + r\left( t \right)} $$
(1)
$$ h_{i}^{*}\left( t \right) = \int\limits_{{ - \infty }}^{t} {{R_{i}}} \left( \tau \right){\theta _{i}}\left( {t - \tau } \right)d\tau $$
(2)
$$ r\left( t \right) = \int\limits_{{ - \infty }}^{t} {\varphi \left( {t - \tau } \right)dW\left( \tau \right)} $$
(3)

where h [L] is the observed groundwater level at time t [T], h i * [L] is the predicted groundwater-level change, written as a convolution integral, at time t credited to stress i, d [L] is the groundwater level corresponding to the local drainage level estimated from the data, and r represents the residuals series [L]. In the second and third equations, R i [L/T] is the value of stress i at time τ, θ is the IR function [−], ϕ is the noise IR function [−] and W [L] is a continuous white noise (Wiener) process with properties E{dW(t)} = 0, E[{dW(t)}2] = dt, E[dW(t 1)dW(t 2)] = 0, t 1 ≠ t 2.

Von Asmuth et al. (2008) also introduced a more complex approach using multiple input stress series. They distinguished several types of stress, including precipitation, evapotranspiration, groundwater withdrawal (or injection), surface-water levels, barometric pressure and hydrological interventions. In the present study, h i *(t) is credited to the precipitation and evapotranspiration stresses, modelled as transfer function noise by the PIRFICT model as follows:

$$ h_{p}^{*}\left( t \right) = \int\limits_{{ - \infty }}^{t} {p\left( \tau \right){\theta _{p}}\left( {t - \tau } \right)d\tau } $$
(4)
$$ h_{{et}}^{*}\left( t \right) = \int\limits_{{ - \infty }}^{t} { - et\left( \tau \right)f} {\theta _{p}}\left( {t - \tau } \right)d\tau $$
(5)

where p is the precipitation intensity at time t [L/T], et is the evapotranspiration intensity at time t [L/T], and ƒ is a reduction function for et [−] depending on soil and land cover. The effect of evapotranspiration et on the groundwater levels is the same as the effect of precipitation p, but negative. When Eqs. 4 and 5 are summed as h i *(t) in Eq. 1, evapotranspiration influence is subtracted from precipitation, similar to a water balance. The calibrated IR function θ and the noise IR function ϕ are responsible for describing the local hydrogeological conditions found at each well location. An important advantage in the use of the PIRFICT model compared with discrete-time transfer function-noise models (Von Asmuth and Bierkens 2005) is that it can deal with input and output series which have different observation frequencies and irregular time intervals, as found with typical monitoring data.

Time-series models, calibrated with observed water-table depths for a limited number of years, enable us to simulate series of extensive length using precipitation surplus/deficit as the input variable (Knotters and Van Walsum 1997). The stochastic simulation of the PIRFICT model is applied to simulate random realizations of the model to predict extreme water-table levels over time. The simulation generates long-term realizations, reconstructed from longer climatological series of precipitation and potential evapotranspiration (e.g. 30 years), or other hydrological records, as inputs. Statistics of groundwater levels can be estimated for the series generated by the PIRFICT model simulation. In this procedure, the PIRFICT model works as a filter in time, given by a convolutional process. Basically, the hypothesis is that signals of finite duration, transmitted through a series of linear systems, eventually assume a simple shape characterized by the IR function adjusted for the system (Mass 1994). This operation considers the shape of the PIII df adjusted from the parameters of each model. The interactions between the input stress signals and the system can be regarded as an operation in the time domain between the frequency expansion of each input signal and the frequency response of the groundwater system. The word signal means an observed and measurable phenomenon, which changes its magnitude in the course of time, and which has a property that is propagated. The signal changes its shape while passing through a system, so that it might be said that the system transforms the signal and the IR function of the composite system can be derived from those subsystems by convolution. The resulting series will represent the prevailing hydrological and climatic conditions, rather than specific meteorological circumstances during the monitoring period of water-table levels (Manzione et al. 2008). With this procedure, it is possible to avoid short-term disturbances from extreme climate events in the simulated time series, once the noise in the series has been filtered out, based on the aquifer system response to model climatological inputs (which is much slower than to climate variations). Then, probability density functions (PDF) are calculated for any date of interest in the agricultural calendar of the region to account for extreme levels. These levels are indicated by percentiles in the PDFs. The assessment of risk requires an understanding of extreme events, which are far from average by definition (Winter 2004). The uncertainty is taken into account by the probabilities thresholds calculated from the PDFs and established in risk management by the decision makers.

In this study, random sampling from a normal distribution was applied, with March 18 and October 12 being chosen for the calculation of groundwater-level statistics. For these dates, the mean highest and mean lowest levels, respectively, were obtained from the monitoring series. To enable risk management of extreme water-table levels, two different levels of probability were calculated, one for each selected date. First, a 95 % probability level scenario was evaluated for March 18 as a measure for shallow water-table risk. A shallow water table can be a problem during the rainy season because it can stop machinery and make ploughing and planting operations impossible. Also, it can influence soil conditions, decreasing soil redox potential, increasing pH in acid soils, decreasing pH in alkaline soils and increasing conductivity and ion-exchange reactions. These modifications in the soil conditions might influence plant growth, by affecting the availability of nutrients and regulating uptake in the rhizosphere. Using the 0.95 percentile of the PDF, it can be said that in this area, the probability of the water levels being higher than the values of the resulting map is just 5 %, and 95 % for deeper water levels. The limit established for risk of shallow water-table levels on March 18 was 0.5 m below the ground surface. Second, a 5 % probability level was considered for October 12 as a measure for water-shortage risk. Using the 0.05 percentile, in this area, there is a 5 % probability that water levels will be deeper than the values of the resulting map, and 95 % for higher water levels. The limit established for water-shortage risk on October 12 was 25 m, exceeding the depth of all wells (dry wells characterize a scenario of water shortage in the area). This can be a problem during the beginning of the plants’ development, affecting water availability and resulting in production losses. Mapping these extreme levels, it is possible to determine areas susceptible to extreme shallow and extreme deep levels in the watershed.

Geostatistical analysis: predicting water-table levels in time and space

The transfer function-noise models, once calibrated to a set of time series observed in various wells in an area, can have their predictions interpolated spatially, using ancillary information related to the physical basis of these models (Knotters and Bierkens 2001). Ancillary information can be used to improve model predictions since there is a strong relationship between the secondary variables and the target variable. Also, using geostatistical interpolators is another approach to aggregating physical knowledge about the processes into the statistical model. When abundant ancillary information related to the processes is available, it is possible to not only improve the accuracy of the spatial predictions, but also yield more plausible spatial patterns and enhance the physical meaning of the maps, using digital elevation models (DEM) and/or classified satellite images (Hengl et al. 2007; Manzione et al. 2010; Peeters et al. 2010).

In this approach, the spatial differences in water-table dynamics are determined by the spatial variation in the system properties, while its temporal variation is driven by the dynamics of the input into the system. In this study case, the values of simulated water-table levels are interpolated spatially using geostatistical techniques (Pebesma 2004). The variability of n observations Z(s), with s denoting spatial location, is the sum of a trend m(s) and a residual e(s):

$$ \matrix{ {Z\left( {{{{\bf s}}_{i}}} \right) = m\left( {{{{\bf s}}_{i}}} \right) + e\left( {{{{\bf s}}_{i}}} \right);} \hfill \\ {i = 1, \ldots ,n} \hfill \\ }<!end array> $$
(6)

where the trend m(s) is a deterministic linear function of s + 1 unknown constants β i and known covariates f i(s), fitted by linear-regression analysis:

$$ m\left( {{{{\bf s}}_{i}}} \right) = {\beta _{0}} + {\beta _{1}} \cdot {f_{1}}\left( {{{{\bf s}}_{i}}} \right) + \ldots + {\beta _{p}} \cdot {f_{p}}\left( {{{{\bf s}}_{i}}} \right) + e\left( {{{{\bf s}}_{i}}} \right) $$
(7)

and the residual e(s) is a zero-mean random variable with stationary covariance, dependent only on separation vectors s i s j interpolated using kriging.

Complete coverage of f(s) in the database for predicting Z(s 0) at unobserved locations s 0 of Z, can help to use all information available optimally, find satisfactory agreement between model and data and yield adequate predictions (Pebesma 2006). The universal kriging predictor of Z(s 0) is:

$$ \matrix{ {\widehat{Z}\left( {{s_{0}}} \right) = \sum\limits_{{k = 0}}^{p} {{{\widehat{\beta }}_{k}} \cdot {f_{k}}\left( {{s_{0}}} \right) + \sum\limits_{{i = 1}}^{n} {{\lambda _{i}} \cdot e\left( {{{{\bf s}}_{i}}} \right)} } } \hfill \\ {{f_{0}}\left( {{{{\bf s}}_{i}}} \right) = 1} \hfill \\ }<!end array> $$
(8)

The β k are trend model coefficients estimated using Generalized Least Squares (GLS), f k (s 0) is the k-th external explanatory variable (predictor) at location s 0, p is the number of predictors, λ i are the kriging weights and e(s i ) are the zero-mean regression residuals. The model is considered to be the best linear predictor for spatial data (Christensen 2001). The zero-mean spatially correlated residual has its spatial-correlation structure characterized by a semivariogram. The semivariogram is the spatial estimator of dependence between observation points and provides information to the kriging system when performing spatial interpolation. The results of spatial interpolation are evaluated using cross-validation (Chilès and Delfiner 1999; Pebesma 2004).

The DEM with 10-m resolution was used as ancillary information to predict water-table depths for the whole Onça Creek watershed (Fig. 2). The resulting maps can be used to evaluate possible states of the phreatic surface or risk areas for extreme deep or shallow water-table levels during any period of the year. This is valuable information for decision making and water-management policies, in order to balance economic and ecological purposes in groundwater exploitation. The method presented has demonstrated potential for unconfined and porous aquifers in Brazil. Manzione et al. (2010) presented the results of stochastic modeling in the Brazilian Cerrados using maps with physical meaning determined by the relationship between water-table levels and elevation, considering them more understandable for the general public, rather than tables containing statistical intervals.

Fig. 2
figure 2

a Digital elevation model with 10-m resolution in m above sea level (asl); and b Landsat 5 R(5)G(4)B(3) image composition for 2008’s land use for the Onça Creek watershed

Results and discussion

Calibration and simulation of time-series models in the Onça Creek watershed

The calibrations and simulations of the PIRFICT model were performed for 21 wells. Two wells of the monitoring network were removed from the analysis due problems in the wells and discontinued monitoring (wells 1 and 6). The calibration statistics calculated for these 21 wells are presented in Table 1.

Table 1 Statistics of PIRFICT-model calibrations for 21 monitoring wells in the Onça Creek watershed from April 2004 to July 2011

The calibrations suggest reasonable model performance, with an average percentage of the variance explained by the models (EVP) of 78.80 %; average root mean square error (RMSE) of 0.530 m; and average root mean squared innovation (RMSI) of 0.271 m. The RMSI values represent the mean model error due to time lag (t versus t – 1), and is understood as a measure of model accuracy (Von Asmuth and Bierkens 2005). Since the RMSI values are smaller than the RMSE values, the PIRFICT-model assumptions are true, because the model preserves the memory of previous time steps in order to adjust the subsequent predictions to actual observations. Considering the length of the time series, with 7 years of monitoring data, the results are satisfactory. In some wells, the results were affected by changes in land use. Well 8, for example, had the land use changed from reforestation with eucalyptus to sugar cane. The water demand from these fields is different, and in this case, not only precipitation and evapotranspiration can influence the response of the aquifer system. Wells 9 and 10 are located under reforestation with eucalyptus. During the monitoring period the forest was cleared, and without the trees, the water levels rose until the plants sprouted again and started to intercept the meteoric water that reaches the soil and infiltrates. In these cases, Von Asmuth et al. (2008) recommend introducing other inputs into the PIRFICT model in order to characterize the aquifer response due to multiple stresses.

Figures 3 and 4 present the calibrated models for wells 16, 17, 18 and 19, and their adjusted IR functions, respectively. These wells are located in a transect, where well 16 is the closest to the drainage and well 19 the more distant, with wells 17 and 18 between them. The levels are more superficial and react faster close to the river, as can be seen in the behaviour of well 16. More distant from the river, the soil coverage gets thicker and the memory of the aquifer system increases.

Fig. 3
figure 3

Examples of PIRFICT-model calibrations for water-table level time series at wells 16, 17, 18 and 19 in the Onça Creek watershed (symbols = observations; lines = modeling results)

Fig. 4
figure 4

Impulse response functions adjusted for the input series of precipitation, for wells 16, 17, 18 and 19 in the Onça Creek watershed

Once the dynamic relationship between precipitation surplus and water-table levels was established and calibrated for all wells, the PIRFICT-model was simulated more than 1,000 times. The resulting series are 1,000 realizations of the PIRFICT-model with 30-year lengths, which were sampled for the dates March 18 and October 12. The procedure reconstructed the series of water-table levels for the unobserved period, based on the available precipitation and evapotranspiration climatological series with longer length. The simulated series preserved the seasonal patterns of water-table oscillation in each of the wells, and the confidence intervals of each series were narrow, denoting a good performance for the simulation.

Mapping simulated water-table levels for selected dates

To estimate the water-table levels for the whole area of the Onça Creek watershed, the spatial correlation structure of the selected percentiles of the simulated time series for March 18 and October 12 are characterized by semivariograms with a linear trend given by elevation. The interpolation considered a linear combination of the observed values and a deterministic trend derived from the DEM with 10-m resolution in the universal kriging system. The linear correlation between elevation values and the target variable predicted water-table levels determined by the coefficient of determination (r 2) for March 18 was 0.78 and for October 12 was 0.76. This confirms the hypothesis that in the upper elevations, the water levels are deeper than in the lower elevations (close to the drainage), where the levels are shallow, following the geomorphological patterns of the watershed (Kitanidis 1997). The relationship between variables not only improved the estimations but also aggregated information for the western part of the watershed where there were no wells (see Fig. 1). The spatial structure modelled for both scenarios was similar. For March 18, a spherical model was used with 0.5 nugget effect, 20.5 sill and 1,000-m range, and for October 12, a spherical model was used with 0.5 nugget effect, 21.5 sill and 1000-m range. The semivariance decreased 80 and 71 % for the two models, respectively, adding elevation as ancillary information to estimate water-table levels when compared with semivariograms using only water-level observations to model the spatial structure of the simulated levels.

The maps with the 0.95 percentile of the water-table level PDF simulated for March 18 and detected risk areas of shallow water-table levels are presented in Fig. 5. In wet years, the map of 0.95 percentiles can be used, for instance, to select areas with risk of shallow levels that can stop machinery and delay field operations (Manzione et al. 2010). In the case of March 18, the two risk spots shown in Fig. 5b correspond to a wetland in the outlet of the watershed where no agricultural practice takes place and to a barrage built for irrigation purposes. The maps with the 0.05 percentile of the water-table level PDF simulated for October 12 and detected risk areas of water-table levels deeper than 25 m are presented in Fig. 6. In dry years, the map of 0.05 percentiles can be used to select areas with risk of water shortage and dry wells (Manzione et al. 2008). In Fig. 6, it is possible to detect four spots with risk of levels deeper than 25 m. The spots on the borders of the watershed deserve attention because of the high uncertainty about the predictions, and the spot in the eastern part of the watershed because it is close to irrigated citrus fields. These limits are just examples and can be changed and adjusted for different purposes.

Fig. 5
figure 5

a Estimated water-table depths that will be exceeded with 5 % probability; and b areas with risk of water-table depths shallower than 0.5 m; on March 18, for the Onça Creek watershed

Fig. 6
figure 6

a Estimated water-table depths that will be exceeded with 95 % probability; and b areas with risk of water-table depths deeper than 25 m; on October 12, for the Onça Creek watershed

The scenarios presented for March 18 and October 12 contain valuable information about groundwater dynamics in the watershed, even with all the sources of uncertainty already mentioned. From these maps, water boards and land-use planners can start a discussion about strategies for the next cultivation season and/or harvest planning, based on the prevailing climatological condition and soil-water balance. It is important to mention that areas with shallow levels represent more liability than areas with deep levels, since the range of the variation is much smaller and less uncertain. It is also an indicator that efforts to improve the monitoring network should be concentrated on those areas with large uncertainty, if it is desired to improve model results. Some would say that scenarios for extreme percentiles are unrealistic, since it would be impossible to imagine that in all spots in an area the water-table levels are reaching such extreme thresholds. However, for planning activities, these maps could reveal potential risk areas that deserve attention during land-use regulation and when assessing the resilience of agricultural activities. Resilient systems can have a higher probability of success in these uncertain times of climate change.

The results of the spatial interpolation for March 18 and October 12 were evaluated by cross-validation (Table 2). For March 18, the standard deviation (SD) values for the observations are higher than those of the predictions, pointing to smoothed interpolation values (for example, 8.38 versus 7.33). The mean predicted values reflect the mean observed water-table levels. Also, the mean interpolation errors are small (−0.04 m). The mean and standard deviation of the Z-score had values close to zero and one, respectively, characteristics of the kriging system (Chilès and Delfiner 1999; Christensen 2001) as a BLUE (Best Linear Unbiased Estimator). For October 12, the SD values for the observations are also higher than those of the predictions (for example, 8.10 versus 6.89). The mean predicted values reflect the mean observed water-table levels and the mean interpolation errors are small (−0.01 m). The mean and standard deviation of the Z-score had values close to zero and one, respectively. The results for both dates suggest good performances of the kriging system and the estimations.

Table 2 Summary of cross-validation for spatial interpolation of water-table levels at March 18 and October 12

Due to the lack of information, to obtain better results in the spatial interpolation, the degree of uncertainty in the maps should be considered for further analysis. The sources of uncertainty can be explained as resulting from different sources: uncertainties related to the data (observed water-table levels, climatic database, measured and estimated quantities); uncertainty associated with time-series modeling (model calibrations); and with the model of spatial variation (lack of data to cover all the watershed and description of the spatial structure of the phenomena). However, understanding these errors gives a margin for water planning; for example, installing longer cables and connections during well construction and positioning the water pump 10 m deeper for operation. Also, for more than half of the watershed area, there was no information about the water-table dynamics. Now it is possible to imagine its behavior from the geomorphology of the region, based on the physical meaning of the final maps.

Integration of stochastic methods as a framework to predict water-table levels for GAS

This kind of integration of methods deserves attention because it can offer a framework for establishing a dialogue between basic research and applied technology (Christakos 2004). Stochastic models can capture the evolution of risk as a dynamic concept, essentially a probabilistic concept that changes in space and time (Winter 2004). For Freeze (2004), there is a need to adjust the complexity of modeling with decision-makers expectations, and, he remarks that uncertainty analysis feeds risk analysis and consequently economic analysis. In developing counties like Brazil, where data scarcity is often a problem, sometimes model simplicity is required to see modelling results applied in decision systems as routine. It helps clients to select the most cost-efficient strategies for their purposes. Pappenberger and Beven (2006) said that users need predictions for future decisions based on an assessment of potential future states and the risk associated with them, products of uncertainty estimation. For Neuman (2004), real-world situations must be analyzed using methods that incorporate realistic combinations of randomness (stochasticity) and determinism (via conditioning). Dagan (2002), Neuman (2004) and Renard (2007) highlight the promotion of stochastic software to facilitate application of these concepts by practitioners. In addition, it is recommended that visualizations of evaluated scenarios are provided, to enhance the interpretation of the results via maps. The fact that many decision makers cannot deal with uncertainty, a topic addressed by Pappenberger and Beven (2006), Christakos (2004), Freeze (2004), Sudicky (2004), Renard (2007) and others, motivates efforts to familiarize the general public with probabilistic concepts and to develop tools which allow the results to be visualized. In this process, is necessary to continue and increase the communication of these kinds of results from academia and from professionals familiar with stochastic theories (Ginn 2004; Renard 2007). Maps containing physical information about the underlying problem are important, since the visualization of groundwater scenarios is difficult for many people. The degree of uncertainty contained in a map should be evaluated and taken into account. Depending on the purpose of the study, the level of uncertainty can be established and used as a risk measure.

The Brazilian government and local authorities are making several efforts to improve the GAS monitoring network coverage for hydro-climatological purposes. In the near future, a wide range of time series will be available for practitioners, so that they can analyse the data for specific studies. The results of the environmental protection and sustainable development of the Guarani Aquifer System project (OAS 2009) are already available, and present new discoveries about the reservoir. New agreements have been reached to continue these studies. All sources of information can be aggregated in the decision-making processes, to help groundwater management at different scales, from the watershed scale (which forms the basis of the Brazilian National Water law), to the regional and continental domains of the GAS. In particular, for groundwater and for GAS, it is an opportunity to reset what Hunt and Doherty (2011) called the balance of theory and application. The authors ask for a renewed emphasis on applications designed to test theoretical constructs from the collaboration between theoretical and applied researchers, with training elements explicit in the work plan, not left to chance. Sustainable management and protection of GAS resources can be achieved by analysing the aquifer globally but acting locally on water-management issues. Accurate predictions will be difficult to reach for the whole aquifer domain. Local actions should be based on local analysis of aquifer dynamics and possible states or scenarios for groundwater management evaluated, discussed and considered. Stochastic methods appear to be a useful alternative, to introduce research and field monitoring results into the discussion meetings. Understanding model uncertainty and experimenting with these results is desirable for this purpose.

Conclusions

Time-series analysis, combined with geostatistics, modelled the dynamic behaviour of water-table levels in the Onça Creek watershed, resulting in maps with possible water-table levels for specific dates. The PIRFICT model described different hydrological behaviours within a watershed, from the same data set, and simulated water levels for the selected dates of March 18 and October 12. Stochastic hydrogeology applications can provide useful information for real-world application and assist the decision-making processes in groundwater management and long-term water policy. The visualization of the final results in the form of maps enhanced the comprehension of the stochastic results, as is routine in land-use planning and evaluation of natural resources. For large aquifers like the GAS, it is important to explore and aggregate different sources of information when modeling monitoring data, and to consider uncertainty in the decision-making processes for groundwater management and quantitative estimations.