1 Introduction

The presence of pollutants in groundwater and their impact in human health and ecosystem’s services have led to efforts in developing physics-driven models that aim to simulate the dynamics of dissolved substances. Aquifers are complex systems to model given that hydraulic properties are spatially heterogeneous over a multitude of scales Rubin (2003); Sahimi (2011). The spatial fluctuations of these properties impact the overall spreading and mixing behavior of a solute body Berkowitz et al. (2002); Dentz et al. (2011) and therefore, environmental performance metrics such as solute arrival times and peak concentrations which are critical for probabilistic risk analysis Andričević and Cvetković (1996); Maxwell et al. (2008); de Barros et al. (2012); Henri et al. (2016); Moslehi and de Barros (2017). Furthermore, due to multiple factors such as sparse site characterization measurements, our incapacity to resolve hydrogeological properties’ variability at all scales and model uncertainties, groundwater contaminant predictions are subject to uncertainty Carrera (1993); Rubin (2003). To tackle these uncertainties, groundwater hydrologists have resorted to the use of probabilistic tools. These stochastic methods have been receiving increasing attention in both the scientific community and environmental regulatory bodies Kelly and Campbell (2000); USEPA (2001); Verdonck et al. (2005).

The stochastic characterization of solute transport in heterogeneous porous media flows has been considered in several theoretical studies Matheron and De Marsily (1980); Dagan et al. (1992); Fiori et al. (2002); Morales-Casique et al. (2006); Andricevic (2008). Through the use of analytical methods, many works related the spatial moments of a dissolved constituent with the geostatistical structure of the hydraulic conductivity field Kitanidis (1988); Dagan (1991); Fiori (1998); Attinger et al. (2004); Dentz and de Barros (2015). The mean and variance of the solute resident concentration were also investigated for stratified media Fiori and Dagan (2002); Fernàndez-Garcia et al. (2008) and multivariate Gaussian logconductivity fields Rubin et al. (1994); Kapoor and Kitanidis (1998); Fiori and Dagan (2000); Tonina and Bellin (2008); de Barros et al. (2011). Methods aimed at computing the full probabilistic description of the concentration at a given point in space and time are also reported in the literature Shvidler and Karasaki (2003); Dentz and Tartakovsky (2010); de Barros and Fiori (2014); Boso and Tartakovsky (2016). Other works examined the uncertainty in the solute mass discharge at a control plane, i.e., the Breakthrough Curve (BTC) Cvetković et al. (1992); Fiori et al. (2002); de Barros (2018, Fiori 2001) and arrival times Cvetkovic and Shapiro (1990); Rubin and Dagan (1992); Sanchez-Vila and Guadagnini (2005), as well as the longitudinal mass distribution at a given time Harvey and Gorelick (2000); Berkowitz et al. (2006); Fiori et al. (2013).

An important measure of contamination lies in the maximum point concentration in groundwater at a given time. The maximum concentration of a given substance is at the basis of most of environmental regulatory practices, e.g., US-EPA’s maximum contaminant level (MCL), where maximum tolerable levels of concentration are typically prescribed for a variety of known contaminants. Hence, the evaluation of the maximum concentration of contaminants in groundwater is of crucial importance in risk assessment, for a variety of flow configurations, as e.g., recently shown by Okkonen and Neupauer Okkonen and Neupauer (2016) for the capture zone delineation. The assessment of the maximum local concentration is a challenging task when dealing with heterogeneous porous media, given that this environmental performance metric is more prone to uncertainty that other aggregated and more “robust” transport metrics, like, e.g., the BTC or the longitudinal mass distribution (e.g., Jankovic et al. (2017)). Full-blown numerical Monte Carlo simulations were performed to evaluate the uncertainty of the maximum concentration at a control plane in permeability fields displaying long-range correlations, see e.g., Moslehi and de Barros (2017). Similar maximum concentration uncertainty quantification analysis by means of numerical Monte Carlo simulations is also reported in the literature Siirila and Maxwell (2012); Libera et al. (2019). The uncertainty of the maximum concentration was also used within the context of well vulnerability criteria Enzenhoefer et al. (2012); Libera et al. (2017). The temporal behavior of the maximum concentration of a contaminant in groundwater depends on several factors that include the complex interplay between the large-scale advection, that is determined by the spatial distribution of the aquifer’s hydraulic properties, and local-scale dispersion. Other key elements consist of the size of the solute plume after its initial release in the subsurface and the mean aquifer velocity, amongst many other factors.

The disentanglement of the several factors ruling the maximum concentration is a formidable task, and unfortunately numerical models are of limited help as they typically struggle in representing both large- and local-scale features of transport in complex groundwater systems and the resulting local concentration field Boso et al. (2013). In turn, physically based “bottom-up” Hrachowitz et al. (2017) analytical models, characterized by lesser complexity and simpler parametrization, may considerably help in elucidating the significant components of transport and their impact on the maximum concentrations. This is in line with the several and different past approaches based on simple analytical formulations that helped in the last five decades to significantly advance the field of groundwater hydrology, i.e., see Bear Bear (1988, 2007).

The key contribution of this paper lies on the analytical investigation of the maximum concentration in spatially heterogeneous porous media under mean uniform flow. We shall make use of the stochastic Lagrangian concentration framework developed by Fiori (2001) that aims at predicting the spatiotemporal dynamics of a solute body. This same concept was employed to estimate the dilution index of a solute plume in a heterogeneous aquifer and was tested against the Cape Cod field data (de Barros et al. 2015), the Borden site data (Soltanian et al. (2020)), and high-resolution SPH numerical simulations (Boso et al. (2013); de Barros et al. (2015)). The framework adopted in our work will serve us as a platform to sort out the factors ruling the spatiotemporal behavior of the maximum concentration, identifying the principal components, together with a sensitivity analysis of the main parameters and their physical meaning. Hence, the model will help in identifying the major components that determine the maximum concentration, which is important in order to better allocate resources toward site characterization. The ultimate scope is to provide a theoretical framework that is application-oriented to estimate the maximum concentration in natural aquifers and provide some guidance in applications; it provides an useful tool for preliminary, screening analysis and testing scenarios. Key advantages of the proposed approach are that it relies solely on parameters that are physically based and can in principle be inferred from site characterization and monitoring campaigns. We test the performance of the method by application to the well-known MADE-1 experiment (Adams and Gelhar 1992); the test is particularly relevant and challenging as the MADE site is a highly heterogeneous aquifer, and to the best of our knowledge this is the first time that a theoretical model is applied specifically to the analysis of the maximum concentration at MADE.

2 Problem Formulation

One of the key environmental performance metrics used for risk assessment and aquifer remediation is the maximum concentration observed or estimated at an environmentally sensitive target such as an observation well or a control plane. Let \(C_{\mathrm {max}}\) denote the maximum concentration within a flow domain \(\mathcal {D}\) at a given time t. The Cartesian coordinate system is represented by \(\mathbf {x} = [x_1,...x_d]\) where d is the space dimensionality of the flow domain. The maximum concentration can be defined as

$$\begin{aligned} C_{\mathrm {max}}(t) = \max \limits _{\mathbf {x} \in \mathcal {D}} c(\mathbf {x},t). \end{aligned}$$
(1)

Here, c denotes the solute concentration field, defined at the Darcy scale (Dagan 1989). Given multiple sources of uncertainty and the high costs associated with site characterization, the concentration field \(c(\mathbf {x},t)\) is conveniently modeled as a random function (and therefore \(C_{\mathrm {max}}\)).

The goal of this paper is to estimate the maximum concentration by means of an analytical framework. In order to achieve this goal, we consider a three-dimensional (\(d = 3\)) steady-state flow through a porous medium characterized by a locally isotropic spatially heterogeneous hydraulic conductivity K field under natural gradient in the absence of sinks, sources and boundary effects. Under these conditions, flow is uniform-in-the-mean along the \(x_1\) direction, which is a rather common condition encountered in most parts of the aquifers. The mean velocity vector is \(\langle \mathbf {V} \rangle \) \(=\) (U, 0, 0) where the angled brackets correspond to ensemble average operator. The governing equation for the flow field is

$$\begin{aligned} \nabla \cdot [ K(\mathbf {x}) \nabla h(\mathbf {x}) ] = 0, \end{aligned}$$
(2)

where h is the hydraulic head. The velocity field \(\mathbf {V}\) is obtained via Darcy’s law (Bear 1988):

$$\begin{aligned} \mathbf {V}(\mathbf {x}) = - \frac{K(\mathbf {x})}{\phi } \nabla h(\mathbf {x}) \end{aligned}$$
(3)

where \(\phi \) denotes the formation’s porosity, here assumed to be constant.

The logconductivity, i.e., \(Y = \ln K\), is modeled as a random space function based on two-point geostatistics. The spatial covariance model for Y adopted in this work is a statistically anisotropic exponential model (Rubin 2003)

$$\begin{aligned} \mathcal {C}_Y (r_1,r_2,r_3) = \sigma _Y^2 e^{-\left( \frac{r_1^2}{I_Y^2}+\frac{r_3^2}{I_Y^2}+\frac{r_3^2}{I_{Y,v}^2}\right) }, \end{aligned}$$
(4)

with \(\sigma _Y^2\) denoting the logconductivity variance, \(\mathbf {r} = (r_1,r_2,r_3)\) is the lag-distance, \(I_Y\) is the integral scale along the \(x_1\) and \(x_2\) directions and \(I_{Y,v}\) is the integral scale along the vertical \(x_3\) direction. Here, we can define the statistical anisotropic ratio \(f \equiv I_{Y,v}/I_{Y}\).

An inert solute is instantaneously injected over a source zone of volume \(\mathcal {V}_o\) \(=\) \(\ell _1 \times \ell _2 \times \ell _3\). The inlet concentration is denoted by \(C_o\) and assumed to be constant. In our work, we will assume a point-like injection of cubic dimensions, i.e., \(\ell \equiv \ell _j\) (for \(j = 1, 2, 3\)), where \(\ell /I_Y \lesssim 1\). The concentration field of the injected solute is provided by the advection-dispersion equation (Bear 1988, 2007)

$$\begin{aligned} \frac{\partial c(\mathbf {x},t)}{\partial t} + \mathbf {v}(\mathbf {x}) \cdot \nabla c(\mathbf {x},t) = D_{\mathrm {d}} \nabla ^2 c(\mathbf {x},t), \end{aligned}$$
(5)

where \(D_{\mathrm {d}}\) represents the local-scale dispersion coefficient and assumed to be constant.

3 Methodology

The methodology adopted in the present work is based on the well-established Lagrangian framework (Dagan 1984; Fiori and Dagan 2000; Rubin 2003). To obtain a solution for the concentration field, we consider a collection of solute parcels that are initially located within the injection zone volume \(\mathcal {V}_o\). A solute parcel that originates from a location \(\mathbf {a} \in \mathcal {V}_o\) will have a random total trajectory \(\mathbf {X}_T\). Therefore, following Fiori and Dagan (2000), the solute concentration can be expressed as

$$\begin{aligned} c(\mathbf {x},t) = C_o \int _{\mathcal {V}_o} \delta [\mathbf {x} - \mathbf {X}_T(t; \mathbf {x})] d\mathbf {a}, \end{aligned}$$
(6)

where \(\delta \) represents Dirac’s delta function. Note that the total solute parcel trajectory can be decomposed into an advective component \(\mathbf {X}\) and a displacement associated with local-scale dispersive mechanisms \(\mathbf {X}_{\mathrm {d}}\), i.e., \(\mathbf {X}_T = \mathbf {X} + \mathbf {X}_{\mathrm {d}}\). Given the random spatial variability of the K field, the advective component can be rewritten as \(\mathbf {X} = \mathbf {a} + \mathbf {U}t + \mathbf {X}^{\prime }\) where \(\mathbf {X}^{\prime }\) corresponds to the random fluctuation of the trajectory to the randomness of the flow field. As mentioned in Sect. 2, we consider a small injection zone \(\ell /I_Y \lesssim 1\). With the goal of achieving an analytical solution for the concentration field, we will further assume that the heterogeneity of the porous medium is low to mild, i.e., \(\sigma _Y^2 \lesssim 1\), in order to use Dagan’s first-order approximation in \(\sigma _Y^2\) (Dagan 1984).

Our next step consists in considering the trajectory of the solute plume’s center of mass. Following the work of Fiori (2001), we rewrite Equation (6) into a mobile coordinate system \(\varvec{\xi } = \mathbf {x} - \mathbf {P}(t; \bar{\mathbf {x}}_{o})\) centered along the trajectory \(\mathbf {P}\) of the centroid of a solute macro-parcel that originated from the source zone at coordinate \(\bar{\mathbf {x}}_{o} \in \mathcal {V}_o\). Thus, the solute concentration can be rewritten as

$$\begin{aligned} c(\varvec{\xi },t) = C_o \int _{\mathcal {V}_o} \delta [\varvec{\xi } - (\mathbf {X}_T(t; \mathbf {x})-\mathbf {P}(t; \bar{\mathbf {x}}_{o}))] d\mathbf {a}. \end{aligned}$$
(7)

The integration is executed to account for all possible initial solute parcel locations \(\mathbf {a} \in \mathcal {V}_o\). In agreement with this mobile coordinate system, we can define the relative trajectory particle \(\mathbf {W}(t;\mathbf {a},\bar{\mathbf {x}}_{o})\) \(=\) \(\mathbf {X}_T(t; \mathbf {x})-\mathbf {P}(t; \bar{\mathbf {x}}_{o})\). The relative trajectory \(\mathbf {W}\) filters out the uncertainty associated with the meandering of the centroid trajectory \(\mathbf {P}\). The first-order approximation in the logconductivity variance is employed to derive expressions for the first two moments of \(\mathbf {W}\) for a point-like source

$$\begin{aligned} \langle \mathbf {W}(t;\mathbf {a},\bar{\mathbf {x}}_{o}) \rangle\,=\, & {} \mathbf {a} - \bar{\mathbf {x}}_{o} \approx 0 \end{aligned}$$
(8)
$$\begin{aligned} W_{ii}(t; \mathbf {a},\bar{\mathbf {x}}_{o})\,=\, & {} X_{ii}(t) + 2 D_{\mathrm {d}} t - Z_{ii}(t; |\mathbf {a} - \bar{\mathbf {x}}_{o}|\approx 0), \end{aligned}$$
(9)

where \(X_{ii}\) and \(Z_{ii}\) corresponds to the one- and two-particles trajectory covariances which can be computed as follows (Fiori and Dagan 2000):

$$\begin{aligned} X_{ii} (t)\,=\, & {} \frac{1}{(2 \pi )^{3/2}} \int _0^t \int _0^t \int _{+\infty }^{-\infty } e^{-{1} \mathbf {k} \langle \mathbf {V} \rangle (t^{\prime }-t^{\prime \prime })}\nonumber \\&e^{-k_p k_r D_{\mathrm {d}}|t^{\prime } + t^{\prime \prime }|} \hat{u}_{ii}(\mathbf {k}) d\mathbf {k} dt^{\prime }dt^{\prime \prime }; \end{aligned}$$
(10)
$$\begin{aligned} Z_{ii}(t; |\mathbf {a} - \bar{\mathbf {x}}_{o}|\approx 0)= & {} \frac{1}{(2 \pi )^{3/2}} \int _0^t \int _0^t \int _{+\infty }^{-\infty } e^{-{1} \mathbf {k} \langle \mathbf {V} \rangle (t^{\prime }-t^{\prime \prime })}\nonumber \\&e^{-k_p k_r D_{\mathrm {d}}(t^{\prime } + t^{\prime \prime })} \hat{u}_{ii}(\mathbf {k}) d\mathbf {k} dt^{\prime }dt^{\prime \prime }. \end{aligned}$$
(11)

Here, \(\hat{u}_{ii}(\mathbf {k})\) corresponds to the Eulerian velocity covariance in Fourier space and \(\mathbf {k}\) is the wave number vector. Note that the limit \(|\mathbf {a} - \bar{\mathbf {x}}_{o}|\approx 0\) present in both Eqs. (8) and (9) is consistent with the small source approximation previously adopted (i.e., \(\ell /I_Y \lesssim 1\)). The expression for \(\hat{u}_{ii}(\mathbf {k})\) is given by Dagan (1989).

$$\begin{aligned} \hat{u}_{ii}(\mathbf {k}) = U^2 \left( \delta _{1i} - \frac{k_1 k_i}{k^2} \right) \left( \delta _{1i} - \frac{k_1 k_j}{k^2} \right) \hat{\mathcal {C}}_Y(\mathbf {k}), \end{aligned}$$
(12)

with \(k^2 = \sum _i k_i^2\) for \(i =\) 1, 2 and 3 and the logconductivity covariance function in Fourier space is as follows (Rubin 2003):

$$\begin{aligned} \hat{\mathcal {C}}_Y(\mathbf {k}) = \sqrt{\frac{8}{\pi }} \sigma _Y^2 I_{Y}^2 I_{Y,v} (1 + k_1^2 I_Y^2 + k_2^2 I_Y^2 + k_3^2 I_{Y,v}^2)^{-3/2}. \end{aligned}$$
(13)

The following step consists of computing the statistical moments of the concentration field in this mobile coordinate system. We start by evaluating the expected value over all possible relative trajectories \(\mathbf {W}\) captured by its probability density function (PDF) \(f_w\). The first moment is given by

$$\begin{aligned} \langle c(\varvec{\xi },t) \rangle = C_o \int _{\mathcal {V}_o} f_w(\varvec{\xi };t, \mathbf {a}) d\mathbf {a} \end{aligned}$$
(14)

and the variance can be computed by

$$\begin{aligned} \sigma _c^2(\varvec{\xi },t) = C_o^2 \int _{\mathcal {V}_o} \int _{\mathcal {V}_o} f_{ww}(\varvec{\xi },\varvec{\xi };t, t, \mathbf {a},\mathbf {a}^{\prime })d\mathbf {a}d\mathbf {a}^{\prime } -\langle c(\varvec{\xi },t) \rangle ^2 \end{aligned}$$
(15)

where \(f_{ww}\) is the joint PDF of the relative displacement of two solute parcels initially released at locations \(\mathbf {a} \in \mathcal {V}_o\) and \(\mathbf {a}^{\prime } \in \mathcal {V}_o\). Under the first-order approximation in the logconductivity variance, the relative displacement PDF is multivariate Gaussian. Furthermore, it is easy to show that \(\sigma _c^2\), Eq. (15), tends to zero for finite Péclet conditions and a point-like injection (see Fiori 2001 for details). The key point of this result is that the concentration can be predicted in a moving coordinate system in the absence of uncertainty. It is important to note that this uncertainty is not eliminated but transferred to the location of the solute parcel’s center of mass. This implies that for a finite Péclet and point-like injection, the value of the concentration (in the mobile coordinate system) is not subject to uncertainty however the location where that event occurs is. Therefore, Eq. (14) can be used to predict the concentration, i.e., \(\langle c (\varvec{\xi }, t) \rangle \equiv c (\varvec{\xi }, t)\). Carrying out the integration over the particle trajectory PDF \(f_w\), we obtain the following expression for Eq. (14):

$$\begin{aligned} c (\varvec{\xi }, t) = C_o \prod _{i=1}^3 \frac{1}{2} \left\{ \mathrm {erf} \left[ \frac{\xi _i + \ell _i/2}{\sqrt{2W_{ii}(t)}}\right] - \mathrm {erf} \left[ \frac{\xi _i - \ell _i/2}{\sqrt{2W_{ii}(t)}}\right] \right\} . \end{aligned}$$
(16)

We emphasize that for plume sizes that are not point-like, the concentration (in the mobile coordinate system) is a random variable subject to uncertainty, quantified by its variance \(\sigma _c^2(\varvec{\xi },t)\), the computation of which being rather involved. Nevertheless, as discussed above, such uncertainty is much lower than the one pertaining to the standard Eulerian approach, see (Fiori and Dagan 2000), as most of the variability is filtered out by the Lagrangian formulation adopted here, that leads to the definition (7). In the following, we neglect such uncertainty for small to intermediate plume sizes and adopt a simple and straightforward description of local concentration by its expected value provided in Eq. (16).

The expected maximum concentration \(C_{\mathrm {max}}\) is calculated from (16) by setting \(\xi _i=0\) (\(\forall i=\) 1, 2 and 3), i.e., evaluated at the (random) centroid of the plume where the peak of local concentration is expected in average terms. This leads to the final, simple expression for the maximum concentration

$$\begin{aligned} C_{\mathrm {max}} (t) = C_o \prod _{i=1}^3 \mathrm {erf} \left[ \frac{\ell _i/2}{\sqrt{2W_{ii}(t)}}\right] . \end{aligned}$$
(17)

The above provides the maximum contaminant concentration in a spatially heterogeneous porous medium as function of time; the factors appearing in (17) rule the temporal behavior of \(C_{\mathrm {max}}\), and in particular the plume size \(\ell _i\) and the parameters appearing in \(W_{ii}\), i.e., the mean velocity U, the logconductivity variance \(\sigma _Y^2\), that expresses the degree of aquifer heterogeneity, the directional correlation lengths of hydraulic conductivity \(I_Y\), \(I_{Y,v}\) , and the Péclet number, defined as \(\mathrm {Pe} \equiv U I_Y/ D_d\). According to the modeling framework adopted here, those are the fundamental quantities that determine the temporal evolution of \(C_{\mathrm {max}}\) and their impact on the solution is further explored and discussed in the sequel.

We remark that the present formulation allows estimating the maximum concentration of a given contaminant regardless of the particular location where it occurs; such location is not explicitly modeled here and it is typically subject to significant uncertainty.

4 Results and Discussion

We discuss here a few features of the maximum concentration and the principal factors and parameters influencing it. The maximum concentration is calculated by expression (17), with moments \(W_{ii}\) given by Eq. (9). In the following, all computational results are reported in dimensionless form. The maximum concentration, \(C_{\mathrm {max}}\) is normalized by the inlet concentration in the source zone \(C_o\), and time is normalized by the advective time scale \(\tau _{\mathrm {adv}} = I_Y / U\). The Péclet number is defined as \(\mathrm {Pe} \equiv U I_Y/ D_d\). Unless specified, the source zone is characterized by cube of dimensions \(\ell = 0.1 I_Y\).

Fig. 1
figure 1

Temporal evolution of the maximum concentration for \(\sigma _Y^2 =1\) and \(\ell = 0.1 I_Y\). Results reported for different values of \(\mathrm {Pe}\) and f where \(\mathrm {Pe} \equiv U I_Y/ D_d\) and \(f \equiv I_{Y,v}/I_{Y}\)

Figure 1 depicts the temporal evolution of the maximum concentration for different values of \(\mathrm {Pe}\) and statistical anisotropy ratio \(f \equiv I_{Y,v}/I_{Y}\); values \(f<1\) indicate the presence of some preferential layering of the hydraulic conductivity in the horizontal plane. The effects of f on the maximum concentration are more significant for higher \(\mathrm {Pe}\) (see Fig. 1, continuous vs dashed lines). For anisotropic geological formations, i.e., \(f = 0.1\), the heterogeneous structure of the porous formation becomes more effective in enhancing the dilution of the plume for \(\mathrm {Pe} = 10^3\). Similar results are reported in the literature in the context of concentration uncertainty analysis (see details in de Barros and Fiori 2014). The reason for such behavior is that local dilution and transfer of solutes from neighbor layers is facilitated by decreasing values of \(I_{Y,v}\), particularly when local diffusion becomes a limiting factor, i.e., for relatively high values of Péclet. This is well represented in Fig. 1. As expected, lower values of concentration are observed for \(\mathrm {Pe} = 10^2\), i.e., for increasing values of the local dispersion coefficient \(D_d\). Under this condition (low \(\mathrm {Pe}\)), the solute plume dilutes quicker and the effects of f become negligible, as described above. It is seen that the reduction of the maximum concentration with decreasing Péclet is quite significant, in terms of orders of magnitude, confirming the fundamental role played by the interplay between local-scale dispersion/diffusion and large-scale advection in ruling the temporal behavior of \(C_{\mathrm {max}}\).

Fig. 2
figure 2

Time evolution of the maximum concentration for (a) \(\mathrm {Pe} = 10^2\) and (b) \(\mathrm {Pe} = 10^3\) for a few values of \(\sigma _Y^2\). Computational results evaluated for \(\ell = 0.1 I_Y\) and \(f =1\)

Fig. 3
figure 3

Maximum concentration as a function of the logconductivity variance \(\sigma _Y^2\) for \(f =1\) and (a) \(\mathrm {Pe} = 10^2\) and (b) \(\mathrm {Pe} = 10^3\). Results computed for early (\(\tau = 2.5\)), intermediate (\(\tau = 10\)) and late (\(\tau = 50\)) times where \(\tau = t/ \tau _{\mathrm {adv}}\)

Fig. 4
figure 4

Relative difference (see Eq. 18) between the maximum concentration computed for \(\sigma _Y^2=\)0.25 and 1. Results for \(\mathrm {Pe} = 10^2\) (continuous red curve) and \(\mathrm {Pe} = 10^3\) (dashed blue curve)

The degree of aquifer heterogeneity is also an important factor as it rules large-scale advection and the related dispersion (generally denoted as macrodispersion), and its combination with Péclet is indeed among the major mechanisms for dilution and the decrease of concentration with time. We evaluate the temporal evolution of the maximum concentration for three distinct degrees of heterogeneity, epitomized by \(\sigma _Y^2\). The anisotropy is set as \(f=1\), i.e., we consider a statistically isotropic aquifer, as its role was previously discussed. Results are depicted for \(\sigma _Y^2\) \(=\) [0.25, 0.5, 1] and for \(\mathrm {Pe} = 10^2\) (Fig. 2a) and \(\mathrm {Pe} = 10^3\) (Fig. 2b). Figure 2 shows that the maximum concentration is sensitive to the level of heterogeneity in the hydraulic conductivity field. Larger values of logconductivity variance lead to a reduction in the maximum concentration since dilution is enhanced for higher levels of heterogeneity, see e.g., (Le Borgne et al. 2013; de Barros et al. 2015; Valocchi et al. 2019). In fact, a higher level of heterogeneity increases large-scale advection, which in turn increases the interfacial area between the solute plume and the surrounding fluid thus facilitating local dilution and the decrease of concentration.

Fig. 5
figure 5

Three-dimensional plots of the normalized maximum concentration as a function of dimensionless time (\(t/ \tau _{\mathrm {adv}}\)) and source size (\(\ell /I_Y\)) for (a) \(\mathrm {Pe} = 10^2\) and (b) \(\mathrm {Pe} = 10^3\). Results obtained for \(f = 1\)

Fig. 6
figure 6

Source zone dimension as a function of the maximum concentration at three different dimensionless times

To better assess the impact of heterogeneity, we compute the maximum concentration as a function of \(\sigma _Y^2\) for three dimensionless times, namely \(\tau =\) 2.5, 10 and 50 where \(\tau = t/ \tau _{\mathrm {adv}}\). The values of \(\tau \) were selected to represent cases for early, intermediate and late times. Bellin et al. (1992) showed that the first-order theory holds up to value of \(\sigma _Y^2 \lesssim 1.6\) for what concerns spreading and for such reasons Fig. 3 reports the maximum concentration for values of \(\sigma _Y^2\) ranging from 0.05 to 1.4. Both Fig. 3a, b illustrate the results for both \(\mathrm {Pe} = 10^2\) and \(10^3\), respectively, for \(f=1\). Figure 3a, b display similar decay behavior where the magnitude of the maximum concentration is the only notable difference. Figure 3 confirms the important role played by \(\sigma _Y^2\), i.e., heterogeneity, for enhancing the decrease of \(C_{\mathrm {max}}\) with time. It is seen that heterogeneity is more effective in the \(C_{\mathrm {max}}\) reduction for increasing times, which is expected as macrodispersion grows with time and hence the effectiveness of local dispersion in diluting the contaminant.

Figure 4 displays the relative difference between the maximum concentration obtained two levels of heterogeneity, i.e. \(\sigma _Y^2 =\) 0.25 and 1, and for \(\mathrm {Pe} = 10^2\) and \(10^3\). The relative difference is computed according to:

$$\begin{aligned} \epsilon (t) = 100 \times \left| \frac{C_{\mathrm {max}}(t;\sigma _Y^2 = 0.25) - C_{\mathrm {max}}(t;\sigma _Y^2 = 1) }{ C_{\mathrm {max}}(t;\sigma _Y^2 = 0.25)} \right| \end{aligned}$$
(18)

The results depicted in Fig. 4 illustrate the temporal evolution of \(\epsilon \). Figure 4 reveals that \(\epsilon \) ranges approximately from 45% to 65%. In agreement with previous results (see Figs. 2 and 3), Fig. 4 shows that the impact of heterogeneity on the peak concentration is more pronounced for \(\mathrm {Pe} = 10^3\) when compared to \(\mathrm {Pe} = 10^2\).

The solute source dimension is also an important component guiding the decay of \(C_{\mathrm {max}}\), especially at the early stages of transport. In fact, large-scale advection and the related macrodispersion require some time to disperse the initial plume and catalyze the dilution processes that occurs at the smaller scales. We examine the role of the solute source dimension \(\ell \) for both \(\mathrm {Pe} = 10^2\) and \(10^3\) (see Fig. 5a, b). We only report results within the range \(0 < \ell /I_Y \) \(\lesssim \) 1. It is interesting to observe that the memory effects of the source zone on the maximum concentration are more persistent when transport is advective dominated (see Fig. 5). In any case, the presence of a larger initial plume determines a more persistent maximum concentration closer to the initial one \(C_o\) because of the aforementioned mechanism: it takes more time, as function of Péclet and \(\sigma _Y^2\), to start diluting the center of the plume. Such feature is particularly evident in the early stages of transport and its effects tend to disappear with time, as observed in Fig. 5. Thus, the results depicted in Fig. 5 highlight the relative importance of the source zone dimension on the maximum concentration.

The results in Fig. 5 can be recast within the context of engineering design and risk analysis (see Fig. 6). As opposed to accidental spills and other contamination events which cannot be controlled, there are cases where many features that characterize contamination sources are engineered, e.g., wastewater discharge into the ground, landfills and drainage ponds associated with mining activities. Figure 6 illustrates how the analytical framework is application-oriented and could be adopted to design a waste disposal facility based on a critical maximum concentration. Figure 6 shows that the dimensions of the source zone \(\ell \) can be estimated such that the maximum concentration of a plume is in compliance with a regulatory maximum allowed value at a given time.

Fig. 7
figure 7

Comparison of the proposed model (Eq. 17) with the maximum concentration data collected at the MADE site reported in Adams and Gelhar (1992). The field data from Adams and Gelhar (1992) are represented by red circles. The cyan colored curves represent the model predictions with uncertain source zone dimensions

5 Application to the MADE-1 experiment

The first experiment conducted at the Columbus Air Force Base (MADE-1) represents a benchmark for analyzing groundwater transport; it has motivated in the years a large body of research work. Contributions consist of the development of innovative measuring techniques to the development of novel theoretical frameworks. For such reasons, after more than 30 years, the MADE site is still providing insights and topics of discussion in the scientific community, as witnessed for instance by the 2015 AGU Chapman Conference held in Valencia (Spain) (Gómez-Hernández et al. 2017). The experiment took place in a highly heterogeneous sedimentary aquifer at Columbus, Ohio (USA). The site was geostatistically characterized by the intense Direct Push campaign carried out by Bohling et al. (2016). The MADE-1 test consisted in the injection of a tracer in a relatively small area of the domain. The plume moved from its initial location along the natural hydraulic gradient and it was continuously monitored for a period approximately equal to two years by a dense network of multilevel samplers. Local concentration measured by the samplers were collected in eight snapshots at times \(t=\) 9, 49, 126, 202, 279, 370, 503 and 594 days since injection, e.g., (Adams and Gelhar 1992; Rehfeldt et al. 1992).

Adams and Gelhar (1992) reported the overall maximum concentrations measured by the multilevel samplers in the eight snapshots. The data indicate a decay of the maximum concentration from its initial value of 2500 mg/l (ppm) at \(t=0\) days down to \(C_{\mathrm {max}}=99\) mg/l at \(t=503\) days (the maximum concentration at \(t=594\) days is not reliable because the analysis was incomplete for that snapshot, see details in Ref. Adams and Gelhar 1992). We emphasize that in strongly heterogeneous aquifers such as the MADE site (which has a logconductivity variance \(\sigma _Y^2=5.9\)), an accurate monitoring and representation of the concentration field are challenging tasks. Despite the high density of multilevel samplers employed at the MADE site, a detailed image of the solute plume is not available. In fact, the complex flow field, characterized by strong preferential flows and disconnected quasi-stagnant zones, results in significant dispersion which leads to challenges in monitoring campaigns and therefore, uncertainty. The presence of uncertainty is also manifested by the incomplete mass recovery during the MADE-1 experiment (Adams and Gelhar 1992) (the matter is further discussed in depth in Fiori (2014)). Thus, for such reasons, we expect that the maximum concentrations measured during the MADE-1 experiment might be somewhat underestimated, although such underestimation may not be so severe standing the exceptionally high density of measurements that was employed during the test, much higher than what usually done in the standard hydrogeological practices.

Testing the performance of the model employed in this work, see Eq. (17), against the MADE-1 experiment is particularly challenging as our analytical formulation for \(C_{\mathrm {max}}\) is formally valid for small heterogeneity (i.e., \(\sigma _Y^2 \lesssim 1\)), while the MADE site is highly heterogeneous. Still, previous studies have shown that perturbation approaches may provide reasonable results in terms of overall dispersion for aquifers characterized by moderate heterogeneity, see Bellin et al. (1994), as previously discussed in Sect. 4. Furthermore, recent work by Fiori et al. (2017) has shown that the first-order solution is a reasonably good predictor of the longitudinal mass distribution observed at the MADE-1 experiment. Therefore, the results reported in the literature (Bellin et al. 1992; Fiori et al. 2017) provide confidence in the application of first-order-based analytical solutions. Still, we point out that the mass distribution is an aggregated quantity and it is presumably more robust than the local concentration, which in turn heavily depends on the interplay between large- and local-scale features, see discussion in Sect. 4. Hence, the validity of first-order solutions for the point concentration needs to be further explored, and this is done in the following for the maximum concentration.

In order to apply our model to the MADE site, we shall make use of some specific data that was elaborated in the last years from the detailed characterization carried out at the site. A summary of the data needed for our model is provided in Table 1 of Fiori et al. (2019); the principal quantities were taken from Boggs et al. (1992); Bohling et al. (2016). In particular, the quantities of interest for the model application are the estimated mean velocity \(U =\) 0.026 m/d, the logconductivity variance \(\sigma _Y^2=5.9\), the porosity \(\phi = 0.31\), the directional logconductivity integral scales \(I_Y =\) 9.1 m and \(I_{Y,v} = 1.8\) m (i.e., the anisotropy ratio is \(f = 0.197\)). The local dispersivity \(D_d/U\) is not known and a few assumptions should be made. First, the dominant component of local dispersion is the vertical one because it is more effective in enhancing dilution by transferring solute among layers in statistically anisotropic formations (Fiori and Dagan 2000). The vertical dispersivity can be assumed of similar order of magnitude of the same quantity inferred in other well monitored and characterized aquifers, like e.g., Borden and Cape Cod, for which a vertical dispersivity around 1 mm was estimated, while the transverse local dispersivity was ten times larger. Considering that the MADE site is much more heterogeneous than the aforementioned sites, we can safely assume for local dispersivity a value the order of \(10^{-2}\) m. Therefore, as an approximation, we used a Pe = \(10^3\) and \(f =\) 0.197 to generate the particle trajectory covariances \(X_{ii}\) and \(Z_{ii}\), see Eqs. (10) and (11).

The remaining parameters pertain to the initial plume size (i.e., the injection zone). The tracer (bromide, with an initial concentration of 2500 mg/l) was injected in five wells spaced 1 m apart in a linear array, each with a screen of dimensions 0.6 m. The solution, of volume \(\mathcal {V}_o=10.07\) m\(^3\), was injected at a uniform rate over a period of 48.5 hours. Thus, the source zone dimensions, that are required by our model, depend on the water flow during the injection and are subject to uncertainty. Uncertainty estimates for \(\ell _j\) (with \(j=\) 1, 2 and 3) are based on (i) the analysis of the plume development at the early stages, see e.g. (Boggs et al. 1992; Adams and Gelhar 1992), and (ii) the source injection data collected from multiple references (Julian et al. 2001; Dogan et al. 2014; Fiori et al. 2019). For the sake of illustration, we assume that both longitudinal and transverse dimensions \(\ell _1\) and \(\ell _2\) are uniformly distributed. The uniform distribution assumption is justified by the fact that we were only able to identify lower and upper bounds in \(\ell _j\). In the present contribution, we assume that the longitudinal and transverse dimensions of the source zone follow a \(\mathcal {U}[0.25, 0.75] \) and \(\mathcal {U}[4, 14.5]\), respectively (both \(\ell _1\) and \(\ell _2\) are in SI units, meter). To obtain an estimate for \(\ell _3\), we make use of injected volume \(\mathcal {V}_o\) and the random variables \(\ell _1\) and \(\ell _2\), i.e., \(\ell _3 = \mathcal {V}_o/ (\phi \ell _1 \ell _2)\). Based on the data provided in the literature (Dogan et al. 2014; Barlebo et al. 2004), we only consider \(\ell _3\) values that are subject to the following constraint 0.6 m \(\le \, \ell _3 \, \le \) 8 m. We perform a Monte Carlo simulation by generating 500 realizations for \(\ell _j\) in order to compute the statistics of \(C_{\mathrm {max}}\), see Eq. (17).

Figure 7 shows the comparison between the experimental maximum concentration measured in the MADE-1 experiment (red circles) and the same quantity predicted by the model discussed here, along the above described Monte Carlo procedure for the source dimensions (represented by the solid cyan lines). The results in Fig. 7 show that the theoretical model captures quite accurately the temporal dynamics of \(C_{\mathrm {max}}\) at MADE, all the uncertainties notwithstanding. The favorable behavior is particularly surprising mainly because of the complex nature of the MADE aquifer and the associated transport phenomena (which are still a matter of debate), and the limitations of the theoretical model, formally valid for weak to moderate heterogeneity. A similar positive behavior of the first-order-based models was found in Fiori et al. (2017) for the longitudinal mass distribution at MADE. In that particular case, the reason for the good performance of such models was attributed to the robustness of the quantity under examination, the longitudinal mass distribution, that is a spatially aggregated measure of contamination. In the present case, the reason for the good agreement displayed in Fig. 7 stands probably in the particular transformation operated by the Lagrangian concentration approach, as discussed in Sect. 3. Due to the change of the coordinate system, the method filters out the spatial variability of advective particles, i.e., of the trajectory \(\mathbf {P}\), thus focusing on the relative, local-scale dispersion around those particles. Since the main source of uncertainty is related to \(\mathbf {P}\) (which is filtered out by the transformation of the coordinate system), the approximations on the statistical distribution of the relative dispersion around \(\mathbf {P}\) are probably less severe than those commonly adopted in methods that do not perform such filtering. For instance, the usual assumption of a Gaussian distribution for longitudinal trajectories \(X_1\) adopted in first-order theories does not hold valid for MADE, for which the observed distribution of the trajectory \(X_1\) (that corresponds by definition to the longitudinal mass distribution) was far from Gaussianity. A discussion on the distribution of longitudinal trajectories is found in Fiori et al. (2017). However, the model investigated here is not based on the total trajectory \(\mathbf {X}_T\) but on the relative displacement \(\mathbf {W}=\mathbf {X}_T-\mathbf {P}\), for which the Gaussian assumption is likely less stringent.

6 Summary

In this study, we have employed the Lagrangian concentration framework to investigate the behavior of the maximum concentration in natural aquifers. The maximum concentration of a given substance is at the basis of most of environmental regulatory practices where maximum tolerable levels of concentration are typically prescribed for different contaminants. Through the use of the Lagrangian concentration framework originally developed by Fiori (2001), we obtained a semi-analytical expression for the maximum concentration. The solution is limited to uniform-in-the-mean flow conditions, small injection zones and low-to-moderate levels of heterogeneity. The main scope is to sort out the factors ruling the spatiotemporal behavior of the maximum concentration, identifying the principal components, together with a sensitivity analysis of the main parameters and their physical meaning. Hence, the model will help in identifying the major components that determine the temporal evolution of the maximum concentration, which is important in order to better allocate resources toward site characterization. As demonstrated in the literature (Oladyshkin et al. 2012; Fiori 2001), the analytical solutions originating from the stochastic Lagrangian framework could be combined with existing global sensitivity analysis to systematically identify the relative role of each parameter on the maximum concentration.

We showed how key geostatistical parameters (i.e., the logconductivity variance), typically inferred from site characterization campaigns, and local-scale dispersion mechanisms control the decay rate of the maximum concentration of a solute body. In addition, we illustrate the impact of engineered variables, such as the source zone’s dimensions, affected the maximum concentration. Finally, we successfully test the performance of the semi-analytical model against the MADE site maximum concentration data reported in Adams and Gelhar (1992). Although our results are strictly valid for low-to-moderate levels of heterogeneity, the Lagrangian-based maximum concentration model performs well at MADE due to the filtering out of the spatial variability of advective particles.

Summarizing, in the present work, we have discussed a theoretical framework that is application-oriented and aimed at estimating the maximum concentration in natural aquifers. The theoretical framework can provide guidance in applications; it also provides an useful tool for preliminary, screening analysis and testing the impact of different scenarios in risk predictions (such as the occurrence of an undesired event, i.e., concentration exceeding a regulatory established value).