Joint Clarification of Contaminant Plume and Hydraulic Transmissivity via a Geostatistical Approach Using Hydraulic Head and Contaminant Concentration Data

Takai, Shizuka; Shimada, Taro; Takeda, Seiji; Koike, Katsuaki

doi:10.1007/s11004-023-10084-8

Joint Clarification of Contaminant Plume and Hydraulic Transmissivity via a Geostatistical Approach Using Hydraulic Head and Contaminant Concentration Data

Published: 31 July 2023

Volume 56, pages 333–360, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Mathematical Geosciences Aims and scope Submit manuscript

Joint Clarification of Contaminant Plume and Hydraulic Transmissivity via a Geostatistical Approach Using Hydraulic Head and Contaminant Concentration Data

Download PDF

155 Accesses
1 Citation
Explore all metrics

Abstract

To enable proper remediation of accidental groundwater contamination, the contaminant plume evolution needs to be accurately estimated. In the estimation, uncertainties in both the contaminant source and hydrogeological structure should be considered, especially the temporal release history and hydraulic transmissivity. Although the release history can be estimated using geostatistical approaches, previous studies use the deterministic hydraulic property field. Geostatistical approaches can also effectively estimate an unknown heterogeneous transmissivity field via the use of joint data, such as a combination of hydraulic head and tracer data. However, tracer tests implemented over a contaminated area necessarily disturb the in situ condition of the contamination. Conversely, measurements of the transient concentration data over an area are possible and can preserve the conditions. Accordingly, this study develops a geostatistical method for the joint clarification of contaminant plume and transmissivity distributions using both head and contaminant concentration data. The applicability and effectiveness of the proposed method are demonstrated through two numerical experiments assuming a two-dimensional heterogeneous confined aquifer. The use of contaminant concentration data is key to accurate estimation of the transmissivity. The accuracy of the proposed method using both head and concentration data was verified achieving a high linear correlation coefficient of 0.97 between the true and estimated concentrations for both experiments, which was 0.67 or more than the results using only the head data. Furthermore, the uncertainty of the contaminant plume evolution was successfully evaluated by considering the uncertainties of both the initial plume and the transmissivity distributions, based on their conditional realizations.

A 3D hybrid model for estimation of steady-state plume lengths with the influence of recharge

Article 06 January 2023

Effects of source size, monitoring distance and aquifer heterogeneity on contaminant mass discharge and plume spread uncertainty

Article 27 November 2017

Significance of Mass–Concentration Relation on the Contaminant Source Depletion in the Nonaqueous Phase Liquid (NAPL) Contaminated Zone

Article 11 March 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Groundwater contamination is a serious environmental issue that has been occurring in various areas worldwide for many years. Contamination is caused by various contaminants such as anthropogenic chemicals and radioactive and microbiological substances. Contamination by soluble and mobile contaminants tends to spread inconspicuously into extensive areas because its extension is not noticed unless the water quality is systematically monitored. To suppress the extension of contamination and form a proper remediation plan via pumping and chemical processing, correct estimations of the contaminant plume evolution and its uncertainty given the water quality data at wells are essential. To achieve this, the construction of an accurate hydrogeological model to simulate the groundwater flow and contaminant transport is indispensable.

A contaminant source (i.e., the source location or release history) and hydraulic properties are two predominant uncertain factors in the estimation of the contaminant plume distribution. Uncertainty in the contaminant source arises in accidental contamination events, because the source is not identified or recorded except laboratory experiments. In real-world events, release history records are particularly rare, as in past cases of underground contamination at nuclear facilities (OECD 2014); although source locations have been detected by preliminary surveys or historical site assessments in drain lines, sumps, pipes, and tanks, the temporal release histories have not been traced in all cases. Among the relevant hydraulic properties, the permeability expressed by the hydraulic conductivity or transmissivity is the most essential parameter for simulating groundwater flow and contaminant transport. Despite the significance and heterogeneity of the spatial distribution, the amount and location of the measured permeability data at wells are usually limited by practical constraints such as time and cost. Therefore, there are large uncertainties in permeability distributions estimated by simply interpolating and extrapolating the measured permeability or indirectly using hydraulic head or other data. The correctness and uncertainty of the estimated permeability distribution necessarily control the estimation accuracy of the contaminant plume evolution.

The joint identification of the contaminant release history and hydraulic properties has been studied using several approaches. Examples of hydraulic property identification approaches include non-linear maximum likelihood estimation (Wagner 1992) and trained artificial neutral network (Singh and Datta 2004) for homogeneous fields and restart normal-score ensemble Kalman filter (EnKF) (Sanchez-Leon et al. 2016; Chen et al. 2018; Xu and Gomez-Hernandez 2018) and ensemble smoother (ES) with multiple data assimilation (Xu et al. 2021) for heterogeneous fields. The high accuracy of EnKF methods has been confirmed in a sandbox experiment (Chen et al. 2021). However, there are two problems with the above studies:

1.
The release patterns of contaminants are assumed to be known. Typically, the contaminant source parameters, such as the source location, initial release time, release duration, and mass-loading rate, are determined by assuming constant release (e.g., Xu et al. 2021). However, the release pattern in actual contamination events is usually unknown and uncertain. Therefore, a random function should be applied to the release pattern (Snodgrass and Kitanidis 1997).
2.
Although EnKF methods have the advantage of enabling production of non-Gaussian distributions without considering the spatial correlation of the hydraulic conductivity, they require an impractically large amount of spatiotemporal measurement data of the head and concentration for usual contaminant cases.

To overcome the first issue, the quasi-linear geostatistical approach (GA; e.g., Snodgrass and Kitanidis 1997; Gyzl et al. 2004; Shlomi and Michalak 2007) is applicable by introducing the prior information of release history with geostatistical trend and covariance. The applicability of GA has been verified at real-world sites contaminated by water-soluble contaminants: 1,4-dioxane (Woodbury et al. 1998; Michalak and Kitanidis 2002), tetrachloroethene and trichloroethne (Michalak and Kitanidis 2003), and hexachlorocyclohexane (Gyzl et al. 2014). However, most of these studies only incorporated the uncertainty of the contaminant source using a deterministic hydraulic property model.

GA can also solve the second issue if extended to hydraulic tomography studies (e.g., Li et al. 2007, 2008; Cardiff et al. 2009; Cardiff and Barrash 2011; Pouladi et al. 2021), and its applicability using head data has been verified by field tests (e.g., Illuman et al. 2009; Wang et al. 2017; Zha et al. 2018; Luo et al. 2022). However, GA tends to generate a spatially smoother best estimate than the true distribution, which is its main drawback. This smoothing effect is caused by modeling the hydraulic conductivity as a multivariate Gaussian, which is usually inadequate for the estimation of heterogeneous fields such as aquifers in fluvial deposits, where several strata with highly different permeabilities coexist (Mo et al. 2020). However, the assumption of a Gaussian field is applicable to cases of groundwater contamination that occur in a single aquifer. The smoothing effect has been improved via joint inversion of the head and temperature data (Jiang and Woodbury 2006) and the head and tracer data (e.g., Harvey and Gorelick 1995; Cirpka and Kitanidis 2000; Xu and Kitanidis 2014), as well as in combination with a convolution neural network (Vu and Jardani 2022).

Although the tracer test data can indeed improve the performance and accuracy of GA, the implementation of many tests over a contaminated area necessarily disturbs the contamination situation, renders situation assessments difficult, and possibly further extends the contamination. In contrast to such impractical testing, measuring the transient concentration data in groundwater at wells over an area is possible and preserves the situation. Therefore, through the joint use of head and transient concentration data, the estimation accuracies of both the contaminant plume distribution and the hydraulic conductivity are expected to be effectively improved. To achieve this, an estimation of the unknown initial plume distribution is indispensable.

Given the above background, this study aims to accurately estimate the contaminant plume evolution by considering uncertainties in both the temporal release history and the heterogeneous transmissivity fields. Accordingly, the GA method is further developed for a joint clarification of the contaminant plume and transmissivity distributions using both the head and contaminant concentration data. The joint clarification is achieved by combining previous estimation methods for each component: a contaminant plume with an unknown release history is estimated using the method of Shlomi and Michalak (2007) and the hydraulic transmissivity is estimated using the method of Kitanidis and Lee (2014). This paper begins with a review of the previous estimation methods and then, proposes a combined method. This method consists of the following three steps: separate initial estimations of the transmissivity and the initial plume distributions using the head and concentration data, respectively; an iterative update of their distributions via joint use of the data; and an estimation of the contaminant plume evolution and its uncertainties based on their conditional realizations. The proposed method is verified by two numerical experiments assuming groundwater contamination in a two-dimensional aquifer and the results are discussed finally.

2 Methods

2.1 Iterative Estimation of Contaminant Plume and Hydraulic Transmissivity

Previous geostatistical approaches for contaminant plume estimation with unknown release histories (e.g., Shlomi and Michalak 2007) cannot consider the uncertainty of the hydraulic transmissivity. To address this problem, this study developed a GA method to estimate the contaminant plume evolution z(x,t) (x: space and t: time) and its uncertainty by combining previous estimation methods for the contaminant plume and transmissivity, reviewed in Sects. 2.2 and 2.3, for the uncertainties of the release history s(t) and log-transmissivity r(x), respectively. Because both z and r are necessary for each estimation, an iterative approach using both the head and concentration data is proposed as shown in Fig. 1.

The first step is the initial estimation of r and the initial contaminant plume z₀ = z(t₀) (t₀: initial measurement time), using the head φ and initial concentration data z₀^*, separately. The next step is to update r based on the estimated z₀ using both φ and the transient concentration data z^*(t). The posterior pdfs of r and z₀ are iteratively calculated until the posterior pdf of r reaches its maximum. In this step, the mutual uncertainties of r and z₀ are not considered (i.e., the uncertainty of r is not considered in the z₀ estimation, and vice versa). Finally, the best estimate of z(t) is obtained using the best estimates of r and the corresponding z₀. To consider both the uncertainties of r and z₀, the estimation method for the uncertainty of z(t) based on N_r × N_z0 conditional realizations of r and z₀ (N: number of realizations) is developed as described in Sect. 2.4.

2.2 Geostatistical Inversion for Initial Contaminant Plume Estimation

This section reviews preceding studies of the quasi-linear GA for the estimation of the contaminant plume distribution from a known source with an unknown release history (e.g., Kitanidis 1995; Snodgrass and Kitanidis 1997; Shlomi and Michalak 2007). Under a steady state flow, ${\varvec{z}}_{0}^{\user2{*}} \in {\mathbb{R}}^{{n_{z} \times 1}}$ (n_z-dimensional real space) is related linearly to the release history ${\varvec{s}} \in {\mathbb{R}}^{{m_{t} \times 1}}$ at each time $t_{j} \left( {j = 1, \ldots ,m_{t} } \right)$ such that

$$ \begin{array}{*{20}c} {{\varvec{z}}_{0}^{\user2{*}} = {\varvec{H}}_{s}^{*} s + {\varvec{v}}_{z} ,} \\ \end{array} $$

(1)

where ${\varvec{H}}_{s}^{*} \in {\mathbb{R}}^{{n_{z} \times m_{t} }}$ and ${\varvec{v}}_{z} \in {\mathbb{R}}^{{n_{z} \times 1}}$ stand for the Jacobian matrix and the model mismatch error at the measurement points, respectively. ${\varvec{H}}_{s}^{*}$ expresses the sensitivity of the concentrations at each measurement point and time and can be calculated in advance by a flow and transport simulation for the release of a unit concentration pulse. Therefore, the unknown s can be obtained by solving Eq. (1) inversely.

The geostatistical inversion incorporates the temporal correlation of s and assumes that s and v_z are random vectors following the multivariate Gaussian distributions ${\varvec{s}}\sim N\user2{ }\left( {{\varvec{X}}_{s} {\varvec{\beta}}_{s} ,{\varvec{Q}}_{s} \left( {\theta_{s} } \right)} \right)$ and ${\varvec{v}}_{{\varvec{z}}} \sim N\left( {{\bf 0},{\varvec{R}}_{z} } \right)$, where ${\varvec{X}}_{s} \in {\mathbb{R}}^{{m_{t} \times p_{s} }}$ is a known matrix of basis functions; ${\varvec{\beta}}_{s} \in {\mathbb{R}}^{{p_{s} \times 1}}$ are p_s unknown drift coefficients; ${\varvec{Q}}_{s} \left( {\theta_{s} } \right) \in {\mathbb{R}}^{{m_{t} \times m_{t} }}$ is the generalized covariance matrix of s; $\theta_{s}$ is the structural parameter of Q_s; and R_z is the error covariance matrix of ${\varvec{z}}_{0}^{\user2{*}}$. This study assumes an uncorrelated error of ${\varvec{R}}_{z} = \sigma_{{R_{z} }}^{2} {\varvec{I}}$, where $\sigma_{{R_{z} }}^{2}$ is the variance of the error and ${\varvec{I}} \in {\mathbb{R}}^{{n_{z} \times n_{z} }}$ is the identity matrix. The unknown s can be estimated from ${\varvec{z}}_{0}^{\user2{*}}$ by maximizing the posterior pdf $p^{{{\prime \prime }}} \left( {{\varvec{s}},\user2{ \beta }_{s} } \right)$ obtained via Bayes’ rule as

$$ \begin{array}{*{20}c} { - {\text {ln}}p^{\prime\prime}\left( {{\varvec{s}},\user2{ \beta }_{s} } \right) = \frac{1}{2}\left( {{\varvec{z}}_{0}^{\user2{*}} - {\varvec{H}}_{s}^{*} {\varvec{s}}} \right)^{T} {\varvec{R}}_{z}^{ - 1} \left( {{\varvec{z}}_{0}^{\user2{*}} - {\varvec{H}}_{s}^{*} {\varvec{s}}} \right) + \frac{1}{2}\left( {{\varvec{s}} - {\varvec{X}}_{s} {\varvec{\beta}}_{s} } \right)^{T} {\varvec{Q}}_{s}^{ - 1} \left( {{\varvec{s}} - {\varvec{X}}_{s} {\varvec{\beta}}_{s} } \right).} \\ \end{array} $$

(2)

The structural parameters ${\varvec{\theta}} = \left( {\theta_{s} , \sigma_{{R_{z} }} } \right)^{T}$ can be iteratively estimated using a restricted maximum likelihood approach that minimizes the objective function $L\left( {\varvec{\theta}} \right)$ (Kitanidis 1995)

$$ \begin{array}{*{20}c} {L\left( {\varvec{\theta}} \right) = \frac{1}{2}\ln \left| {{\varvec{\varSigma}}_{s} } \right| + \frac{1}{2}\ln \left| {{\varvec{X}}_{s}^{T} {\varvec{H}}_{s}^{*T}{\varvec{\varSigma}}_{s}^{ - 1} {\varvec{H}}_{s}^{*} {\varvec{X}}_{s} } \right| + \frac{1}{2}{\varvec{z}}_{0}^{*T}{\varvec{\varXi}}_{s}^{ - 1} {\varvec{z}}_{0}^{*} ,} \\ \end{array} $$

(3)

$$ \begin{array}{*{20}c} {{\varvec{\varSigma}}_{s} = {\varvec{H}}_{s}^{*} {\varvec{Q}}_{s} {\varvec{H}}_{s}^{*T} + {\varvec{R}}_{z} ,} \\ \end{array} $$

(4)

$$ \begin{array}{*{20}c} {{\varvec{\varXi}}_{s} ={\varvec{\varSigma}}_{s}^{ - 1} -{\varvec{\varSigma}}_{s}^{ - 1} {\varvec{H}}_{s}^{*} {\varvec{X}}_{s} \left( {{\varvec{X}}_{s}^{T} {\varvec{H}}_{s}^{*T}{\varvec{\varSigma}}_{s}^{ - 1} {\varvec{H}}_{s}^{*} {\varvec{X}}_{s} } \right)^{ - 1} {\varvec{X}}_{s}^{T} {\varvec{H}}_{s}^{*T}{\varvec{\varSigma}}_{s}^{ - 1} .} \\ \end{array} $$

(5)

Then, the best estimate $\hat{\user2{s}}$ and its posterior covariance ${\varvec{V}}_{{\hat{\user2{s}}}}$ are derived by solving the following equation system

$$ \begin{array}{*{20}c} {\left( {\begin{array}{*{20}c} {{\varvec{\varSigma}}_{s} } & {{\varvec{H}}_{s}^{*} {\varvec{X}}_{s} } \\ {\left( {{\varvec{H}}_{s}^{*} {\varvec{X}}_{s} } \right)^{T} } & 0 \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {{\varvec{\varLambda}}_{s}^{T} } \\ {{\varvec{M}}_{s} } \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {{\varvec{H}}_{s}^{*} {\varvec{Q}}_{s} } \\ {{\varvec{X}}_{s}^{T} } \\ \end{array} } \right),} \\ \end{array} $$

(6)

$$ \begin{array}{*{20}c} {\hat{\user2{s}} ={\varvec{\varLambda}}_{s} {\varvec{z}}_{0}^{*} ,} \\ \end{array} $$

(7)

$$ \begin{array}{*{20}c} {{\varvec{V}}_{{\hat{s}}} = {\varvec{Q}}_{s} - {\varvec{Q}}_{s} {\varvec{H}}_{s}^{*T}{\varvec{\varLambda}}_{s}^{T} - {\varvec{X}}_{s} {\varvec{M}}_{s} ,} \\ \end{array} $$

(8)

where ${\varvec{\varLambda}}_{s} \in {\mathbb{R}}^{{m_{t} \times n_{z} }}$ and ${\varvec{M}}_{s} \in {\mathbb{R}}^{{p_{s} \times m_{t} }}$ are the weight matrix and the Lagrange multiplier, respectively. To enforce concentration non-negativity, a power transformation (Box and Cox 1964) is applied, such that

$$ \begin{array}{*{20}c} {\tilde{\user2{s}} = \alpha \left( {{\varvec{s}}^{1/\alpha } - 1} \right),} \\ \end{array} $$

(9)

where α is a positive number. Because Eq. (1) is not linear in the transformed space, $\tilde{\user2{s}}$ and θ are solved iteratively using the quasi-linear approach (Snodgrass and Kitanidis 1997) in which α is chosen to be as small as possible while ensuring that $\tilde{\user2{s}} > \alpha$. After obtaining the best estimate and its covariance, the solutions are back-transformed into the original space by

$$ \begin{array}{*{20}c} {{\varvec {s}} = \left( {\frac{{\tilde{\user2{s}} + \alpha }}{\alpha }} \right)^{\alpha } .} \\ \end{array} $$

(10)

Once $\hat{\user2{s}}$ and ${\varvec{V}}_{{\hat{s}}}$ are determined, the best estimate of $\widehat{{{\varvec{z}}_{0} }} \in {\mathbb{R}}^{m \times 1}$ and its posterior covariance ${\varvec{V}}_{{\hat{z}}}$ can be solved as

$$ \begin{array}{*{20}c} {\widehat{{{\varvec{z}}_{0} }} = {\varvec{H}}_{s} \hat{\user2{s}},} \\ \end{array} $$

(11)

$$ \begin{array}{*{20}c} {{\varvec{V}}_{{\hat{z}}} = {\varvec{H}}_{s} {\varvec{V}}_{{\hat{s}}} {\varvec{H}}_{s}^{T} ,} \\ \end{array} $$

(12)

where m is the number of estimation points and ${\varvec{H}}_{s} \in {\mathbb{R}}^{{m \times m_{t} }}$is the Jacobian matrix at all estimation points.

2.3 Principle Component Geostatistical Approach for Hydraulic Transmissivity Estimation

This study adopts the principal component geostatistical approach (PCGA: Kitanidis and Lee 2014; Lee and Kitanidis 2014), as reviewed below, to estimate the hydraulic transmissivity distribution. The observation ${\varvec{y}} \in {\mathbb{R}}^{n \times 1}$ can be expressed by the forward model h with ${\varvec{r}} \in {\mathbb{R}}^{m \times 1}$ and the observation error ${\varvec{v}} \in {\mathbb{R}}^{n \times 1}$ as

$$ \begin{array}{*{20}c} {{\varvec {y}} = h\left( {\varvec{r}} \right) + {\varvec {v}}.} \\ \end{array} $$

(13)

For the present case, y corresponds to only head data or to head and concentration data. r and v are assumed to follow the multivariate Gaussian distributions ${\varvec{r}}\sim N\user2{ }\left( {\user2{X\beta },{\varvec{Q}}\left( {\theta_{r} } \right)} \right)$ and ${\varvec{v}}\sim N\user2{ }\left( {0,{\varvec{R}}} \right)$, where ${\varvec{X}} \in {\mathbb{R}}^{m \times p}$ is a known matrix of basis functions; ${\varvec{\beta}} \in {\mathbb{R}}^{p \times 1}$ represents p unknown drift coefficients; ${\varvec{Q}}\left( {\theta_{r} } \right) \in {\mathbb{R}}^{m \times m}$ is a generalized covariance matrix of r; θ_r is the structural parameter of Q; and R is the error covariance matrix of y. As in the above release history, the best estimate $\hat{\user2{r}}$ is obtained by maximizing $p^{{{\prime \prime }}} \left( {{\varvec{r}},\user2{ \beta }} \right)$. Because Eq. (13) is not linear, the quasi-linear approach (Kitanidis 1995) is applied to approximate the true $\hat{\user2{r}}$ with the latest estimate $\overline{\user2{r}}$, such that

$$ \begin{array}{*{20}c} {h\left( {\hat{\user2{r}}} \right) = h\left( {\overline{\user2{r}}} \right) + {\varvec {H}}\left( {\hat{\user2{r}} - \overline{\user2{r}}} \right), {\varvec {H}} = \left. {\frac{\partial h}{{\partial {\varvec{r}}}}} \right|_{{{\varvec{r}} = \overline{\user2{r}}}} .} \\ \end{array} $$

(14)

The following equation system is solved to update $\overline{\user2{r}}$ until it converges

$$ \left( {\begin{array}{*{20}c}{\varvec{\varSigma}}& {{\varvec{HX}}} \\ {\left( {{\varvec{HX}}} \right)^{T} } & 0 \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {\overline{\user2{\xi }}} \\ {\overline{\user2{\beta }}} \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {{\varvec{y}} - h\left( {\overline{\user2{r}}} \right) + {\varvec {H}}\overline{\user2{r}}} \\ 0 \\ \end{array} } \right), $$

(15)

$$ \begin{array}{*{20}c} {{\varvec{\varSigma}} = {\varvec {HQ}}{\varvec{H}}^{T} + {\varvec {R}}}, \\ \end{array} $$

(16)

$$ \begin{array}{*{20}c} {\overline{\user2{r}} = {\varvec {X}}\overline{\user2{\beta }} + {\varvec {Q}}{\varvec{H}}^{T} \overline{\user2{\xi }}.} \\ \end{array} $$

(17)

Once the optimal solution $\hat{\user2{r}}$ is obtained, its posterior covariance ${\varvec {V}}_{{\hat{r}}}$ can be calculated in the same way as in Eqs. (6) and (8).

PCGA was proposed to obtain $\hat{\user2{r}}$ efficiently with small computation cost by improving the conventional GA through two approaches. The first approach is the use of Taylor expansion for the indirect expression of ${\varvec{H}}$

$$ \begin{array}{*{20}c} {{\varvec {Ha}} \approx \frac{1}{{\delta_{r} }}\left[ {h\left( {{\varvec{a}} + \delta_{r} {\varvec{a}}} \right) - h\left( {\varvec{a}} \right)} \right],} \\ \end{array} $$

(18)

where a is the target vector, such as $\overline{\user2{r}}$, and $\delta_{r}$ is a finite difference interval that can be optimized as (Lee et al. 2016)

$$ \begin{array}{*{20}c} {\widehat{{\delta_{r} }} = \frac{{\sqrt {\varepsilon_{r} } }}{{{\varvec{a}}_{2}^{2} }}\max \left( {\left| {{\varvec{r}}^{T} {\varvec{a}}} \right|,\left| {\varvec{r}} \right|^{T} \left| {\varvec{a}} \right|} \right){\text{sign}}\left( {{\varvec{r}}^{T} {\varvec{a}}} \right),} \\ \end{array} $$

(19)

where ε_r is the relative machine precision depending on the precision of the forward model; $\left| {\varvec{a}} \right| = \left( {\left| {a_{1} } \right|, \ldots , \left| {a_{m} } \right|} \right)^{T}$; and sign() indicates the sign of a value. The second approach is a low-rank approximation of Q as

$$ \begin{array}{*{20}c} {{\varvec {Q}} \approx {\varvec{Z}}_{Q} {\varvec{Z}}_{Q}^{T} = \mathop \sum \limits_{i = 1}^{K} {\varvec{\zeta}}_{i} {\varvec{\zeta}}_{i}^{T} , {\varvec{\zeta}}_{i} = \sqrt {\lambda_{i} } {\varvec{V}}_{i} ,} \\ \end{array} $$

(20)

where λ_i and ${\varvec{V}}_{i} \in {\mathbb{R}}^{m \times 1}$ are the ith eigenvalue and eigenvector of Q in the descending order. The order K can be defined such that the relative error of the low-rank approximation, $\lambda_{K + 1} /\lambda_{1}$, is sufficiently small. All of the above calculations are implemented by normalizing the drift and covariance of the prior model, as explained in the Appendix, following Kitanidis and Lee (2014).

2.4 Conditional Realizations of the Transmissivity and Initial Plume Distributions

The uncertainty of z(x,t) is assessed considering the uncertainties of both the initial contaminant plume and the transmissivity distributions by generating their conditional realizations. The conditional realization can be drawn from the posterior pdf using either the Cholesky decomposition of the posterior covariance (Harvey and Gorelick 1995; Nowak 2009; Troldborg et al. 2012) or the parametric bootstrapping sampling method (Kitanidis 1995; Kitanidis and Lee 2014). Because of its simplicity and smallness of calculation, the Cholesky approach is adopted here where the ith conditional realization of the transmissivity distribution $\widehat{{{\varvec{r}}_{c} }}_{i}$ is

$$ \begin{array}{*{20}c} {\widehat{{{\varvec{r}}_{c} }}_{i} = \widehat{{{\varvec{r}}_{u} }}_{i} + \left( {{\varvec{HQ}}} \right)^{T} {\varvec{\varSigma}}^{ - 1} \left( {{\varvec{y}} - h\left( {\widehat{{{\varvec{r}}_{u} }}_{i} } \right) + {\varvec{v}}_{i} } \right),} \\ \end{array} $$

(21)

where $\widehat{{{\varvec{r}}_{u} }}_{i} \sim N\left( {\hat{\user2{r}}, V_{{\hat{r}}} } \right)$ is the ith unconditional realization of the transmissivity randomly sampled from the posterior pdf and ${\varvec{v}}_{i} \sim N\user2{ }\left( {0,{\varvec{R}}} \right)$ is the ith random measurement error of the head and concentration. In the same way, the ith realization of the initial contaminant plume distribution ${\widehat{{\varvec{z}}_0}}_{c_i}$ can be written as

$$ \widehat{{{\varvec{z}}_{0}}}_{c_i} = {\varvec{H}}_{s} \widehat{{{\varvec{s}}_{c} }}_{i} , \widehat{{{\varvec{s}}_{c} }}_{i} = \widehat{{{\varvec{s}}_{u} }}_{i} + \left( {{\varvec{H}}_{s}^{*} {\varvec{Q}}_{s} } \right)^{T} {\varvec{\varSigma}}_{s}^{ - 1} \left( {{\varvec{z}}_{0}^{{*}} - {\varvec{H}}_{s}^{*} \widehat{{{\varvec{s}}_{u} }}_{i} + {\varvec{v}}_{z_i}} \right), $$

(22)

where $\widehat{{{\varvec{s}}_{u} }}_{i}$ is the ith unconditional realization of the release history and ${\varvec{v}}_{z_i} \sim N\user2{ }\left( {0,{\varvec{R}}_{z} } \right)$ is the ith random measurement error of the initial concentration. $\widehat{{{\varvec{s}}_{u} }}_{i}$ can be inversely calculated from the ith realization of the release history in the transformed space, $\widehat{{\tilde{\user2{s}}_{u} }}_{i} \sim N\left( {\widehat{{\tilde{\user2{s}}}}, V_{{\widehat{{\tilde{s}}}}} } \right)$.

3 Numerical Experiment

3.1 Physical Model

The above proposed geostatistical approach was tested via numerical experiments of two-dimensional steady state groundwater flow and contaminant transport in the steady state. Let a transmissivity field T in a confined aquifer be spatially variable but locally isotropic. The governing equation for groundwater flow in a saturated porous media is expressed as

$$ {\varvec{\nabla}} \cdot {\varvec{u}} = Q_{f} \delta \left( {{\varvec{x}} - {\varvec{x}}_{f} } \right),\quad {\varvec{u}} = - T{\varvec{\nabla}} \varphi , $$

(23)

where u is the groundwater flow velocity; T is the transmissivity; φ is the hydraulic head; δ(x) is the Dirac delta function; and Q_f is the pumping rate at a well location x_f. Under this state, the contaminant transport is expressed by the advection–dispersion equation as

$$ \nabla \cdot \left( {{\varvec{D}}\nabla c - {\varvec{V}}c} \right) = R_{f} \frac{\partial c}{{\partial t}} + \lambda_{f} c, $$

(24)

where c is the dimensionless concentration; ${\varvec{V}} = {\varvec{u}}/\varepsilon$ is the actual groundwater velocity; D is the dispersion tensor; ε is the porosity; R_f is the retardation factor; and λ_f is the radioactive or first-order biochemical decay constant. Each component of D is formulated as

$$ \begin{array}{*{20}c} {D_{ij} = \alpha_{T} \left| {\varvec{V}} \right|\delta_{ij} + \left( {\alpha_{L} - \alpha_{T} } \right)\frac{{V_{i} V_{j} }}{{\left| {\varvec{V}} \right|}} + D_{m} \tau \delta_{ij} \left( {i,j = 1,2} \right),} \\ \end{array} $$

(25)

where α_L and α_T are the longitudinal and transverse dispersivities, respectively; D_m is the molecular diffusion coefficient; and τ is the tortuosity.

3.2 Settings of Two Cases

On the basis of reviews of groundwater contamination events caused by water-soluble pollutants, the contamination extent generally ranges from scales of 100 m to 1 km (e.g., for over 2000 sites in California, the median plume length was 270 m for 1,4-dioxane, 115 m for 1,1,1-trichloroethne, 95 m for trichloroethene, and 123 m for 1,1-dichloroethene; Adamson et al. 2014). At such a scale, contaminated water can be pumped from wells set at several locations. To simplify the present experiment, only one pumping well was set at (x, y) = (25, 0) m in a model domain of 100 m along the x-axis (the flow direction) × 50 m along the y-axis (Fig. 2).

Two cases of transmissivity fields with different degrees of heterogeneity were prepared by referring to the experimental model of Lee and Kitanidis (2014): case 1 had a smooth spatial change in the transmissivity and case 2 had a highly heterogeneous field with local changes in the transmissivity. The mean log-transmissivities (m²/d) of 2.4 were the same for both cases; this is the product of the assigned aquifer thickness, 10 m, with a typical hydraulic conductivity for porous sand, 10⁻⁵ m/s (Zanini and Kitanidis 2009). The difference between the two cases is expressed by the spatial correlation range of the field, case 1 is long and case 2 is short, as shown by the covariance function in Table 1. The covariance functions defined were a generalized cubic covariance with a linear drift following Zanini and Kitanidis (2009), which is continuously differentiable and smooth (Kitanidis and Lee 2014), for case 1 and an isotropic exponential covariance with a constant drift for case 2. Constant-head boundaries were set at x = 0 (inflow) and 100 m (outflow) with a head difference of 0.2 m, and impermeable boundary conditions were set at both y edges (y = ± 25 m) (Fig. 3). The contaminant concentration at x = 0 m and the dispersive flux at x = 100 m were both zero. The longitudinal and transverse dispersivities were defined as 5.0 and 0.5, respectively, considering that $\alpha_{L} \sim 0.1L_{p}$ (Lallemand-Barres and Peaudecerf 1978; Pickens and Grisak 1981; Spitz and Moreno 1996) or α_L = 0.83 [log₁₀(L_p)]^2.414_, where L_p is the plume length from the source [m] (Xu and Eckstein 1995) and α_T is approximately 0.1α_L (Gelhar et al. 1992; Wiedemeier et al. 1999). R_f and λ_f were not considered, and τ was set to 1.

Table 1 Parameter settings for the two cases with smooth and heterogeneous transmissivity fields (case 1 and 2, respectively)

Full size table

The contaminant plume distribution originated from a known source at (x, y) = (25, 0) m. The release of the contaminant starts at t = −300 days (case 1) and −150 days (case 2) before the initial measurement time (t = 0) and ends at t = 0. The source intensity was assumed to increase linearly from 0 to 1. Hydraulic heads were measured at 35 monitoring wells under steady state for one pumping well at the source location (Fig. 4), and contaminant concentrations were measured monthly over 1 year (t = 0 to 1 year) at 18 monitoring wells located uniformly on the downstream side. In the calculation, the mean travel time was used instead of the transient concentration data, as suggested for tracer data (Harvey and Gorelick 1995; Ezzedine and Rubin 1996; Cirpka and Kitanidis 2000; Lee and Kitanidis 2014)

$$ \begin{array}{*{20}c} {\overline{{t_{{{\varvec{x}}_{i} }} }} = \frac{{\mathop \sum \nolimits_{{t = t_{0} }}^{{t_{end} }} tz\left( {{\varvec{x}}_{i} , t} \right)\Delta t}}{{\mathop \sum \nolimits_{{t = t_{0} }}^{{t_{end} }} z\left( {{\varvec{x}}_{i} , t} \right)\Delta t}},} \\ \end{array} $$

(26)

where $\overline{{t_{{{\varvec{x}}_{i} }} }}$ is the mean travel time at position x_i; $t = (t_{0} , \ldots ,t_{end} )$ is the measurement time; and Δt is the measurement interval.

Using the 35 head and 18 travel time data, the log-transmissivities at the 5,000 (100 × 50) cells at intervals of 1 m along the x- and y-axes were estimated. Assuming that the transmissivities at the 35 wells were known, the unknown structural parameters θ_r were estimated to be 2.0 × 10⁻⁵ for case 1 and 12.5 for case 2. In the contaminant plume estimation, the optimal values of both σ_R and θ_s were determined simultaneously. The standard deviation of the measurement errors of the head and concentration (or the mean travel time as mentioned above) were set to 0.05 m (approximately 5% of the maximum head change as a result of pumping) and 10%, respectively, following Lee and Kitanidis (2014). A Gaussian random error with zero mean and a corresponding standard deviation was added to all of the measurement data. The unknown release histories were recovered at ten-day intervals over the 1,350 days prior to the start of measurement, which is sufficiently long to express the Jacobian matrix for the contaminant plume evolution from the source to the model boundary.

3.3 Calculation Execution Conditions

The forward simulation of the groundwater flow and transport was executed using 3D-SEEP (Kimura and Muraoka 1986), based on the three-dimensional Garlerkin finite element method. Singular value decompositions for the low-rank approximations were computed in parallel using the ScaLAPACK package (Blackford et al. 1997). The linear systems of Eqs. (6) and (15) were solved using the generalized minimal residual method with a criterion for the relative residual error of ≤ 1 × 10⁻⁸. A PC with an Intel Core i9-11900 K (3.50 GHz) CPU and 64-GB memory was used for the numerical experiments.

For both cases, the initial transmissivity field was set to be uniform with a log mean of −9.0 m²/s. Following Lee and Kitanidis (2014), the optimum number of the low-rank approximation of Q was set to K = 96, in which the relative error of the approximation $\lambda_{K + 1} /\lambda_{1}$ was 3.1 × 10⁻⁴% for case 1 and 1.2% for case 2. Only for the joint inversion of case 2, which is a strongly nonlinear problem, was K changed to 350 with $\lambda_{K + 1} /\lambda_{1} = 0.{18}\%$. To ensure the monotonic convergence of the nonlinear transmissivity estimation problem [Eq. (13)], the optimal solution was identified using a line search (Zanini and Kitanidis 2009)

$$ \begin{array}{*{20}c} {\hat{\user2{r}} = {\varvec{r}}_{i} \delta_{ls} + {\varvec{r}}_{i + 1} \left( {1 - \delta_{ls} } \right),} \\ \end{array} $$

(27)

where r_i is the previous estimate and r_i+1 is the updated estimate found using the Gauss–Newton procedure [Eq. (17)] and δ_ls is a scalar. The range of δ_ls was set to −0.1 ≤ δ_ls ≤ 1.1 following Zanini and Kitanidis (2009). Finally, the calculations of the transmissivity estimation converged entirely within 18 iterations for all cases with ε_r = 5 × 10⁻⁶.

To obtain the final solution, the estimated transmissivity distributions were updated two and three times for cases 1 and 2, respectively. The uncertainty of the contaminant plume distributions was evaluated using the results of 10,000 (N_r = 100 × N_z0 = 100) realizations. Because of rounding errors, the eigenvalues of the posterior covariance included small negative values (approximately −10⁻⁷); all the negative eigenvalues were therefore changed to 1 × 10⁻¹⁰.

4 Results

4.1 Hydraulic Transmissivity

The best estimates and estimation variances of the log-transmissivity distributions are shown in Figs. 5 and 6, respectively. Even for the results using only the head data, sufficient accuracy of the best estimates can be confirmed by the near agreement between the simulated and measured heads having small root mean square errors (RMSE) of 0.047 (case 1) and 0.054 (case 2) (Fig. 7a). However, because of the measurement error, the results are spatially much smoother than the true fields in both cases. In particular, large underestimates occurred at the relatively high transmissivity portions; these are continuously distributed from the upstream to the downstream (case 1) and are heterogeneously distributed on the downstream side (case 2). This smoothing effect was remarkable in case 2 with the highly heterogeneous field; furthermore, the estimation uncertainty significantly increased with distance from the pumping well.

Conversely, through the joint use of the head and concentration data, large improvements in the estimation accuracy were confirmed for both cases, for example, the relatively high transmissivity portions were well reproduced. Both the measurement data of the head and the mean travel time were adequately reproduced within their 95% confidence intervals (Fig. 7b). Even though the estimation accuracy was low in the portions outside the contaminant plume transport, such as in the vicinity of the domain boundary, the joint data use obviously decreased the estimation uncertainties along the contaminant plume evolutions for both cases compared with the results using only the head data (Fig. 6).

Table 2 Summary of statistics of the best estimates of the log-transmissivity distributions (m²/d) for cases 1 and 2

Full size table

The estimation accuracies of the log-transmissivity distributions were assessed in the measurement area, x = [12, 87] (m), y = [− 15, 15] (m), using the descriptive statistics: the coefficient of determination (R²) between the estimated and true values (Fig. 8), mean (μ₀) and variance (ν₀) of the log-transmissivity field; and mean (μ₂ = mean $\left[ {\left( {{\varvec{r}}_{{{\text{true}}}} - \hat{\user2{r}}} \right)^{2} } \right]$) and variance (ν₂ = var $\left[ {\left( {{\varvec{r}}_{{{\text{true}}}} - \hat{\user2{r}}} \right)^{2} } \right]$) of the square differences (Table 2). These statistical parameters demonstrate the large improvement of the estimation accuracy via the joint use of the head and concentration data for both cases (e.g., μ₂ was decreased by 76% and 50% for cases 1 and 2, respectively).

4.2 Contaminant Plume Distribution

Using the optimal values of θ_s and $\sigma_{{R_{z} }}$ (Table 3), the release histories for the best estimates of the log-transmissivity distributions were estimated for the two cases, as shown in Fig. 9. When using both the head and concentration data, the recovered release peaks were slightly closer to the true peaks for both cases compared with the results when using only the head data. This is due to the improvement in the estimated transmissivity distribution, in particular the decrease in the underestimation induced when using only the head data. The magnitude of the estimated variance of the release history depends on both the concentration measurement error and the heterogeneity of the transmissivity distribution. As reported by Butera and Tanda (2003), the model mismatch error of the hydraulic field can increase the estimation variance for two main reasons: errors in the flow direction and the magnitude of the dispersion. The large confidence intervals in Fig. 9 are caused by the former factor in case 1 using only the head data; and by the latter factor in case 2 because of the large heterogeneous transmissivity.

Table 3 Optimal values of the structure parameters for the best estimates of the release history in the two cases

Full size table

The best estimates of the transmissivity distributions were applied to the prediction of the best estimate of the contaminant plume evolution, as drawn in Figs. 10 and 11 for the results of the plume distribution at t = 0, 1, and 2 years for cases 1 and 2. When using only the head data, the contaminant transport velocity was underestimated in both cases. This underestimation resulted in the maximum concentrations at t = 1 and 2 years being 1.4 (case 1) and 1.7 (case 2) times as large as the true concentrations. In addition, the center of plume distribution differs from the true position. At t = 2 year, the true maximum concentrations are located at 85 m (case 1) and outside the model domain (case 2). However, the estimated positions when using only the head data are 69 m (case 1) and 66 m (case 2). Conversely, given the joint use of the head and concentration data, the plume evolution is well reproduced in both cases. The advantage of using both forms of data was proved by the significant increases in the mean of the linear correlation coefficients at one month interval from t = 0 to 2 years between the true and best estimated concentrations at the 18 measurement points (Fig. 12): the time-averaged correlation coefficient increased from 0.72 to 0.97 for case 1 and from 0.67 to 0.97 for case 2.

The uncertainties in the contaminant plume transport were quantified for the results through the joint data use (Fig. 13). Obviously, the resultant uncertainties were sufficiently small compared with the best estimate values in Figs. 9 and 10. At t = 0, the uncertainties were relatively high as a result of the estimated variance of the release histories (Fig. 9); this was more conspicuous in case 2. However, these fluctuations decreased with time.

5 Discussion

The effectiveness and high accuracy of the joint clarification of the initial contaminant plume and transmissivity distributions using both the head and concentration data were demonstrated for the prediction of the contaminant plume evolution. The proposed GA method is applicable for any water-soluble contaminant with or without retardation and/or radioactive or biochemical decay. However, the flow and transport simulation was simply implemented under steady state flow conditions. This assumption cannot be satisfied for cases that need to consider unsteady flow caused by typically periodic pumping. The next step is to incorporate the unsteady flow in both unconfined and confined aquifers.

As mentioned above, the two main novel points of this study differing from previous studies were the non-use of tracer data and the consideration of the uncertainty of the initial contaminant plume distribution. While a smaller model domain (40 m × 20 m) and a smaller head measurement error (5.0 × 10⁻⁴ m) than that used in this study was targeted, the effectiveness of the joint use of the head and tracer data with a defined anisotropic exponential covariance of the transmissivity was, in previous studies, demonstrated to improve the estimation accuracy of log-transmissivity inside the tracer paths (Cirpka and Kitanidis 2000; Lee and Kitanidis 2014). For a similar exponential covariance model (case 2), the joint use of the head and tracer data was demonstrated to improve the estimation accuracy based on the RMSE ($\sqrt {\mu_{2} }$) of the log-transmissivity in the entire domain by 10% compared with that obtained using only the head data (Cirpka and Kitanidis 2000). Although this study did not use tracer data, the RMSE in the measurement area was improved by 29% via the joint use of the head and concentration data. This high accuracy contributes to accurately reproducing the contaminant plume evolution even if the release history is unknown. Another noteworthy advantage of the present method is its capability to evaluate the uncertainty of the contaminant plume evolution, considering the uncertainties of both the initial contaminant plume and the transmissivity distributions. This feature resulted in the sufficiently small estimation uncertainty of the contaminant plume evolution even for the highly heterogeneous transmissivity field (case 2) compared with the best estimate values.

As an extension of the proposed method to practical applications, the following three points need to be considered. The first point is the need to estimate the three-dimensional permeability (i.e., the hydraulic conductivity). Although this is possibly straightforward given the proposed method, a problem is the large cost of acquiring a sufficient amount of multi-depth data of the head and contaminant concentration for the three-dimensional estimation. Therefore, a suitable data amount and the location of the measurement data should be specified depending on the hydrogeological features of the target area. The second point is the reduction of the smoothing effect associated essentially with the spatial estimation of geostatistical methods. As shown in this study, even though the estimated transmissivity field is spatially smoother than the true field, the contaminant plume evolution can be reproduced well because the contaminant plume spreads over time primarily as a result of mechanical dispersion. However, reducing the smoothing effect is indispensable when a hotspot-shaped concentration anomaly much higher than the surroundings needs to be reproduced. This reduction may be possible by applying geostatistical simulations to conditional realizations, typically via sequential Gaussian simulations (e.g., Deutsch and Journel 1998), turning band simulations (e.g., de Sá et al. 2021a, b), and the incorporation of discontinuous geological structures such as lithological contacts and unconformities (Fienen et al. 2008, 2009; Koike et al. 2022) and fracture/fault distribution (Zha et al. 2017). The third point is to incorporate the constraints on release history, except for the non-negativity, for example, the upper limit of concentration as a result of the contaminant solubility. This can be achieved via Gibbs sampling (Michalak 2008), a representative Markov chain Monte Carlo method whose effectiveness for contaminant plume estimation was demonstrated by Takai et al. (2022) using a set of field data.

6 Conclusions

This study developed a geostatistical method to achieve accurate estimation of contaminant plume evolutions via a joint clarification of the contaminant plume and hydraulic transmissivity distributions. One of the novelties of this method is the use of the contaminant concentration data in the consideration of the uncertainty of a contaminant plume distribution originating from an unknown release history. To verify the effectiveness and accuracy of the proposed method, two transmissivity fields with different spatial patterns were prepared: a high-contrast smooth field (case 1) and a highly heterogeneous field (case 2). The main obtained results are summarized as follows.

(1)
Even though the contaminant plume distribution was unknown, a higher estimation accuracy of the hydraulic transmissivity distribution was achieved through the joint use of the head and concentration data than when using only the head data. For case 2, the estimation accuracy was improved similarly to the previous study (Cirpka and Kitanidis 2000) using tracer data.
(2)
Using both the head and concentration data, the release peaks were recovered more accurately than when using only the head data. This superiority was due to the improvement of the estimated transmissivity distribution by decreasing the smoothing effect.
(3)
Large difference between the results through the non-use and use of the concentration data with the head data was highlighted in the predicted contaminant plume evolution. Using only the head data, the velocity of the estimated plume transport was almost half that of the true value. Conversely, using both the head and concentration data, the plume evolutions were sufficiently predicted for both cases with high time-averaged correlation coefficients of 0.97, respectively, between the true and predicted concentrations. Furthermore, the uncertainties of the predicted plume distributions based on the conditional realizations of the initial plume and transmissivity distributions were sufficiently smaller than the magnitudes of the best estimates.

Consequently, the effectiveness and accuracy of the proposed approach were demonstrated even if the initial contaminant plume distribution is uncertain. Any forward modeling methods of the groundwater flow and the contaminant transport can be incorporated into the method. Accurate predictions of the contaminant plume transport are helpful to effectively plan remediation, in particular, when choosing the number and location of pumping-up wells. Because the applicability was demonstrated only via numerical experiments in this study, our next step will be a practical application to actual contaminated fields using multi-depth head and concentration data, targeting three-dimensional space and considering an additional uncertainty such as the uncertainty in the boundary conditions of the flow and transport model.

References

Adamson DT, Mahendra S, Walker KL Jr, Rauch SR, Sengupta S, Newell CJ (2014) A multisite survey to identify the scale of the 1,4-dioxane problem at contaminated groundwater sites. Environ Sci Technol Lett 1:254–258. https://doi.org/10.1021/ez500092u
Article CAS Google Scholar
Blackford LS, Choi J, Cleary A, D’Azevodo EF, Demmel J, Dhillon IS, Dongarra J, Hammarling S, Henry G, Petitet A, Stanley K, Walker DW, Whaley RC (1997) ScaLAPACK: a linear algebra library for message-passing computers. In: Proceedings of the eighth SIAM conference on parallel processing for scientific computing (PPSC 1997), Minnesota, USA, March 1997. Society for Industrial and Applied Mathematics
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B 26:353–360
Google Scholar
Butera I, Tanda MG (2003) A geostatistical approach to recover the release history of groundwater pollutants. Water Resour Res 39(12):1372. https://doi.org/10.1029/2003WR002314
Article CAS ADS Google Scholar
Cardiff M, Barrash W (2011) 3-D transient hydraulic tomography in unconfined aquifers with fast drainage response. Water Resour Res 47(12):W12518. https://doi.org/10.1029/2010WR010367
Article ADS Google Scholar
Cardiff M, Barrash W, Kitanidis PK, Malama B, Revil A, Straface S, Rizzo E (2009) A potential-based inversion of unconfined steady-state hydraulic tomography. Ground Water 47(2):259–270. https://doi.org/10.1111/j.1745-6584.2008.00541.x
Article CAS PubMed Google Scholar
Chen Z, Gomez-Hernandez JJ, Xu T, Zanini A (2018) Joint identification of contaminant source and aquifer geometry in a sandbox experiment with the restart ensemble Kalman filter. J Hydrol 564:1074–1084. https://doi.org/10.1016/j.jhydrol.2018.07.073
Article Google Scholar
Chen Z, Xu T, Gomez-Hernandez JJ, Zanini A (2021) Contaminant spill in a sandbox with non-Gaussian conductivities: simultaneous identification by the restart normal-score ensemble Kalman filter. Math Geosci 53:1587–1615. https://doi.org/10.1007/s11004-021-09928-y
Article MathSciNet CAS Google Scholar
Cirpka OA, Kitanidis PK (2000) Sensitivity of temporal moments calculated by the adjoint-state method and joint inversing of head and tracer data. Adv Water Resour 24(1):89–103. https://doi.org/10.1016/S0309-1708(00)00007-5
Article ADS Google Scholar
De Sá VR, Koike K, Goto T, Nozaki T, Takaya Y, Yamasaki T (2021a) A combination of geostatistical methods and principal components analysis for detection of mineralized zones in seafloor hydrothermal systems. Nat Resour Res 30:2875–2887. https://doi.org/10.1007/s11053-020-09705-4
Article Google Scholar
De Sá VR, Koike K, Goto T, Nozaki T, Takaya Y, Yamasaki T (2021b) 3D geostatistical modeling of metal contents and lithofacies for mineralization mechanism determination of a seafloor hydrothermal deposit in the middle Okinawa Trough, Izena Hole. Ore Geol Rev 135:104194. https://doi.org/10.1016/j.oregeorev.2021.104194
Article Google Scholar
Deutsch CV, Journel AG (1998) GSLIB: geostatistical software library and user’s guide. Applied geostatistics series. Oxford University Press, New York
Google Scholar
Ezzedine S, Rubin Y (1996) A geostatistical approach to the conditional estimation of spatially distributed solute concentration and notes on the use of tracer data in the inverse problem. Water Resour Res 32(4):853–861. https://doi.org/10.1029/95WR02285
Article ADS Google Scholar
Fienen MN, Clemo T, Kitanidis PK (2008) An interactive Bayesian geostatistical inverse protocol for hydraulic tomography. Water Resour Res 44(12):W00B01. https://doi.org/10.1029/2007WR006730
Article Google Scholar
Fienen MN, Hunt R, Krabbenhoft D, Clemo T (2009) Obtaining parsimonious hydraulic conductivity fields using head and transport observations: a Bayesian geostatistical parameter estimation approach. Water Resour Res 45(8):W08405. https://doi.org/10.1029/2008WR007431
Article ADS Google Scholar
Gelhar LW, Welty C, Rehfeldt KR (1992) A critical review of data on field-scale dispersion in aquifers. Water Resour Res 28(7):1955–1974. https://doi.org/10.1029/92WR00607
Article CAS ADS Google Scholar
Gyzl G, Zanini A, Fraczek R, Kura K (2014) Contaminant source and release history identification in groundwater: a multi-step approach. J Contam Hydrol 157:59–72. https://doi.org/10.1016/j.jconhyd.2013.11.006
Article CAS Google Scholar
Harvey CF, Gorelick SM (1995) Mapping hydraulic conductivity—sequential conditioning with measurements of solute arrival time, hydraulic-head, and local conductivity. Water Resour Res 31(7):1615–1626. https://doi.org/10.1029/95WR00547
Article Google Scholar
Illuman WA, Liu X, Shinji T, Yeh TJ, Ando K, Saegusa H (2009) Hydraulic tomography in fractured granite: mizunami underground research site. Jpn Water Resour Res 45(1):W01406. https://doi.org/10.1029/2007WR006715
Article ADS Google Scholar
Jiang Y, Woodbury AD (2006) A full-Bayesian approach to the inverse problem for steady-state groundwater flow and heat transport. Geophys J Int 167:1501–1512. https://doi.org/10.1111/j.1365-246X.2006.03145.x
Article ADS Google Scholar
Kimura H, Muraoka S (1986) The 3D-SEEP computer code user’s manual. Japan Atomic Energy Research Institute, JAERI-M 86-091. https://doi.org/10.11484/jaeri-m-86-091
Kitanidis PK (1995) Quasi-linear geostatistical theory for inversing. Water Resour Res 31(10):2411–2419. https://doi.org/10.1029/95WR01945
Article ADS Google Scholar
Kitanidis PK, Lee J (2014) Principal component geostatistical approach for large-dimensional inverse problems. Water Resour Res 50(7):5428–5443. https://doi.org/10.1002/2013WR014630
Article CAS PubMed PubMed Central ADS Google Scholar
Koike K, Kiriyama T, Lu L, Kubo T, Heriawan MN, Yamada R (2022) Incorporation of geological constraints and semivariogram scaling law into geostatistical modeling of metal contents in hydrothermal deposits for improved accuracy. J Geochem Explor 233:106901. https://doi.org/10.1016/j.gexplo.2021.106901
Article CAS Google Scholar
Lallemand-Barres P, Peaudecerf P (1978) Recherche des relations entre les valeurs de la dispersivite macroscopique d’un milieu aquifere, ses autres caracteristiques et les conditions de measures. Etude Bibliographique Bull BRGM III(4):277–284
Google Scholar
Lee J, Kitanidis PK (2014) Large-scale hydraulic tomography and joint inversion of head and tracer data using the Principal Component Geostatistical Approach (PCGA). Water Resour Res 50(7):5410–5427. https://doi.org/10.1002/2014WR015483
Article ADS Google Scholar
Lee J, Yoon H, Kitanidis PK, Werth CJ, Valocchi AJ (2016) Scalable subsurface inverse modeling of huge data sets with an application to tracer concentration breakthrough data from magnetic resonance imaging. Water Resour Res 52(7):5213–5231. https://doi.org/10.1002/2015WR018483
Article ADS Google Scholar
Li W, Englert A, Cirpka OA, Vanderborght J, Vereecken H (2007) Two-dimensional characterization of hydraulic heterogeneity by multiple pumping tests. Water Resour Res 43(4):W04433. https://doi.org/10.1029/2006WR005333
Article ADS Google Scholar
Li W, Englert A, Cirpka OA, Vereecken H (2008) Three-dimensional geostatistical inversion of flowmeter and pumping test data. Ground Water 46(2):193–201. https://doi.org/10.1111/j.1745-6584.2007.00419.x
Article CAS PubMed Google Scholar
Luo N, Illuman WA, Zha Y (2022) Large-scale three-dimensional hydraulic tomography analyses of long-term municipal wellfield operations. J Hydrol 610:127911. https://doi.org/10.1016/j.jhydrol.2022.127911
Article Google Scholar
Michalak AM (2008) A Gibbs sampler for inequality-constrained geostatistical interpolation and inverse modeling. Water Resour Res 44(9):1–14. https://doi.org/10.1029/2007WR006645
Article Google Scholar
Michalak AM, Kitanidis PK (2002) Application of Bayesian inference methods to inverse modeling for contaminant source identification at Gloucester Landfill, Canada. Comput Methods Water Resour XIV(2):259–1266
Google Scholar
Michalak AM, Kitanidis PK (2003) A method for enforcing parameter nonnegativity in Bayesian inverse problems with an application to contaminant source identification. Water Resour Res 39(2):1033. https://doi.org/10.1029/2002WR001480
Article CAS ADS Google Scholar
Nowak W (2009) Best unbiased ensemble linearization and the quasi-linear Kalman ensemble generator. Water Resour Res 45(4):W04431. https://doi.org/10.1029/2008WR007328
Article ADS Google Scholar
OECD (2014) Nuclear site remediation and restoration during decommissioning of nuclear installations: a report by the NEA co-operative programme on decommissioning. Radioactive Waste Management, NEA No. 7192. https://doi.org/10.1787/9789264222182-en
Pickens JF, Grisak GE (1981) Scale-dependent dispersion in a stratified granular aquifer. Water Resour Res 17(4):1191–1211. https://doi.org/10.1029/WR017i004p01191
Article ADS Google Scholar
Pouladi B, Linde N, Longuevergne L, Bour O (2021) Individual and joint inversion of head and flux data by geostatistical hydraulic tomography. Adv Water Resour 154:103960. https://doi.org/10.1016/j.advwatres.2021.103960
Article Google Scholar
Sanchez-Leon E, Leven C, Haslauer CP, Cirpka OA (2016) Combining 3D hydraulic tomography with tracer tests for improved transport characterization. Ground Water 54(4):498–507. https://doi.org/10.1111/gwat.12381
Article CAS PubMed Google Scholar
Shlomi S, Michalak AM (2007) A geostatistical framework for incorporating transport information in estimating the distribution of a groundwater contaminant plume. Water Resour Res 43(3):1–12. https://doi.org/10.1029/2006WR005121
Article Google Scholar
Singh RM, Datta B (2004) Groundwater pollution source identification and simultaneous parameter estimation using pattern matching by artificial neural network. Environ Forensics 5(3):143–153. https://doi.org/10.1080/15275920490495873
Article CAS Google Scholar
Snodgrass MF, Kitanidis PK (1997) A geostatistical approach to contaminant source identification. Water Resour Res 33(4):537–546. https://doi.org/10.1029/96WR03753
Article CAS ADS Google Scholar
Spitz K, Moreno J (1996) A practical guide to groundwater and solute transport modeling, 1st edn. Wiley, New York
Google Scholar
Takai S, Shimada T, Takeda S, Koike K (2022) Evaluating the effectiveness of a geostatistical approach with groundwater flow modeling for three-dimensional estimation of a contaminant plume. J Contam Hydrol 251:104097. https://doi.org/10.1016/j.jconhyd.2022.104097
Article CAS PubMed Google Scholar
Troldborg M, Nowak W, Lange IV, Santos MC, Binning PJ, Bjerg PL (2012) Application of Bayesian geostatistics for evaluation of mass discharge uncertainty at contaminated sites. Water Resour Res 48(9):W09535. https://doi.org/10.1029/2011WR011785
Article ADS Google Scholar
Vu MT, Jardani A (2022) Mapping of hydraulic transmissivity field from inversion of tracer test data using convolutional neural networks. CNN-2T. J Hydrol 606:127443. https://doi.org/10.1016/j.jhydrol.2022.127443
Article Google Scholar
Wagner BJ (1992) Simultaneous parameter estimation and contaminant source characterization for coupled groundwater flow and contaminant transport modelling. J Hydrol 135:275–303. https://doi.org/10.1016/0022-1694(92)90092-A
Article CAS Google Scholar
Wang X, Jardani A, Jourde H (2017) A hybrid inverse method for hydraulic tomography in fractured and karstic media. J Hydrol 551:29–46. https://doi.org/10.1016/j.jhydrol.2017.05.051
Article Google Scholar
Wiedemeier TH, Rifai HS, Newell CJ, Wilson JT (1999) Natural attenuation of fuels and chlorinated solvents in the subsurface. Wiley, New York
Book Google Scholar
Woodbury A, Sudicky E, Ulrych TJ, Ludwig R (1998) Three-dimensional plume source reconstruction using minimum relative entropy inversion. J Contam Hydrol 32:131–158. https://doi.org/10.1016/S0169-7722(97)00088-0
Article CAS Google Scholar
Xu M, Eckstein Y (1995) Use of weighted least-squares method in evaluation of the relationship between dispersivity and field scale. Ground Water 33(6):905–908. https://doi.org/10.1111/j.1745-6584.1995.tb00035.x
Article CAS Google Scholar
Xu T, Gomez-Hernandez JJ (2018) Simultaneous identification of a contaminant source and hydraulic conductivity via the restart normal-score ensemble Kalman filter. Adv Water Resour 112:106–123. https://doi.org/10.1016/j.advwatres.2017.12.011
Article ADS Google Scholar
Xu T, Gomez-Hernandez JJ, Chen Z, Lu C (2021) A comparison between ES-MDA and restart EnKF for the purpose of the simultaneous identification of a contaminant source and hydraulic conductivity. J Hydrol 595:125681. https://doi.org/10.1016/j.jhydrol.2020.125681
Article Google Scholar
Zanini A, Kitanidis PK (2009) Geostatistical inversing for large-contrast transmissivity fields. Stochas Environ Res Risk Assess 23(5):565–577. https://doi.org/10.1007/s00477-008-0241-7
Article MathSciNet Google Scholar
Zha Y, Yeh TCJ, Illman WA, Onoe H, Mok CMW, Wen JC, Huang SY, Wang W (2017) Incorporating geologic information into hydraulic tomography: a general framework based on geostatistical approach. Water Resour Res 53(4):2850–2876. https://doi.org/10.1002/2016WR019185
Article ADS Google Scholar
Zha Y, Yeh TCJ, Illman WA, Zeng W, Zhang Y, Sun F, Shi L (2018) A reduced-order successive linear estimator for geostatistical inversion and its application in hydraulic tomography. Water Resour Res 54(3):1616–1632. https://doi.org/10.1002/2017WR021884
Article ADS Google Scholar

Download references

Acknowledgements

This study was funded by the Secretariat of Nuclear Regulation Authority, Nuclear Regulation Authority, Japan. Sincere thanks are extended to the two anonymous reviewers for their essential and constructive comments and suggestions that improved the clarity of this manuscript.

Author information

Authors and Affiliations

Nuclear Safety Research Center, Japan Atomic Energy Agency, 2-4 Shirakata, Tokai, Ibaraki, 319-1195, Japan
Shizuka Takai, Taro Shimada & Seiji Takeda
Department of Urban Management, Graduate School of Engineering, Kyoto University, Katsura, Kyoto, 615-8540, Japan
Shizuka Takai & Katsuaki Koike

Authors

Shizuka Takai
View author publications
You can also search for this author in PubMed Google Scholar
Taro Shimada
View author publications
You can also search for this author in PubMed Google Scholar
Seiji Takeda
View author publications
You can also search for this author in PubMed Google Scholar
Katsuaki Koike
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Data analysis and modeling were planned and performed by ST and KK. The first draft of the manuscript was written by ST, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Shizuka Takai.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix: Normalization of the Prior Model

The geostatistical estimation of the transmissivity was implemented via the normalization of the prior model using the following procedure. More details are given in Kitanidis and Lee (2014).

List of symbols
Parameter		Dimension
All
t	Time
x	Space	m × 1
m_t	Number of unknowns (source intensity)
m	Number of unknowns (contaminant plume/transmissivity)
n_z	Number of observations (concentration)
n_φ	Number of observations (head)
δ(x)	Dirac delta function
Estimation of initial contaminant plume distribution
s	Discretized unknown source intensity	m_t × 1
z₀	Discretized unknown initial contaminant plume	m × 1
${\varvec{z}}_{0}^{*}$	Observation (initial concentration)	n_z × 1
H_s,${\varvec{H}}_{s}^{*}$	Jacobian matrix (for whole domain/observations)	$m \times m_{t} ,n_{z} \times m_{t}$
v_z	Observation error	n_z × 1
$\sigma_{{R_{z} }}$	Standard deviation of error
R_z	Error covariance matrix	n_z × n_z
X_s	Known drift matrix	m_t × p_s
β_s	Unknown drift coefficients	p_s × 1
Q_s	Generalized prior covariance matrix of unknown function	m_t × m_t
θ_s	Structure parameter of covariance Q_s
${\varvec{V}}_{{\hat{s}}}$	Covariance matrix of estimated source intensity	m_t × m_t
${\varvec{V}}_{{\hat{z}}}$	Covariance matrix of estimated initial contaminant plume	m × m
Estimation of hydraulic transmissivity distribution
n	Number of observations
r	Discretized unknown log-transmissivity	m × 1
z(t)	Discretized unknown contaminant plume at time t	m × 1
y	Observation	n × 1
$\user2{\varphi }$	Observation (head)	$n_{\varphi } \times 1$
${\varvec{z}}^{\user2{*}}$	Observation (transient concentration)	$n_{z} \times \left( {t_{0} , \ldots ,t_{end} } \right)$
$\overline{\user2{t}}$	Observation (mean travel time)	$n_{z} \times 1$
H	Jacobian matrix	n × m
v	Observation error	n × 1
$\sigma_{R}$	Standard deviation of error
R	Error covariance matrix	n × n
X	Known drift matrix	m × p
β	Unknown drift coefficients	p × 1
Q	Generalized prior covariance matrix of unknown function	m × m
θ_r	Structure parameter of covariance Q
K	Rank of approximation of Q
${\varvec{V}}_{{\hat{\user2{r}}}}$	Covariance matrix of estimated transmissivity	m × m
δ_r	Finite difference interval
ε_r	relative machine precision
$\lambda_{i}$	ith eigenvalue of Q
V_i	ith eigenvector of Q	m × 1
δ_ls	Finite difference interval for line search
Conditional realization
$N_{{z_{0} }}$	Number of realizations (initial contaminant plume)
N_r	Number of realizations (transmissivity)
${\varvec{s}}_{{\varvec{u}}} , {\varvec{s}}_{{\varvec{c}}}$	Unconditional/conditional realization of s	m_t × 1
${\varvec{z}}_{0_{{\varvec{c}}}}$	Conditional realization of z₀	m × 1
${\varvec{r}}_{{\varvec{u}}} , {\varvec{r}}_{{\varvec{c}}}$	Unconditional/conditional realization of r	m × 1
Physical model
u	Groundwater velocity	2 × 1
T	Transmissivity	m × 1
Q _f	Pumping rate
V	Actual groundwater velocity	2 × 1
R_f	Retardation factor
λ_f	Decay constant
D	Dispersion tensor	2 × 2
ε	Porosity
$\alpha_{L} , \alpha_{T}$	Dispersivity (longitudinal and transverse)
D_m	Molecular diffusion coefficient
τ	Tortuosity
L_p	Plume length

First, the drift matrix X is replaced with its normalized and isomorphic matrix U such that

$$ \begin{array}{*{20}c} {{\varvec {U}} = \left\{ {\begin{array}{*{20}c} {{\varvec{X}}/\sqrt m } & {\left( {p = 1} \right)} \\ {{\varvec{US}}_{X} {\varvec{V}}_{X}^{T} } & {\left( {p > 1} \right)} \\ \end{array} ,} \right.} \\ \end{array} $$

(28)

where ${\varvec{S}}_{X} \in {\mathbb{R}}^{p \times p}$ is a diagonal matrix of the singular values and the columns of ${\varvec{V}}_{X}^{{}} \in {\mathbb{R}}^{p \times p}$ are the orthonormal eigenvectors of ${\varvec{X}}^{T} {\varvec{X}}$. Then, U is used to compute the detrending matrix ${\varvec{P}} \in {\mathbb{R}}^{m \times m}$

$$ \begin{array}{*{20}c} {{\varvec {P}} = {\varvec {I}} - {\varvec {U}}{\varvec{U}}^{T} .} \\ \end{array} $$

(29)

The next step is to detrend the low-rank covariance Q [Eq. (20)]. First, Z_Q is replaced with ${\varvec{PZ}}_{Q}$. Then, the singular value decomposition of Z_Q is calculated such that

$$ \begin{array}{*{20}c} {{\varvec{Z}}_{Q} = {\varvec{U}}_{Z} {\varvec{S}}_{Z} {\varvec{V}}_{Z}^{T} ,} \\ \end{array} $$

(30)

where ${\varvec{U}}_{Z} \in {\mathbb{R}}^{m \times K}$ is the unitary matrix; ${\varvec{S}}_{Z} \in {\mathbb{R}}^{K \times K}$ is a diagonal matrix of the singular values; and the columns of ${\varvec{V}}_{Z}^{{}} \in {\mathbb{R}}^{K \times K}$ are the orthonormal eigenvectors of ${\varvec{Z}}_{Q}^{T} {\varvec{Z}}_{Q}$. Using ${\varvec{C}} = {\varvec{U}}_{Z}^{T} {\varvec{QU}}_{Z}$, Q is replaced with its detrended and isomorphic matrix PQP

$$ \begin{array}{*{20}c} {{\varvec {PQP}} \approx {\varvec{U}}_{Z} {\varvec {C}}{\varvec{U}}_{Z}^{T} .} \\ \end{array} $$

(31)

Then, HQ and ${\varvec{HQH}}^{T}$ can be approximated as

$$ \begin{array}{*{20}c} {{\varvec {HQ}} \approx \left( {{\varvec{HU}}_{Z} } \right){\varvec {C}}{\varvec{U}}_{Z}^{T} \equiv {\varvec {BC}}{\varvec{U}}_{Z}^{T} ,} \\ \end{array} $$

(32)

$$ \begin{array}{*{20}c} {{\varvec {HQ}}{\varvec{H}}^{T} \approx \left( {{\varvec{HU}}_{Z} } \right){\varvec {C}}\left( {{\varvec{HU}}_{Z} } \right)^{T} = {\varvec {BC}}{\varvec{B}}^{T} .} \\ \end{array} $$

(33)

Finally, $\overline{\user2{r}}$ is updated iteratively by the following linear equation system corresponding to Eq. (6)

$$ \begin{array}{*{20}c} {\left( {\begin{array}{*{20}c}{\varvec{\varSigma}}& {{\varvec{HX}}} \\ {\left( {{\varvec{HX}}} \right)^{T} } & 0 \\ \end{array} } \right)\left( {\begin{array}{*{20}c} {{\varvec{\varLambda}}^{T} {\varvec{A}}_{p} } \\ {{\varvec{MA}}_{p} } \\ \end{array} } \right) = \left( {\begin{array}{*{20}c} {{\varvec{HQA}}_{p} } \\ {{\varvec{X}}^{T} {\varvec{A}}_{p} } \\ \end{array} } \right), {\varvec{A}}_{p} = \left( {{\varvec{U}}_{Z} , {\varvec{U}}} \right),} \\ \end{array} $$

(34)

$$ \begin{array}{*{20}c} {\overline{\user2{r}} = {\varvec{A}}_{p} \left( {{\varvec{\varLambda}}^{T} {\varvec{U}}_{Z} } \right)^{T} \left( {{\varvec{y}} - h\left( {\overline{\user2{r}}} \right) + \user2{H\overline{r}}} \right).} \\ \end{array} $$

(35)

Once the optimal solution $\hat{\user2{r}}$ is obtained, the posterior covariance ${\varvec {V}}_{{\hat{r}}}$ can be approximately calculated as

$$ \begin{array}{*{20}c} {{\varvec{V}}_{{\hat{r}}} = {\varvec{U}}_{Z} {\varvec {C}}{\varvec{U}}_{Z}^{T} - {\varvec {X}}\left( {{\varvec{MA}}_{p} } \right){\varvec{A}}_{p}^{T} - {\varvec {Q}}{\varvec{H}}^{T} \left( {{\varvec{\varLambda}}^{T} {\varvec{A}}_{p} } \right){\varvec{A}}_{p}^{T} .} \\ \end{array} $$

(36)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Takai, S., Shimada, T., Takeda, S. et al. Joint Clarification of Contaminant Plume and Hydraulic Transmissivity via a Geostatistical Approach Using Hydraulic Head and Contaminant Concentration Data. Math Geosci 56, 333–360 (2024). https://doi.org/10.1007/s11004-023-10084-8

Download citation

Received: 10 November 2022
Accepted: 25 June 2023
Published: 31 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11004-023-10084-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Joint Clarification of Contaminant Plume and Hydraulic Transmissivity via a Geostatistical Approach Using Hydraulic Head and Contaminant Concentration Data

Abstract

Similar content being viewed by others

A 3D hybrid model for estimation of steady-state plume lengths with the influence of recharge

Effects of source size, monitoring distance and aquifer heterogeneity on contaminant mass discharge and plume spread uncertainty

Significance of Mass–Concentration Relation on the Contaminant Source Depletion in the Nonaqueous Phase Liquid (NAPL) Contaminated Zone

1 Introduction

2 Methods

2.1 Iterative Estimation of Contaminant Plume and Hydraulic Transmissivity

2.2 Geostatistical Inversion for Initial Contaminant Plume Estimation

2.3 Principle Component Geostatistical Approach for Hydraulic Transmissivity Estimation

2.4 Conditional Realizations of the Transmissivity and Initial Plume Distributions