Introduction

Inverse problems are very important in groundwater modeling since the quality of the groundwater model largely depends on the quality of the model parameters (Gómez-Hernández et al. 2003; Karahan and Ayvaz 2008; Franssen et al. 2009; Zhou et al. 2014). Many studies have focused on parameter estimation in the last few decades (e.g. Carrera et al. 2005; Dagan 1985; Doherty 2004; Gómez-Hernández et al. 2003; Franssen et al. 2009; Neuman 1973; Oliver et al. 1997; Zhou et al. 2014). Due to the intrinsic heterogeneities of natural porous media and the scarcity of observation data, accurate characterization of the spatial distribution of hydraulic properties and corresponding uncertainty is always a key issue in groundwater management and protection (De et al. 1999; Carrera et al. 2005; Zhou et al. 2014).

Inverse methods are often used by conditioning on observation data (e.g. flow data, concentration data, and hydrogeophysical data) to characterize the spatial variation of parameters (e.g. hydraulic conductivity). Data assimilation methods have been popular in recent decades as they can assimilate different sources of information to estimate parameters and predict states (Oliver and Chen 2008, 2011; Chen and Oliver 2011; Chen et al. 2009; Chen and Zhang 2006; Nan and Wu 2011; Li et al. 2012a; Zhou et al. 2014; Xue and Zhang 2014; Man et al. 2016; Lan et al. 2018; Evensen 2018). Well-known data assimilation methods include the Kalman filter (KF, Kalman 1960), ensemble Kalman filter (EnKF, Evensen 2009), ensemble smoother (Van Leeuwen and Evensen 1996), and iterative ensemble smoothers (e.g. ensemble smoother with multiple data assimilation, ES-MDA, Emerick and Reynolds 2013a). They have gained popularity due to their simplicity and flexibility in implementation. For subsurface flow problems, the ES-MDA method, proposed by Emerick and Reynolds (2013a), can typically obtain better data reproduction and better estimation of parameters, compared to the EnKF (Emerick 2016; Emerick and Reynolds 2012, 2013a, b). However, the data assimilation methods mentioned previously cannot get optimal solutions when they are applied to groundwater inversion problems where estimated parameters usually have a non-Gaussian distribution. Accounting for non-Gaussian distributions of hydraulic conductivity is very significant since flow and transport predictions are dramatically different between Gaussian and non-Gaussian conductivity fields (Gómez-Hernández and Wen 1998; Zinn and Harvey 2003; Feyen and Caers 2005; Lee et al. 2007).

There have been many published papers on alleviating this non-Gaussian challenge. On the one hand, much research focused on parameterization methods that can represent the non-Gaussian parameters using latent Gaussian variables (Dorn and Villegas 2008; Sarma et al. 2008; Chen et al. 2009; Chang et al. 2010; Li et al. 2012b; Zhou et al. 2012a; Xu and Gómez-Hernández 2016; Xu and Gómez-Hernández 2017; Li et al. 2018). Liu and Oliver (2005) applied the truncated pluri-Gaussian model to match geologic facies using dynamic flow data via the EnKF. Jafarpour and McLaughlin (2008) applied the discrete cosine transformation to decrease the non-Gaussianity of parameters, and then used a traditional method to accomplish the model inversion. Chang et al. (2010) applied level-set parameterization for the non-Gaussian parameter estimation problem. Zhou et al. (2011) combined normal score transformation and the EnKF together to solve the non-Gaussian inversion problem. Moreover, as machine learning has become more and more popular, some researchers have begun to use machine learning algorithms as parameterization methods (Canchumuni et al. 2017; Mo et al. 2020). However, most of these studies are illustrated based on two-facies cases, and their applicability in multiple faces, such as three-facies, is worth further research and demonstration. Sometimes, change from two-facies to three-facies could lead to new challenges for some parameterization methods, as the existence of the third facies greatly increases the complicity of facies delineation (Chen et al. 2015, 2016).

On the other hand, many researchers focused on geostatistical methods, for which the core purpose is to condition facies simulation to the non-linear flow data (Strebelle 2002; Caers and Hoffman 2006; Cao et al. 2018; Hansen et al. 2018; Laloy et al. 2018; Khaninezhad et al. 2019). Jafarpour and Khodabakhshi (2011) proposed a probability conditioning method (PCM) based on the tau-model proposed by Journel (2002), and combined it with the EnKF. Zhou et al. (2012b) developed a pattern-search-based method following the idea of direct sampling (Mariethoz et al. 2010). Li et al. (2013) proposed an ensemble PATtern (EnPAT) search method that simultaneously updated both hydraulic conductivity and hydraulic head. Recently, Ma and Jafarpour (2018a) proposed a new pilot-points method for conditioning discrete MPS facies simulation on dynamic flow data and coupled it with ES to test several numerical experiments.

Among these methods mentioned already, the PCM gained popularity as it can condition both facies and hydraulic properties on flow data via the probability map (Jafarpour and Khodabakhshi 2011). Lots of research has been focused on understanding its mathematical principles and improving its performance (Khodabakhshi and Jafarpour 2014; Ma and Jafarpour 2019). Khodabakhshi and Jafarpour (2013) proposed an adaptive sampling strategy based on the PCM when multiple training images are used to acknowledge the uncertainty. Ma and Jafarpour (2018b) improved the PCM by constructing a probability map based on first- and second-order moments and introducing pixel-based tau values. To the best of the authors’ knowledge, these further studies on the PCM are based on the assumption that conductivities within each facies are homogeneous. However, conductivity heterogeneities within facies play an important role in the groundwater flow and transport model (Zhang et al. 2013).

To illustrate the effect of heterogeneous conductivities within facies on the flow and transport, two cases were constructed for comparison (case homo and case heter; more details of the flow and transport model can be found in the ‘Appendix’). In these two cases, everything is the same except that the conductivities in each facies in case homo are set as a constant (equal to the mean value in case heter, i.e. 3 ln(m/day) for the channel and –2 ln(m/day) for nonchannel, see Fig. 1b,c). Figure 1d shows that the breakthrough curves are rather different in case homo and case heter for both observation points. In addition, Fig. 1e,f illustrates that the flow fields are strongly different even though the two cases are identical except for the conductivity heterogeneities within the facies. These observations indicate that accurate characterization of the conductivity heterogeneities within each facies is also significant in the non-Gaussian inversion problems.

Fig. 1
figure 1

The comparison of case homo and case heter. a Facies field and two observation points for both two cases; b lnK field for case homo; c lnK field for case heter; d breakthrough curves at two observation points for the two cases; e the flow field of case Homo; f the flow field of case Heter

This work first proposes a modified PCM to characterize geological facies as well as hydraulic conductivities within each facies by combining PCM with the ES-MDA. Note that the ES-MDA was chosen instead of the EnKF as the data assimilation method, due to the better performance from ES-MDA for subsurface inverse problems (Emerick 2016; Emerick and Reynolds 2012, 2013a, b). Meanwhile, to the best of the authors’ knowledge, this work is the first time that the PCM is used to simultaneously estimate facies and conductivity fields in groundwater models, especially for models with three facies.

The rest of the paper is organized as follows. The relevant methods and the proposed method are introduced. Afterwards, a two-facies synthetic case and a three-facies synthetic case are constructed to illustrate the performance of the proposed method. Then, a sensitivity analysis to evaluate the impact of two important parameters of the proposed algorithm is carried out, followed by summarization.

System model and methodologies

Groundwater flow model

The flow model is assumed to be transient, and its governing equation is as follows (Bear 1972),

$$ \nabla \cdotp \left(K\nabla H\right)+W={\mu}_s\frac{\partial H}{\partial t} $$
(1)

where ∇ is the divergence operator; ∇ is the gradient operator; K is the hydraulic conductivity [L T−1]; H is the hydraulic head [L]; W is the volumetric injection (pumping) flow rate per unit volume of the aquifer [L T−1]; μs is the specific storage of the aquifer [L−1]; and t is the time [T].

Ensemble smoother with multiple data assimilation (ES-MDA)

The ES-MDA is one of the most popular data assimilation methods with better performance and higher efficiency compared to the EnKF. In the ES-MDA, all the parameters of interest p are augmented with state variable h into a joint state vector x = [ph]T, and an ensemble of Ne realizations of parameters is generated. The principle of the ES-MDA is very similar to that of the EnKF. There are only two differences between these two methods: one is that the ES-MDA uses global updates with all available data while the classical EnKF carries out updates sequentially using data from different times, and the other difference is that the ES-MDA uses multiple data assimilations with inflation coefficients, while the EnKF performs only one assimilation with each set of data. The main procedures of the ES-MDA are listed here:

  1. Step 1.

    Decide the number of data assimilations (Na) and choose the coefficient (αi, i = 1, …, Na) for each data assimilation step satisfying the constraint in Eq. (2). Considering the computational cost and its performance, the number of assimilation times (iteration times) in the ES-MDA algorithm is chosen to be 4 in this work (Emerick and Reynolds 2013a), and the coefficient of each data assimilation in the ES-MDA is chosen to be α1 = 9.333, α2 = 7.0, α3 = 4.0 and α4 = 2.0 (Emerick and Reynolds 2013a).

$$ \sum \limits_{i=1}^{N_{\mathrm{a}}}\frac{1}{\alpha_i}=1 $$
(2)
  1. Step 2.

    For each realization, run the forward model G(.) from time zero

$$ {d}_i=G\left({\mathrm{x}}_i^{\mathrm{f}}\right),i=1,2,\dots, {N}_{\mathrm{e}} $$
(3)

In the aforementioned equation, i is the ensemble member index, and superscript f denotes forecast.

  1. Step 3.

    Update the ensemble of realizations using Eq. (4).

$$ {\mathbf{x}}_i^{\mathrm{a}}={\mathbf{x}}_i^{\mathrm{f}}+{\mathbf{C}}_{\mathrm{YD}}{\left({\mathbf{C}}_{\mathrm{D}\mathrm{D}}+{\alpha}_l{\mathbf{C}}_{\mathrm{D}}\right)}^{-1}\left({\mathbf{d}}_{{\mathrm{obs}}_i}-{\mathbf{d}}_i\right),i=1,2,\dots, {N}_{\mathrm{e}} $$
(4)

In the preceding equation, CYD is the cross-covariance matrix between the forecast state and the predicted data, CDD is the covariance matrix of the predicted data, l is the iteration index of the ES-MDA, l = 1, 2, …, Na, CD is the covariance matrix of the measurements error, dobs is the perturbed observations with noise covariance αlCD, d is the predicted data, and superscript a denotes analysis.

After step 3, the updated ensemble Xa is obtained. Then, go back to step 2; the updated ensemble obtained from this step is implemented for the next data assimilation. Repeat steps 2–3 until the total number of data assimilations Na is reached.

Probability conditioning method (PCM)

Original PCM

The probability conditioning method (PCM) was proposed to constrain single normal equation SIMulation (SNESIM)-based (Strebelle 2002) facies simulations on flow data (Jafarpour and Khodabakhshi 2011). The implementation of the PCM consists of two main steps. In the first step, flow data are used to update the lnK field through ES-MDA, and then the updated lnK field is used to infer a facies probability map. In the second step, the probability map is used as soft data (through the tau-model, Journel 2002) in the SNESIM algorithm to generate new (updated) realizations of facies indicators.

In order to infer a facies probability map from the nonlinear flow data, the EnKF data assimilation method was used in the original PCM to update conductivities (denoted as lnK) based on the flow data. Then the updated hydraulic properties were used to infer the facies probability map. For a model with two facies types, channel and nonchannel, with homogeneous lnK values, i.e. lnK1 and lnK0 respectively, one can use the following equation to calculate the facies probability map based on the updated lnK,

$$ {\displaystyle \begin{array}{c}P\left(f(x)=\mathrm{channel}|\mathrm{flow}\ \mathrm{data}\right)\\ {}=\Big\{\begin{array}{c}\kern-9.7em {p}_{\mathrm{max}},\overline{lnK}(x)>\mathit{\ln}{K}_1\\ {}\kern-9.7em {p}_{\mathrm{min}},\overline{lnK}(x)<\mathit{\ln}{K}_0\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\overline{lnK}(x)-\mathit{\ln}{K}_0}{\mathit{\ln}{K}_1-\mathit{\ln}{K}_0},\kern0.5em \mathrm{otherwise}\end{array}\end{array}} $$
(5)

where \( \overline{\ln K}(x) \) denotes the ensemble mean of lnK at grid cell x; lnK1 and lnK0 denote the homogeneous lnK values in the channel and non-channel respectively; and [pmin,pmax] are the boundaries of probability values. In this work, [pmin,pmax] are set as [0.01, 0.99].

For the two-facies case, one first uses Eq. (5) to calculate the probability of the channel in each grid based on the lnK ensemble which is conditioned to flow data, then the probability of the nonchannel is equal to 1 – P(f(x) = channel). However, for the three-facies case, it is necessary to calculate the probability of each facies type separately. The equations for the three-facies case are shown in the following, which are similar to that used in the two-facies case, calculating probabilities based on updated lnK,

$$ {\displaystyle \begin{array}{c}P\left(f(x)=\mathrm{high}\;\ln K\;\mathrm{facies}|\mathrm{flow}\ \mathrm{data}\right)\\ {}=\Big\{\begin{array}{c}\kern-9.7em {p}_{\mathrm{max}},\overline{\ln K}(x)>\ln {K}_2\\ {}\kern-9.7em {p}_{\mathrm{min}},\overline{\ln K}(x)<\ln {K}_1\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\overline{\ln K}(x)-\ln {K}_1}{\ln {K}_2-\ln {K}_1},\kern0.5em \mathrm{otherwise}\end{array}\end{array}} $$
(6)
$$ {\displaystyle \begin{array}{c}P\left(f(x)=\mathrm{mid}\;\ln K\;\mathrm{facies}|\mathrm{flow}\ \mathrm{data}\right)\\ {}=\Big\{\begin{array}{c}\kern-7.7em {p}_{\mathrm{min}},\overline{\ln K}(x)>\ln {K}_2\; or\;\overline{\ln K}(x)<\ln {K}_0\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\overline{\ln K}(x)-\ln {K}_0}{\ln {K}_1-\ln {K}_0},\ln {K}_0<\overline{\ln K}(x)<\ln {K}_1\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\ln {K}_2-\overline{\ln K}(x)}{\ln {K}_2-\ln {K}_1},\ln {K}_1<\overline{\ln K}(x)<\ln {K}_2\end{array}\end{array}} $$
(7)
$$ {\displaystyle \begin{array}{c}P\left(f(x)=\mathrm{low}\;\ln K\;\mathrm{facies}|\mathrm{flow}\ \mathrm{data}\right)\\ {}=\Big\{\begin{array}{c}\kern-9.7em {p}_{\mathrm{min}},\overline{\ln K}(x)>\ln {K}_1\\ {}\kern-9.7em {p}_{\mathrm{max}},\overline{\ln K}(x)<\ln {K}_0\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\ln {K}_1-\overline{\ln K}(x)}{\ln {K}_1-\ln {K}_0},\kern0.5em \mathrm{otherwise}\end{array}\end{array}} $$
(8)

where \( \overline{\ln K}(x) \) denotes the ensemble mean of lnK at grid cell x; lnK0, lnK1 and lnK2 denote homogeneous lnK values in the three facies (from low to high), respectively; and [pmin,pmax] are the boundaries of probability values. In this work, [pmin,pmax] are set as [0.01, 0.99].

Modified PCM

It is evident that there is an assumption of homogeneity in the preceding equations. To apply the scenario to non-Gaussian and heterogeneous cases, the preceding equations are modified as follows. It should be noted that non-Gaussian and heterogeneous lnK fields in this work are constructed by combining the SNESIM algorithm (used to generate facies distribution) and GCOSIM3D algorithm (used to generate heterogeneity within each facies type).

For the two-facies case,

$$ {\displaystyle \begin{array}{c}P\left(f(x)=\mathrm{channel}|\mathrm{flow}\ \mathrm{data}\right)\\ {}=\Big\{\begin{array}{c}\kern-9.7em {p}_{\mathrm{max}},\overline{\ln K}(x)>\overline{\ln {K}_1}\\ {}\kern-9.7em {p}_{\mathrm{min}},\overline{\ln K}(x)<\overline{\ln {K}_0}\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\overline{\ln K}(x)-\overline{\ln {K}_0}}{\overline{\ln {K}_1}-\overline{\ln {K}_0}},\kern0.5em \mathrm{otherwise}\end{array}\end{array}} $$
(9)

where \( \overline{\ln K}(x) \) denotes the ensemble mean of lnK at grid cell x; \( \overline{\ln {K}_1} \) and \( \overline{\ln {K}_0} \) are the means of lnK in channel and nonchannel type facies; and [pmin,pmax] are the boundaries of probability values. In this work, [pmin,pmax] are set as [0.01, 0.99].

For the three-facies case,

$$ {\displaystyle \begin{array}{c}P\left(f(x)=\mathrm{high}\;\ln K\;\mathrm{facies}|\mathrm{flow}\ \mathrm{data}\right)\\ {}=\Big\{\begin{array}{c}\kern-9.7em {p}_{\mathrm{max}},\overline{\ln K}(x)>\overline{\ln {K}_2}\\ {}\kern-9.7em {p}_{\mathrm{min}},\overline{\ln K}(x)<\overline{\ln {K}_1}\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\overline{\ln K}(x)-\overline{\ln {K}_1}}{\overline{\ln {K}_2}-\overline{\ln {K}_1}},\kern0.5em \mathrm{otherwise}\end{array}\end{array}} $$
(10)
$$ {\displaystyle \begin{array}{c}P\left(f(x)=\mathrm{mid}\;\ln K\;\mathrm{facies}|\mathrm{flow}\ \mathrm{data}\right)\\ {}=\Big\{\begin{array}{c}\kern-7.7em {p}_{\mathrm{min}},\overline{\ln K}(x)>\overline{\ln {K}_2}\ or\ \overline{\ln K}(x)<\overline{\ln {K}_0}\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\overline{\ln K}(x)-\overline{\ln {K}_0}}{\overline{\ln {K}_1}-\overline{\ln {K}_0}},\overline{\ln {K}_0}<\overline{\ln K}(x)<\overline{\ln {K}_1}\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\overline{\ln {K}_2}-\overline{\ln K}(x)}{\overline{\ln {K}_2}-\overline{\ln {K}_1}},\overline{\ln {K}_1}<\overline{\ln K}(x)<\overline{\ln {K}_2}\end{array}\end{array}} $$
(11)
$$ {\displaystyle \begin{array}{c}P\left(f(x)=\mathrm{low}\;\ln K\;\mathrm{facies}|\mathrm{flow}\ \mathrm{data}\right)\\ {}=\Big\{\begin{array}{c}\kern-9.7em {p}_{\mathrm{min}},\overline{\ln K}(x)>\overline{\ln {K}_1}\\ {}\kern-9.7em {p}_{\mathrm{max}},\overline{\ln K}(x)<\overline{\ln {K}_0}\\ {}{p}_{\mathrm{min}}+\left({p}_{\mathrm{max}}-{p}_{\mathrm{min}}\right)\frac{\overline{\ln {K}_1}-\overline{\ln K}(x)}{\overline{\ln {K}_1}-\overline{\ln {K}_0}},\kern0.5em \mathrm{otherwise}\end{array}\end{array}} $$
(12)

where \( \overline{\ln K}(x) \) denotes the ensemble mean of lnK at grid cell x; \( \overline{\ln {K}_0} \), \( \overline{\ln {K}_1} \) and \( \overline{\ln {K}_2} \) denote the means of lnK values in the three facies (from low to high), respectively; and [pmin,pmax] are the boundaries of probability values. In this work, [pmin,pmax] are set as [0.01, 0.99].

Note that these equations could be extended to more facies types, providing the differences between each \( \overline{\ln {K}_i} \) are significant.

Parameter estimation scheme

In this paper, to apply the PCM in non-Gaussian and heterogeneous parameter estimation cases, the original PCM is first modified to remove its homogeneity limitation (see details in section ‘Probability conditioning method (PCM)’), and then the modified PCM is combined with the ES-MDA instead of the EnKF due to the superior performance of ES-MDA for subsurface parameter estimation problem. In order to show the proposed scheme thoroughly, Fig. 2 shows the framework of the original PCM in Jafarpour and Khodabakhshi (2011) together with the framework resulting from this study. As shown in Fig. 2, in order to estimate non-Gaussian parameters in heterogeneous aquifers, the proposed scheme has an additional step of updating lnKi for facies type i compared to the original PCM.

Fig. 2
figure 2

The framework of the a original PCM and b proposed modified PCM method (see red frame)

Figure 2 shows that the proposed parameter estimation framework includes seven steps overall:

  1. Step 1.

    Generate initial realizations of the facies indicator using the SNESIM algorithm, initial ensembles of heterogeneous lnK of different facies types using GCOSIM3D algorithm (Gómez-Hernández and Journel 1993), and an initial probability map according to the number of facies (if there are n types of facies in the study domain, then the initial probability in each grid is set to 1/n).

  2. Step 2.

    Generate the non-Gaussian lnK ensemble by mapping lnK ensembles in a different facies to facies ensemble. For example, for the jth facies realization, if the facies type indicator in grid i is 0, then the lnK value in this grid is set to be the corresponding value in grid i of the jth lnK0 realization.

  3. Step 3.

    Run the forward model with each non-Gaussian lnK realization.

  4. Step 4.

    Update lnKi ensembles and the non-Gaussian lnK ensemble based on the ES-MDA equations introduced in section ‘Ensemble smoother with multiple data assimilations (ES-MDA)’.

  5. Step 5.

    Calculate the probability map based on the updated lnK according to equations stated in section ‘Modified PCM’.

  6. Step 6.

    Generate new facies realizations using the updated probability map in SNESIM.

  7. Step 7.

    Generate the updated non-Gaussian lnK ensemble by mapping the updated lnKi ensembles in step 4 to the updated facies ensemble in step 6.

    Since the ES-MDA is an iterative data assimilation method, step 3 to step 7 is repeated Na times.

Case 1: two-facies case

Case setup

In this case, the flow is assumed to be transient in a two-dimensional (2D) confined aquifer with a starting head of 0 m. As shown in Fig. 3, the dimension of the aquifer is 600 m × 600 m and the grid size is 10 m in both horizontal x and y directions. In this case, the upward and downward boundaries are assumed to be impermeable, and the head of the left boundary is fixed to be 0 m. The flux at the right boundary is shown in Fig. 3c. More details can be found in Table 1.

Fig. 3
figure 3

Reference fields and observation wells layout. a Facies distribution; b Reference hydraulic conductivity distribution; c The locations of observation wells in case 1

Table 1 Flow model parameters in synthetic cases

It is assumed that there are two facies types in the study domain, the channelized facies field and lnK field (Fig. 3), and are constructed in the following three steps:

  1. Step 1.

    Generate the facies field using the SNESIM algorithm with the training image (Fig. 4) in Strebelle (2002).

  2. Step 2.

    Generate two Gaussian random fields, lnK0 and lnK1, of the same size as the study domain using GCOSIM3D (Gómez-Hernández and Journel 1993) with parameters shown in Table 2 for sand (channel) and shale (nonchannel). In GCOSIM3D, the log-conductivity fields are characterized by their mean, standard deviation and directional correlation lengths in the two spatial dimensions (λx and λy).

  3. Step 3.

    Assemble the non-Gaussian lnK field by populating regions with one facies type (from step 1) with log-conductivity values from the corresponding Gaussian random field from step 2, i.e. the lnK value of a grid cell is based on the facies indicator value, if the facies indicator is equal to 1 then the lnK of this grid cell is set to the corresponding lnK1 value at this grid cell, and vice versa.

It should be noted here that both the reference fields and the initial realizations are generated using the procedures already mentioned.

Fig. 4
figure 4

The training image used in case 1

Table 2 Geostatistical parameters to characterize the spatial distribution of lnK within each two facies in case 1

In order to estimate the facies map and heterogeneous lnK map in this synthetic case, nine observation wells are randomly chosen (Fig. 3c) to get observation data for data assimilation. The measurement errors of the head are assumed to follow the standard normal distribution with mean of zero and standard deviation of 0.01 m. The numerical code MODFLOW−2000 (Harbaugh et al. 2000) is used to solve the flow model in this case.

Case 1: results and analysis

Estimation results

Figures 5 and 6 show the evolution of three individual realizations, and the ensemble mean and ensemble standard deviation with the four iterations (data assimilations) of the ES-MDA. The ensemble mean of the initial realizations do not show any channelized feature, but the spatial structures start to appear and become evident during the data assimilation. For instance, at the first assimilation step, the upper channel is well identified; and at the final step, the ensemble mean has good connectivity and clear channel boundaries, recovering most channel locations in the reference model. In addition, Fig. 6 shows that the lnK heterogeneities in two facies are also well characterized in this case. The standard deviation has decreased dramatically from conditioning to head data, with the highest uncertainly remaining only near the estimated channel boundary at the final iteration.

Fig. 5
figure 5

The estimation results of facies indicators in case 1

Fig. 6
figure 6

The estimation results of lnK in case 1

Similarly, Fig. 7 shows the evolution of the probability map of the channel facies with the iterations of the ES-MDA. The high probability region at the last iteration clearly identifies the channel location in the reference model, demonstrating the effectiveness of this proposed method. The gradually refined probability map with the iterations of the ES-DMA constrains the facies models simulated from the SNESIM for the next data assimilation, hence increasing the estimate accuracy with each iteration.

Fig. 7
figure 7

The probability map in different assimilation steps for case 1

In order to evaluate the estimation results further and quantitatively, two quantitative indicators are analyzed: root mean square error (RMSE) and the fraction of the correct facies indicator.

Since the true distribution of the estimated parameters in the synthetic case is known, it is possible to calculate the deviation of the estimation from the truth (reference field). The RMSE is a commonly used indicator in parameter estimation, measuring the accuracy of estimation results. In this work, RMSEi at grid i is computed as follows

$$ {\mathrm{RMSE}}_i=\sqrt{\frac{1}{N_{\mathrm{r}}}\sum \limits_{j=1}^{N_{\mathrm{r}}}{\left({Y}_{j,i}-{Y}_{\mathrm{r}\mathrm{ef},i}\right)}^2} $$
(13)

where Yref,i and Yj,i are the reference value and jth realization value at grid i respectively, and Nr denotes the total number of realizations used in ES-MDA.

Figure 8 shows the map of RMSE from the initial ensemble and from the final ensemble. The initial RMSE values in the study domain are relatively large, indicating the low accuracy of the heterogeneity characterization. However, after the data assimilation, the RMSE values decrease dramatically, illustrating the ability of this proposed method to capture spatial heterogeneity.

Fig. 8
figure 8

a RMSE of initial ensemble; b RMSE of final ensemble

In this paper, the ‘fraction of correct facies’ is defined as the number of grid cells for which the facies indicator is correctly estimated divided by the total number of grid cells in the study domain. The average fraction of all realizations, Ef, is used to quantitatively evaluate the quality of the reconstructed facies model.

Figure 9 shows the evolution of the Ef during data assimilation, and one finds that the Ef increases as the assimilation step advances. After only two steps, facies indicators in around 80% of grid cells are correctly estimated. For the final ensemble, facies indicators are estimated correctly in around 85% of grid cells, which shows the efficiency and effectiveness of this proposed method.

Fig. 9
figure 9

The evolution of Ef during the data assimilation in case 1

Data reproduction and prediction performance

The foregoing analysis is focused on demonstrating and illustrating the performance of parameter estimation. To further evaluate the estimation results, the performance of data reproduction and model prediction is illustrated here.

To quantitatively assess the performance of data reproduction and model prediction, the mean absolute error (MAE) is used in this work. It is calculated as follows

$$ {\mathrm{MAE}}_j=\frac{1}{n_{\mathrm{obs}}}\sum \limits_{i=1}^{n_{\mathrm{obs}}}\mid {h}_{ij}^s-{h}_i\mid $$
(14)

where nobs is the total number of head data used for data assimilation, hi is the ith observation head data, and \( {h}_{ij}^s \) is the corresponding simulated head data of jth realization.

The scatterplots of the observed data and the ensemble mean of the simulated head data are shown in Fig. 10. Linear fit results and MAE are included to evaluate the overall model calibration performance. A perfect result would show the simulated head data on the 45° line. Figure 10 shows that the cloud from the final ensemble (blue) is much closer to the 45° line compared to the cloud from the initial ensemble. Based on this, one can argue that this proposed method has a rather good performance in terms of data reproduction. The MAE values of the initial and the final ensemble, shown in Table 3, also suggest that the final ensemble obtained a good match to data with small uncertainty, since the min, max, mean, and standard deviation of the MAE are all largely reduced compared to those from the initial ensemble.

Fig. 10
figure 10

The data reproduction of head data in case 1

Table 3 The MAE values of initial and final ensembles for data reproduction. SD standard deviation

In order to evaluate the prediction ability, the updated lnK is used to forecast head data in the next 500 days. All model parameters remain the same. The scatterplots of true values and the ensemble mean of the simulated data are shown in Fig. 11. Linear fit results and MAE are also included to evaluate the overall model prediction performance. Figure 11 shows that the average simulated prediction data of the final ensemble are very close to the 45° line, and they are dramatically better than those of the initial ensemble, showing the good prediction ability of this proposed method. In addition, Table 4 shows MAE values of the initial and final ensembles. It shows that the final ensemble has much lower values in terms of min, max, mean, and standard deviation relative to the initial ensemble, which further suggests that the prediction data of the final ensemble have high accuracy and low uncertainty.

Fig. 11
figure 11

The prediction of head data in case 1

Table 4 The MAE values of initial and final ensembles for data prediction. SD standard deviation

Case 2: three facies case

Case setup

To further investigate the applicability of the proposed scheme to the estimation of facies and heterogeneous lnK in the multiple facies case, this proposed method was applied to an example with three facies.

In this case, the flow is assumed to be transient in a 2D confined aquifer with a starting head of 0 m. As shown in Fig. 12, the dimension of the aquifer is 600 m × 600 m and the grid size is 10 m in both horizontal x and y directions. In this case, the head of all boundaries is fixed to be 0 m. There are three pumping wells in the study domain (blue crosses in Fig. 12c), and the flux at each pumping well is set as 60 m3/day. The locations of three pumping wells and another 10 observation wells are shown in Fig. 12c. More details can be found in Table 1.

Fig. 12
figure 12

a Reference facies distribution; b reference hydraulic conductivity distribution; c the locations of wells in case 2. The blue crosses in c are pumping wells, and black circles are another 10 observation wells

It is assumed that there are three facies types in the study domain, and the facies field and lnK field (Fig. 12) are constructed in the same way as introduced in case 1. A low-permeability facies is added to the training image used in case 1 to generate a new training image with three facies, namely high conductivity (sandstone channels), medium conductivity (shale background), and low conductivity (lens-shaped clay). The new training image is shown in Figure 13. More details of parameter settings can be found in Table 5.

Fig. 13
figure 13

The training image used in case 2

Table 5 Geostatistical parameters to characterize the spatial distribution of lnK within each facies in case 2

In order to estimate the facies map and heterogeneous lnK map in this synthetic case, 13 observation wells (3 pumping wells and another 10 observation wells) are used (Fig. 12c) to get observation data for data assimilation. The measurement errors of the head are assumed to follow the standard normal distribution with mean of zero and standard deviation of 0.01 m. The numerical code MODFLOW−2000 (Harbaugh et al. 2000) is used to solve the flow model in this case.

Case 2: results and analysis

Estimation results

Figures 14 and 15 show the evolution of three individual realizations, and the ensemble mean and ensemble standard deviation with the four iterations (data assimilations) of the ES-MDA. The ensemble mean of the initial realizations do not show any evident non-Gaussian and heterogeneous features despite that single initial realizations have non-Gaussian and heterogeneous features, but spatial structures start to appear and become evident during the data assimilation. For instance, at the first assimilation step, the bottom channel is identified; at the second step, the upper channel and lens-shaped low-lnK distribution become evident. Finally, at the fourth step, the ensemble mean has good connectivity within certain facies and clear boundaries between different facies types, recovering most non-Gaussianity and heterogeneities in the reference fields. The standard deviation has significantly reduced from conditioning to head data, indicating the decrease of uncertainty. Figure 15 shows that the lnK heterogeneities in three facies types are also well characterized in this case. So, one can argue that this proposed method has a strong ability to recover non-Gaussian characteristics and estimate parameter heterogeneities via conditioning to head data.

Fig. 14
figure 14

The estimation results of facies indicators in case 2

Fig. 15
figure 15

The estimation results of lnK in case 2

In addition, Fig. 16 shows that probability maps gradually recover the spatial distribution of the three facies in the study domain based on the updated lnK ensemble, showing the effectiveness of this proposed method. In this way, it is possible to convert flow data into soft data on which SNESIM conditions facies realizations, hence increase the accuracy of the geo-statistical simulation. Furthermore, only after two assimilation steps, the probability map characterizes most spatial features of the three facies, illustrating the efficiency of this proposed method.

Fig. 16
figure 16

The probability map in different assimilation steps for case 2

In order to quantitatively evaluate the estimation results further, two quantitative indicators are again analyzed: root mean square error (RMSE) and the fraction of correct facies indicators.

Figure 17 shows the map of RMSE from the initial ensemble and from the final ensemble. The initial RMSE values in the study domain are relatively large, indicating low accuracy of the heterogeneity characterization. However, the RMSE values of the final ensemble (estimation result) decrease dramatically compared to the initial ones, illustrating the ability of this proposed method to capture spatial heterogeneity.

Fig. 17
figure 17

a RMSE of initial ensemble; b RMSE of final ensemble

Figure 18 shows the evolution of the Ef during the data assimilations, and it is evident that the Ef increases as the assimilation step advances. After only one assimilation step, facies indicators in around 75% of grid cells are correctly estimated. For the final ensemble, facies indicators are estimated correctly in around 81% of grid cells, which shows the efficiency and effectiveness of this proposed method in characterizing non-Gaussian features in the multifacies case.

Fig. 18
figure 18

The evolution of Ef during the data assimilation in case 2

Data reproduction and prediction performance

The above analysis is focused on demonstrating and illustrating the performance of parameter estimation. To further evaluate the estimation results, the performance of data reproduction and model prediction is illustrated here.

The scatterplots of the observed data and the ensemble mean of the simulated head data are shown in Fig. 19. Linear fit results and MAE are included to evaluate the overall data reproduction performance in this case as well. Figure 19 shows that the cloud from the final ensemble (blue) is much closer to the 45° line compared to the cloud from the initial ensemble, illustrating the accuracy of this proposed method in terms of data reproduction. The MAE values of the initial and the final ensemble shown in Table 6 show that the final ensemble obtained a good match to data with small uncertainty, since the min, max, mean, and standard deviation of MAE are all significantly reduced compared to the initial ensemble.

Fig. 19
figure 19

The data reproduction of head data in case 2

Table 6 The MAE values of initial and final ensembles for data reproduction in case 2. SD standard deviation

In order to evaluate the prediction ability, the updated lnK was used to forecast head data in the next 500 days. All model parameters remain the same. The scatterplots of true values and the ensemble mean of the simulated data are shown in Fig. 20. Linear fit results and MAE are also included to evaluate the overall model prediction performance. In Fig. 20, it is easy to see that the average simulated prediction data of the final ensemble are very close to the 45° line, and they are dramatically better than those of the initial ensemble, showing the good prediction ability of this proposed method. In addition, Table 7 shows MAE values of the initial and final ensemble. It shows that the final ensemble has much lower values in terms of min, max, mean, and standard deviation relative to the initial ensemble, illustrating the prediction ability of the proposed method as prediction data of the final ensemble show high accuracy and low uncertainty.

Fig. 20
figure 20

The prediction of head data in case 2

Table 7 The MAE values of initial and final ensembles for data prediction in case 2. SD standard deviation

Discussion

Effect of ensemble size

The results of case 2 shown in the previous section are based on an ensemble size of 300. To evaluate the impact of the ensemble size on parameter estimation, an analysis with ensemble sizes of 150, 300, 500, 800, and 1,000 (Table 8) is performed here. Note that scalar RMSE and ensemble spread are used in this section to evaluate the performance of different parameter settings for simplicity. These two indicators are defined as follows

$$ {\mathrm{RMSE}}_{\mathrm{scalar}}=\sqrt{\frac{1}{N_{\mathrm{m}}}\sum \limits_{i=1}^{N_{\mathrm{m}}}{\left(\overline{Y_{\mathrm{e},i}}-{Y}_{\mathrm{r},i}\right)}^2} $$
(15)
$$ \mathrm{Ensemble}\kern0.17em \mathrm{Spread}=\sqrt{\frac{1}{N_{\mathrm{m}}}\sum \limits_{i=1}^{N_{\mathrm{m}}}\operatorname{var}\left({Y}_{\mathrm{e},i}\right)} $$
(16)

where \( \overline{Y_{\mathrm{e},i}} \) and Yr,i are the ensemble mean value and the reference value at location i respectively; var(Ye,i) is the ensemble variance at location i; and Nm is the total number of nodes in the study domain.

Table 8 Data-assimilation-related parameters used in different cases

The evolution of the RMSEscalar and the ensemble spread with data assimilations for the cases with different ensemble size are shown in Fig. 21. For the RMSEscalar, all cases with ensemble size of 300 and above show similar performance. When the ensemble size is only 150, the RMSEscalar increases slightly after the second data assimilation. In terms of ensemble spread, all cases show somewhat similar behavior. The small differences in ensemble spread among the cases might not be statistically significant, since ES-MDA is an ensemble-based method and its results typically vary to a certain degree when with different ensembles of the same size. Based on both the RMSEscalar and the ensemble spread, it seems that an ensemble size of 300 is an appropriate choice and further increasing the ensemble size does not provide much improvement to results. Note that in a typical use of the ensemble-based data assimilation methods for parameter estimation of subsurface models, increasing ensemble size would result in large improvement to ensemble spread (maintaining ensemble variability). The improvement in this case is not obvious because in the PCM workflow, facies realizations are always regenerated using the updated probability map, and this resimulation of facies realizations introduces new variability to the ensemble for the next data assimilation (acting almost like covariance inflation in a sense).

Fig. 21
figure 21

a RMSEscalar for different Ne settings; b Ensemble spread for different Ne settings

Effect of assimilation steps

The number of assimilation steps (Na) is another important parameter in this proposed method. The results of case 2 (section ‘Case 2: three facies case’) are obtained using four data assimilations. To evaluate the impact of the number of assimilation steps on the results of parameter estimation, an analysis with two, four, and eight assimilation steps (Table 8) is performed here.

Figure 22 shows the RMSEscalar and the ensemble spread for the cases with different numbers of data assimilations. It shows that the quality of parameter estimation is acceptable when there are only two data assimilations, but it is slightly less accurate compared to case 2 where Na is 4. However, the performance of parameter estimation does not improve significantly as Na further increases. Again, in this example, Na equal to 4 appears to be a good choice, balancing the performance and computation cost.

Fig. 22
figure 22

a RMSEscalar for different Na settings; b Ensemble spread for different Na settings

Conclusions

In order to fully characterize both the facies boundary and heterogeneity of hydraulic conductivity within each facies, a modified PCM is proposed, in which a data assimilation method that is more suitable for subsurface parameters estimation (ES-MDA) is used with the probability conditioning method (PCM) and the estimation is extended to include heterogeneity within facies. It is of interest to note that, to the best of the authors’ knowledge, this work is the first time that the PCM is used to estimate both facies and conductivity fields in groundwater modeling, especially for models with three facies types. To illustrate and demonstrate the effectiveness and efficiency of this proposed method, a two-facies case and a three-facies case were constructed, both with heterogeneities within each facies type, and both quantitative and qualitative measures were used to evaluate the results.

For both test cases, the proposed method was able to identify nearly correct facies boundary locations in a few data assimilations. The calibrated models were able to reproduce head data that were used for conditioning during data assimilation, and the predictability of the calibrated models are also highly improved compared to models that were conditioned to head data.

A sensitivity analysis is also carried out to evaluate the impact of two important parameters of the proposed algorithm, ensemble size (Ne) and assimilation steps (Na), using the case with three facies types. The analysis showed that for this particular case an ensemble size of 300 and ES-MDA with four data assimilations are good choices, balancing the performance and computational cost. As noted in the discussion, increasing ensemble size did not show as a significant impact, as typically it would for a standard application of the ensemble-based method to parameter estimation in subsurface models, because the facies regeneration step in PCM injects additional variability after each data assimilation.

An important issue in conditioning facies simulation to flow data indirectly is the usage of facies probability maps for soft conditioning in the SNESIM algorithm. However, having distinctive hydraulic conductivity for each of the facies types is a critical condition for the effectiveness of PCM. Therefore, it is of interest to note that PCM may not be suitable for nonclassical non-Gaussian problems where there are no evident multi-peaks in the probability density function of the conductivity field, meaning that the distance between peaks of distributions of the hydraulic conductivity for each facies is not significant compared to the standard deviation of the distributions. The impact of the distinctiveness of the distribution of the hydraulic properties among different facies on PCM is worth further research and discussion.