Abstract
Data integration is challenging where there are different levels of support between primary and secondary data that need to be correlated in various ways. A geostatistical method is described, which integrates the hydraulic conductivity (K) measurements and electrical resistivity data to better estimate the K distribution in the Upper Chicot Aquifer of southwestern Louisiana, USA. The K measurements were obtained from pumping tests and represent the primary (hard) data. Borehole electrical resistivity data from electrical logs were regarded as the secondary (soft) data, and were used to infer K values through Archie’s law and the Kozeny-Carman equation. A pseudo cross-semivariogram was developed to cope with the resistivity data non-collocation. Uncertainties in the auto-semivariograms and pseudo cross-semivariogram were quantified. The groundwater flow model responses by the regionalized and coregionalized models of K were compared using analysis of variance (ANOVA). The results indicate that non-collocated secondary data may improve estimates of K and affect groundwater flow responses of practical interest, including specific capacity and drawdown.
Résumé
L’intégration de données entre en jeu lorsque plusieurs niveaux intermédiaires d’assistance sont nécessaires pour corréler données primaires et secondaires de diverses manières. Le présent article décrit une méthode géostatistique qui intègre les mesures de conductivité hydraulique (K) et les données de résistivité électrique, afin d’estimer plus efficacement la distribution de K dans l’Aquifère Supérieur de Chicot, au sud-ouest de la Louisiane (Etats-Unis). Les mesures de K sont issues des pompages d’essai et représentent les données primaires (“dures”). Les données des diagraphies de résistivité électrique ont été considérées comme des données secondaires (“molles”), à partir desquelles les valeurs de K ont été déduites, par la loi d’Archie et l’équation de Kozeny-Carman. Un pseudo semi-variogramme croisé a été développé afin de pallier à l’absence de colocalisation des données de résistivité. Les incertitudes sur les semi-variogrammes automatiques et sur les pseudo semi-variogrammes croisés ont été quantifiées. Les réponses du modèle d’écoulement des eaux souterraines aux modèles régionalisés et co-régionalisés de K ont été comparés, par les analyses de variance (ANOVA). Les résultats montrent que les données secondaires non-colocalisées peuvent améliorer les estimations de K, et affecter efficacement les réponses des écoulements souterrains, y compris les débits spécifiques et les rabattements.
Resumen
La integración de datos es un gran desafío cuando existen diferentes niveles de apoyo entre datos primarios y secundarios que es necesario correlacionar de varias maneras. Se describe un método geoestadístico el cual integra mediciones de conductividad hidráulica (K) y datos de resistividad eléctrica para tener una mejor estimación de la distribución de K en el Acuífero Chicot Superior del suroeste de Luisiana, Estados Unidos de América. Las mediciones de K se obtuvieron de pruebas de bombeo y representan los datos primarios (duros). Los datos de sondeos de resistividad eléctrica se consideraron como datos secundarios (suaves) y se usaron para inferir valores de K a través de la ley de Archie y la ecuación de Carman-Kozeny. Se desarrolló un pseudo semivariograma cruzado para enfrentar la falta de colocación de datos de resistividad. Se cuantificaron las incertidumbres en los auto-semivariogramas y en los semivariogramas cruzados. Las respuestas del modelo de flujo de agua subterránea por los modelos coregionalizados y regionalizados de K se compararon usando el análisis de varianza (ANOVA). Los resultados indican que los datos secundarios no colocados pueden mejorar los estimados de K y afectar las respuestas de flujo de agua subterránea de interés práctico, incluyendo capacidad específica y descenso.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Groundwater flow models have been used in many regulatory applications such as well pumping, aquifer sustainability, and capture zone analysis. Simulated drawdown from a groundwater flow model has also been used to determine well specific capacity. Specific capacities reflect hydraulic conductivity or potential of the aquifer, which affects groundwater resource management. For example, management guidelines may stipulate that the average water level for a region should not drop below specified levels. Therefore, uncertainty in model responses leads to uncertainty in the management field.
Groundwater modeling requires hydrogeological parameters as well as other information, e.g., initial/boundary conditions and discharge/recharge fluxes, to constitute a model structure. Although mathematical problems associated with model structure account for all components in the groundwater model, hydraulic conductivity is one of the most essential aquifer parameters in the inverse problem of groundwater hydrology (McLaughlin and Townley 1996). The importance of studying hydraulic conductivity arises from its wide spatial variability (Koltermann and Gorelick 1996) and its significant influence on dispersion and mass transport (Rehfeldt et al. 1992; Neuman 1990; Poeter and Gaylord 1990).
A number of studies have attempted to evaluate and compare various techniques for estimating hydraulic conductivity fields. Ritzi et al. (1994) compared three indicator-based geostatistical methods to predict zones of higher hydraulic conductivity. Eggleston et al. (1996) compared the hydraulic conductivity field and its sensitivity by using two estimation methods (i.e., kriging and conditional mean) and two simulation methods (i.e., sequential Gaussian and simulated annealing) and found that simulation methods are better at reproducing local contrasts and large-scale features. Boman et al. (1995) studied the response of a transport model to four schemes (three deterministic and one fractal-based stochastic) to interpolate field-measured hydraulic conductivity data, concluding that kriging and fractal interpolation were not significantly better than simpler methods; this study was based on densely sampled measurements which may not be typical of groundwater studies.
In groundwater studies, measurements of hydraulic conductivity are typically sparse. Conductivity fields estimated using regionalized (univariate) models such as kriging or simulation are therefore highly uncertain, as these methods depend on data density. Eggleston et al. (1996) studied a heavily sampled aquifer and reported that the number of data points significantly affects permeability field models. Therefore, data integration from various sources is essential. Hydraulic conductivity estimation with hydrologic data (e.g., hydraulic heads) has been extensively studied through the Bayesian maximum likelihood estimation (Carrera and Neuman 1986) and geostatistics of cokriging (Kitanidis and Vomvoris 1983). Many types of geophysical data have been incorporated in hydraulic conductivity estimation due to their relatively low cost, abundant field measurements, and high correlation with hydraulic conductivity. Geophysical data integration through geostatistics improves the estimation of conductivity (Gloaguen et al. 2001).
In a hydrogeological framework, the spatial structures are constructed from field-measured data. However, field measurements are always limited. Uncertainties associated with the field measurements lead to uncertainties in the correlation structure. Kitanidis (1986) examined the effect of semivariogram model uncertainty in a Bayesian framework. Feyen et al. (2003) examined semivariogram model uncertainty for capture zones in a Bayesian framework, concluding that predictions based on fitted models do not reflect field variance.
This research focuses on a real-world case study, in the estimation of hydraulic conductivity of the Upper Chicot Aquifer, Acadia Parish, in southwestern Louisiana using hydrogeological data and geophysical data. The case study presents many challenges that are commonly encountered in regional groundwater modeling. First of all, the measured hydraulic conductivity data (primary data) from pumping tests and the borehole electrical resistivity data (secondary data) are integrated through a geostatistical approach to better estimate the hydraulic conductivity field. Borehole resistivity data correlate well with the effective porosity of a saturated formation (Archie 1942), which consequently infers the hydraulic conductivity. Second, a pseudo cross-semivariogram is introduced in the geostatistical framework to cope with the estimation challenge stemming from non-collocation of the resistivity logs and pumping test sites. This study adopts the approach suggested in Clark et al. (1989) to calculate the pseudo cross-semivariogram of these two data sets. Details of the pseudo cross-semivariogram technique are given in Myers (1991). Third, the semivariogram uncertainty due to data scarcity is quantified in terms of mean and variance of the semivariogram at each lag distance using an approach suggested by Ortiz and Deutsch (2002). Assuming lognormal distribution in auto-semivariograms and pseudo cross-semivariograms, the upper and lower limits of the semivariograms, given a confidence interval, are obtained with the requirement of positive definiteness (Goovaerts 1997). The study uses the mean, upper limit and lower limit semivariograms to examine the impact of semivariogram uncertainty on the aquifer hydraulic responses. Fourth, this study develops a groundwater flow model to examine the impact of regionalized and coregionalized hydraulic conductivity fields, semivariogram uncertainty, and simulation processes in a groundwater system. Two regionalized models (kriging and sequential Gaussian simulation) and two coregionalized models (cokriging and cosimulation) are compared. Analysis of variance (ANOVA) assesses the significance of secondary data, interpolation method, and semivariogram uncertainty on the groundwater flow behavior. The relevant methods are discussed first. Data and modeling results for the parish-scale study (25 km2) are then described. Finally, the results are interpreted statistically and implications for model choice and interpretation are presented.
Hydrogeologic interpretation of borehole electrical resistivity data
Borehole electrical resistivity readings reflect the combination of geologic formation and water ionic content with depth. In general, when fresh groundwater is present, electrical resistivity identifies the interfaces between the clay layers (low resistivity value) and sand aquifers (high resistivity value) due to high clay porosity and conductive minerals on the clay surface. Another strong influence on electrical resistivity is the presence of fluid salinity; erratic electrical resistivity readings are often found in sand aquifers when salt water is present.
Typical resistivity data for the Chicot Aquifer are shown in Fig. 1. The resistivity readings distinguish the Upper Chicot Aquifer and Lower Chicot Aquifer, separated by a thin clay layer with low resistivity around a depth of 122 m (400 ft). The erratic resistivity readings in Fig. 1b also show the salt-water intrusion in deep aquifers due to heavy pumping activity. The areas affected by salt water will not be considered in this study.
Borehole resistivity in the saturated sand aquifer has a high correlation to the formation porosity and water resistivity, which is often expressed as Archie’s law (Archie 1942):
where R 0 is the resistivity of the saturated formation, R w is the formation water resistivity, Φ is the sand porosity. Two parameters a and m in Eq. (1) represent the pore geometry coefficient and the cementation factor, respectively. Typically, the pore geometry coefficient a varies between 0.62 and 2.45, and the value of the cementation factor m has a range between 1.08 and 2.15 depending on the formation type (Asquith and Gibson 1982). The formation water resistivity varies with local geochemical and ionic content. However, when fresh groundwater is considered, the variation is not significant. Assuming constant formation water resistivity is reasonable.
The formulas for determining hydraulic conductivity from particle-size distribution are reduced to the following generalized formula (Vukovic and Soro 1992).
where K is the hydraulic conductivity, γ w is the specific weight of water, μ w is the dynamic viscosity of water, C is a dimensionless parameter, G(Φ) is the function of porosity, and d e is the effective grain diameter. The term \(CG\left( \Phi \right)d_{\text{e}}^{\text{2}} \) in Eq. (2) refers to the soil permeability. In Eq. (2), the hydraulic conductivity can be evaluated with the resistivity data and Archie’s law. In this study, the Kozeny-Carman equation (Carman 1956) is adopted, where C=1/180 and \( G{\left( \Phi \right)} = \Phi ^{3} {\left( {1 - \Phi } \right)}^{{ - 2}} \). Substituting Eq. (1) into Eq. (2), the hydraulic conductivity relating to formation/water resistivity results and is denoted as K F:
where the ratio R 0/R w is defined as the formation factor (Archie 1942). The formation factor obtained from borehole resistivity logs (Asquith and Gibson 1982) is a relatively common and low-cost measurement, particularly compared to pumping tests. Kelly (1977) and Kosinski and Kelly (1981) studied site-specific electrical resistivity and hydraulic conductivity and established empirical relationships between the formation factor and hydraulic conductivity. However, their studies did not investigate the theoretical basis for the correlation and the empirical relationship is not necessarily applicable to other sites. The study reported here interprets the hydraulic conductivity using the geophysical resistivity data through Archie’s law (1942) and the Kozeny-Carman equation (Carman 1956) for sand units.
Data integration and simulation via geostatistics
Data non-collocation and pseudo cross-semivariogram
Data integration of the resistivity-driven hydraulic conductivity and the measured hydraulic conductivity by pumping tests is possible through a geostatistical approach. Commonly, the measured hydraulic conductivity denoted as K y is considered as the primary data because the hydraulic conductivity is measured directly through the groundwater responses. The measured and resistivity-driven hydraulic conductivity data are considered as the random fields and are transformed to a normal distribution (the hydraulic conductivity is usually considered as a lognormal distribution) such that these two types of data are classified as second-order stationary. Let Y and F be the data transform of K Y and K F, respectively. If both Y and F have sample values at the same locations, the cross-semivariogram of Y and F can be obtained for the collocated data
where \(Y_{\text{i}} = Y\left( {{\mathbf{x}}_{\text{i}} } \right)\); \(Y_{{\text{i}} + {\text{h}}} = Y\left( {{\mathbf{x}}_{\text{i}} + h} \right)\); h is the lag distance between a pair of sample data; n c is the number of pairs of Y or F at a distance h apart. However, in practice the electrical resistivity log sites rarely coincide with the pumping test sites. Equation (4) fails for non-collocated data and a cross-semivariogram between Y and F cannot be obtained. To cope with this problem, a pseudo cross-semivariogram has to be introduced to replace Eq. (4). For this study, the pseudo cross-semivariogram introduced by Clark et al. (1989) was used, where the two random functions must have the same unconditional mean and unconditional variance. The easiest way to achieve this requirement is to transform the primary and secondary data into the standard normal distribution (normal score) that has zero mean and unity variance (Deutsch and Journel 1998). In this study, the K Y and K F are transformed to the normal scores y and f, respectively, with zero means and unity variances. The pseudo cross-semivariogram is given by the expected squared difference between y and f measured at different locations,
where \(\widetilde\gamma _{{\text{yf}}} \) is the pseudo cross-semivariogram; and n yf is the number of pairs of y and f at a distance h apart. It can be shown that \(\widetilde\gamma _{{\text{yf}}} \left( h \right) = \widetilde\gamma _{{\text{fy}}} \left( { - h} \right)\). The auto-semivariograms for y and f are:
and
Details of the pseudo cross-semivariogram technique can be found in Myers (1991). Due to the normalization, the sills for \(\widehat\gamma _{{\text{yy}}} \), \(\widehat\gamma _{{\text{ff}}} \), and \(\widetilde\gamma _{{\text{yf}}} \) are unity as the unconditional variance. Moreover, the Cauchy-Schwarz inequality \( \widetilde{\gamma }_{{yf}} \leqslant {\sqrt {\widehat{\gamma }_{{yy}} \widehat{\gamma }_{{ff}} } } \) is required to ensure the positive definiteness in the following cokriging method. The Cauchy-Schwarz inequality will be used determine the lower and upper limits of the auto- and pseudo cross-semivariograms when semivariogram uncertainty is considered in the section Significance analysis using ANOVA.
Conditional estimation using hydrogeological data and resistivity data
The conditional estimation of the normalized hydraulic conductivity is made through a linear-weighted interpolation form, which honors the normalized primary and secondary data:
where \( \widehat{y} \) is the estimate of y; λ j and \( \nu _{{\ell }} \) are the weighting coefficients that need to be determined. n y is the number of primary data; and n f is the number of the secondary data. The conditional estimate of hydraulic conductivity is obtained via the back transformation of the normal score. By minimizing the variance of estimation error \( {\left[ {\widehat{y}{\left( {{\mathbf{x}}_{0} } \right)} - y{\left( {{\mathbf{x}}_{0} } \right)}} \right]} \) at an unsampled site x0, the optimal weighting coefficients λ j and \( \nu _{{\ell }} \) are obtained by the following set of linear equations,
where μ is the Lagrange multiplier. Equation (9) involves auto-semivariograms of y and f as well as the pseudo cross-semivariogram for non-collocated data fusion.
If primary data are sparse and more densely sampled secondary data are strongly correlated to the primary variable, geostatistics of cokriging can help improve estimates of the primary variable. In this work, the secondary data (the borehole resistivity logs) are non-collocated with the pumping test sites. Therefore, a pseudo cross-semivariogram, \( \widetilde{\gamma }_{{yf}} \) has to be used to replace the cross-semivariogram. For the comparison purpose, the kriging conditional estimation, which uses only the measured hydraulic conductivity data, is also considered
Therefore, Eq. (9) is reduced to the following for y,
Conditional simulation and cosimulation of hydraulic conductivity
The conditional estimation of hydraulic conductivity obtained by the kriging and cokriging methods is a smoothed distribution over the region, which does not reveal the actual variability of the hydraulic conductivity. However, the K variability has a significant influence on the groundwater responses when a groundwater flow model is adopted (Fogg 1986). In this study, conditional simulation and conditional cosimulation will be conducted to study the K variability in relation to groundwater model responses in the real-world case study. Moreover, the significance analysis using the conditional cosimulation against conditional simulation in groundwater responses will be conducted using analysis of variance (ANOVA), and is discussed later. Conditional simulation and cosimulation techniques simulate the estimation error (kriging error) as a correlated random process using the semivariogram and cross-semivariogram. For non-collocated data, a pseudo cross-semivariogram is adopted in the cosimulation process. This section provides the information and procedure for conducting cosimulation of the measured hydraulic conductivity and the resistivity-derived hydraulic conductivity in the study. The K simulation using measured hydraulic conductivity can be inferred by reducing the cosimulation procedure for a univariate model.
A cosimulated hydraulic conductivity field is generated by using a full linear model of cross covariance. The steps followed in the cosimulation algorithm include (1) modeling \( \widehat{\gamma }_{{yy}} \), \( \widehat{\gamma }_{{ff}} \), and \( \widetilde{\gamma }_{{yf}} \) for ensuring positive definiteness; (2) obtaining a cokriged estimate of the secondary data \( \widehat{f}_{{CoK}} \) at computation grids; and (3) calculating hydraulic conductivity estimates \( \widehat{y}_{{CoK}} \) (using SGSIM_FC subroutine; C.V. Deutsch, University of Alberta, Canada, personal communication, 2003). In the study area, the number of borehole resistivity data points is approximately the same number as that of the measured hydraulic conductivity data points. Therefore, the information from the primary data (y j) for the secondary data estimation (\( \widehat{f}_{{CoK}} \)) is important. This was verified by comparing the kriged estimate against the cokriged estimate of \( \widehat{f}_{{CoK}} \). A hybrid method was used to impose joint correlation and the correct variability in the cosimulation procedure:
-
Step 1.
Obtain a cokriged estimation of secondary data \( \widehat{f}_{{CoK}} \) over the field.
-
Step 2.
Obtain a single-variable simulation of the correlated residual of the secondary variable only, using the semivariogram for the secondary data and zero at the secondary data locations.
-
Step 3.
Sum the simulated result in step 2 and the cokriged results in step 1.
-
Step 4.
Use SGSIM_FC with the normal score transform of the conductivity data and results from step 3 in place of the secondary variable simulation.
-
Step 5.
Back-transform from normal score values of the primary data to the hydraulic conductivity values.
The aim here is to get the primary variable structure into the secondary variable via the cokriging step, with the assumption that the mean carries most of the information; the secondary residual is approximated as uncorrelated with the primary data.
Due to scarcity of measured hydraulic conductivity data and electrical-log data, one needs to investigate the uncertainty embedded in the semivariograms \( \widehat{\gamma }_{{yy}} \), \( \widehat{\gamma }_{{ff}} \), and \( \widetilde{\gamma }_{{yf}} \). From Eq. (5), the mean or expected pseudo cross-semivariogram is given by
The variance of the pseudo cross-semivariogram is given by
The covariance of \( {\left( {y_{i} - f_{{i + h}} } \right)}^{2} \) and \( {\left( {y_{j} - f_{{j + h}} } \right)}^{2} \) is
Substituting Eq. (11) into Eq. (10), the pseudo cross-semivariogram variance is
Similarly, the auto-semivariogram variances for y and f are
and
where \( \widehat{C}_{{yy}} \) is the covariance of \( {\left( {y_{i} - y_{{i + h}} } \right)}^{2} \) and \( {\left( {y_{j} - y_{{j + h}} } \right)}^{2} \); and \( \widehat{{\text{C}}}_{{ff}} \) is the covariance of \( {\left( {f_{i} - f_{{i + h}} } \right)}^{2} \) and \( {\left( {f_{j} - f_{{j + h}} } \right)}^{2} \). Equations (16) and (17) represent the variances of the semivariograms for a given lag distance, which are the average covariances between pairs of pairs used to calculate the semivariograms. If the data in y i and f j are Gaussian distributions, \( \widetilde{{\text{C}}}_{{yf}} \), \( \widehat{{\text{C}}}_{{yy}} \) and \( \widehat{{\text{C}}}_{{ff}} \) can be evaluated by the methods described in Ortiz and Deutsch (2002).
As for the semivariogram uncertainty analysis, the specific lower-limit and upper-limit semivariograms using the semivariogram variances for the auto-semivariogram models and pseudo cross-semivariogram models will be determined according to the real data. The lower and upper limit semivariograms along with the mean semivariograms will be used for the significance analysis using ANOVA.
Significance analysis using ANOVA
A single estimate of hydraulic conductivity cannot simulate the actual heterogeneity and complexity. In order to better understand the influence of the coregionalized model, ANOVA (analysis of variance) was adopted to differentiate the statistical significance among different approaches on generating K distributions. ANOVA is able to make inferences about mean differences by comparing variances among different groups (between-group variance) to the variance within the groups (within-group variance) using F-test (Fisher 1935). Donnelly-Makoweckia and Moore (1999) used a combination of ANOVA and jackknife to test the statistical significance of differences in a hydrologic model performance using different factors. Rong (2002) used ANOVA to compare MTBE (methyl tertiary-butyl ether) concentrations in sand/gravel and fine-grained soils. Ikem et al. (2002) used ANOVA to compare upgradient and downgradient contaminant concentrations near a waste site to find the impact of the waste site on the groundwater quality.
In this study, four groups of simulations, each using an alternative hydraulic conductivity field created by kriging, cokriging, simulation, and cosimulation, were run with variogram means, upper and lower bounds. Statistics from various flow responses are calculated and the significance of the (1) process (kriging versus simulation and cokriging versus cosimulation); (2) variables (kriging versus cokriging and simulation versus cosimulation); and (3) variogram variance (mean, 95% upper and lower bounds) are tested using a multifactor ANOVA test.
Case study
Semivariogram estimation and uncertainty in the upper Chicot Aquifer
The Chicot aquifer system is the principal groundwater source for southwestern Louisiana, which underlies 15 parishes. It is also the most heavily pumped aquifer in Louisiana. Rice irrigation accounts for about 85% of the groundwater pumped from the aquifer in Acadia Parish (Sargent 2002). The Chicot system in Acadia Parish is divided into Upper and Lower Chicot units (sandy aquifers) separated by clay lenses referred to as the Upper/Lower confining zone (Fig. 1). This study focuses on the hydraulic conductivity estimation in the Upper Chicot Aquifer in Acadia Parish.
The local characterization of the Chicot Aquifer in Acadia Parish was developed using geophysical resistivity logs obtained from 74 oil and gas exploration wells. Note that not all the geophysical logs could be used; only 53 of them were used in the cokriging method as some of them did not cover the Upper Chicot Aquifer. The geology is divided into: confining clay layer, Upper Chicot Aquifer, dividing layer, Lower Chicot Aquifer. Figure 1a shows a west-east cross section of the local model.
Field K measurements
There are 42 hydraulic conductivity values (Fig. 2c) determined from specific capacity tests according to Eq. (15) (Bradbury and Rothschild 1985),
where Q is the pumping rate; s is drawdown at time t; b is the Upper Chicot Aquifer thickness; \( {\ell }_{s} \) is the length of screen interval; r w is the well radius; and S s is the specific storage. The term CQ 2 represents the well loss, where C is the well loss constant. The second term in the bracket corrects the K Y calculation due to partial penetration of the pumping well. The term G is the polynomial function of the ratio of the screen length to the aquifer thickness,
The technique solves Eq. (15) in an iterative fashion by taking other parameters as an input to Eq. (15). The Upper Chicot Aquifer thicknesses (b) were determined by analyzing well log data (Carlson et al. 2003).
The hydraulic conductivity values of this study were determined from specific capacity tests associated with the development of high-capacity wells usually for irrigation, public supply or industrial use. Because of the intended use of these wells, the average test discharge Q was 7,850±6,210 m3/day. The dimensions of the wells are large in response to the intended large demand for water: average radius is 9.8±3.1 cm, and average screen length is 17.4±7.1 m. The specific capacity tests lasted on average 9.8±9 h. These wells are clearly partially penetrating wells in that the average sand thickness surrounding a well screen is 45.1±15.9 m. The average specific capacity test drawdown is 7.9±5.0 m.
From previous studies, the storativity values for the Chicot Aquifer range from 1.1×10−4 to 3×10−3 with an average of 6.7×10−4. These values are a clear indicator that the Chicot is a confined aquifer (Weight and Sonderegger 2001). The confining clay layer above the Upper Chicot in Acadia Parish has an average thickness of 28.9±10.0 m as determined from examination of 2,648 well log reports.
The well loss constant value used for this study is 2.66×10−8 day2/m5, which is significantly smaller than the one listed in Bradbury and Rothschild (1985), which is 5.46×10−6 day2/m5 after unit conversion for Q. However, a smaller value of well loss constant is used to avoid well loss head exceeding drawdown for this study’s high capacity tests were the average Q is 7,850±6,210 m3/day compared to the smaller tests of Q=10 gallons per meter (54.5 m3/day) noted in Bradbury and Rothschild (1985).
The capacity test data set (42 values) shows that the measured hydraulic conductivity values vary from 5 to 700 m/day with a geometric mean \( \overline{K} _{Y} = 88\,{\text{m}}/{\text{day}} \) and a standard deviation \( \sigma _{{K_{Y} }} = 115\,{\text{m}}/{\text{day}} \).
Electrical-log data
A total of 53 oil and gas geophysical resistivity logs (Fig. 2c) were obtained from Louisiana Department of Natural Resources, Office of Conservation (OC), Well Log Library. To obtain the average R 0 for each log, the resistivity readings are divided into 3-m (10-ft) sections between the top and bottom elevation of Upper Chicot Aquifer. Again, it is reasonable to assume constant formation water resistivity because the study area is small and only freshwater is present in the Upper Chicot unit. The groundwater temperature is reported as 25°C. The formation water resistivity R w=12.7 ohm-m, water specific weight γ w=9.8 KN/m3 and dynamic viscosity \( \mu _{w} = 1 \times 10^{{ - 3}} \,{\text{N}}{\text{.s}}/{\text{m}}^{2} \) are used in this study. The variance in the sand formation resistivity factor (R 0/R w) is therefore the result of differences in sand/clay ratio and porosity. The effective saturated formation resistivity over the depth of the Upper Chicot is the average of the formation resistivity (R 0) values calculated at the 3 m (10 ft) intervals. The 53 borehole resistivity logs in the Upper Chicot Aquifer show that the formation resistivity value varies from 43 to 100 ohm-m with a mean of 75 ohm-m and a standard deviation 16 ohm-m. The average effective particle diameter d e is 0.42 mm (1.22 in phi units) calculated from US Geological Survey unpublished sieve test data files (US Geological Survey, unpublished data, 2003).
As for the pore geometry coefficient and cementation factor in the Archie’s equation, the maximization of the correlation over a short lag distance between the measured hydraulic conductivity (y i) and resistivity-driven hydraulic conductivity f j (a, m) in the normal scale was conducted to obtained the optimal values of for a and m. This is equivalent to finding the best a and m values such that the deviation between y i and f j (a, m) is minimized:
Eight pairs of y i and f j were selected with a lag distance less than 300 m. The bounds of a and m were 0.62≤a≤2.45 and 1.08≤m≤2.15, respectively, according to Asquith and Gibson (1982). It has been noted that the zero-lag distance should be used for collocated data. However, a subjective selection of a short lag distance (300 m) is necessary for non-collocated data. Equation (20) is solved by a gradient-based nonlinear optimization method and the optimal values of a and m are found to be 1.76 and 1.64. Figure 3 shows the cross plot for the eight pairs of y i and f j with optimal a and m values, which indicates a nugget between the measured hydraulic conductivity and resistivity-driven hydraulic conductivity data.
The electrical-log data set (53 values) shows that the resistivity-driven hydraulic conductivity values vary from 4 to 66 m/day with a geometric mean \( \overline{K} _{F} = 14\,{\text{m}}/{\text{day}} \) and a standard deviation \( \overline{\sigma } _{F} = 13.7\,{\text{m}}/{\text{day}} \).
Trend analysis and normal score transformation
The measured hydraulic conductivity data and resistivity-driven hydraulic conductivity data are tested for trend analysis using t-statistic from a bilinear model. The bilinear model is fit in the X-direction and Y-directions (Fig. 2c) to detect if a trend exits. With a 10% level of significance, Table 1 concludes that the conductivity data have no significant trend in the X or Y direction. Moreover, the measured hydraulic conductivity and resistivity-driven hydraulic conductivity show positive skewness, which indicates that one can assume hydraulic conductivity to be log-normally distributed (Domenico and Schwartz 1990). In this study, the data are standardized to zero mean and unit variance and transformed to normal scores (Deutsch and Journel 1998). The normal transform facilitated computation of the Archie coefficients (a and m) and the pseudo cross-semivariogram, which requires both random functions to have the same mean and variance.
Experimental semivariograms and pseudo cross-semivariogram
The 42 measured hydraulic conductivity values and 53 resistivity-driven hydraulic conductivity values are used in conditional estimation of the hydraulic conductivity field for both the regionalized model (i.e., \( \widehat{y} \) in Eq. (10)) and coregionalized model—i.e., \( \widehat{\gamma } \) in Eq. (8). Again, as none of the data in this study area are collocated, a pseudo cross-semivariogram is used instead of the cross-semivariogram. The experimental auto-semivariograms (\( \widehat{\gamma }_{{yy}} \) and \( \widehat{\gamma }_{{ff}} \)) and pseudo cross-semivariograms (\( \widetilde{\gamma }_{{yf}} \)) are computed using Eqs. (5)–(7) and shown in Fig. 4.
This study adopts a spherical (Sph) semivariogram model with nugget as the following:
where h is the lag distance; c 0 is the nugget; c 1 is the relative sill; (c 0 + c 1) is the total sill; and c 2 is the semivariogram range. Moreover, γ Sph(0)=0 and \( \gamma _{{Sph}} = c_{0} + c_{1} \) for h>c 2. With the spherical model, the integral scale is
The parameters (nugget, sill, and range) embedded in the semivariograms are estimated through a weighted normalized least-squares estimation (Cressie 1985):
where h k is the kth lag distance; γ Sph is the spherical semivariogram model; and n(h) is number of points at each lag representing the weight for the selected lag distance. The minimization problem in Eq. (23) is subjected to the Cauchy-Schwarz inequality conditions to ensure the auto-semivariogram models and pseudo cross-semivariogram model are positive definite (Hohn 1998). The experimental semivariograms in Fig. 4 were fitted by the weighted least-squares estimation method to obtain the optimal values for c 0, c 1, and c 2. The total sill for the auto-semivariogram is equal to 1, which represents the unconditional variance of the normal score transformed data. However, the total sill for the pseudo cross-semivariogram is 0.9. Table 2 lists the optimized semivariogram parameter values and shows moderate correlation at short range (as expected when investigating hydraulic conductivity). The resulted semivariogram models are shown in Fig. 4. Especially, the pseudo cross-semivariogram in Fig. 4c shows correlation over a length of approximately 15,000 m, approximately one-half of the length of the study region, which indicates the appropriateness of using cokriging for the correlated primary and secondary data. The semivariogram models determine kriging weights and cokriging weights (see section Conditional estimation using hydrogeological data and resistivity data), which are used to obtain the conditional estimates and conditional variances. Moreover, these semivariogram models are the means in the following semivariogram uncertainty analysis. The kriged and cokriged hydraulic conductivity distributions using the mean semivariograms are shown in Fig. 5.
Semivariogram uncertainty
The experimental auto-semivariograms of y and f are the mean semivariograms according to the method described in section Significance analysis using ANOVA. A FORTRAN program (C.V. Deutsch, University of Alberta, Canada, personal communication, 2003) is used to determine the semivariogram variance that allows determination of the lower and upper limits under the assumption that the semivariograms are log-normally distributed at each lag distance. The procedures to determine these limits are as follows.
First, the 90% confidence interval was used to determine the semivariogram limits for \( \widehat{\gamma }_{{yy}} \). Figure 6a show the 5% percentile, 10% percentile, and so forth for each lag distance. The upper limit semivariogram model is determined using the spherical model to best fit the 95% limits. As for the lower-limit semivariogram model, the spherical model has 5% limit at the first lag distance and gradually moved toward the sill according to Ortiz and Deutsch (2002). Figure 6a shows the lower and upper limit semivariograms, which are positive definite. The upper limit for \( \widehat{\gamma }_{{ff}} \) is also determined by the best fit of the spherical model to the 95% limit at each lag distance as shown in Fig. 6b. However the lower limit of \( \widehat{\gamma }_{{ff}} \) and the lower and upper limits of \( \widetilde{\gamma }_{{yf}} \) , as shown in Fig. 6c, are to be adjusted such that the Cauchy-Schwarz inequality condition \( \widetilde{\gamma }_{{yf}} \leqslant {\sqrt {\widehat{\gamma }_{{yy}} \widehat{\gamma }_{{ff}} } } \) is satisfied for positive definiteness. Again, these lower and upper limits are spherical models with the model parameters listed in Table 2.
The cokriged hydraulic conductivity fields using the lower and upper limits of the semivariograms in Fig. 7 show that the incorporation of the electrical log data introduces more heterogeneity which cannot be observed in the kriged hydraulic conductivity fields. However, because the cokriged field has less conditional variance, the simulated field using kriging has higher variability in hydraulic conductivity than that using cosimulation as shown in Fig. 8. The overall structure of the cokriging and cosimulation are similar, but the cosimulation has a higher variance. Moreover, the lower limit semivariogram introduces longer correlation length compared with that of the mean semivariogram. The upper limit semivariogram has higher kriging variance that that of the lower limit semivariogram.
Significance analysis on groundwater responses
With the obtained semivariogram models and their associated semivariogram variances, this section investigates the significance of hydraulic conductivity heterogeneity to the groundwater system responses for different hydraulic conductivity distributions using the kriging method, cokriging method, simulation, and cosimulation with the mean, lower limit and upper limit semivariograms. In this section, a groundwater flow model in the study area is developed to conduct ANOVA on groundwater responses that include hydraulic head variability and mean groundwater level.
Development of groundwater flow model
A regional-scale model of the Chicot Aquifer system in southwest Louisiana was developed to better understand the groundwater flow system in the region (Hanson et al. 2001). The Chicot groundwater model underlying Acadia Parish is based on MODFLOW (Harbaugh and McDonald 1996) and has five layers and 50 rows and 50 columns, resulting in a grid size of approximately 0.83 km2. The grid is oriented 20° counter-clockwise from the geographic coordinate in order to closely align with the flow direction (Fig. 2). A local model in Acadia Parish has been developed to better understand the groundwater flow dynamics in the study area (Rahman 2005). A telescopic mesh refinement technique MODTMR (Leake and Claar 1999) was used to extract the modeling heads from the regional model (Chicot model) and assign the heads to the hydraulic head boundary conditions of the local model. Therefore, time-varied head boundary conditions are specified for all sides of the study domain. The top layer has a constant head-boundary condition; and the bottom layer is the no-flow boundary. The initial condition of the hydraulic head is created from the 1961 water level estimates. Pumping well types and locations were obtained from the Water Well GIS database of Louisiana Department of Transportation and Development (DOTD). A total of 411 water wells (26 industrial, 27 public supply and 358 irrigation wells) are included in the model. Only industrial, irrigation and public supply water wells are used in this study as these three types are responsible for more than 90% of the groundwater withdrawn from the Chicot Aquifer (Sargent 2002). Yearly pumping rates were estimated by linear interpolation from the 5-year water-use reports of US Geological Survey (Sargent 2002). The total rate was evenly distributed to the registered wells by dividing yearly pumping rate by total number of registered wells within each sector. The groundwater flow was simulated from the year of 1961 to the year of 2001 and the results at the final year are used for analyzing the head responses.
Groundwater responses
Three responses are derived from the groundwater flow model and investigated for the ANOVA test. The responses are (1) hydraulic head variability (R 1), (2) specific capacity of a pumping well (R 2), and (3) mean groundwater level (R 3). The hydraulic head variability (first response, R 1) is taken as the mean squares error of the difference between the initial hydraulic head distribution in 1961 and the calculated hydraulic head distribution in the year 2000 obtained using different hydraulic conductivity distributions over 2,500 computational nodes at the Upper Chicot Aquifer:
where φ ini,i is the initial groundwater head; φ r,i is the simulated groundwater head for a particular realization r; n r is the number of realizations.
The number of realizations needed depends on the uncertainty of the K field being addressed (Deutsch and Journel 1998). In this study, the number of realizations (n r) is selected based on a simple test. Model responses (R 1) for different numbers of hydraulic conductivity realizations are calculated while keeping everything else as constant. Response for 20 realizations is taken as the finest case. The error terms, \( \varepsilon _{R} = {\left| {R_{1} - R_{{finest}} } \right|} \), are plotted against the square root of number of realizations, \( {\sqrt {N_{{realizations}} } } \). Increasing the number of hydraulic conductivity realizations from 8 to 20 (finest case) only accounts for 6% of response uncertainty. Therefore, an average over 20 realizations was used in stochastic cases. When the kriged field and cokriged field are considered, the n r is equal to 1.
The specific capacity is also a key factor to assess the groundwater response. A decrease in the specific capacity indicates a decline in the productivity of the well due to the lower effective hydraulic conductivity field in the near-well area. A decrease in the specific capacity decreases the ability of the well to produce water economically. The specific capacity assesses influence of the hydraulic conductivity field on the ability of a well to produce water at a prescribed flow rate. In this study, the second response was chosen to be the specific capacity of an irrigation well (second response, R 2) located at row 31 and column 25—561235 E (easting), 3347383 N (northing) in the UTM coordinate system—with the real pumpage Q real=757 m3/day (0.2 million gallons per day). The irrigation well located near the center of the study area is chosen as the boundary conditions have least influence on it. Simulated drawdown (s) at the end of last stress period is used to evaluate the specific capacity:
where s r is the simulated drawdown for a particular realization. An average over 20 realizations was considered for the simulation process.
This study considers the average groundwater head from a 30×30-inner grid (from row 10 to row 40 and from column 10 to column 40) at the last stress period as the third response, R 3:
The three responses are computed from the groundwater model using the alternative hydraulic conductivity fields (Table 3).
Statistical analysis using ANOVA
All three responses are found insensitive to semivariogram uncertainty for the kriged and cokriged models. For the simulation and cosimulation results, increased variance in hydraulic conductivity fields (i.e., upper limit semivariogram) causes more variation in water-head variability and specific capacity. In all levels of semivariogram uncertainty, the coregionalized method has higher water levels than those in the regionalized method.
Results from the cosimulated scenario (Table 3) show that there is a 10% chance that specific capacity is less than the cokriging estimate by at least 11%, and specific capacity from the simulated scenario is less than the kriging estimate by at least 8%. However, this difference may be too small to be of practical importance.
Table 3 shows that the variance of the average water-level response using simulation is greater than the variance using cosimulation. To verify the significant difference in these two variances, the F-tests method was introduced to conduct the significance analysis at all three levels of semivariogram uncertainty. The F-test results shown in Table 4 conclude that the variances are different at the 10% significance level. This is because the cosimulation reduces uncertainty in aquifer heterogeneity and aquifer response by introducing the secondary variable in the simulation process.
The t-tests in Table 5 show that the mean specific capacity using simulation and cosimulation is significantly different from conditional estimates using kriging and cokriging methods at 10% level of significance. Table 6 shows the ANOVA significance analysis results on the flow model responses in terms of the processes (i.e., kriging versus simulation; cokriging versus cosimulation), variables (i.e., kriging versus cokriging; simulation versus co-simulation); and semivariograms (i.e., lower limit, mean, and upper limit). At 10% level of significance, the ANOVA results show that semivariogram uncertainty does not have a significant effect on any of the flow model responses.
However, the results in the ‘process’ class (Table 6) show that the use of simulation and cosimulation methods has a more significant impact on all of the flow model responses than the use of kriging and cokriging methods. The results in the ‘variable’ class (Table 6) also show that the use of the resistivity data does have a significant effect on the flow model responses.
Conclusions
Geophysical data such as borehole electrical resistivity have been integrated with the measured hydraulic conductivity under the cokriging framework to improve spatially distributed hydraulic conductivity estimation. A pseudo cross-semivariogram has been evaluated to cope with the non-collocated resistivity data. The improvements in the groundwater model have been investigated by comparing the head levels from different hydraulic conductivity models and the significance of either (1) processes; (2) variables; or (3) semivariograms as determined by ANOVA. The results show that the use of the coregionalized field is statistically significant in the flow-model response compared to the regionalized model. The study also shows that the simulation process can reproduce the aquifer heterogeneity and is statistically significant in the flow-model response; therefore, cosimulation should be used instead of cokriging. The semivariogram uncertainty has been studied in all models, both regionalized and coregionalized, by using the upper and lower-limit semivariograms, assuming that the sills are log-normally distributed. However, results show that the semivariogram uncertainty on the groundwater-flow model response is not significant.
References
Archie GE (1942) The electrical resistivity logs as an aid in determining some reservoir characteristics, AIME Trans 146:54–62
Asquith GB, Gibson CR (1982) Basic well log analysis for geologists. American Association of Petroleum Geologists Methods in Exploration Series 3, The American Association of Petroleum Geologists, Tulsa, OK, USA, 216 pp
Boman GK, Molz FJ, Guven O (1995) An evaluation of interpolation methodologies for generating three dimensional hydraulic property distribution from measured data. Ground Water 33(2):247–258
Bradbury KR, Rothschild ER (1985) A computerized technique for estimating the hydraulic conductivity of aquifers from specific capacity data. Ground Water 23(2):240–246
Carlson D, Milner R, Hanson B (2003) Evaluation of the aquifer capacity to sustain short–long term groundwater withdrawal from point sources in the Chicot Aquifer for southwest Louisiana, part 1. Report of Investigation Series 03–01, 90, Louisiana Geological Survey, Baton Rouge, LA
Carman PC (1956) Flow of gases through porous media. Butterworths, London
Carrera J, Neuman SP (1986) Estimation of aquifer parameters under transient and steady-state conditions. 1. Maximum-likelihood method incorporating prior information. Water Resour Res 22:199–210
Clark I, Basinger K, Harper W (1989) MUCK: a novel approach to co-kriging. In: Buxton BE (ed) Proceedings of the Conference on Geostatistical, Sensitivity and Uncertainty: methods for groundwater flow and radionuclide transport modeling. Battelle Press, Columbus, OH, USA, pp 473–494
Cressie N (1985) Fitting variogram models by weighted least squares. J Int Assoc Math Geol 17:563–586
Deutsch CV, Journel AG (1998) GSLIB: geostatistical software library and users guide, 2nd edn. Oxford University Press, New York, 369 pp
Domenico PA, Schwartz FW (1990) Physical and chemical hydrogeology. Wiley, New York
Donnelly-Makoweckia LM, Moore RD (1999) Hierarchical testing of three rainfall-runoff models in small forested catchments. J Hydrol 219:136–152
Eggleston JR, Rojstaczer SA, Pierce JJ (1996) Identification of hydraulic conductivity structure in sand and gravel aquifers: Cape Cod data set. Water Resour Res 32(5):1209–1222
Feyen L, Ribeiro PJ, De Smedt F, Diggle PJ et al (2003) Stochastic delineation of capture zones: classical versus Bayesian approach. J Hydrol 281(4):313–324
Fisher RA (1935) The design of experiments. Oliver and Boyd, Edinburgh, UK
Fogg GE (1986) Groundwater-flow and sand body interconnectedness in a thick, multiple-aquifer system. Water Resour Res 22(5):679–694
Gloaguen E, Chouteau M, Marcotte D, Chapuis R (2001) Estimation of hydraulic conductivity of an unconfined aquifer using cokriging of GPR and hydrostratigraphic data. J Appl Geophys 47(2):135–152
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York, 483 pp
Hanson B, Milner R, Willson C, Rahman A, Paulsell R (2001) Evaluation of aquifer capacity to sustain long term ground water withdrawal from point sources: a pilot study. Report to the Louisiana Department of Natural Resources, Baton Rouge, 171 pp
Harbaugh AW, McDonald MG (1996) User’s documentation for MODFLOW-96, an update to the U.S. geological survey modular finite-difference ground-water flow model. US Geol Surv Open-File Rep 96–485, 56 pp
Hohn ME (1998) Geostatistics and petroleum geology, 2nd edn. Kluwer, Dordrecht, The Netherlands, 248 pp
Ikem A, Osibanjo O, Sridhar MKC, Sobande A (2002) Evaluation of groundwater quality characteristics near two waste sites in Ibadan and Lagos, Nigeria. Water Air Soil Poll 140:307–333
Kelly WE (1977) Geoelectric sounding estimating aquifer hydraulic conductivity. Ground Water 15(6):420–425
Kitanidis PK (1986) Parameter uncertainty in estimation of spatial functions: Bayesian-analysis. Water Resour Res 22(4):499–507
Kitanidis PK, Vomvoris EG (1983) A geostatistical approach to the inverse problem in groundwater modeling (steady state) and one-dimensional simulations. Water Resour Res 19(3):677–690
Koltermann C, Gorelick SM (1996) Creating images of heterogeneity in sedimentary deposits: a review of descriptive, structure-imitating, and process-imitating approaches. Water Resour Res 32(9):2617–2658
Kosinski WK, Kelly WE (1981) Geoelectric soundings for predicting aquifer properties. Ground Water 19(2):163–171
Leake SA, Claar DV (1999) Procedures and computer programs for telescopic mesh refinement using MODFLOW. US Geol Surv Open-File Rep 99–238, 53 pp
McLaughlin D, Townley LR (1996) A reassessment of the groundwater inverse problem. Water Resour Res 32(5):1131–1162
Myers DE (1991) Pseudo-cross variograms, positive definiteness and cokriging. Math Geol 23:805–816
Neuman SP (1990) Universal scaling of hydraulic conductivities and dispersivities in geologic media. Water Resour Res 26(8):1749–1758
Ortiz CJ, Deutsch CV (2002) Calculation of uncertainty in the variogram. Math Geol 34(2):169–183
Poeter EP, Gaylord DR (1990) Influence of aquifer heterogeneity on contaminant transport at the Hanford site. Ground Water 28(6):900–909
Rahman A (2005) Improvements in groundwater flow modeling through the integration of resistivity logs and hydraulic conductivity and the use of variogram uncertainty, PhD Thesis, Louisiana State University, USA, 104 pp
Rehfeldt KT, Boggs JM, Gelhar LW (1992) Field-study of dispersion in a heterogeneous aquifer. 3. Geostatistical analysis of hydraulic conductivity. Water Resour Res 28(12):3309–3324
Ritzi RW, Jayne DF, Zahradnik AJ, Field AA, Fogg GE (1994) Geostatistical modeling of the heterogeneity in glaciofluvial buried-valley aquifers. Ground Water 32(4):666–674
Rong Y (2002) Groundwater data analysis for MTBE relative to other oxygenates at gasoline-impacted sites 9(4):184–190
Sargent P (2002) Water use in Louisiana, 2000, Special Report no. 15, Department of Transportation and Development Water Resources, Baton Rouge, LA, USA
Vukovic M, Soro A (1992) Determination of hydraulic conductivity of porous media from grain-size composition. Water Resources Publications, Highlands Ranch, CO, USA, pp 1–83
Weight WD, Sonderegger JL (2001) Manual of applied field hydrogeology, McGraw-Hill, New York, 608 pp
Acknowledgements
The research was partially supported by Louisiana Department of Transportation and Development; Louisiana Geological Survey; and Louisiana State University Faculty Research Grant Program. Special thanks go to C.V. Deutsch, Dept. of Civil and Environmental Engineering, University of Alberta, Canada for his invaluable discussion. We also acknowledge the help from R. Milner of the Louisiana Geological Survey for resistivity-log data interpretation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rahman, A., Tsai, F.TC., White, C.D. et al. Geophysical data integration, stochastic simulation and significance analysis of groundwater responses using ANOVA in the Chicot Aquifer system, Louisiana, USA. Hydrogeol J 16, 749–764 (2008). https://doi.org/10.1007/s10040-007-0258-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10040-007-0258-x