1 Introduction and motivation

Hydraulic fracturing is the process of injecting highly pressurized slurry to crack tight formations and unlock trapped oil and gas pockets (King 2012). Although the use of this technology dates back several decades, recent advances in horizontal drilling and the capability of reaching deep target formations, with lower costs, have resulted in extensive use of hydraulic fracturing within the oil and gas industry. Energy independence and improved economy are supporting the wide- spread use of this technology (Arthur et al. 2009; USEPA 2012).

Despite the benefits, hydraulic fracturing has been the subject of concerns, raising questions on its impact on the environment (Kargbo et al. 2010; USEPA 2012, 2015). To ensure safe operation and alleviate any concern, there are unique safety and environmental considerations to be addressed (Heinecke et al. 2014; Jabbari et al. 2015b). With the relatively short history of horizontal drilling, long term environmental consequences of hydraulic fracturing will require further investigation (Engelder et al. 2011).

Three potential environmental issues are earthquake hazard, air and water pollution. There is a likelihood of generating artificial earthquake tremors (i.e. induced seismicity) during the fracturing operation (Pater and Baisch 2011) or in deep well injection of waste fluid (Aminzadeh et al. 2014). Air pollution originates from volatile organic compounds and particulate matters (McKenzie et al. 2012; Moore et al. 2014). Surface water (Hammer and VanBriesen 2012; VanBriesen et al. 2014) and underground water contaminations are caused by either fracturing fluid chemicals or volatile compounds from deep formations (Osborn et al. 2011; Vengosh et al. 2013; Jabbari et al. 2015a; Llewellyn et al. 2015).The returned fluid handling and treatment is another source of possible environmental complications (Gregory et al. 2011; USEPA 2012; Gordalla et al. 2013; Glazer et al. 2014; Jiang et al. 2014; VanBriesen et al. 2014).

This paper will not deal with the induced seismicity nor air pollutions issues. Instead, we will focus on shallow groundwater contamination resulting from failure in the injection system. Our aim is to evaluate the risks of such incidents to human health, and define how the geological and well settings control the magnitude of these risks. The conceptual model of this study is described as a coupled geological system consisting of the groundwater table and the formation underneath.

Assessing the risks to groundwater reserves is a challenging task for several reasons. First, we lack a detailed characterization of the geological system (Rubin 2003). Second, there is variability and uncertainty in the hydrogeological, operational, and health parameters that significantly affect the risk assessment (USEPA 2001). As a consequence, groundwater-driven health risk analysis should be treated within a probabilistic framework (Ciriello et al. 2012). Probabilistic human health risk assessment due to aquifer contamination has been the topic of intense research studies (Andričević and Cvetković 1996; Ma 2002; Benekos et al. 2007; López et al. 2008). Many of the studies showed how the uncertainty related to human health risk, such as increased lifetime cancer risk, is largely impacted by the uncertainty in spatial patterns of the flow field, as well as human exposure and physiological parameters (e.g. Maxwell and Kastenberg 1999; Maxwell et al. 1999; de Barros and Rubin 2008; de Barros et al. 2009; Siirila and Maxwell 2012). In a review paper, different methods to probabilistically quantify groundwater risk analysis are discussed in detail (Tartakovsky 2013). Efficient computation of probabilistic risk analysis in groundwater systems can be found in Ciriello et al. (2012). Although several studies on groundwater-driven risk analysis are available in the literature, there is a need to further develop, and subsequently employ these methodologies to hydraulic fracturing operations in order to improve our capacity to predict and control the associated risks.

Groundwater can potentially be contaminated with chemical additives used in hydraulic fracturing operations. Failures in onsite storage and poor management lead to spills from the surface (Vengosh et al. 2014). Gas migration from deep formations, and contamination from additives of fracturing slurry during and after the high pressure injection process are among other potential risks to groundwater (Osborn et al. 2011; USEPA 2012; Rozell and Reaven 2012; Jackson et al. 2013; Llewellyn et al. 2015). Turbidity, color change, and the odor of water have been reported in groundwater resources of the cities near gas production fields (DiGiulio et al. 2011; Holloway and Rudd 2013). The problem of groundwater contamination is heightened in regions with water scarcity and a high dependence on underground supplies (Kargbo et al. 2010; Rahm 2011; USEPA 2012). The risk pathways to shallow aquifer are divided into two categories: (1) over ground accidental spills and (2) well integrity failure and upward fluid migration (USEPA 2012; Kissinger et al. 2013). Well integrity issues are further divided into two groups (Holloway and Rudd 2013): (1) behind the casing upward movement of the fluid (annular flow, in which fluid flows in the space between the casing outer wall and the cementing around it) and (2) leakage from the well in a radial pattern (leak flow). Although active pressure monitoring does help to increase the chance of underground leakage discovery (Gordalla et al. 2013), this failure scenario has severe consequences, which makes it worthwhile to investigate.

Upward migration of fluid, and shallow groundwater contamination caused by hydraulic fracturing have received considerable attention within the hydrogeological community. Brine migration in the Marcellus shale has been reported as a possible risk pathway to the shallow groundwater using a 2D single-phase flow model (Myers 2012; Saiers and Barth 2012; Cohen et al. 2013). Hypothetical work on geological settings in Germany using a 3D multicomponent multiphase model shows gas migration potential from shale formations in lengthy time periods (Kissinger et al. 2013). However, Kissinger et al. (2013) addressed a deterministic problem and did not take into account the stochastic nature of parameters, nor did it include evaluation of the risk. In a more recent study, the possibility of contaminant transport through a faulting system has been reported using a 2D single phase multicomponent model (Gassiat et al. 2013). But that study did not investigate the risk characterization within a 3D modeling scope, and did not consider multiphase flow and transport, nor the associated capillary effects (Reagan et al. 2015).

In spite of the existing body of work in the literature, there is a need to develop risk frameworks for water contamination that are tailored to the hydraulic fracturing process, while capturing the specific features of the subsurface (e.g. coupled saturated–unsaturated flow) on the risk response and corresponding uncertainties. The final goal of this study is to ensure that, while maintaining the efficiency of the hydraulic fracturing operation, water resources and human health are protected.

In this paper, we investigate the impact of both geological features and hydraulic fracturing operational parameters on the risk magnitude and its uncertainty. Prior to uncertainty quantification, we perform a set of deterministic analyses to investigate the sensitivity of risk to different parameters within the hydraulic fracturing scenario. Next, we analyze the effect of hydrogeological (anisotropic permeability field, water saturation and porosity) and operational (point source location and leakage rate) parameters on the chemical concentration, the source-to-receptor arrival time, and the human health risk assessment. Well casing rupture and contamination leakage into the surrounding formation during injection is introduced as the failure event with low probability of occurrence. This rare (extreme) event can yet be followed by a major consequence, such as contamination of nearby drinking water resources. Our work will address the following questions: (1) What are the most important parameters when characterizing the risk in a casing failure scenario? (2) What are the probabilities of exceeding a threshold concentration value for a specific chemical? (3) What is the most useful focal point for the regulators to reduce the human health risk in such complex systems? The findings reported in this study are valid only for the specific geological setting and parameter range discussed.

2 Review of a hydraulic fracturing operation

A hydraulic fracturing operation comprises complementary steps in drilling, casing, cementing, and injection. Fluid injection, at the core of the entire process, is performed in two phases (Spellman 2012). First, the mixture of chemicals and water (aka the pad) is injected. Second, propping agents are added to the injectant to prop the created fissures open (Economides and Martin 2007). Due to the high pressure needed to fracture the target formation, an interval to be fractured (i.e. kilometers) is divided into smaller segments. Each segment is then fractured separately in a stage which may last from 45 min up to a few hours. After the fracturing job is complete, the fluid is pumped out of the well (Economides and Martin 2007).

Hydraulic fracturing is very effective in extracting natural gas from low-permeability formations such as shale basins. Shale is a tight formation with permeability values as low as 0.01 nano-Darcies (10−20 m2) (Arthur et al. 2009) and not larger than 10 milli-Darcies (10−14 m2) (Freeze and Cherry 1979). In the United States shale formations exist in various depths, depending upon geological features found in different regions of the country. Barnett, Marcellus, Fayetteville, Haynesville, Woodford, Antrim, New Albany, and Lewis are among famous U.S. gas shales. These formations are 1220–2255 m deep on average. New Albany is the shallowest shale formation in this set with depth values of 150–610 m followed by Antrim (183–671 m) as the second shallowest (Arthur et al. 2009). As suggested by the United States Energy Information Administration (USEIA), shale plays exist in five regions with porosity values ranging from 1 to 12 % (USEIA 2011). Formation overlaying the shale is normally more pervious. The geological units forming the overburden differ from one shale play to another, but they usually consist of a set of sandstone, inter-bedded shale, siltstone, and mudstone, similar to the case of Marcellus formation (Saiers and Barth 2012). In general, the reported permeability values for sandstone formations vary from 10−17 to 10−13 m2 (Gleeson et al. 2011). In another classification by Bear (1988), fresh sandstone shows a permeability variation between 10−15 and 10−14 m2; however, that range increases for oil and gas bearing sandstones with values ranging from 10−13 up to 10−11m2 for naturally fractured reservoirs. Porosity of sandstone falls within the range of 5–30 % (Freeze and Cherry 1979).

3 Problem statement

A groundwater contamination problem resulting from a failure in the injection system is considered. The contamination is assumed to be released from a point source on the well vertical casing during the injection phase via a breach on the well casing. In high pressure injection operations, leakage from the existing or new breaches is common, and in some instances will result in well casing burst and blow out (USEPA 2012). Potential health problems from the groundwater exposure pathway are assessed through a human health risk framework.

In this study, human health risk is termed as the hazard of being exposed to a non-carcinogen for long time (30 years) through the drinking water pathway. The adverse health effects are quantified by adapting USEPA’s chronic Hazard index (HI) for Tetraethylenepentamine (TEPA) among the various chemicals of concern. TEPA has applications in fracturing fluid as a stabilizer (Gordalla et al. 2013). The reason for selecting TEPA is twofold: (1) availability of operational and toxicological data (although limited), and (2) for illustration purposes. This compound is known to be non-reactive, non-biodegradable, fully soluble in water and stable in the environment (International Programme on Chemical Safety 2001). The density of TEPA is reported as 0.993 g/cm3 at 20 degrees Celsius (International Programme on Chemical Safety 2001). TEPA is an example of a chemical compound that has not been investigated in drinking water resources, as it has been deemed as an agent with a low likelihood of occurrence in drinking water sources. The main reason for choosing HI over other risk metrics (such as cancer risk) is the limited toxicology studies available for TEPA and many other contaminants of the fracturing fluid.

The formulated risk for this study is a function of two different sets of parameters—namely, operational and hydrogeological. Physical and hydraulic properties of the formation outside the well and near the leakage point define the hydrogeological parameters. The operational parameters, on the other hand, are selected as the leakage rate and the location of a leakage point with respect to the bottom of fresh water reserve.

The effects of different parameters on risk are first investigated through deterministic scenarios, which will help us improve our understanding of the influence of model input parameters on the prediction of interest. Then, uncertainties in the parameters are acknowledged by conducting stochastic analyses through the Monte Carlo simulations. In all cases, we will assume a homogeneous and anisotropic permeability field. Effects of spatial heterogeneities in the permeability field are out of the scope of this work and will be addressed in a future study. One should notice that geological formations are very complex in terms of hydrogeological parameters and settings (i.e. porosity, permeability, water saturation, etc.). Effects of spatial uncertainty in the permeability field and the presence of geological features such as faults, inclined stratification, fissures, and channels have an undeniable impact on the results of studies. In this study, uncertainty in the parameters domain is quantified by the CDF in order to determine the probability of exceedance of three distinct environmental performance metrics (EPMs) that are relevant to risk analysis (de Barros et al. 2012). They are: (1) the chemical concentration at an environmentally sensitive target location, (2) the arrival time (from the source to the target location) and (3) the hazard index on the point of exposure (which is directly linked to the risk metric used in this study). The hazard index is evaluated as a function of the hazard quotient (USEPA 2001).

For a non-carcinogenic chemical, hazard quotient (HQ) is defined as the exposure dose to a chemical over a period of time, divided by the daily exposure reference dose. The reference dose is the dose with zero likelihood of occurrence for adverse health effects (USEPA 2001). The chronic exposure hazard quotient is given by:

$$HQ = \frac{CDI}{RfD}$$
(1)

whereby CDI is the chronic daily intake [mg/(kg-d)] and RfD is the chronic reference dose [mg/(kg-day)] (i.e. calculated for the lifetime period). The reference dose can be determined from its Drinking Water Equivalent Level (DWEL) (USEPA 2009). DWEL is calculated by assuming an average body weight of 70 kg and water consumption of 2 L/days (USEPA 2009). For the purpose of this study, drinking water is presumed to be the only source of exposure. The CDI is given by:

$$CDI = \bar{C} \times \beta$$
(2)

whereby \(\bar{C}\) is the chemical concentration (at the environmentally sensitive location) averaged over the exposure duration and β is a coefficient which incorporates the health parameters:

$$\beta = \frac{CR}{BW} \times \frac{ED \times EF}{AT}$$
(3)

whereby CR is the contact rate of medium (water ingestion rate) (L/days), BW body weight (kg), ED exposure duration (years), EF exposure frequency (days/years), and AT is the averaging time (days) which is equivalent to 365 × ED for non-carcinogens (USEPA 2001). It is assumed that \(\bar{C}\) is calculated at a monitoring well location according to the moving averaged expressed below (Maxwell and Kastenberg 1999; de Barros and Rubin 2008; Siirila and Maxwell 2012):

$$\bar{C}\left( {\mathbf{x}} \right) = \hbox{max} \left( {\frac{1}{ED}\int_{{t_{0} }}^{{t_{0} + ED}} {C\left( {{\mathbf{x}},t} \right)} dt} \right)$$
(4)

With t 0 being the starting time of exposure and \(C\left( {{\mathbf{x}},t} \right)\) being the chemical concentration at any point in space \({\mathbf{x}}\) and time t. \(C\left( {{\mathbf{x}},t} \right)\) is determined by the advection–dispersion equation described in the upcoming section. It should be noted that defining the concentration as an average value is aligned with the USEPA guidelines where it introduces the reasonable maximum exposure as the maximum exposure which is logically expected in a site (USEPA 1989). After calculating the hazard quotient for every single chemical, Hazard Index (HI) can be formulated as follows:

$$HI = \sum\limits_{i = 1}^{n} {HQ_{i} }$$
(5)

with n denoting the number of different chemicals in a compound. For this work, as we deal with only one chemical, HI is equal to HQ. According to USEPA (2001), HI = 1 is referred to as the risk level of concern. It should be noted that risk in this study is regarded as the probability of harmful effects and adverse response induced in a human body as a result of exposure to an environmental stressor (USEPA 2001). Therefore, In other words, we try to quantify the probability of observing HI CDF value of greater than 1 (i.e. Pr (HI ≥ 1)) (Siirila et al. 2012) which implies the consequence of adverse health effects.

4 Conceptual hydrogeological model

The conceptual model consists of a sequence of different geological layers: vadose zone, aquifer, impervious layer, overburden formation (sand), and the shale layer. We conceptualize the model as a 3D composite geological formation. Figure 1 schematically displays a cross section of the 3D model with the approximate location of the failure point (contamination source) on the well casing, and the geological settings of the hypothetical model. The shallowest layer, the vadose zone (Layer I), is placed on top with a thickness of 20 m. Then a 20 m thick aquifer (Layer II) lies underneath the vadose zone. A thin shale layer (Layer III—2 m thickness) separates the aquifer from the underneath sand medium (Layer IV) with a thickness of 800 m. This thin layer is denoted here as the impervious layer. The shale reservoir (Layer V) is assumed to be a shallow formation starting from a depth of 840 m and extending to 900 m (see Fig. 1). It should be noted that the thickness and depth values are based on the properties of shallow shale plays reported by the USEIA (2011).

Fig. 1
figure 1

Schematic representation of the failure event and the hydrogeological conceptual model: Point source located on the vertical section of the well within the sand formation. The layers include Vadose zone (I), aquifer (II), impervious layer (III), overburden formation (sand layer-IV), and the shale layer (V)

A monitoring well is placed at a 100 m distance from the hydraulic fracturing well. The sand formation over the shale is deemed to be a water under-saturated porous medium. As it is evidenced in multiple field studies for Marcellus Shale and other shale formations (e.g. Saiers and Barth 2012; Cohen et al. 2013; Reagan et al. 2015), it is not a valid assumption to model the formation underneath the aquifer as a fully-saturated medium. This medium is, in general, partially filled with water. Therefore, the multiphase flow characteristics and the effects of relative permeability (e.g. capillary effects) are incorporated to be able to account for interactions between water and air phases.

Simulation time is set to be 500 years, so that the model can show the mid- and long-run responses of the system. Also, it is presumed that leakage from the injector to the surrounding formation takes place only during each fracturing stage which lasts 2.5 h. Assuming 6 fracturing stages there will be a total of 15 h of leakage.

4.1 Flow and transport model formulation

The solute transport mechanisms in this work include advection, mechanical dispersion, and diffusion. For the case analyzed in this paper, the effects of advection and dispersion are more pronounced during the injection period because of the ongoing fluid movement from the point source into the medium. However, the transport mechanism is mainly diffusive, starting from the end of injection to the time at which the contamination plume diffuses across the impervious layer (Layer III-Fig. 1) and enters the aquifer (Layer II-Fig. 1).

Here, we consider a 3D variably saturated flow field (within the sand medium), with an anisotropic permeability field and constant porosity. The domain of interest is denoted by Ω with boundaryΓ. The governing equation for solute transport in the porous media is as follows (Chen et al. 2006):

$$\frac{{\varphi \,\partial \left( {\,c\rho } \right)}}{\partial t} = - \nabla \cdot \left( {c\rho {\mathbf{u}} - \rho {\mathbf{D}}\nabla c} \right) + q$$
(6)

in which c is the concentration of the chemical in the fluid phase, φ is the medium’s effective porosity, ρ is the water density, \({\mathbf{u}}\) is the specific discharge, q is sink or source, and \({\mathbf{D}}\) is the diffusion and dispersion tensor (Eq. 12). The concentration varies in space \({\mathbf{x}} = \left( {x,y,z} \right)\) and time t. Darcy’s law for a two-phase flow system (Helmig 1997) is given by:

$${\mathbf{u}}_{\alpha } = - \frac{1}{{\mu_{\alpha } }}k_{r\alpha } {\mathbf{k}}\left( {\nabla p_{\alpha } - \rho_{\alpha } g\nabla z} \right),\quad \alpha = w, a$$
(7)

with \({\mathbf{k}}\) denoting absolute permeability tensor and k p α μ α are the relative permeability, pressure, and viscosity for the phase α. Corresponding water and gas phases are denoted by w and a, respectively. The relative permeability (k r ) for each phase, is a scalar that relates the effective permeability tensor of that phase to the absolute permeability tensor. The Brooks–Corey relationship is a commonly used empirical expression for acknowledging relative permeability (Brooks and Corey 1964). For the wetting phase, we have:

$$k_{rw} = \left( {S_{e} } \right)^{{\frac{2 + 3\lambda }{\lambda }}}$$
(8)

whereas for the non-wetting phase, the following expression is use:

$$k_{rnw} = \left( {1 - S_{e} } \right)^{2} \left( {1 - S_{e}^{{\frac{2 + \lambda }{\lambda }}} } \right)$$
(9)

where the wetting phase is water, the non-wetting phase is air,S e is the effective saturation andλis the pore-size distribution index. The existence of surface tension between two fluids causes the pressure in the wetting fluid to be less than the pressure of the non-wetting fluid. The pressure difference (capillary pressure) is then calculated as (Chen et al. 2006):

$$p_{c} = p_{a} - p_{w}$$
(10)

Capillary pressure can also be expressed as a function of S w . One of the most well-known empirical equations for capillary function is the Brooks–Corey equation (Brooks and Corey 1964):

$$p_{c} = p_{e} \left( {S_{ew} } \right)^{ - 1/\lambda }$$
(11)

with p c denoting the capillary pressure and p e is the air-entry pressure.

The 3D diffusion and mechanical dispersion tensor \({\mathbf{D}}\), is defined as follows:

$${\mathbf{D}} = \varphi \left\{ {d_{m} {\mathbf{I}} + \left| {\mathbf{u}} \right|\,\left( {d_{l} {\mathbf{E}}\left( {\mathbf{u}} \right) + d_{t} {\mathbf{E}}^{ \bot } \left( {\mathbf{u}} \right)} \right)} \right\}$$
(12)

whereby d m is the molecular diffusion coefficient, d l and d t are the longitudinal (parallel to the flow) and transverse (perpendicular to the flow in two directions: horizontal and vertical) mechanical dispersion coefficients. The Euclidian norm of the specific discharge is given by:

$$\left| {\mathbf{u}} \right| = \sqrt {u_{x}^{2} + u_{y}^{2} + u_{z}^{2} }$$
(13)

The orthogonal projections in Eq. 12 are defined as:

$${\mathbf{E}}\left( {\mathbf{u}} \right) = \frac{1}{{\left| {\mathbf{u}} \right|^{2} }}\left( {\begin{array}{*{20}c} {u_{x}^{2} } & {u_{x} u_{y} } & {u_{x} u_{z} } \\ {u_{y} u_{x} } & {u_{y}^{2} } & {u_{y} u_{z} } \\ {u_{z} u_{x} } & {u_{z} u_{y} } & {u_{z}^{2} } \\ \end{array} } \right)\;{\text{and}}\quad {\mathbf{E}}^{ \bot } \left( {\mathbf{u}} \right) = {\mathbf{I}} - {\mathbf{E}}\left( {\mathbf{u}} \right)$$
(14)

as for the flow boundary conditions utilized for the system of equations, we consider the Neumann type in which mass flux is prescribed on the boundary (Γ):

$$\rho \,{\mathbf{u}} \cdot {\varvec{\upupsilon}} = g\;{\text{on}}\;\varGamma$$
(15)

whereby \({\varvec{\upupsilon}}\) is the unit normal outward to Γ and \(\rho \,{\mathbf{u}} \cdot {\varvec{\upupsilon}}\) gives the projection of the flux vector on unit normal vector of the domain’s boundary. In our model, g will be 0 for all the boundaries of the overburden formation (i.e. no flow condition) and also for boundaries of the aquifer layer perpendicular to y-axis. Aquifer boundaries perpendicular to x-axis, however, are ascribed a constant mass flux (Table 1). The initial condition of the flow system is given as:

$$p\left( {{\mathbf{x}},0} \right) = p_{0} \left( {\mathbf{x}} \right),\quad {\mathbf{x}} \in \varOmega$$
(16)

with \(p_{0} \left( {\mathbf{x}} \right)\) being the hydrostatic pressure and Ω denoting the entire domain of interest.

Table 1 Input data used in the main case scenario

4.2 Numerical implementation

In this work, the 3D simulation model consists of the aquifer and 60 meters of the sand layer underneath. The horizontal extents of the model are 210 and 205 meters in x and y directions, respectively. It is assumed that the shale is hydraulically fractured in a depth of 850 m. The shale layer, however, is not part of the numerical model. Fluid leakage is considered as a point source contamination on the vertical section of the well.

Figure 2 illustrates the distribution of the cells in the numerical mesh. Three different sets of numerical block dimensions (cells) are used in this model: Primary cells (5 m × 5 m × 1 m), refined cells (1 m × 1 m × 1 m), and coarse cells (10 m × 10 m × 10 m).

Fig. 2
figure 2

Numerical model layers consisting of the groundwater and the geological settings underneath (a) and numerical mesh with injector and the monitoring well (b). A locally refined mesh was employed to accurately capture velocity and concentration gradients

In areas 1 and 3 (Fig. 2b) lower resolution is needed because these locations are far away from the source and receptor. Therefore, the coarsened cell blocks are used in areas 1 and 3. Area 2, however, is the area of focus and the mesh needed must be fine enough to give high resolution. This area is meshed using a combination of global and local blocks. Local grid refinement is used around the leakage point to improve the accuracy of the results. Lateral sides of the model are ascribed no-flow boundary conditions; with an exception to the sides of the aquifer perpendicular to the x axis which are assigned open-flow conditions to simulate the actual horizontal movement of groundwater flow. Open-flow boundary condition is modelled using ECLIPSE’s analytical aquifer (constant flux) module with flow velocity of 1.1 cm/days that results in 2 % of pressure gradient in the (positive) x-direction within the aquifer layer. It is important to state that our study deals with a deterministic boundary conditions setup. Uncertainty in boundary conditions can have remarkable effects on the results of simulation work.

The locally refined mesh of this work is selected after performing a grid refinement sensitivity analysis in which results for numerical grids with block dimensions of 5 m × 5 m, 2.5 m × 2.5 m, and 1.5 m × 1.5 m (all with vertical block dimension of 1 m) were evaluated. This analysis revealed that results for locally refined numerical mesh are not significantly different than those of the case with 1.5 m × 1.5 m × 1 m grid blocks. Also, it is verified that the dispersivity values are in accordance with the scale of the grid block to avoid further numerical errors (i.e. dispersivities are smaller than the finest grid block dimension of 1 m).

As discussed in the conceptual model (Sect. 4), the aquifer is connected to the sand formation via an impervious layer and in the simulation the contamination plume diffuses across this layer and arrives up at the aquifer.

Parameter values used in the main case are reported in Table 1. As previously stated, it is assumed that the aquifer has a constant pressure gradient of 2 % in positive x direction maintained by defining constant in and outflow at the boundaries of the aquifer. The hydraulic fracturing operation is performed in 6 stages—each 2.5 h long with 10 h of relaxation in between the consecutive stages. The injection rate is selected to be 11,500 m3/days with chemical additives contributing to 2 weight percent of the fracturing slurry. For the main case scenario investigated, the leakage depth is assumed to be 10 m below the aquifer bottom and leakage rate is calculated for a 2.5 cm hole at this specified depth and for the specific geological settings discussed here. The pressure difference between the inner well casing and the surrounding formation (i.e. hydrostatic pressure) is the main drive for the leakage phenomenon. In the sensitivity analysis section, the focus is on values below and above the ones selected for the main case scenario to investigate the sensitivity of simulation results with respect to different values. The upper value of 20 m for the leakage depth is selected after performing a preliminary set of simulations in which at depths below 20 m very low concentration values for TEPA were observed in the monitoring well. As it will be discussed in Sect. 5.2, calculation of HI for this depth interval unravels the critical depth as an important piece of information for risk managers.

As previously stated, a groundwater monitoring well is located 100 m away from the injection well to assess the effect of advection and dispersion on the contamination plume when travelling within the aquifer. It is further assumed that TEPA forms 2 % of the total chemical concentration (Gordalla et al. 2013). Health related parameters used in this study (see Eqs. 15) are shown in Table 2.

Table 2 Health-related parameters used in the model

Numerical simulations are carried out by ECLIPSE (Schlumberger 2011). ECLIPSE is a simulation software for subsurface multi-phase and multi-component flow and transport equations (see Eqs. 614) (Schlumberger 2011). Aside from modeling oil and gas reservoirs, this simulator has applications in groundwater modeling (Mohammed et al. 2009) and environmental assessment and remediation (Zhou and Arthur 1994).

In the current study, transport is a function of different hydrogeological and operational parameters. A main case scenario is assumed to assess the effect of initial water saturation (S ° w ) and vertical to horizontal permeability ratio (κ) defined as:

$$\kappa = k_{vertical} /k_{horizontal}$$
(17)

Whereby k vertical and k horizontal are the vertical and horizontal permeability values, respectively. In addition, four deterministic scenario sets, each with one varied parameter, are simulated for sensitivity analysis. It should be noted that TEPA and many other additives used in fracturing fluid are unlikely to occur in water, so there exists no complete human toxicological studies. As a result, and in accordance with German Federal Environment Agency, the allowable contamination level is set to be 3 × 10−4 mg/l (Gordalla et al. 2013).

Among different parameters of the model, the focus is on two geological parameters of the sand layer (Fig. 1), namely horizontal permeability k horizontal and porosityφ, and two operational parameters: leakage point distance from the aquifer bottom H and the rate of leakage Q. The permeability field is chosen to be homogeneous and anisotropic. Deterministic scenario sets and associated parameter values are listed in Table 3. Water saturation and vertical to horizontal permeability ratios are altered for the main case scenario. Two extreme values of 0.1 and 0.3 are selected as S ° w numbers and range of 0.1–1 is assigned to the vertical to horizontal permeability ratio. The water saturation values are calculated using empirical relationships among permeability, porosity and residual water of sand formations (Timur 1968).

Table 3 Parameters used for deterministic scenarios: Leakage rate (Q), Leakage depth below the aquifer bottom (H), Sand formation horizontal permeability (k horizontal ), and Sand formation porosity (φ)

In this study, we assume long term exposure to the chemical. Therefore, the chronic HI is used for risk quantification (Eq. 5). As stated by the USEPA, when working with the reasonable maximum exposure, one could use exposure duration of 30 years as the upper-bound value, but lifetime exposure assumption (i.e. 70 years by convention) is also appropriate in some cases (USEPA 1989). We selected 30 years for ED and EF = 365 days/years as advised by USEPA (2001). The body weight is fixed at 70 kg, as the only pathway is drinking water ingestion and the contact rate to body weight ratios remain approximately constant over a lifetime (USEPA 1989). The reference dose is calculated through the conversion method described previously in the risk formulation section (USEPA 2009).

5 Results and discussion

5.1 Impact of anisotropy and initial water saturation on concentration and risk

Figure 3 shows the concentration isolines for different combinations of S ° w and κ in the main case scenario. Concentration breakthrough curves are estimated for two sensitive locations in the computational domain: (1) bottom of the aquifer (termed as aquifer boundary) where the plume first enters the aquifer; and (2) the monitoring well. Figure 3 shows the temporal evolution of the concentration at different environmentally sensitive locations (i.e. aquifer boundary and the monitoring well). The concentration at the aquifer boundary remains zero for almost 25 years after the end of hydraulic fracturing fluid injection. For time periods longer than 25 years, the concentration value increases to a peak number and drops thereafter. For the physical set-up used in this work, it nearly takes 90 years to observe the concentration in the monitoring well (see Fig. 3b). The results from Fig. 3 are reported for different water saturations and anisotropy ratios. The key conclusion of this analysis is that, the most critical situation takes place for the larger water saturation (S ° w  = 0.3) and smaller anisotropy ratio (κ = 0.1).

Fig. 3
figure 3

Contour plots of normalized concentration of TEPA at the sensitive locations as a function of time and κ. The sensitive locations considered are: aquifer boundary (a) and monitoring well (b). Concentrations shown for different initial water saturations (S ° w ) and anisotropy ratios (κ) over the simulation time. The concentration values are normalized by the initial injected concentration C 0

Changes in S ° w has a direct effect on the concentration at both environmentally sensitive locations. For both the aquifer boundary and monitoring well, the trend of the time-concentration curves are the same: concentration values remain zero for a long time, they start rising, reach a peak and eventually decrease. However, it can be clearly noticed from Fig. 3 that there is a time difference between observing the peak concentration at the aquifer boundary versus the monitoring well. The magnitude of the peak value is also different in two locations. Much lower concentration values are perceived in the monitoring well, as the well is 100 m away from the injector and the plume also undergoes dilution in the fully-saturated groundwater. According to Fig. 3, for a fixed S ° w and horizontal permeability, increasing k vertical in Eq. 17 (i.e. decreasing anisotropy in the permeability field) results in lower concentration values in longer time frames for both locations.

The main reason is attributed to the increased stratification of the geological formation in the anisotropic case (i.e. κ = 0.1). This phenomenon can be observed in Fig. 4 where the cross-section of the numerical model for two extreme cases of permeability ratios is shown. Compared to the anisotropic case (κ = 0.1), the isotropic homogenous case (κ = 1) has a 10 times larger k vertical which facilitates the downward movement of the plume under the vertical pressure gradient due to gravity. The reason pertains to the greater effective permeability in the case with larger absolute vertical permeability (i.e. Fig. 4b). In multiphase flow systems, the effective permeability for each phase is a fraction of absolute permeability of the geological formation and is defined as follows:

$$k_{eff - \alpha } = k_{r\alpha } \times k_{abs} ,$$
(18)

where k eff-α denotes the effective permeability of phase α, k is the relative permeability of phase α, and k abs stands for the absolute permeability value of the formation. For both cases reported in Fig. 4, the initial water saturation (irreducible, residual, or connate water) is the same and so is the relative permeability. At beginning of the simulation (t = 0), there is no fluid movement occurring within the sand medium as the water content is initially at the irreducible level and relative permeability values for both phases are zero. As leakage starts to occur, the water content level in the sand formation increases, the water relative permeability takes on positive values (as the effective permeability, see Eq. 18), and therefore, fluid starts moving along the vertical direction. It should be noted that the only effective pressure gradient in the sand formation is due to density force in negative z-direction. Darcy’s law, Eq. 7, can be re-written for the water phase in this specific situation as follows:

$${\mathbf{u}}_{w} = - \frac{1}{{\mu_{w} }}k_{rw} {\mathbf{k}}\left( {0 - \rho_{w} g\nabla z} \right) = \frac{1}{{\mu_{w} }}k_{rw} {\mathbf{k}}\rho_{w} g\nabla z$$
(19)

With constant relative permeabilities and pressure gradient in the vertical direction, larger absolute permeability (as shown in Fig. 4b) facilitates plume displacement. Therefore, for the κ = 1 simulation (i.e. absolute vertical permeability of 150 mD), the contaminant plume vertical spreading is higher when compared to the κ = 0.1 case (with absolute vertical permeability of 15 mD). For the latter, the solute plume remains in the vicinity of the leakage point (Fig. 4 time-steps 65 h and 27.4 years).

Fig. 4
figure 4

Cross-section of the model illustrated for cases with constant water saturation (S ° w  = 0.3) and varied permeability ratios: a κ = 0.1 and b κ = 1. The gravity effect on the shape of contamination plume is more pronounced in early stages of the simulations (time-steps 65 h and 2.74 years)

The simulated concentrations can now be translated into the risk measure (HI—see Eq. 5). The changes of HI with the permeability are illustrated in Fig. 5. For all of the κ values, HI shows numbers larger than 1, indicating a potential hazard of TEPA. Similar to the case of concentration, increasing water saturation and decreasing the anisotropy ratio result in higher values of HI.

Fig. 5
figure 5

Impact of anisotropy ratio (κ) and initial water saturation (S ° w ) on the Hazard Index (HI) of TEPA. Values of HI remain above 1 for all the scenarios, indicating a potential hazard for consumption of the drinking water

As shown in Fig. 5, it is worthwhile mentioning that irrespective of the solute travel time (from source to receptor) or concentration values observed are, the hazard index values are high (i.e. greater than one) for the range of parameters explored thus indicating a potential risk for the humans exposed to the pollutant.

5.2 Sensitivity analysis of the parameters of interest

For all deterministic scenarios discussed in this section, the permeability ratio (Eq. 17) is fixed at 0.1 and the water saturation is varied systematically. The focus is on two operational parameters (leakage depth and rate) and two hydrogeological parameters (horizontal permeability and porosity of the sand layer). Results are reported at the aquifer boundary and the monitoring well at the end of the simulation (Fig. 6). The goal is to investigate the change in concentration values from the time the plume enters the aquifer to the moment at which the contamination reaches the monitoring well.

Fig. 6
figure 6

Concentration values reported at the end of simulation (year 500) at the locations of interest—Effect of changing leakage rate (a), Leakage depth (b), sand horizontal permeability (c), and sand porosity (d)—The concentration values are more sensitive to the operational parameters (i.e. Q and H as compared to the hydrogeological parameters (i.e. k and φ)

Among scenarios with different leakage rates, only the case with a leakage rate of 0.1 m3/days shows concentration values below the threshold for all saturations and at both locations (Fig. 6a). When comparing the results for different saturations, it is observed that the graph for S ° w  = 0.3 shows slightly higher concentration values. Increasing the leakage rate concentration at both locations shows an increase in value, as one would normally expect.

Next we investigate the leakage depth (H). An increase in H results in lower concentrations in both aquifer boundary and the monitoring well (Fig. 6b). From depths between 5 and 15 m below the aquifer a decreasing trend is observed, but the rate significantly declines for depths below 18 meters. Concentration values reaching the monitoring well from a leakage point of 20 m below the aquifer are low. When comparing the concentration values to the threshold concentration, we observe that even for leakage depths around 20 m below the aquifer, concentrations reaching the aquifer boundary is approximately the same as the specified threshold; however, in the monitoring well the concentration value is on the safe side. The effect of water saturations in this case is also noticeable (Fig. 6b).

As shown in Fig. 6c, the sand horizontal permeability is indirectly proportional to the concentration results when the permeability ratio is fixed (κ = 0.1) at both the aquifer boundary and monitoring well. Also, keeping the total volume constant and increasing sand porosity result in increased pore volume. Larger pore volume in the sand medium provides larger space for the plume to spread. For a fixed initial water saturation, larger pore volume equals larger volume of water in the media and decreased concentration (Fig. 6d).

For the scenario investigated, results indicate that the geological parameters are less important than the operational parameters on controlling both the concentration values and the risk measure (Fig. 6c, d). We therefore, elaborate more on the effect of leakage depth on the results to determine the required well integrity precautions when performing such an operation in the field.

As inferred from Fig. 6b, from leakage depths of 10 m and greater, a direct relation between the concentration and the water content value is perceived, with S ° w  = 0.3 showing the largest concentration values both for the aquifer boundary and the monitoring well. At depths of less than 10 m however, a change in behavior takes place, so that when reaching a depth of 5 m, S ° w  = 0.1 shows the highest concentrations followed by S ° w  = 0.3 and S ° w  = 0.2 (Fig. 6b). The main reason for this discrepancy is attributed to the increased mobility of the plume in S ° w  = 0.3.

The well breakthrough curves for scenarios with changing H values are plotted in Fig. 7; the threshold concentration is highlighted by a solid red line. Breakthrough curves shown in Fig. 7 can be used to recommend appropriate extension for the surface casing (i.e. the casing which starts from the top of the well bore and is meant to isolate fresh water formations from the well) by defining the critical depth associated with the worst case scenario. Here, when H = 20 m concentration values are likely to be in the safe zone so that the critical depth can be set somewhere between 15 and 20 m below the aquifer.

Fig. 7
figure 7

Temporal evolution of the concentration for different leakage depths (H = 5, 10, 15 and 20 m). Results illustrated for fixed water saturation (S ° w  = 0.3) and fixed anisotropy ratio (κ = 0.1). The concentration threshold is highlighted by the solid red line

Figure 8 shows the impact of the leakage depth on the Hazard Index. Larger numbers of HI are observed for a leakage depth of 5 m, whereas by moving to a 10 m depth, the values drop nearly five-fold. HI equals 1 at the leakage depth around 18 m, such that for the particular case of this analysis the critical depth is set at 18 m below the aquifer. As illustrated in Fig. 8, for a leakage depth of 5 m, the case of S ° w  = 0.3 shows the largest value, indicating that it is still the most critical water saturation, irrespective of the fact that the concentration value at the end of the simulation is higher for S ° w  = 0.2.

Fig. 8
figure 8

Changes of hazard index (HI) of TEPA with leakage depth (H), shown for various water saturations. The solid gray line shows HI value of 1

5.3 Parametric uncertainty analysis and the stochastic characterization of risk

In this section, we quantify the uncertainty in the risk prediction. Uncertainty is attributed to two different sources previously discussed: hydrogeological and operational parameters. In order to choose appropriate PDFs for porosity and permeability of the sandstone, one should notice how these parameters are distributed in the field. Data from well-log samples show that for sandstone and carbonate formations, permeability distribution follows a log-normal pattern whereas the porosity tends to be normally distributed (Nelson 1994; Hohn 1999). Therefore, the permeability is presumed to be log-normally distributed. Porosity is, however, sampled from a truncated normal distribution to ensure that negative values are assigned zero probability of selection. Leakage depth and rate are assumed to follow a uniform distribution (Rish 2005).

To capture the effect of the aforementioned parameters and to make comparisons among different scenarios, hydrogeological features of the aquifer are set to be constant. The focus is on the sand layer surrounding the leakage point. The parameters are altered within a Monte Carlo based framework and CDF for three commonly used EPMs (de Barros et al. 2012) are evaluated at the environmentally sensitive location (i.e. the monitoring well). EPMs of interest in this study are the chemical concentration, the source-to-target arrival time, and the HI (see Eq. 5). The analysis is performed with constant permeability ratio (κ = 0.1) and water saturation (S ° w  = 0.3) assumptions. Statistical distributions of the uncertain parameters are shown in Table 4. The Monte Carlo simulation is carried out for 1000 realizations.

Table 4 Statistical distributions for uncertain parameters

Figure 9 shows the concentration CDF, denoted by F, for four distinct times at the monitoring well location. The threshold concentration for TEPA is depicted by the solid red line and is chosen to be 3 × 10−4 mg/l (Gordalla et al. 2013). The probability of exceedance (e.g. events in which the concentration is above the threshold) is defined as follows:

$$\xi = \Pr \left[ {C/C_{0} > C_{th} /C_{0} } \right] = 1 - \Pr \left[ {C/C_{0} \le C_{th} /C_{0} } \right] = 1 - F\left( {C_{th} /C_{0} } \right)$$
(20)

with C th denoting the threshold concentration, C 0 the initial concentration injected, and F probability value from the CDF. It takes more than 100 years for the concentration to reach the threshold value, so that for a period of less than 100 years, ξ equals zero. For longer time periods, however, C/C 0 takes on larger values resulting in higher ξ.

Fig. 9
figure 9

Concentration CDF (denoted by F) at the monitoring well at different times. The concentration threshold is depicted by the solid red line

The exceedance probabilities for 100, 200, and 300 years are 0, 0.15, and 0.27, respectively. The highest exceedance probability (i.e. ξ = 0.46) is recorded for the end of simulation (year 500).

Next, we investigate the CDF of source-to-receptor arrival time (Fig. 10). In this research, the arrival times are defined as the earliest period of time in which the simulated concentration in the monitoring well takes on a value equal to or greater than the chemical threshold. For the scenarios investigated, on average it takes 276 years for the contamination plume to reach the monitoring well, which is quite noticeable.

Fig. 10
figure 10

CDF of the arrival time for the threshold concentration. Results illustrated for fixed water saturation and anisotropy ration (S ° w  = 0.3, κ = 0.1). The CDF starts from 100 years as the first time in which concentration threshold (C th ) is observed in the monitoring well

HI CDF (Eq. 5) is plotted in Fig. 11 as for the last EPM of this study. HI values in the Figure are highlighted for the percentiles of interest in risk assessment (USEPA 2001). The probability of remaining in the safe zone, i.e. Pr (HI ≤ 1), is 0.55. By setting the critical HI at 1, one could use the CDF to answer questions such as: What are the safe margins and intervals for uncertain parameters for the specific setup discussed? How do operational parameters in hydraulic fracturing affect the risk?

Fig. 11
figure 11

Hazard index (HI) CDF, denoted by F, plotted for fixed water saturation (S ° w  = 0.3) and fixed anisotropy ratio (κ = 0.1). The risk level of concern (dashed red line) and percentiles of interest in risk assessment (solid and dashed gray line) are highlighted

Some limitations of the numerical simulations and models of this study should be noted. First, the physical model is isothermal and does not take into account the property changes associated with temperature (e.g. density). Second, the permeability field is anisotropic homogeneous, but for more holistic studies, spatial uncertainty of the geological medium’s properties must be taken into consideration, because permeability heterogeneity affects health risk metrics and corresponding uncertainties (Maxwell and Kastenberg 1999; de Barros and Rubin 2008). Third, the transport model considered in our study is non-reactive. Also, models need to account for chemical adsorption on the rock surface of the porous media and other reactions (Atchley et al. 2013), since it can affect the risk and corresponding EPMs. Fourth, the plume is propagated within the uniform stratification throughout the simulation; whereas in reality there might be significant geological processes happening over the course of 500 years. Finally, more realistic results can be obtained by incorporating more uncertain parameters (e.g. health related ones, which are assumed to be constant for this purpose).

6 Conclusion

The novelty of this work includes addressing the risk of contamination due to a specific failure scenario (i.e. for a specific geological setup) in a hydraulic fracturing process. Different environmental performance metrics such as the source-to-receptor arrival times, chemical concentrations, and the hazards to human health are used within a 3D anisotropic variably saturated model. The 3D multiphase simulation honors both the saturated groundwater medium and under-saturated condition commonly observed in geological settings beneath the saturated layer. For the assumed failure scenario of this study (i.e. leakage from the casing during injection) and for the specific geological conditions and configurations discussed, the results indicate that initial water saturation (retained water) and the anisotropy of the permeability field in the sand formation have an undeniable effect on the spread of a contamination plume. The results of the sensitivity analysis further illustrate that when dealing with a point source leakage on the well casing, the operational parameters (leakage depths and rate) are more effective in controlling the risk magnitude compared to the geological features with an 18 m depth under the aquifer as the critical depth. It is worth mentioning that several states in the U.S. have mandatory surface casing standards with which the oil and gas operators must comply. As previously mentioned, the permeability field used in this study is anisotropic homogeneous; however, in order to account for more realistic conditions, the effects of heterogeneity and spatial uncertainties in geological formations as well as the in-homogenies of reservoir rocks should be incorporated. Among such factors are the presence of fault, natural fractures, fissures, channeling, and anisotropy, among others. The presence of fast flow conduits can lead to an earlier solute breakthrough and higher concentration values thus augmenting the corresponding risks. In this case, the geological and reservoir parameters will likely play a more significant role when compared to the operational ones. The effects of the heterogeneity of the geological parameters on the risk (and corresponding uncertainties) is subject to future work.

Additionally, CDF plots constructed for different EPMs are useful tools for decision makers when conducting risk assessment studies for hydraulic fracturing failure scenarios similar to those investigated herein. Working with real data from hydraulic fracturing operations is imperative in conducting similar studies. A promising strategy to better mitigate the risk and improve our understanding is to record and collect data on incidents and accidents from sites all around the world. We should be performing root-cause analyses, and endeavor to improve existing guidelines and regulations in order to protect human health and the environment.

It should also be noted that the outlined concluding remarks are specifically related to the failure scenario, conceptual model, and geological settings discussed in this study. The results presented here should not be generalized to other failure scenarios. As previously mentioned, different geological settings, injection modes and boundary conditions can lead to different results. However, we emphasize that the risk framework adopted in this work can be modified for subsurface injection operations for regions with more complex geological settings and operational parameters.