Introduction

Selenium (Se) is a naturally occurring element in soils and bedrock that often leaches due to agricultural and mining developments into surrounding water bodies such as occurred at the Kesterson Reservoir in California, USA (Schuler et al. 1990), the Stillwater Wildlife Refuge in Nevada, USA (Tuttle et al. 2000), and the Blackfoot River of southeast Idaho, USA (Tetra Tech 2002; USFS 2009; IDEQ 2009). In general, wildlife and livestock are more sensitive to Se in their water supply than are humans, as evidenced by the difference between the aquatic and drinking-water standards (5 and 50 μg/l, respectively; IDEQ 2009).

Livestock deaths near historic phosphate mines (Newfields 2005) initiated concern of Se contamination resulting from mining the Permian Phosphoria Formation in the Blackfoot watershed. Twelve major mines have operated within the watershed starting in 1906 (Fig. 1); investigations under the Comprehensive Environmental Response, Compensation and Liability Act of 1980 (CERCLA) have commenced at five of those mines, the Enoch Valley, Ballard, Conda, North Maybe and South Maybe Mines (BLM 2011).

Fig. 1
figure 1

Blackfoot watershed showing the subwatersheds, rivers, Wells formation outcrops, alluvium, springs, 303(d) water-quality monitoring sites on rivers and streams, flow measuring points along the river, mines, and conceptual flow direction

Se concentrations have exceeded aquatic standards in the Blackfoot River during both snowmelt and baseflow for years (IDEQ 2009). Exceedences during snowmelt are primarily due to runoff leaching Se from the soils. High Se concentrations during baseflow represent groundwater quality because baseflow is primarily groundwater discharge from the regional aquifer, Pennsylvanian Wells Formation, which underlies the entire watershed (Ralston and Williams 1979; Ralston et al. 1977 and 1979, 1983; Mayo et al. 1985), and the alluvium near the river. Mining creates pathways both for surface (Newfields 2005; Knudsen and Gunter 2004) and groundwater Se contamination to reach the river.

Remediating abandoned mines should decrease the Se-laden runoff (Mars and Crowley 2003) if it decreases the contact time with seleniferous waste. Even if the runoff ceased immediately, however, Se in the groundwater would continue to discharge to the river due to continuing seepage through the mine wastes, which are much more difficult to eliminate, and the long groundwater flow path to the river (Winter 1980). In a watershed with many contaminant sources, it is necessary to prioritize the expenditure of remediation funds on the mines that would lower Se concentrations on the river most quickly.

The goal of this study is to provide a tool that can simulate Se transport to the river during baseflow and demonstrate its use in considering remediation of different mines. The study includes development of a conceptual flow and transport model in the Blackfoot watershed (Fig. 1) and a numerical model which implements the conceptual models. It demonstrates the use of the model to prioritize among three potential remediation scenarios.

Method of analysis

Study area

The Blackfoot watershed lies in southeast Idaho. It heads in the Webster Range which forms a divide between the Blackfoot watershed and the Salt River watershed. The Blackfoot watershed has four main subwatersheds—the upper and lower Blackfoot, Slug and Dry Creek Valleys, and Diamond Creek (Fig. 1), which may more appropriately be referred to as the Upper Valley (Ralston and Williams 1979; Ralston et al. 1977 and 1979). The Blackfoot River effectively begins at the exit from Upper Valley where Diamond Creek and Lanes Creek converge (Fig. 1).

The climate of the watershed is marked by cold winters and moderately warm summers (Ralston et al. 1983). Precipitation varies from less than 25 cm in the low elevations just west of the watershed to more than 90 cm on the ridge tops with the majority falling from October through March as snow (Ralston et al. 1983). Snowmelt occurs from April through June, depending on elevation.

The underlying geologic structure of the Upper Blackfoot river basin results from thrusting associated with the Bannock Thrust Zone which results in synclinal-anticlinal folds and faulting of primarily sedimentary rock (Mayo et al. 1985; Cannon 1980; Ralston and Williams 1979). Bedrock outcrops range from Pennsylvanian to Triassic in age (Hein 2004; Johnson and Raines 1996; Bond and Wood 1978). The oldest, and deepest, formation considered herein is the up-to-800-m-thick Wells Formation (Fig. 2). The Wells Formation consists of the Grandeur Tongue, and the Upper and Lower Member of Wells Formation and is the primary regional aquifer in the Blackfoot watershed (Fig. 1). The aquifer system within the thrust block is conceptually a bowl, bounded beneath by a low angle thrust fault, with discharge to interior springs and streams and to major springs along its edge. The lithology of the Wells Formation is sandstone, limestone, and dolomite, with more dolomite and limestone on both the top and bottom of aquifer (Cannon 1980; Winter 1980).

Fig. 2
figure 2

Cross section A–A′ from Fig. 1 showing stratigraphy and conceptual flow (blue arrows). Qal is alluvium, Trd is Thaynes formation, PPwu is Wells Formation, Mb is Brazer Limestone, and Mn is Lodgepole Limestone. Adapted and simplified from Winter (1980)

Overlying the Wells Formation is the Phosphoria Formation. The Phosphoria Formation includes the Meade Peak and Rex chert members, which are primarily phosphate-bearing mudstone, and the Center Waste Shale (Hein 2004). The Center Waste Shale contains an average 65 ppm Se, with samples containing up to 1,040 ppm, which is described as exceptional compared to a worldwide average for shale (Herring and Grauch 2004). The Meade Peak member is also an effective aquitard separating the overlying local surface aquifers from the underlying Wells Formation (Ralston and Williams 1979) and preventing Se from leaching to the Wells Formation, before mining breaches the Meade Peak member. Throughout the watershed, Quaternary alluvial aquifers bound the river and streams extending up to 2 km from the streams (Figs. 1 and 2). They are generally less than 50 m thick (Winter 1980).

Conceptual flow model

Groundwater discharge from two hydrogeologic formations, alluvium and the Wells Formation, controls the baseflow in the Blackfoot River. The Wells Formation receives distributed recharge and streamflow seepage on outcrops throughout the watershed (Fig. 1 and 2). Groundwater flows from recharge on the outcrops to discharge into basin fill and directly to the river and its tributaries. The alluvium receives recharge as stream seepage, mountain-front recharge, and inflow from the Wells Formation and discharges to the Blackfoot River, and other tributaries. Overall flow is from the Wells Formation outcrops and stream percolation toward the northwest where it discharges into springs and the Blackfoot River or flows out of the study area towards the Blackfoot Reservoir (Figs. 1 and 2).

Most data regarding hydraulic conductivity (K) of the Wells Formation resides in mine proposal environmental documents. Whetstone (2009) assimilated most of the available Wells Formation aquifer test data and found a range from 0.1 to 3.0 m/day with a significant horizontal anisotropy. The Wells Formation has very low primary porosity and permeability but is fractured by folding and faulting throughout so that fracture flow dominates over the domain (Ralston et al. 1983). Winter (1980) indicated that the Wells Formation near the top of anticlines where the formation is in tension, which causes fractures to be wider, and where outcrops have been weathered should have a higher K than at depth where it is compressed and unweathered. Due to the prevalence of fractures throughout, the system may be considered as equivalent porous media.

The primary sources of flow data in the watershed include a flow gaging station located on the Blackfoot River above the Reservoir near Henry, Idaho (No. 13063000) and spot flow measurements collected for annual water-quality assessments, as required by section 303(d) of the Clean Water Act (IDEQ 2009; Fig. 1). During most years since 1914, the gage has operated only during the irrigation season from April through September. The contributing area is 862 km2 including irrigation diversions for 18.2 km2.

The maximum monthly flow which usually occurs in May exceeds by 70 % the values in April and 80 % the values in June (Fig. 3a). Average flows are less than 2.4 m3/s from August through March, indicating that once spring runoff ends, baseflow ensues as the flow becomes relatively constant. This is also evidenced by the standard deviation of monthly flows (Fig. 3a). Monthly flows during the low flow months have also decreased with time, with the 2001–2010 period being lower than during previous periods (Fig. 3b). The year 1977 was one of the lowest flow years of the century (Winter 1980), but the drought was short-lived. During August and September 1977, daily flows varied from about 0.57 to 0.93 m3/s, less than half of the monthly mean (Fig. 3). However, some daily flows in August and September during 2001–2003 and again in 2007 dropped to less than half of the range observed during August and September of 1977 (Fig. 3b), which could reflect increased irrigation and/or the effects of a longer drought. The return flow through the alluvial groundwater from irrigation recharge has a short lag time, based on low flow in months subsequent to the irrigation periods, and does not significantly support the baseflow during the autumn, in contrast to the findings of Kendy and Bredehoeft (2006) in the much larger Gallatin drainage in Montana.

Fig. 3
figure 3

a Monthly flow average and standard deviation and b average May and September flows, for gage 13063000, Blackfoot River, above the Blackfoot Reservoir near Henry

During September 1999, the only year with sufficient 303(d) measurements during baseflow, the flow exiting Upper Valley was about 41 % of that exiting the entire watershed. The increase below Upper Blackfoot River and Diamond Creek valleys was inflow, both groundwater and baseflow from small perennial streams, including Rasmussen, Dry, Wooley, and Slug creeks in the lower valleys (Fig. 1). The Upper Valley generates about 80 % of the spring runoff, more during dry years, as evidenced by flow during 2007. Within the Upper Valley, both Diamond Creek and a short tributary named Spring Creek (not shown) provide approximately equal proportions.

Discharge from the Wells Formation also includes springs on the edge of the Meade thrust block allochthon (Mayo et al. 1985) and internal to the watershed (Ralston et al. 1983; Fig. 4). Seven springs along the thrust fault that delineates the west edge of the allochthon, just west of the Aspen Range (Fig. 4) discharge approximately 0.99 m3/s of warm water (Ralston et al. 1983). Four other springs discharge approximately 1.02 m3/s of cool water near a cluster of faults just south of the Aspen Range, near Georgetown Canyon (Fig. 4); this flow likely emanates primarily from recharge in the 210.0 km2 Georgetown Creek watershed (IDEQ 2007). Thirteen springs discharge warm water north of the Blackfoot watershed (Fig. 4), suggesting a long flow path from within the Blackfoot watershed. Six other springs discharge southeast from the watershed into Crow Creek, located outside the watershed east of the Webster Range (Fig. 4) from a Wells Formation outcrop that slopes under the topographic divide (Ralston 1979; Ralston et al. 1977).

Fig. 4
figure 4

Springflow measurements (Ralston et al. 1983) and faults

The total discharge from the Wells Formation, including the Blackfoot River, Georgetown Canyon, and springs west of the Aspen Range, is 4.27 m3/s. The average recharge is 0.13 m/year over a 1,002 km2 area, assuming that baseflow represents average annual groundwater recharge, a valid assumption in a regional-scale system (Cherkauer 2004) dominated by spring snowmelt and baseflow. This exceeds the 0.05 to 0.10 m/year range found by Ralston et al. (1977) for several valleys internal to the watershed, but this study ignored flow from those basins to formation-bounding springs. The recharge exceeds 25 % of the average annual precipitation at nearby stations located at Soda Springs, Conda, and Henry (40, 48, and 52 cm/year, respectively) (Desert Research Institute Climate Center 2010).

Conceptual transport model

Two processes control the Se loading to the Blackfoot River. First, runoff erodes Se from the ground surface and rapidly transports it to the streams. The most severe Se contamination is associated with watersheds that have large mine dumps that tend to obstruct stream flow and also have larger gradients and little streambank storage (Mars and Crowley 2003). Second, groundwater discharge includes Se derived both from seepage through the Phosphoria Formation and Se recharged from runoff. Phosphate mining increases these sources by disturbing the ground, thereby making it more erosive, and by increasing direct recharge through the waste to the Wells Formation (Newfields 2005; Knudsen and Gunter 2004).

With seasonal variation, both low and high Se concentrations in the Blackfoot River have trended upward since 2001, the first year occasional Se samples have been collected (Fig. 5a). The higher concentrations, which occur during May, are double the aquatic Se standard. The lower concentrations, which occur during baseflow both in March and July–October, have increased through the period, almost doubling between 2006 and 2009, and most recently have ranged from 1 to 2 μg/l. This trend suggests an increase in Se sources affecting groundwater, which would take much longer to manifest because of long flow paths, although fractures and faults, mapped (Fig. 4) and unmapped, could reduce the transport time or change the concentration as compared to transport in unfractured formations in portions of the watershed (Nordqvist et al. 1996). River concentrations could also fluctuate as different parts of the heterogeneous alluvium contribute load to the river (Osiensky et al. 1984).

Fig. 5
figure 5

a Se concentration with time for Gage 1306300 and b for select mine dump seeps. See Fig. 1 for the location of mines

Dozens of phosphate mines have been constructed through time in the Blackfoot watershed since 1906, with the Georgetown Canyon Mine being the first (Lee 2000), thereby increasing the potential Se sources (Mayo et al. 1985; Winter 1980; Ralston and Williams 1979; Herring and Grauch 2004; Tetra Tech 2002). Contaminants move through groundwater by advection, dispersion, and diffusion toward the sinks, primarily the rivers and springs of the watershed.

Se can exist in multiple oxidation states (−2, 0, +4, and +6) with different geochemical characteristics affecting fate and transport (Herring and Grauch 2004; Newfields Inc 2005). Selenate is the most mobile state, but if reduced, precipitation or adsorption to small charged soil particles can attenuate transport (Drever 1997). In the Wells Formation, the upper several hundred meters have oxidizing conditions, but at depth Se species may be reduced causing attenuation of any Se transport to those levels (Newfields Inc 2005).

Tributaries below many mines have elevated Se concentration (Fig. 6), with the highest levels occurring during the runoff period (IDEQ 2009). Most high concentrations occur downstream of Diamond Valley and in all tributary valleys that contain phosphate mines. Other measurements are of springs and seeps directly below the mines (Fig. 5b).

Fig. 6
figure 6

Distribution of Se concentration in seeps and springs from around the Blackfoot watershed, with data collected generally in the spring as part of Idaho water quality monitoring (IDEQ 2009). The pink shaded areas are Wells Formation outcrops. See Fig. 1 legends for mine names and descriptions for other features

Mine seepage concentrations are best represented in the seeps and groundwater within the backfill (BLM 2011; Formation Environmental 2010), as shown in Table 1, which lists the average and standard deviation of mine seep concentrations observed throughout the study area. That the range (not shown) includes some very low values in addition to the high standard deviations demonstrates the high variability of conditions within mine waste backfill. Mine seep concentrations may be trending upward with time, based on data from the Ballard Mine Seep and Enoch Valley (Fig. 5b).

Table 1 Average of Se concentration from seeps underlying dumps and overburden piles compiled from various sources (JBR 2006; Tetra Tech 2002; MWH 2010; IDEQ 2009; Formation Environmental 2010). SD standard deviation

Numerical flow and transport model

Long-term forecasts and planning require a flow and transport model to estimate discharge and Se load during baseflow to the Blackfoot River and various tributaries. MODFLOW-2000 (McDonald and Harbaugh 1988; Harbaugh et al. 2000) was used to simulate flow and MT3D (Zheng and Wang 1999), which uses an existing MODFLOW flow solution, was used to simulate Se transport.

The model domain (Fig. 7) is the Wells Formation and connected alluvium within the Upper Blackfoot watershed and extending over the watershed boundary to coincide with the Mead thrust block, or allochthon, west of the Aspen Range. The boundary is also west of the Webster Ridge topographic divide to coincide with the westernmost Wells Formation outcrop on the Webster Ridge (MWH 2010; Cannon 1980).

Fig. 7
figure 7

Groundwater model domain showing the Blackfoot watershed and subwatersheds (black line), Wells Formation outcrop (pink area) and flux boundary conditions. GHB general head boundary. R reach number

The grid consists exclusively of 152-m square cells to implement mine seepage as recharge through one or more cells. The grid is rotated 22.8° to the northwest to parallel the main axis of the tributary valleys. The model has four layers that represent the cross-section as illustrated in Fig. 2. The top layer includes Wells Formation outcrops, alluvial fill around the river and tributaries, and undifferentiated bedrock. Layer 1 has a variable thickness equal to the difference between the ground-surface elevation, determined with 30-m digital-elevation models, and the top of the Wells Formation. If the Wells Formation outcrops at the location, the layer 1 thickness is 30.5 m. The depth to the top of the Wells Formation was estimated from water well logs and from the various published geologic cross-sections (Winter 1980). Layers 2, 3, and 4 simulating the Wells Formation, or undifferentiated bedrock, were set 30.5, 152 and 460 m thick, respectively, with adjustments to avoid significant offsets in steep areas, to facilitate the simulation of vertical flow.

This model used head-controlled flux boundaries (Anderson and Woessner 1992), including river boundaries to simulate an interchange of water between the river and aquifer, drain boundaries for springs and river tributaries, and general-head boundaries (GHBs) for flow to the northwest (Fig. 7). Boundary reaches were chosen to be the discharge zones from the aquifer so that the flux and Se discharge rates would represent flow or transport from a tributary basin (Lemly 1999). The total discharge into the river is 2.27 m3/s partitioned so that 41 % of the discharge is to the river in the Upper Valley divided equally between Diamond Creek and the river, based on discharge to the river as discussed in the preceding. Below the Upper Valley, the remaining 59 % of the discharge was divided between river reaches based on Wells outcrops, the narrow canyon section 1 km downstream from the confluence of the Blackfoot River and Diamond Creek, and the tributaries so that three river reaches and tributaries each received 0.06 m3/s and 1.0 m3/s discharged into the longer reach 11 (Fig. 7). Simulated discharge should be slightly higher than the targeted values to account for groundwater supporting riparian evapotranspiration (ET). Springs discharging from the Wells Formation have targeted flows as specified in Ralston et al. (1983; Fig. 6). The GHB that simulates groundwater flowing north toward the Blackfoot Reservoir does not have a targeted flux.

Recharge averaging 0.135 m/year provides the entire model flux as specified flux to the domain. The recharge distribution depends on the precipitation, soils, and geology (Dribbs et al. 2006; Flint et al. 2004; Stone et al. 2001). Wells Formation outcrops and alluvium are receptive to seepage and recharge, while precipitation mostly runs off the other rock outcrops and flows across and recharges Wells outcrops or alluvium.

Recharge can be estimated by assuming the recharge to the contributing area above a point equals the baseflow discharge to that point. The strategy used to distribute total recharge around the model domain is to set recharge zones based on the three outcrop types, broken into zones by mountain range or valley (Fig. 8). The rates were set by calibrating so that boundary fluxes best match their target fluxes; this method has been shown to substantially reduce the error in estimating flux in similar-sized basins (Juckem et al. 2006). Recharge through undifferentiated rock was set equal to 0.005 m/year based on the low rate expected through the Mead Peak aquitard.

Fig. 8
figure 8

Groundwater model recharge zones and rates. White areas are the mines; see Fig. 1

The model simulates flow through three primary units—the Wells Formation, alluvium, and undifferentiated rock —modeled as equivalent porous media using parameter zones (Anderson and Woessner 1992); the Wells Formation outcrops (Figs. 7 and 9a) and all of layers 2–4 consist of Wells Formation K zones (Fig. 9b and c); layer 4 consists of just one K zone with a conductivity of 0.09 m/day. Some K zones have values higher than found by Whetstone (2009), discussed in the preceding, due to scale effects; the K of larger areas is often higher than the values determined using pump tests (Schulze-Makuch et al. 1999). Undifferentiated rock primarily represents the Phosphoria or younger formations (Ralston 1979) which are above the Wells Formation in layer 1 (Fig. 9a). Alluvial zones were subdivided by subwatershed (Figs. 1 and 7). Wells Formation K zones correspond with anticlines (where K should be higher), synclines (where K should be lower), outcrops (where K should be higher due to weathering), and intermediate zones (Fig. 2). Wells Formation K also reflects an expected depth decay of the fractures, with fractures becoming smaller with depth (Belcher 2004) mostly between layers 1 and 2 and between layers 3 and 4. Final Kh and Kv values were determined as part of steady-state calibration.

Fig. 9
figure 9

Calibrated hydraulic conductivity values by zone for a layer 1, b layer 2, and c layer 3

Faults can be a flow barrier or conduit (Caine et al. 1996); fault conductance (Harbaugh et al. 2000) across the various mapped faults (Fig. 4) was set as part of calibration, with lower values set at thrust faults between bedrock and fill due to a likely significant fining of the fault core (Caine et al. 1996). Faults were simulated as flow conduits where indicated by spring flow (Georgetown Canyon) by increasing the K of the cells adjacent to the fault, a method which could help to direct the flow to the springs where necessary (Dettinger et al. 1995; Ralston and Williams 1979).

Model calibration

Steady-state calibration involved adjusting the recharge, K and boundary flux conductance so that the simulated heads would match the observed heads and the simulated flux at various boundaries matched the observed or estimated fluxes. Target fluxes were measured and estimated flows and target heads were the static water levels observed during well completion for wells in the Wells Formation and alluvium. This assumes that static water levels approximate steady-state conditions, which is accurate in basins without substantial groundwater development (Myers 2009). Depth to water at two mine pits and piezometer observations near Diamond Creek (Ralston et al. 1977 and 1979) were also used for calibration. Head-dependent flux boundaries also control the head at those points. Because calibration data is limited, qualitative techniques as described by ASTM (1998) including simulating the potentiometric surface according to the conceptual model and having reasonable simulation of the vertical exchange of groundwater among layers were also utilized.

Calibration was completed in three stages. First, recharge rates among the various recharge zones (Fig. 8) were adjusted to match target fluxes. Second, using the calibrated recharge rates, K and conductance were calibrated using both trial and error and automated routines including sensitivity analysis within MODFLOW-2000 (Harbaugh et al. 2000) to match steady-state head while maintaining the flux values found while calibrating recharge. MODFLOW-2000 calculates a composite scaled sensitivity for each parameter, which reflects the total information about the parameter available in the observations; see Hill and Tiedeman (2007) for a description of how the value is calculated. Third, storage coefficients were estimated by simulating well driller’s pump tests assuming that pumping and recovery lasted for 0.1 and 0.9 days, respectively, so that average simulated drawdown approximated about half the measured drawdown in the well. This adjustment accounts for the effect of averaging head over the cell area when the reality is that the pump tests are very short term and the drawdown at a well would be substantially greater than over the cell. All annual recharge was assumed to occur within a 90-day seasonal period, an assumption which implicitly accounts for the soil-water balance by season, in the spirit of Jyrkama et al. (2002). The calibration target was for seasonal fluctuation in the Wells outcrop areas to be less than 9 m and for river and drain fluxes to fluctuate not more than plus or minus 10 %, as observed at the gaging station (Fig. 3).

Contaminant transport modeling

Mining introduces contaminants to the aquifer as seepage, which may be simulated as a new recharge boundary with a contaminant concentration, with recharge zones at and below the mines (Figs. 1 and 8). The contaminant, Se, is treated as conservative, an assumption justified by the fact that selenate (measured as Se6+) is by far the most common Se species in a groundwater sample in the Wells Formation (Newfields Inc 2005). Simulating the seepage starts with an accurate transient flow simulation of the overall flow in the system and the new mine seepage (Bredehoeft and Pinder 1973). Therefore, simulating new Se sources involves two steps, a flow simulation of the new seepage and a transport simulation with the contaminants introduce with the new seepage. For the comparison of remediation scenarios, the mine seeps are the only Se source in the watershed, an assumption justified by the naturally low Se concentrations at both springs and wells (Newfields 2005).

Dispersion coefficients can be difficult to estimate because they depend on the scale of the modeling (Anderson and Woessner 1992). Two models prepared for minesites in the area had set longitudinal, transverse, and vertical dispersivity in the Wells Formation as 20, 6 and 2 m, respectively (Myers 2007) and 30, 10, and 3 m, respectively (JBR 2007). Fetter (2001) recommends an equation derived by Xu and Eckstein (1995) for relating the apparent longitudinal dynamic dispersivity to the length of the flow path. For a flow path equal to the cell length used in this model, the Xu and Eckstein equation yielded a longitudinal dispersivity equal to 2.7 m. The transverse and vertical dispersivity values equal 0.2 and 0.1 times the longitudinal values (Schulze-Makuch et al. 1999). Dispersivity was verified by comparing the simulated river concentration to the 2010 303(d) concentrations and by completing a sensitivity analysis using the higher values.

Scenario modeling

Transient simulation included mine development in the watershed from 1960 through 2010, which establishes initial conditions for simulation of the future, 416 years beyond 2010, and verification of the dispersivity coefficients. Historic mine development required five stress periods with seepage from historic mines simulated as starting in 1960, 1969, 1977, 1987, or 1991 (Lee 2000; Table 2). Simulation of the future required four additional periods; two were 6 and 10 years to simulate projected development of the Blackfoot Bridge Mine (BLM 2011; Arcadis 2009) followed by two 200-year periods for a longer-term consideration of mine remediation. The 200-year simulation periods are a simplification of the time periods necessary for the Se concentration from the seepage to be decreased by an order of magnitude, as determined from leaching tests (BLM 2011; Whetstone Associations 2009). Simulation of several hundred years into the future is imprecise because of uncertainties in conceptual model of the flow, but the purpose of this modeling is to prioritize remediation scenarios; the model establishes a baseline against which remediation is compared therefore the imprecision should not bias the results.

Table 2 Blackfoot watershed mines, and relevant mining and modeling year. For modeling, the seepage continues until it is remediated

Three mine remediation scenarios, in which specified mines stop seeping Se-laden recharge at the beginning of the first 200-year simulation period, were considered. Scenario 1 is the baseline, without any remediation; scenario 2 remediates Ballard, Enoch Valley, and Henry Mine; scenario 3 remediates North and South Maybe Canyon; and scenario 4 adds the Mountain Fuel and Champ Mines to scenario 3.

The number of time steps per stress period was established so that the first time step was near a critical value of about 3 days (Anderson and Woessner 1992). For flow modeling, there are 20 time steps for the first six periods and 40 time steps for the 200-year period with a time step multiplier of 1.2. For MT3D, the time step size is 0.5 days and the maximum time step is 80 days, to keep it less than the limit implied by the Courant number (Anderson and Woessner 1992) for a maximum flow velocity of 0.6 m/day (observed near some springs).

Results and discussion

Calibration

Recharge rates around the domain (Fig. 8) were established with automated calibration. Recharge in the fill aquifers generally yielded the highest composite sensitivity (Fig. 10a), which reflects the connection between alluvial recharge and discharge to the river; for example, recharge in the alluvium of the Upper Valley was set equal to 0.556 m/day to reflect runoff from the mountains. Four of the five least sensitive recharge zones were in the Wells Formation, which indicates that recharge does not vary much among the Wells outcrops. This probably is due to the long flow paths from recharge in the outcrops to discharge. Long flow paths tend to diminish the effect of seasonal and annual recharge amounts (Cherkauer 2004; Myers 2009).

Fig. 10
figure 10

Steady-state calibration figures of a recharge zone composite sensitivity (Hill and Tiedeman 2007), b comparison of target flux with simulated flux for various river and drain boundaries, and c vertical flux among layers. Bottom in and bottom out means flow into and out of a layer through its bottom, respectively. River flux is from the model domain to the river; recharge from the river is negligible. GHB is general head boundary flow to the northwest (Fig. 7). Total recharge is 373,079 m3/day. See Fig. 8 for recharge zones

With the recharge rates set, adjusting K and conductance resulted in a good fit of simulated and targeted fluxes and an adequate fit of water level observations. The most important fit, due to the goal for the model to simulate discharge to the river, was with the targeted fluxes, including along the river and tributaries for reaches 10–19, with one exception (Fig. 10b). Reach 18 flux, Trail Creek, exceeded the target flux by about three times. The model also simulated Woodall and Formation springs, reaches 22 and 23, which lie on the far west and northwest of the domain (Fig. 4), poorly (Fig. 10b). Their flux may be underestimated due to the failure to account for sufficient connections between the recharge zones and the springs. Georgetown Canyon Tailings Spring (reach 28, Fig. 10b), near the similarly named mine (Fig. 1) and near the accurately simulated Georgetown Canyon springs, was poorly simulated which could be due to conflicting measurements in Ralston et al. (1983).

The average residual for head with 26 observations equaled −3.2 m and the standard error and deviation equaled 6 and 7 %, respectively, of the total 311-m head range. The average was negative because of the tendency for near-surface water levels to be simulated high, due to not modeling ET near the rivers. Large-value positive and negative residuals occurred adjacent to one another, suggesting significant faulting or other low K units between the wells.

Calibrated K ranged over seven orders of magnitude, with the highest K in the alluvium, which exceeded 3 m/day in all zones and approached 50 m/day in some areas (Fig. 10a). The Wells Formation is less conductive and Kh ranged from about 0.0001 to 16 m/day, as expected, with one area draining to a spring being much higher (Fig. 10b), and with good correspondence with models simulating smaller portions of the area (Arcadis 2009; Myers 2007; JBR 2007). Higher K values in localized areas reflect weathered rock in anticlines and the highest represents a zone which channels groundwater to various springs including the Georgetown Canyon and Georgetown Canyon Tailings springs (IDEQ 2007).

Vertical flux among layers demonstrates that the vertical circulation of groundwater, a primary qualitative calibration target, is reasonable (Fig. 10c). The vertical flux into layer 1 is a little less than half of the combined flux to the drain and river boundaries in that layer and about a third of the total recharge to the model; the total flux into and out of layer 2 is substantially more than the discharge to drains from that layer (Fig. 10c). The model accurately simulates deep circulation at reasonable rates, as conceptualized. About 5 % of the recharge leaves the model domain through GHBs to the north.

Flux to Blackfoot River reach 11 (Fig. 10b) is most sensitive to variability in conductance, with flux varying from 61,500 to 114,000 m3/day, for a conductance multiplier varying from 0.5 to 2.0. Reaches 15 and 16 vary over a range up to 40 %, and the remaining flux boundaries vary by 20 % or less for similar conductance variations (not shown).

The calibrated specific yield is 0.2, 0.1, and 0.05 for alluvium, Wells outcrops, and undifferentiated rock, respectively. The specific storage is 3.1 × 10−6, 2.7 × 10−6, and 2.5 × 10−6 m−1 for bedrock in layers 2, 3, and 4, respectively. The sensitivity of the model to changes in specific storage was tested by increasing and decreasing the calibrated value by 50 %; this altered the water level in the unconfined aquifer monitoring wells by less than 0.4 m, with water levels in most of the wells changing less than 0.1 m. Water levels in monitoring wells in the highest confined layers were changed less than 11 m, although all but two changed less than 3 m. Deeper confined layers experienced less fluctuation, except for one on Dry Ridge for which the change was almost 30 m; at this point, layer 1 was dry so the recharge flux was added to confined layer 2. Because layer 1 was dry on many ridges, this was not unusual and is representative of actual recharge reaching the aquifer through fractures. Layer 4 varied almost not at all. Similar sensitivity analyses completed for the pump test results caused a difference in pumping drawdown generally less than 0.5 m. However, two wells both completed in a model zone with Kh = 0.2 m/day, fluctuated from the baseline by up 9 m due to pumping. Changing the specific yield by 15 %, positive or negative, affected water levels in both types of aquifer by less than 0.2 m.

The simulated 2010 Se concentrations in the river during baseflow ranged from 2.2 to 3.1 μg/l (Fig. 11), which are close to the 303(d) observations and verify the model as accurate with regard to seepage from past mining. The increased discharge to the river due to mine seepage is negligible.

Fig. 11
figure 11

Simulated 2010 Se concentration along the Blackfoot River reaches from upstream to downstream. Combination terms, such as R1+R2, means the combination of two or more reaches

Larger dispersivity coefficients (JBR 2007) caused the 1 μg/l Se contour to spread further than for the baseline in certain areas (Fig. 12) but overall the difference is slight. The spread is greatest in the southeast portion of the domain where the plume moved 3 km further in 270 years due to the higher dispersivity. Transport to deeper layers may cause the lack of apparent differences in some areas and some river and drain boundaries limit the plume by providing a sink for the Se. The relatively small sensitivity to dispersivity coefficient indicates that advection is more important in controlling the plume shape (Konikow 2011).

Fig. 12
figure 12

Comparison of 1 μg/l selenium-concentration contours for the baseline scenario in 2230 with the longitudinal dispersion coefficient D = 2.7 m and D = 30 m. The ratio among longitudinal, transverse, and vertical dispersion coefficients is 1:0.2:0.1

Comparison of baseline with remediation scenarios

In 2010, the simulated Se contour near some of the mines is 100 μg/l although Se concentration may be much higher directly under the mine footprint (Fig. 13). These simulated Se concentrations are similar to seep concentrations as observed in the 303(d) data (IDEQ 2009) and to monitored groundwater quality at nearby mines (Newfields 2005). Much of the simulated Se plume is north of the Blackfoot River and a smaller plume emanates from the North and South Maybe Canyon Mines (Fig. 13). Se plumes appear less extensive in layer 1 because the layer is dry in places and are more extensive in layer 3 because of the vertical dispersion along the long flowpaths.

Fig. 13
figure 13

Se concentration contours in layer 2 in 2010. Contours are 1, 10, and 100 μg/l. See Fig. 1 for mine identification. The map also shows the simulated monitoring points used in Fig. 14

The Se concentration often varies among model layers reflecting the vertical circulation, as demonstrated by the selected monitoring wells in Fig. 14a. Along the long river reach 11 (Fig. 13), Se concentration is slow to increase but does so first at depth, in layer 3. Further upstream at the confluence of Diamond Creek and the river, layer 3 also has higher concentration. However, in the Aspen Range, layer 2 has higher concentration because of the nearby recharge and downward flow in the area. Deep circulation initially decreases the concentration reaching the river because the aquifer lengthens the transport time of the Se, thereby releasing it more slowly.

Fig. 14
figure 14

Se concentration for a baseline scenario monitoring wells for the baseline scenarios, b river reach (R) endpoints and inflows for the baseline scenario, c scenario 2, and d scenario 3. See Fig. 13 for monitoring well sites and Fig. 7 for reach numbers. DS is downstream end, referring to a river reach. Scenario 4 is not shown

The concentration in flux to the river requires up to 80 years to approach steady state from break-through to the time the hydrograph becomes horizontal (Fig. 14b). Monitoring points close to Se sources may have small shifts in the Se concentration as the simulated load from those sources change. Monitoring points further from Se sources have longer times to break-through and to level off and multiple sources complicate the hydrograph. Sites under high concentration sources such as the Aspen Range, experienced break-through quickly but still took up to 50 years to reach equilibrium. Other monitoring sites (not shown) required from about 30 to nearly 100 years to reach equilibrium, also demonstrating that long periods could be required to remediate the sources.

The Se plume will expand through the watershed and along the Blackfoot River west of Dry Valley (Fig. 15), which increases the Se loading to the river (Fig. 14), without any remediation. With time, Se concentrations will exceed the aquatic standard over much of the Blackfoot River and in the flux to most river reaches (Fig. 14b). Groundwater from undeveloped tributary valleys may dilute the river near the outlet but not simulating ET may cause the model to underestimate the concentration. At the downstream end of reach 12, Se concentrations could exceed 5 μg/l for about 200 years beginning about 80 years after the beginning of the simulation (Fig. 14b). About 75 years was required to reattain equilibrium after source concentrations were reduced at the beginning of period 9, at year 260 (Fig. 14b), which reflects the time for the contributory area to fully affect the river concentrations. It will also require longer to naturally remediate due to the large amount of Se in storage.

Fig. 15
figure 15

Se plume, 5 μg/l, for baseline and scenario 3. Scenario 2 is essentially the same as the baseline and scenario 4 is essentially the same as scenario 3. See Fig. 1 for mine names

Scenario 2 remediates mines located in the northwest portion of the watershed, just north of Blackfoot River reaches 10 and 1, where it removes Se as indicated by Se contours (Fig. 15). The Se concentration at the downstream end (DS End in Fig. 14c) is slightly less than for the baseline scenario, only reaching about 4 μg/l. Remediation does not decrease Se concentration at the DS end of reach 12 or in the inflow to reaches 12, 13, 14, or 19 (Fig. 14c).

Scenario 3 substantially changes the Se concentration hydrograph (Fig. 14d). Se concentrations at neither station DS End nor DS End R12 exceed 5 μg/l at any time (Fig. 14d). Inflow concentrations on reaches 12, 14, and 19 still peak above 5 μg/l (Fig. 14d), but they do not achieve the apparent equilibrium observed in the baseline scenario (Fig. 14b). The Se plume at year 266 is mostly nonexistent around the remediated mines (Fig. 15). Scenario 4, however, results in little difference in Se concentrations reaching the Blackfoot River because of the substantial distance these mines and the decreased Se plumes (Fig. 15) are from that river.

Conclusion

The model presented herein reasonably simulates flow and transport through the Blackfoot watershed at a reconnaissance level. It is a good tool for comparing plans to remediate mines in the watershed. The study and comparisons presented herein demonstrate how to use conceptual and numerical modeling to prioritize remediation.

Se plumes currently or will extend over a large portion of the Blackfoot River watershed due to existing and abandoned phosphate mines. If potential future mines contribute additional Se load, the plumes could expand. If the waste rock is capped to decrease seepage, it could even decrease the inflow to the river that dilutes the Se-laden flow from other parts of the watershed. The peak concentration does or will exceed 5 μg/l for a long time period over a substantial portion of the Blackfoot River, but cleaner inflow dilutes it to below that level by the time it reaches the downstream end of the river.

All of the remediation scenarios could help to remediate Se contamination in the watershed and downstream of the watershed. Scenario 2 would reduce the Se reaching the downstream portions of the river, thereby preventing Se concentrations from exceeding the aquatic standard at the outlet. Scenario 3 would reduce the Se concentration in the river and inflow through several reaches substantially compared to baseline and more than scenario 2. It would prevent Se concentrations from exceeding 5 μg/l throughout. However, remediating these mines had almost no effect on the load leaving the domain through the GHB due to the long distance, and transport time, from the boundary. Scenario 4, which added two mines to scenario 3, did not discernibly change river Se loading, but it reduced the Se concentration around the additional mines.

The analysis presented herein is reconnaissance level due to the lack of data for calibrating the model. The lack of data and uncertainties should not be used as an excuse to forgo remediation; rather data collection as remediation proceeds could provide additional data to improve the model. It could be used to consider additional permutations of remediation and better focus the remediation dollar.