Introduction

Endocrine disrupting compounds (EDCs) are becoming of increasing concern in waterways of the USA and worldwide. Endocrine disruptors have the potential to interfere with the hormone (or endocrine) systems in animals and humans (reviewed in Burkhardt-Holm (2010)). Analytical methods that allow detection of chemical constituents in streams at very low concentration as well as detection of numerous constituents that were previously un-assessed are now available (Snow et al 2009; Richardson and Ternes 2011; Metcalfe et al. 2013; Rotroff et al 2013). What remains poorly understood, however, is how prevalent these emerging contaminants are in the environment and what methods are best able to determine landscape sources of EDCs.

Sources of EDCs vary but can include exogenous sources, such as pharmaceuticals (human as well as veterinary) or breakdown products of herbicides and pesticides, and endogenous sources such as naturally excreted hormones from humans as well as farm animals, especially where those animals are held in concentrated feeding operations (Burkhardt-Holm 2010; Khanal et al. 2006; Alvarez et al. 2013). Endocrine-disrupting chemicals identified in the environment include those that interact with estrogen, androgen, glucocorticoid, thyroid, and progesterone receptors; however, those described as being estrogenic have received most research attention (Burkhardt-Holm 2010; Stavreva et al. 2012; Kerdivel et al. 2013). Estrogens can be delivered to streams through wastewater treatment systems or through land application of animal waste (Khanal et al. 2006; Hanselman et al. 2003; Focazio et al. 2008; Dutta et al. 2012; Leet et al 2012). While treatment systems remove some estrogens, the remaining concentrations in wastewater effluent may be high enough to cause endocrine disruption and feminization of aquatic organisms (Jobling et al. 2002; Vajda et al. 2008; Tetreault et al. 2011; Tanna et al. 2013).

In contrast, land application of dairy, swine, and poultry waste is not restricted in most areas of the USA as long as it is not directly discharged into streams (Khanal et al. 2006). While some hormonally active compounds break down quickly in aerobic conditions, the anaerobic conditions of subsurface soils and groundwater can deliver biologically active estrogens to streams (Barnes et al. 2008). The volume of estrogens entering streams from land-applied animal waste and the potential potency of the endogenous hormones have led to concerns regarding effects to aquatic organisms. Endogenous estrogens, such as estradiol and estrone, generally have a much higher estrogenic potency (10,000–100,000 times higher) than certain exogenous estrogenic compounds, such as nonylphenols or bisphenol A (Li et al. 2004; Dussault et al. 2005; Quinn-Hosey et al 2012). Thus, the source of estrogenic compounds entering receiving waters can have important implications for potential ecosystem impacts.

A variety of in vitro bioassays to measure total estrogenicity of environmental samples have been developed (Zacharewski 1997; Li et al. 2004; Leusch et al. 2010). In regards to estrogenic EDCs, environmental surveys of such compounds are possible using cost-effective bioassays or the more expensive but analyte-specific chemical analyses (Kolpin et al. 2002, 2013; Alvarez et al. 2013). EDCs can be detected using grab samples for quick assessments (Ciparis et al. 2012; Burkhardt-Holm 2010) or with technologies such as semi-permeable membrane devices or polar organic chemical integrative samplers (POCIS) for integration of low-level concentrations over time (Kolpin et al. 2013). Recently developed yeast reporter assays use genetically modified strains of yeast (Saccharomyces cerevisiae) that luminesce in the presence of estrogenic compounds (Sanseverino et al. 2009). These bioreporter assays are suitable for screening environmental water samples for estrogenic potential, particularly when the identity of the bioactive ligands is unknown.

A number of studies have focused on estrogenic contaminants within the Potomac River drainage in the mid-Atlantic region of the eastern USA. In the Potomac River drainage, a high prevalence of male smallmouth bass (Micropterus dolomieu) with intersex (testicular oocytes) and detectable plasma vitellogenin has been noted (Blazer et al. 2007; Iwanowicz et al. 2009; Blazer et al. 2012). The presence of natural estrogens, other steroid hormones, and chemicals with estrogenic activity has been confirmed in these waters via chemical analysis (Kolpin et al. 2013). Since estrogens also modulate the immune system in fishes (Iwanowicz and Ottinger 2009; Robertson et al. 2009), estrogenic compounds entering the aquatic environment may pose a threat to fish health, both reproductively and due to reduced disease resistance. The biological and chemical evidence of estrogen exposure at study sites within the Potomac River drainage, with coincident fish mortalities, have become increasingly frequent (Blazer et al. 2010).

An association between intersex prevalence and landscape factors such as percent agriculture and number of confined animals upstream of sites within the Potomac River system has been documented by Blazer et al. (2012). Additional factors such as number of animal feeding operations and wastewater treatment plant flow were correlated with intersex severity. A companion chemical study at the same sites identified numerous constituents, including biogenic hormones, in both the water and sediments at these sites that were associated with intersex prevalence and severity (Kolpin et al. 2013). An extensive study of the presence of estrogenic activity was conducted by Ciparis et al. (2012); however, it focused on sub-watersheds of Shenandoah River in the heavily agricultural Shenandoah Valley, VA, USA. Again, levels of estrogenic (equivalent) compounds in streams were correlated with densities of confined animal feeding operations. The prevalence of estrogenic EDCs in stream systems within other land use types and in other drainages requires further assessment in order to make definitive statements on the extent of the problem and relationships between agricultural practices, wastewater treatment systems, and estrogenic activity occurrence and distribution.

Spatial epidemiology is an emerging field that exploits the power of geographic information systems to examine spatial patterns of disease occurrence, distribution, and etiology (Ostfeld et al. 2005; Elliott and Wartenberg 2004). Mapping disease clusters and associating cluster locations with environmental influences have been a particular methodological advance (Elliott and Wartenberg 2004; Moore and Carpenter 1999) and have led to the development of studies attempting to determine links between landscape stressors and disease patterns (Meentemeyer et al. 2012; Holdenrieder et al. 2004). However, for unbiased estimates of occurrence and distribution, special attention has to be paid to study design issues to ensure adequate spatial coverage while maintaining representativeness through randomization (Beale et al. 2008; Elliott and Wartenberg 2004). Spatially balanced sampling designs have been developed to determine the extent and severity of environmental perturbations (Paulson et al. 2008; Stevens and Olsen 2004; Herlihy and Larsen 2000) while minimizing logistical constraints. While primarily used in studies of human health, these techniques are equally applicable to the study of diseases of domestic animals or wildlife (Meentemeyer et al. 2012; Sheridan et al. 2005; Foley et al. 2005; Miller and Conner 2005).

Here we describe the development of a spatially structured and randomized sampling design to allow a reconnaissance survey of estrogenic activity along gradients of land use within sub-watersheds. In contrast to some reconnaissance studies that intentionally bias sampling to locations believed to be the source of contaminants, our goal was to produce an unbiased design to enhance information discovery through statistical modeling. We present this example as a useful reconnaissance approach for state and federal agencies with an interest in identifying locations potentially impacted by EDCs that warrant more intensive, focused EDC research. We selected the model to include waters likely to support smallmouth bass (e.g., mid-sized, third-order streams) in the mid-Atlantic USA given our specific regional interests. The goals of this analysis were threefold: (1) summarize and assess the major gradients of land uses within equivalently sized sub-watersheds in the regional study area, (2) develop a randomized sampling design to allow a representative assessment of estrogenic activity in receiving waters across the major land use gradients in the study area, and (3) develop statistical models that relate presence and magnitude of observed EDC occurrence to sampled land use gradients. While other survey designs have been developed to assess general environmental conditions across mid-Atlantic US river basins (Tran et al. 2002) or have been developed for the expressed purpose of assessing estrogenic activity in relation to agricultural activity in a specific subset of the region (Ciparis et al. 2012), our intent is to provide an example of how spatial analysis approaches can provide a statistically reliable and spatially extensive assessment of potential landscape factors important for influencing the delivery of estrogenic compounds to the region’s streams and can enhance information discovery and alternative hypothesis formulation for more detailed field investigations.

Methods

We developed a landscape-based sampling design to assess estrogenic EDC occurrence in relation to land use gradients in sub-basins of the upper Potomac, Shenandoah, and James River in the mid-Atlantic eastern USA (Fig. 1). We used geographic information systems (GIS) and landscape ecology tools to examine land use gradients using consistent geospatial datasets. Data were collected and organized using ArcGIS1 version 9.3 (ESRI, Redlands, CA, USA). We summarized numerous landscape attributes including proportional land cover configuration by land cover type, land cover diversity, average patch size, patch density, area in natural land cover types versus human-influenced types, etc., using EPA’s Analytical Tools Interface for Landscape Assessments (ATtILA, Ebert and Wade 2004). This spatial analysis tool summarizes landscape data within polygon reporting areas, which is useful for a comparative assessment of sub-watersheds. We summarized landscape data within sixth level (or 12 digits) sub-watershed areas from the National Hydrography Dataset (http://nhd.usgs.gov/) and we used the derived landscape attributes of these areas as a basis for establishing a landscape gradient across which to sample for estrogenic activity in surface waters. Initially, there were 421 sixth-level sub-watersheds within the study area to consider for sampling.

Fig. 1
figure 1

Study area including river sub-basins, sub-watersheds, randomly selected sub-watersheds, and point locations of water sampling for E2Eq concentrations

We assessed land cover within sub-watersheds using the USGS Chesapeake Bay 2006 land cover dataset (Irani and P. Claggett 2010). This dataset is derived from satellite imagery in a manner similar to and at the same resolution as the National Land Cover Dataset (NLCD; http://www.mrlc.gov/). This map was used rather than the 2006 NLCD because at the time we initiated this study, it was the most current land use map with regionally consistent coverage. Since the land cover map includes only features that have spatial extents greater than the satellite image resolution of 900 m2, it does not map some smaller land use practices that may have significant influence on surface water estrogenicity. This is especially true of large animal feeding operation (AFO) structures that are used on commercial poultry, dairy, and beef farms and that are ubiquitous in the region.

In order to capture the location and density of these important features, we used Google Earth (Google, Inc.) imagery to locate and map AFOs in the study area by mapping a point at the center of the roof line of each of these easily observable structures using 2009 and 2010 aerial imagery. Some states have available data on the location of AFOs, but only one state in our four-state study area (Virginia) had this data readily available at the time of our study. We therefore did our own mapping of AFO structures in order to be regionally consistent. A cursory comparison of our data with Virginia’s data revealed that our mapping captured the individual structures while the Virginia dataset captured whole farms (multiple structures) but was otherwise consistent (Young, unpublished data). While mapping the location of AFO structures cannot provide information on the number of animals in the buildings, Ciparis et al. (2012) found that animal numbers within AFOs (as reported by permit data collected by the State of Virginia) were highly correlated with density of AFOs (r = 0.88–0.99) within sub-watersheds. Based on this regionally relevant correlation, AFO density was used as a surrogate for the influence of commercially produced animals on potential EDC delivery to streams. We computed density as the number of individual AFO structures divided by the area of the sixth digit sub-watershed in which they occurred.

We also attempted to account for the number of wastewater treatment plant (WWTP) outfalls draining into receiving streams based on data within the EPA’s Permit Compliance System (PCS) database which records the location, type, and maximum permitted discharge of all facilities with National Pollution Discharge Elimination System (NPDES) permits. While some states provide NDPES data in mapping applications, we preferred to use a regionally consistent dataset across the four states of our study area. We therefore used the PCS database included with the EPA Better Assessment Science Integrating Point and Non-point Sources (BASINS) mapping tool, Version 4 (http://water.epa.gov/scitech/datait/models/basins/index.cfm) as an initial data source for summarizing locations of potential point source discharge of EDC into receiving streams. We first selected from the PCS dataset only sewerage system (SIC code 4952) permit points to remove from consideration other permitted dischargers that were not germane to our analysis. We tabulated both the density and reported design flow of WWTPs in the NDPES data. The density of WWTPs within each of the sixth-level hydrologic unit sub-watersheds was computed as the number of permit points falling within the sub-watershed divided by the watershed area.

We used the National Hydrography Dataset (http://nhd.usgs.gov/) 1:100,000 stream lines and the US Census TIGER road data layers to represent streams, roads, and road–stream crossings. We tabulated maximum Strahler stream order using the 1:100,000 scale stream lines, as well as stream density, and the number of road–stream crossings within each sixth-level watershed. We focused our analysis on sixth-level hydrologic units with a maximum stream order of three, four, or five to control for drainage area and discharge to the extent possible. We therefore eliminated sixth-level basins from further consideration from sampling that had a maximum Strahler stream order smaller than third order or larger than fifth order. This reduced the total number of sixth-level watersheds from 421 to 333 to consider for the sampling design.

We used correlation analysis and principal component analysis in the R statistical program (http://www.r-project.org/) to examine land use gradients summarized by sixth-level watershed. Correlation analysis indicated substantial multi-colinearity in candidate landscape pattern variables. Therefore, to create the sampling stratification, we first selected five uncorrelated variables that captured potential gradients in land use that were suspected of influencing E2Eq presence and concentration based on literature reviews of previous studies. These selected land use factors included density of confined animal feeding operations, sewer outfall density, human use index (percent of reporting unit in human influenced land uses), number of stream–road crossings, and percent of reporting unit in impervious land use types (Ebert and Wade 2004).

As a check on the land use gradients represented by these variables, we conducted PCA on these remaining variables. PCA revealed two components that captured the majority of the variation (62 %) in land use gradients represented by these variables and two variables that largely explained the two components. Closer inspection of these components revealed that human use index and density of confined animal feeding operations were dominant in the loadings on the variables. Component 1 represented a gradient in human use index, stream–road crossings, and percent imperviousness based on land use type, while component 2 was largely a function of the density of confined animal feeding operations. We therefore defined watershed sampling strata based on three levels of human use index and three levels of confined animal feeding operation density, defined by Jenk’s natural breaks (Table 1).

Table 1 Random sampling strata constructed on two variables: (1) confined animal feeding operation (CAFO) density computed as number of CAFOs per hectare within the sub-watershed unit (×100) and (2) human use intensity defined as percent of the sub-watershed unit in human dominated land use categories. Strata value ranges shown were defined by Jenks natural breaks of each variable distribution. Values shown are number of possible sub-watersheds of third to fifth Strahler stream order within the study area available for sampling (n = 333) and the number randomly selected for sampling in parentheses (n = 54)

We used the NOAA Sampling Design Tool (http://ccma.nos.noaa.gov/products/biogeography/sampling/) to select stratified random samples of the sixth-level hydrologic units for field investigation. We selected six randomly drawn sampling units for each of the nine strata, resulting in 54 sixth-level hydrologic units selected for field sampling of EDC levels. Within each of these sub-watersheds, we located the road–stream crossing that was nearest to the mouth of the sixth-level sub-watershed for collection of water samples in the field. One of the 54 sites originally selected for sampling was inaccessible due to private property access restrictions, leaving a total of 53 sites sampled.

Grab water samples in 1-l bottles were collected at each site over a 5-day period during the week of June 14–18, 2010. This sampling coincided with retrieval of long-term (polar organic chemical integrative samplers) in the same geographical area for a separate study, minimizing logistical constraints. Inspection of hydrographs of nearby USGS gauging stations during this time period reveal discharges on the descending limb from previous rainfall and elevated discharges the week prior. At the time of sampling, stream discharges were at or below the long-term median daily discharges for all sites, with no intervening rainfall events during sampling. Samples were stored on wet ice during transport and adjusted to pH 3 with 6 N HCl within 8 h of collection. Samples were stored at 4 °C prior to extraction. A total of 400 mL of each sample was filtered and extracted using solid-phase extraction. The methods used were identical to those described by Ciparis et al. (2012). Values were reported as ng/L E2 equivalent (E2Eq), and the quantitation limit of the assay was 0.31 ng/L.

We assessed differences in E2Eq within and among sampling strata using two-factor ANOVA with replication. The null hypothesis in this analysis was that there is no difference in mean E2Eq between or among land use strata. In order to further examine potential land use predictors of E2Eq, we also used ordinary least squares (OLS) regression analysis to develop models predicting levels of E2Eq found at the 54 randomly selected sampling sites from all summarized landscape data. We used Akaike’s Information Criteria (AIC) to evaluate candidate models, we assessed OLS model significance using the Joint F statistic, and we assessed residual normality using the Jarque–Bera statistic (Jarque and Bera 1987). We also checked for multi-colinearity in explanatory variables using a variance inflation factor (VIF) metric and removed redundant variables. We assessed stationarity of residuals using the Koenkner studentized Bruesch-Pagan (BP) statistic and we assessed spatial clustering of regression residuals from the OLS models using the Moran’s I statistic. Where models had non-normal residuals, we constructed alternative, non-biased models by removing outliers where the residual was greater than 2.5 standard deviations from the mean. We evaluated local structure in the regression model using geographically weighted regression (Fotheringham et al. 2002) using AIC-based automatic band width distance selection.

Results

Geographically, the highest levels of the confined animal feeding operation density were located in the central Shenandoah Valley (VA, USA) and the South Branch Potomac sub-basins, while the highest values of the human use index were concentrated in the areas closest to the Washington, DC metro area, as well as distributed along the Shenandoah and Great Valley areas of Virginia, Maryland, and southeastern Pennsylvania. Randomly selected sub-watersheds from each of the nine strata combinations of human use index and confined animal feeding operation density were spread throughout the study area. However, since the highest levels of the confined animal feeding operation density and human use index strata overlap in the central Shenandoah Valley, more randomly selected samples were drawn from this area (Fig. 1).

Results from field collection of water samples and laboratory analysis of the E2Eq yeast bioreporter assay range from below quantitation to a high of 1.2 ng/L. A total of 31 of 53 sites sampled (58 %) had levels of E2Eq below quantitation limits (0.31 ng/L). Sites where E2Eq was measured below quantitation limits were assigned a fixed value of 0.01 in further statistical analysis. While fixed value substitution is not ideal, maximum likelihood or randomization methods for below quantitation limit substitution generally do not work well for small datasets (Hensel 2006). Geographically, the below quantitation limit sites tended to occur in the sampled sub-watersheds representative of the low human use index and low confined animal feeding operation density strata. Conversely, the higher values of E2Eq detected tended to occur in the high animal feeding operation density and high human use index strata. This is evidenced by the ANOVA results demonstrating a statistically significant difference (α = 0.05) in mean E2Eq between levels of the sampling strata in both the human use index (p = 0.021796, F = 4.170553, F crit = 3.204317), the confined animal feeding operation density (p = 0.003479, F = 6.436779, F crit = 3.204317), and significant interaction between the two strata types (p = 0.021927, F = 3.182709, F crit = 2.578739). The difference in mean E2Eq between strata and the interaction between strata are illustrated in Fig. 2.

Fig. 2
figure 2

Results from two-way ANOVA with replication. The graph depicts mean response in E2Eq concentrations in streams across three levels of two factors from the design strata, human use index, and density of confined animal feeding operations (AFO). Differences among strata and interactions between strata are significant

OLS regression model selection using AIC on 53 sites sampled (one site dropped due to access restrictions) resulted in a parsimonious and significant model explaining ~ 31 % of the variation (R 2 = 0.307, adjusted R 2 = 0.279) in E2Eq with just two significant predictors: percent of the catchment land use in agriculture (as crops) and the (square root-transformed) density of confined animal feeding operations. However, the model residuals were non-normally distributed as measured by the Jarque–Bera statistic. By examining the magnitude of residuals and removing two outliers with residuals greater than 2.5 standard deviations from the mean, we were able to create an alternative model using 51 observations with normally distributed residuals that improved overall model fit (R 2 = 0.396, adjusted R 2 = 0.371; Table 2). An assessment of spatial autocorrelation in the residuals using the Moran’s I statistic showed no significant spatial clustering. However, a significant Keonker’s BP statistic (p = 0.002) suggested that there was non-stationarity in the response, i.e., the explanatory variables did not have a consistent spatial relationship with the dependent variable.

Table 2 Ordinary least squares regression models of landscape variables summarized within sub-watershed reporting units predicting levels of E2Eq (ng/L). Models shown include the most parsimonious models (on the basis of AIC), using all sites sampled (n = 53) with two outlier observations removed (n = 51) after examination of OLS residuals and subdivided into southern sites and northern sites based on geographically weighted regression results. Predictor variables include percent of the sub-watershed in agriculture as crops (%AG as crop), square root-transformed confined animal feeding operation density (AFO density), Shannon’s diversity of land cover types (LC diversity), and design flow (mgd) of wastewater treatment facility point discharges (WWTP flow)

Fitting of local regression models with geographically weighted regression (GWR) resulted in improvement of model fit (R 2 = 0.48, adjusted R 2 = 0.41); however, examination of the local model fit revealed that the southern portion of the study area had high local coefficients of determination (R 2) values when predicting levels of E2Eq from the two variables (square root-transformed density of confined animal feeding operations and percent of the catchment in agriculture land use as crops), while conversely the northern sample sites had low local model R 2 values (Fig. 3). Division of the 51 sites used in the overall OLS model into zones of northern sub-watersheds (n = 18) and southern sub-watersheds (n = 33) and fitting individual OLS models to each zone resulted in a significant model with better approximation (R 2 = 0.594, adjusted R 2 = 0.568) of E2Eq in the southern zone where (square root-transformed) confined animal feeding operation density and percent of the catchment in agricultural (as crops) were still the only significant predictors in the model with the smallest AIC values (Table 2). Conversely, for the northern zone, the best OLS model (as measured by AIC) consisted of two significant predictors including Shannon’s diversity of land cover types and the design flow (million gallons per day) from permitted wastewater treatment plants. Although the overall model was significant as measured by the joint Wald statistic, the model fit was relatively poor (R 2 = 0.28, adjusted R 2 = 0.18), suggesting that the set of landscape predictor variables we assembled for this analysis is unable to adequately estimate the variation in levels of E2Eq in areas that are not dominated by agricultural land use (Table 2). Additionally, the parameter estimate for design flow of wastewater treatment was negative, contrary to expectation. Close examination of the data values revealed that the negative coefficient was most likely influenced by one site where the design flow was an order of magnitude greater than other sites and yet the measured E2Eq was below quantitation limit. Although the variable was significant in the regression, this relationship may be spurious and warrants further investigation.

Fig. 3
figure 3

Local regression results (geographically weighted regression) predicting concentrations of E2Eq at sampled sites from percent of the subwatershed reporting unit in agriculture as crops and square root-transformed density of confined animal feeding operations variables. Note the regional shift in model predictive capability

Discussion

The presence of estrogenic compounds in the nation’s waterways is becoming of increasing concern due to observed and potential effects on aquatic wildlife and human health. Discovering the landscape sources and potential delivery pathways of these compounds is therefore of crucial importance. There are many potential sources of estrogenic compounds on the landscape, from both endogenous sources such as human and animal excretion and exogenous sources such as chemical breakdown products. Ciparis et al. (2012) demonstrated that in highly agricultural areas of the Shenandoah Valley (VA, USA), the density of confined animal feeding operations and the percent of the catchment in agriculture were reliable and important predictors of the presence of estrogenic equivalent compounds (as measured by the BLYES yeast reporter assay, e.g., E2Eq).

Although our study area was much larger, our findings confirm those of Ciparis et al. (2012) as we found that the density of confined animal feeding operations and the percent of the catchment in agricultural land uses were indeed the most reliable predictors of levels of E2Eq overall. However, where agricultural land uses were less intensive (e.g., in the areas in the north of our study area closer to Washington, DC, USA, and the Maryland/Pennsylvania border), this relationship was less reliable, and instead the levels of E2Eq may be better predicted by other land use attributes. For example, a greater diversity of land uses may introduce other sources of estrogenic compounds to streams. Since the coefficient of determination for the subset of models built using only data from the northern portion of the study area was weak (adjusted R 2 = 0.19), it is probable that the regression models for these northern sub-watersheds are incomplete as we did not account for some landscape factor important for structuring the levels of E2Eq present in these areas.

While we used consistent mapping methods for assessing land cover throughout the study area, it is possible that we may have under-counted animal production activities that changed in characteristic form as we moved from the southern portion of the study area to the north. For instance, poultry production activities have a very distinctive form and the elongated metal-roofed poultry houses are readily observable on aerial imagery. Other than a possible confusion with mini-storage units that typically only occur in urban, sub-urban, or ex-urban settings, there really is no other rural land use type that is easily confused with these activities. On the contrary, diary production activities take on numerous spatial forms due to a wide variety of barn structures, loafing lot configurations, and farm ages. As a result, we may have not adequately accounted for the density of these concentrated animal rearing and production activities in areas where we did not have ancillary data sources (such as permit databases) to augment our aerial mapping efforts. However, a cursory remapping of potential dairy farms in the northern portion of our study area using aerial imagery available in the Google Earth program did not appreciably change our estimates of confined animal feeding operation density. Additionally, we did not specifically account for the influence of grazing beef cattle which may be a significant source of estrogens to streams from direct runoff of waste. However, percent of land area mapped as pasture was included in the suite of predictor variables in our models, but it was not significant.

Likewise, our estimates of WWTP flow were somewhat crude as they were from publicly available data sources and provide only a relative approximation of potential inputs to streams rather than summaries of actual daily or weekly flow values. Additionally, the location of permitted flows from WWTP along stream networks does not always coincide with easily observable features such as settling ponds or WWTP infrastructure, calling into question the accuracy of this map layer. However, the dataset we used has been vetted by other researchers and is the same dataset used to calibrate point source loads for SPARROW models of nutrient delivery to streams (Ator et al. 2011).

The levels of E2Eq that we found were generally low, and many observations (58 %) were measured below quantitation limits. This contrasts with only 23 % of samples below quantitation limits in Ciparis et al. (2012). In this study, we potentially sampled more areas without significant watershed land use sources of E2Eq than Ciparis et al. (2012) did. Since our sampling occurred in early June 2010, we may also have missed peak spring flows that might have washed more material containing E2Eq off of the land surface or delivered more E2Eq through wastewater point source outfalls. Ciparis et al. (2012), in repeated sampling, found the highest levels of E2Eq during high flows, although they found no effect of sampling period on statistical models relating watershed land use to levels of E2Eq. Comparison of grab samples from other time periods as well as data from long-term monitoring equipment such as polar organic chemical integrative samplers (POCIS) is warranted.

Given our methodology, it is reasonable to question the mechanisms of delivery of estrogenic compounds, whether endogenous or exogenous, to regional waterways. We assume in this study that the primary mechanism of E2Eq delivery to streams is through either non-point source runoff or point source wastewater treatment. Depending on treatment technologies, wastewater treatment facilities are capable of removing estrogenic compounds (e.g., through activated sludge treatment; Barber et al. 2012), although low concentrations may still be present in treated wastewater (Khanal et al. 2006). Land application of manure as fertilizer, however, has been shown to result in runoff of E2 into streams and groundwater (Shore and Shemesh 2003; Dyer et al. 2001; Peterson et al. 2000; Finlay-Moore et al. 2000; Nichols et al. 1998), although this can be reduced by buffer strips and other “best management practices”. While we measured density of confined animal feeding operations, we did not directly measure manure spreading or nutrient management activities. We implicitly assume that waste generated from intensive animal rearing activities correlates with the density of such activities on the landscape. While most manure is managed locally, there is some inter-basin transfer of animal manure for use as fertilizer, and these activities may lead to some spatial generalization and inaccuracies in our regression models.

It is apparent from laboratory and field studies that spreading of animal manure from cows and poultry can be significant sources of environmental estrogens, although poultry waste is the main source of 17-β estradiol (Andaluri et al. 2012; Khanal et al. 2006; Shore and Shemesh 2003), the equivalent of which is measured by the BLYES assay used in this study (Sanseverino et al. 2009). The spreading of biosolids from human sewage sludge may also be a significant source of land-applied endogenous estrogenic compounds (US-EPA 2009), but they were not accounted for in this study due to lack of available data. However, on a national level, biosolids are reported to be a less significant source of solid waste (by tonnage) than cattle, swine, or poultry manure and are thought to contribute less than 1 % of estrogen load to the environment (Andaluri et al. 2012). Unaccounted for in studies by Andaluri et al. (2012) and this study are potential sources of estrogenic compounds from leaky septic systems.

The presence of 17-β estradiol and other estrogenic compounds in the environment can be detrimental at high levels, and even at low levels E2 (or E2Eq) is reported to induce intersex conditions in aquatic organisms (Jobling et al. 2006; Kidd et al. 2007; Vajda et al. 2008; Tetreault et al. 2011; Hirakawa et al. 2012). However, it is unclear how widespread these conditions are in US waters. Using a GIS-based water quality model, Anderson et al. (2012) estimate that only 1.1 % of river kilometers in 12 US river basins studied were at risk of long-term E2Eq exposures above a predicted no effect concentration (PNEC) of 2.0 ng/L and only 0.8 % of river kilometers studied were above 5.0 ng/L PNEC for short-term exposures, and those stream segments at threat were mostly influenced by upstream wastewater treatment plants. However, that study modeled concentrations at mean and low flow conditions only and considered only human-derived estrogens, explicitly ignoring non-point sources of estrogenic compounds from animal husbandry. Since Ciparis et al. (2012) found concentrations of E2Eq above the PNEC proposed by Anderson et al. (2012) in streams draining intensive animal rearing areas during time periods with high stream flow and increased surface runoff, a focus on non-point sources of estrogenic compounds is warranted. However, Ciparis et al. (2012) and the current study also found quantifiable levels of E2Eq during low flow periods, suggesting that background levels of E2Eq are present in streams in some areas with intensive land use. Further determining sources and relative contributions of estrogenic compounds in the environment, whether from surface runoff, persistently present in agricultural soils and groundwater, from leaking septic systems or from point sources such as wastewater treatment, should be a research priority.

Conclusions

Our study set out to demonstrate the use of a randomized, stratified sampling design for a systematic examination of the distribution of estrogenic compounds in streams over a regional area with varied land uses and known problems with intersex. Although we did not set out to evaluate our design against other potential sampling designs (e.g., purely random, systematic, etc.), we sought to demonstrate that the use of systematic methods provides an appropriate basis for information discovery through statistical modeling and spatial data analysis. Our study confirms the importance of agricultural activities on levels of E2Eq and also highlights the importance of other potential landscape sources of E2Eq in areas where intensive agriculture is not the dominant land use. More research needs to be completed to determine the specific land use practices and runoff pathways that deliver biologically disruptive estrogenic compounds to surface waters so that effective mitigation methods can be designed and implemented. Better data on manure and bio-solid spreading and areas served by septic system versus wastewater treatment would help as well to pinpoint problem areas in specific land use configurations.

We propose that future studies use similar spatial analysis approaches to develop a systematic geographic framework for reconnaissance of estrogenic EDCs in the environment rather than conducting ad hoc or purposive sampling. Such systematic designs, coupled with detailed field studies, are useful for assessing the correlation of specific land use types to levels of estrogenic activity across a large area in a consistent and unbiased manner. Furthermore, through application of spatial statistical approaches such as geographically weighted regression, we demonstrated the ability to detect regional variability in response of estrogenic activity to land use influences undetectable in non-spatial regression models. Use of such techniques can be a powerful source of insight for development of alternative hypotheses for future studies.