Introduction

World Health Organization reported an estimated 600 million people in the world suffer from diseases after eating contaminated food every year (WHO 2015). Consumption of fresh produce has been increasingly viewed as a matter influencing food safety and health (Kearney 2010). The microbial quality of irrigation water is recognized as a substantial factor affecting contamination of produce to produce fresh fruits and vegetables (Kearney 2010; Pachepsky et al. 2012; Oliveira et al. 2012; Akinde et al. 2016). Therefore, the use of irrigation water that will contact fresh produce must be monitored to assess threats from bacteria that cause foodborne disease.

Generic Escherichia coli (E. coli) and Enterococci are commonly used as indicator microorganisms for microbial quality of irrigation water (Steele et al. 2005; Boehm and Sassoubre 2014; Chandrasekaran et al. 2015). Specific metrics and thresholds to evaluate microbial water quality are based on concentrations of these two bacteria. Geometric mean E. coli concentrations and E. coli concentrations at the 90% probability level are used by the US EPA for recreational waters (US EPA 2003) and have been proposed by the US FDA for irrigation waters (US FDA 2018).

FIB concentrations in surface waters undergo rapid change in time and space (Cha et al. 2010; Wu et al. 2011; Perkins et al. 2014; Kim et al. 2017). The spatial organization of FIB concentrations in freshwater sources was demonstrated in several studies. Rao et al. (2015) observed that E. coli concentrations decreased with increasing distance from riverside along three different rivers. Hyland et al. (2003) found fecal coliform and E. coli counts increased at the confluence of drainages from agricultural lands along the river. Pachepsky et al. (2018) demonstrated that spatial variations of E. coli in ponds can follow a pattern, according to which E. coli concentrations in some parts of the pond tend to be lower than the geometric mean across the pond, and the concentrations in other parts tend to be higher than the geometric mean over several sampling times. Piorkowski et al. (2014) showed that the spatial patterns of E. coli concentrations in bottom sediments existed along the 2-km creek reach and could be explained by water velocity and effective particle size. Stocker et al. (2016) observed a spatial pattern in E. coli concentrations in two creeks over 0.7-km reaches during baseflow periods when no runoff or sediment resuspension occurred.

When the spatial pattern is preserved over time, the phenomenon of such preservation of spatial pattern over time is dubbed temporal stability (Vachaud et al. 1985). Analysis of temporally stable patterns has facilitated upscaling of observations to obtain average values across the observation area and has been suggested to improve monitoring strategies across for environmental variables such as soil water content (Huang et al. 2018) and crop yields (Miao et al. 2018). To our knowledge, temporal stability of FIB concentrations in creeks has not been studied.

The objectives of this work were (1) to investigate the spatiotemporal variation of E. coli and Enterococci concentrations in a large creek with diverse land use along it and (2) to research the temporal stability of FIB concentrations along the creek.

Study site and methods

Site description

The study area was the Conococheague Creek headwaters in Franklin County, South Central Pennsylvania in the USA. It is near the center of the Chesapeake Bay Watershed and east of the Appalachian Mountains (Fig. 1). Study sites are upriver and east of Chambersburg, Pennsylvania. Uppermost tributaries flow from a mountainous state conservation area in Adams County where the main headwater drains a water reservoir for the town of Chambersburg. The land use in the area draining to the studied reach can be categorized into three classes: forest (57.2%), farmland (25.1%), and urban (15.2%). The watershed receives an average annual precipitation of 1058 mm and has an average temperature of 11.3 °C from 1981 to 2010. Five water sampling locations (TP, I81, SS, SG, SD) and one weather monitoring station (Fig. 1) were established. The total drainage area upstream from the last sampling station (SD) includes an area over 243 km2. TP is the most upstream station and drains 99 km2 of the forested land. From TP, it is 10.3 km to the I81 station, 13.6 km to SS, 17 km to SG, and 22.8 km to SD. The I81 station drains 180 km2 including the TP forest drainage and an agricultural area which begins 1 km downstream from TP station. The drainage areas of SS, SG, and AD locations are 220, 225, and 243 km2 respectively.

Fig. 1
figure 1

Land use map of the upper part of the Conococheague Creek watershed and monitoring locations

Monitoring data acquisition

Water samples were collected weekly within hours of sampling for E. coli and Enterococci from October 2015 to October 2018, and daily on alternating weeks during July and August from 2017 to 2018. The total number of observation days was 179. The 1 L samples were delivered to the laboratory on ice within 3 h after collection. Once at the lab, samples were serially diluted with peptone buffer solution containing 0.05% Tween-20 (EMD Millipore, MA, USA) so that filtration of diluted volumes yielded between 1 and 99 colony-forming units (CFU). After vacuum filtration using 45-μm filters, E. coli filters were plated and incubated on the modified membrane Thermotolerant E. coli Agar (mTEC) at 35 °C degrees for 2 h and 45 °C degrees for 22–24 h (EPA method 1603). Concentrations were corrected for dilutions and reported in CFU per 100 mL. Processing of samples for Enterococci concentrations was similar to that of E. coli except mEI agar was used and plates were incubated at 41 °C for 24 h.

Temporal variation modeling

The temporal changes of logarithms of concentrations followed the sine wave of the temperature change. (Fig. 2). The sine dependence on time

$$ {x}_i(t)={A}_i+{B}_i\sin \left(2\pi \frac{t-C}{365}\right) $$
(1)

was fitted to data on logarithms of E. coli and Enterococci at all locations. In (1), Ai is the average annual logarithm of concentrations, Bi is the amplitude, and C is the phase shift for location “i,” and t is the day from the beginning of observations. The values of the parameter C were fixed 240 for E. coli and 231 for Enterococci.

Fig. 2
figure 2

Time series of daily precipitation and temperature at a weather monitoring station and logarithms of E. coli and Enterococci concentrations (both in CFU 100 mL−1) and sine curve fitting at monitoring stations TP, I81, SS, SG, and SD

Temporal stability assessment

Temporal stability was defined by Vachaud et al. (1985) as the time-invariant association between spatial location and classical statistical parametric values of observations. To diagnose and quantify the temporal stability, these authors proposed to use two statistics— the mean relative difference (MRD) and the Spearman correlation coefficient.

The MRD values are derived from relative differences observed at each location over the observation period. The relative differences (RDij) between an individual measurement of xij at location i and time j and spatial average of variable \( {\overline{x}}_j \) at the same time from all locations are calculated as

$$ {RD}_{ij}=\frac{x_{ij}-{\overline{x}}_j}{{\overline{x}}_j} $$
(2)

The mean relative difference for the location i (MRDi) becomes

$$ {MRD}_i=\frac{1}{N_t}\sum \limits_{j=1}^{j={N}_t}{RD}_{ij} $$
(3)

Where Nt is the number of observation times. Values of MRDi < 0 and MRDi> 0 mean that measurements in the location “i” tend to be less than average and larger than average, respectively. Locations where MRDi are close to zero provide a good representation of the average observed value xij across all observation locations.

The set of relative differences (RDij) for the location i can be also characterized by the standard deviation (SDRDi) defined as

$$ {\mathrm{SDRD}}_{\mathrm{i}}=\sqrt{\frac{1}{N_t-1}\sum \limits_{j=1}^{N_t}{\left({\mathrm{RD}}_{ij}-{\mathrm{MRD}}_i\right)}^2} $$
(3)

This statistic shows the uncertainty associated with MRD values. Small SDRDi indicates that MRDi is a good predictor for most of RDij to be of the same sign as MRDi.

Another aspect of temporal stability is the preservation of ranks of observation locations from one observation time to another (Vachaud et al. 1985). If observation locations are ranked by the values xi at the time moment j and moment j + 1, then the high correlation of the ranks at time j and time j + 1 will mean that the spatial order of observation locations in terms of observed concentrations is preserved in time. The Spearman correlation coefficient rs, j is used to quantify such stability of ranks. Its value is computed

$$ {r}_{s,j}=1-\frac{6\sum \limits_{i=1}^{i={N}_L}{\left[\operatorname{rank}\left({x}_{ij}\right)-\operatorname{rank}\left({x}_{i,j+1}\right)\right]}^2}{\left({N}_L-1\right){N}_L\left({N}_L+1\right)} $$
(4)

where NL is the total number of the observation locations.

Baseflow conditions

The temporal stability was examined for baseflow conditions. Baseflow is the portion of streamflow that is sustained between precipitation events and fed to stream channels by delayed pathways (Huang et al. 2016). To analyze the temporal stability during baseflow of the stream, we separated the measurements based on the duration of surface runoff and rainfall threshold. The duration of surface runoff was calculated from the empirical relation defined by Linsley et al. (1982):

$$ \mathrm{N}={A}^{0.2} $$
(5)

where N is the number of days after which surface runoff ceases, and A is the drainage area in square miles. We calculated the number of days of the impact of surface runoff after the rain from this equation. The duration of surface runoff on each water monitoring location was 2.1, 2.3, 2.4, 2.4, and 2.5 days at TP, I81, SS, SG, and SD locations respectively.

The rainfall threshold was defined to exclude the concentration measurements which were made during the periods affected by surface runoff. Han et al. (2015) concluded that rainfall threshold was 10 mm significantly at the forest-dominated area in the northern subtropical climate with annual precipitation ranging from 680 to 1700 mm. We extracted the monitoring data which did not exceed the cumulative precipitation of 10 mm during three days before the observation date as measurements during baseflow. The total number of observation days with baseflow conditions was 139.

Statistics

Statistical testing was done with the software PAST (Hammer et al. 2001). Pearson’s correlation analysis was performed to analyze the correlation between the logarithms of FIB concentrations and temperature as well as correlation among the logarithms of FIB concentrations at all monitoring locations. The Shapiro-Wilk test was used to assess the normality of the distribution of MRDs for logarithms of FIB concentrations. That was done to justify the ANOVA application, one-way ANOVA with post hoc comparison using the Tukey test. Levene’s test for equality of variances (SDRDs) was performed. The t test was used to determine whether there exists a significant difference in the MRDs of FIB concentrations between warm season and cold season. The level of significance was set at 0.05.

Results

Weather conditions

Supplemental Fig. S1 shows the variation of monthly precipitation and temperature from October 2015 to October 2018 in the study area. The average annual precipitation and temperature were about 715 ± 270 mm and 12.5 ± 0.2 °C. Monthly precipitation was relatively high in the warm season from May to October. During this time, 100 mm of rainfall occurred in July alone. Low monthly precipitation of about 40 mm occurred during the cold season from November to April except for February (60 mm). Average monthly temperature was also high in the warm season (20.2 °C) and low in the cold season (4.9 °C). The highest and lowest monthly temperatures were observed in July (24 °C) and January (− 0.4 °C).

Spatiotemporal variability of FIB concentrations

Observed E. coli and Enterococci concentrations are shown in Fig. 2 along with daily rainfall and average temperature. FIB concentrations were high in the warm season and low in the cold season. Significant positive relationships between log FIB concentrations and daily temperature were found at all monitoring locations (Table 1). Pearson correlation coefficients ranged from 0.62 to 0.73 and from 0.58 to 0.76 for E. coli and Enterococci, respectively.

Table 1 Pearson correlation coefficients between bacterial concentrations and air temperature

Enterococci fitting results for the temporal changes of logarithms of concentrations are shown in Fig. 2 and in Table 2. Coefficient of determination (R2) of sine regression ranged from 0.40 to 0.62 for E. coli concentrations and from 0.48 to 0.75 for Enterococci concentrations. The values of parameters Ai and Bi for Enterococci were higher than the values for E. coli at all monitoring locations.

Table 2 Parameters of sine curves fitted to time series of the logarithms of bacterial concentrations in CFU 100 mL−1

Concentrations of measured FIB tended to increase with distance downstream. The geometric mean value of E. coli and Enterococci concentrations over the study period at SG location were 5.8 and 5.1 times higher than at TP. Significant positive correlations were found among log E. coli concentrations at all locations (Table 3). A similar correlation matrix was found for Enterococci. The range of correlation coefficients among log E. coli and log Enterococci concentrations was 0.56 to 0.83 and 0.69 to 0.89, respectively. The correlation weakened as the distance between locations increased. Logarithms of Enterococci concentrations showed higher correlation coefficients among locations than log E. coli concentrations. The significant differences between correlation coefficients for logarithms of E. coli and Enterococci concentrations for the same pairs of locations were determined by paired t test for the location pairs: TP-I81, TP-SS, TP-SD, I81-SS, and SS-SG.

Table 3 Correlation between logarithms of bacteria concentrations at different locations. Values for Enterococci and E. coli are above and below the diagonal, respectively. Values for Enterococci in italics are significantly (P < 0.05) larger than values for E. coli

Monthly geometric means of E. coli concentrations at monitoring locations are shown in the supplemental Fig. S2. Geometric mean concentrations and statistical threshold values (STV) were estimated for each observation date using four observations to encompass the month that included this observation date. The geometric mean values were lower than the threshold E. coli concentration of 126 CFU/100 mL set by US EPA (US EPA 2003) and US FDA (US FDA 2018), but the threshold was exceeded during the warm season. The location TP presented the exception as concentrations there were lower than the final rule threshold all year around. The geometric mean of E. coli at location TP exceeded the threshold once in 2016 and 2017, respectively. These values at location SG were higher than the other locations during August and October.

Temporal stability of FIB concentrations

The MRD values of the logarithms of E. coli and Enterococci concentrations computed over the three-year period of observation and each year of observations are shown in Fig. 3a and b, respectively. The MRD values mostly increased downstream, but the MRDs of E. coli and Enterococci at SG were slightly higher than at the SD location. The difference between the highest and the lowest MRD for E. coli and Enterococci were 0.41 and 0.39 respectively. The MRD values of E. coli at all locations were significantly different according to the t test. One-way ANOVA showed that there were no significant differences in the MRD values among each of three years except for location SG. The MRD values of Enterococci at all locations were significantly different except location pairs I81-SS and SG-SD. There were no significant differences in the Enterococci MRD values among the years.

Fig. 3
figure 3

Mean relative differences (MRD) of logarithms of E. coli and Enterococci concentrations (CFU 100 mL−1) across monitoring locations TP, I81, SS, SD, and SD at the Conococheague Creek. Error bars show the standard deviations of relative differences SDRD. (a) MRDs computed over a three-year period of observation, (b) MRDs for each of the three years of observations, and (c) seasonal MRD, which is divided into two groups: warm season (May to October) and cold season (November to April)

The SDRDs of logarithms of E. coli concentrations at I81, SS, SG, and SD had similar values from 0.12 to 0.13, only the SDRD of TP was 0.16. The Levene test showed that the SDRD of TP was significantly higher than the others. The SDRDs of logarithms of Enterococci concentrations at I81 and SS were 0.15. These values were around 0.19 at TP, SG, and SD, higher than I81 and SS. The SDRD at SS was significantly lower than that at TP.

Separating data into warm and cold season subsets showed that the temporal stability exists for each of the seasons (Fig. 3c). Changes of seasonal MRD along the observation reach were similar to changes in MRDs that were calculated regardless of the season. The differences of MRDs for E. coli among locations were larger in the cold season than in the warm season. There were significant differences of the MRDs for E. coli between warm and cold seasons at all locations except I81. The SDRDs for E. coli in the cold season were also larger than in the warm season except for location SD. The MRDs and SDRDs for Enterococci between warm and cold seasons showed starker differences compared with E. coli. There were significant differences for Enterococci between the warm season MRD and the cold season MRD at all locations. The MRDs in the cold season were 3.0 and 5.8 times higher than in the warm season at SG and SD locations. The SDRDs for Enterococci in the cold season were more than 1.7 times higher than in the warm season at all locations. Seasonal differences between SDRDs for Enterococci at locations TP, SS, SG, and SD were statistically significant.

Since the MRD at location SS was close to zero, and observations at this location should represent the average overall observation locations. The quality of this representation is shown in Fig. 4. The coefficient of determination of the regressions was 0.91 for Enterococci and 0.81 for E. coli. The slopes of the regressions did not differ significantly from 1, the root-mean-squared errors of predicted average logarithms of the FIB concentrations were 0.26 for E. coli and 0.22 for Enterococci.

Fig. 4
figure 4

The relationship between logarithms of bacterial concentrations at location SS and average values of logarithms of bacterial concentrations across all locations

The time series of Spearman rank correlation coefficients between two consecutive observation dates are shown in Fig. 5. These coefficients for E. coli increased during the warm season and decreased during the cold season as the season changed. Enterococci showed an opposite trend. For E. coli, the very strong correlation (rs ≥ 0.9) was observed in 30% of cases, strong or very strong correlation (rs ≥ 0.7) was observed 65% of cases, and moderate to very strong correlation (rs ≥ 0.4) was found in 80% of all cases. For Enterococci, the percentages of very strong, strong and very strong, and moderate to strong correlation were 25%, 51%, and 71% respectively. The number of occasions when Spearman rank correlation coefficients were less than 0.5 was larger in the warm season than in the cold season.

Fig. 5
figure 5

Time series of Spearman rank correlation coefficients for E. coli and Enterococci concentrations during warm (May–October) and cold (November–April) seasons. The solid lines and dashed lines are average values of the Spearman rank correlation coefficient for E. coli and Enterococci concentrations

Discussion

FIB concentrations were higher in the warm season than in the cold season. Similar seasonality was reported in other studies (Choi and Seo 2018; Kostyla et al. 2015) including the earlier study for the Little Cove Creek at Southern Pennsylvania (Kim et al. 2010). The seasonal differences may be related to the differences in FIB input from land, differences in the availability of nutrients, differences in activity of predators, or differences in growth rates (Youlton et al. 2016; Nguyen et al. 2016). Paul (2014) confirmed that temperature was a major controlling factor for growth of FIB in soil. Runoff from pastures brings less FIB and less nutrients in the cold period compared with the warm one. Optimal temperatures for protozoan predator activity are much lower than those for E. coli (McCambridge and McMeekin 1980). The low temperatures may reduce pathogen accumulation (Walters et al. 2014). Seasonal differences in dilution can be a factor, and differences could be attributed, at least partly, to the seasonal hydrology of small streams which can experience low and even stagnant waters to affect microbial ecology in the water courses (Wilkes et al. 2009).

FIB concentrations at the uppermost station passing primarily through the forested drainage basin were consistently lower than the others. Kang et al. (2010) monitored FIB concentrations at 50 monitoring sites distributed along Yeongsan River (126 km) in Korea and reported forest land use generated little E. coli and Enterococci. The average annual logarithm of concentrations increased as the proportion of agricultural and urban land use increased (Fig. 1 and Table 2). In the urbanized area, FIB concentrations at SD were lower than at SG. There is a tributary stream between SD and SG that could be responsible for the dilution. Also, the concentration of urban versus agricultural land use increased in an upstream direction along the creek from SD to SG.

The amplitude of logarithm of E. coli and Enterococci concentrations followed decreasing trend from location I81 to SD (see values of B in Table 2). A possible reason for that can be the relatively less seasonal variation of E. coli influx in an urbanized area as the latter is related to the presence of leaking septic tanks. Seasonal differences in animal activity and access to streams also can be a factor affecting seasonality of amplitudes of E. coli and Enterococci concentrations in stream water.

The average annual logarithms of Enterococci concentrations were higher than E. coli concentrations at all locations. This agrees with previous studies in southern California, USA (Tiefenthaler et al. 2009) and in Benta river basin, Bangladesh (Islam et al. 2017). The amplitudes of Enterococci concentrations were also higher than E. coli concentrations at all locations (values B in Table 2). A reason for that can be that the survival rate of Enterococci was higher than E. coli in fresh water as well as sediments. Liu et al. (2006) indicated that Enterococci survived longer compared with E. coli in Lake Michigan. Haller et al. (2009) has observed that longer survival of Enterococci compared with E. coli in sediments of a freshwater lake in Switzerland.

FIB concentrations had temporally stable spatial patterns. Water after passing through agricultural and urban areas had FIB concentrations that tended to be higher than average across all locations. There obviously exist substantial sources of FIB along the creek that support these temporally stable patterns in baseflow conditions in the absence of runoff. Populations of FIB in sediment may be one of such sources. Recent modeling and direct measurements showed that the flux of E. coli from the bottom sediment to the water column in streams in the region of study can be substantial enough to increase concentrations in water during baseflow periods despite dilution along the streams caused by groundwater inflow (Park et al. 2017; Pachepsky et al. 2017; Stocker et al. 2016). Direct measurements of the FIB influx to the water column in the first-order creek in Maryland resulted in values of 40–60 CFU m−2 s−1 for E. coli and 40–80 CFU m−2 s−1 for Enterococci. In the absence of dilution, such fluxes could change concentrations at the I81 location by 86 CFU/100 mL which is comparable with the maximum concentration increase of 123 CFU/100 mL in the warm period (supplemental material 1). FIB population growth in the water column could be another reason for the FIB concentration increase along the observation stream reach in baseflow periods. The likelihood of such growth is debatable. The growth of E. coli and Enterococci in temperate climate streams was not documented in existing literature (Blaustein et al. 2013). On the other hand, Ishii and Sadowsky (2008) noted the ability of E. coli to grow in soil, sand, and sediment in temperate climates. The SWAT modeling of E. coli concentrations in four creeks in different climatic conditions was substantially improved when the growth in water was permitted (Cho et al. 2016). Additional factors of the concentration increase during baseflow can be wildlife contribution (Parajuli 2007; Guber et al. 2016) and urban wastewater release (Stallard et al. 2019). More research needs to be done to separate and quantify the effect of individual factors on the FIB population changes in creeks under baseflow.

The MRDs were significantly different in warm and cold seasons at all locations (Fig. 3c). Absolute values of MRD were larger in the cold than in the warm period. Urbanized and wildlife-only sites were more different from each other in cold than in warm period. That may be attributed to differences in sources of FIB at these two types of sites or the presence of season-independent sources in the urbanized area.

No statistical difference was found between MRD values for each of the three consecutive observation years (Fig. 3b). One year appeared to be a plausible duration for the discovery and quantification of the temporal stability pattern. Literature on temporal stability discusses the value of short intensive campaigns to develop the temporal stability pattern (e.g., Huang et al. (2018)). The dataset generated in the present study can also be used in further studies to research the effect of sampling frequency on the temporal stability patterns.

Site SS was close to the average logarithm concentration of FIB along the observation reach. Since the average logarithm of concentration is equal to the logarithm of the geometric mean concentration, this site could be used to estimate the geometric mean of concentrations across the observation reach of the creek (Fig. 4). The I81 site had similar properties. These two sites can be used to represent the geometric mean across the observation reach and to characterize the microbial water quality of the entire observation reach. Measures to improve microbial water quality across the drainage area of the observation reach should manifest themselves in fecal indicator concentrations found at I81 and SS locations.

Spearman rank correlation coefficients demonstrated mostly moderate to very strong relationships between consecutive distributions of the baseflow concentrations along the creek. Weak relationship or negative rank correlation coefficients were observed in 20% of cases for E. coli and in 28% of cases for Enterococci. An obvious single reason for small or negative correlations was not found. Spatial variability in rainfall or differences in snowmelt rates across the watershed could cause microbial loads of different intensity. Natural variability might affect ranks in cases when the concentrations were not very different between locations.

Using the MRD values provides information about a single spatial dominant pattern in concentration. Studies of patterns of other environmental variables showed that there may exist more than one temporary stable spatial pattern that manifest themselves in spatiotemporal variations of those variables (Vereecken et al. 2016; Pachepsky and Hill 2017). Uncovering the existence of several spatial patterns in spatiotemporal dynamics of microbial indicators can be an interesting avenue to explore.

Conclusions

Three years of intensive monitoring of the microbial water quality in five locations along the 25-km reach of the Conococheague Creek in Southern Pennsylvania provided a large dataset on spatiotemporal dynamics of FIB concentrations. Logarithms of FIB concentrations had the sine wave-like dependences on time and correlated well with temperature at all observation locations. The spatial pattern of variation of FIB concentrations was preserved over time. Two locations in the urbanized area had logarithms of FIB concentrations mostly larger than average across the observation reach, and one location in a forested area had these logarithms mostly smaller than average. Two locations in agricultural and sparsely urbanized area had logarithms of FIB concentrations close to the average. The temporal stability of FIB concentrations was more pronounced in cold periods of the observation years as compared with warm periods. No significant difference was found among separately derived temporal stability patterns for each of three years’ observations. Spearman rank correlations between observations in consecutive dates showed mostly moderate to very strong relationships. Two sampling locations in the study could inform about the geometric means of FIB concentrations across the whole observation reach. Identification of stable spatial patterns can be a useful component of the microbial water quality monitoring design and implementation.