1 Introduction

Preventing excessive phosphorus (P) loss from soils to aquatic ecosystems benefits society in many ways. High P concentrations in stream and lake water are a major driver of cultural eutrophication (Haygarth and Jarvis 1999; Correll 1998) and loss of soil P fertility disrupts agricultural ecosystems (Quinton et al. 2010). Identifying anthropogenic increases in streamwater P concentrations requires clear documentation of natural background P concentrations where human activity is minimal. In many regions, forested catchments represent best available background conditions, even though they contribute some P to downstream ecosystems (e.g., Mattsson et al. 2003). Understanding P movement from forested ecosystems, thus, is important to watershed management of P in many regions and for predicting land use changes on watershed development.

In temperate, forested ecosystems, P exports from soils to streams increase when rainfall creates hydrologic connections between soils and streams (Sharpley et al. 2003; Rodríguez-Blanco et al. 2013). During rainstorms, P is mobilized across multiple hydrologic pathways—overland flow, interflow, and groundwater flow—with the flux through each pathway determined by the water flux and the P concentration from the original source. Generally, P sources to streams during storm events are near the soil surface, and that P is transported via erosion driven by some form of overland flow (Sharpley 1985). Thus, the generation of erosive hydrologic events is a focus for predicting P transport.

Rainfall intensity and duration affect the amount of event-related water in nearby streams, with “event water” from high-intensity storms capable of producing about 50% or more of peak flow (Brown et al. 1999; Sklash et al. 1976). The amount of event water, measurable with conservative hydrologic tracers, can be a strong indicator of overland flows (Holko et al. 2011). Moderate intensity rainfall, on the other hand, even in nearly saturated soils, leads to streamflows dominated by groundwater (Sklash and Farvolden 1979), which tend to have lower P fluxes associated with them due to the P sorption onto soil surfaces (Holman et al. 2008).

Thus, characteristics of rainfall events affect the flux of different sources and chemical species of P to local streams, and these fluxes, in turn, affect loading to downstream wetlands, rivers, and lakes. Precipitation intensity and duration, interacting with different antecedent soil moistures, activate different hydrologic pathways, which alters the mix of groundwater, soil water, and overland flow contributions to water and P transport to streams (Haygarth et al. 1999). Over decadal and longer timescales, climatic factors also influence P fluxes from watersheds especially as they affect the length of the growing seasons and erosion rates (Jeppesen et al. 2009; Foster et al. 2008). P concentration-discharge relationships can help to identify sources of P in streamwater during storm events and provide information about P transport (McDiffett et al. 1989; Lloyd et al. 2016). In particular, the hysteresis of the concentration-discharge relationship can indicate probable sources of P species in streams (Bowes et al. 2005), with the size and magnitude of the total P and discharge hysteresis loop increasing as more sources of P enter the system (Mellander et al. 2015).

This study explores the hydrologic transport of P from a forested catchment during three summer storms. We tested two hypotheses:

  1. 1.

    Stormflow P concentrations were higher than non-stormflow conditions in this forested catchment.

  2. 2.

    Streamwater tracing with stable water isotopes improved understanding of streamwater P transport during storms.

To test these, we characterized the temporal variability of streamwater P concentrations and hydrologic conditions during the storms in an attempt to understand how hydrologic flow pathways impact the transport of different P species from a forested catchment.

2 Methods

2.1 Site Description

We studied Livermore Cove Brook (LCB), a tributary to Squam Lake (Fig. 1a), New Hampshire’s second largest lake, with an outflow to Little Squam Lake and ultimately the Pemigewasset River. Squam Lake has 34 inflowing tributaries, of which LCB is one. The LCB sampling site (43.75° N and − 71.55° W) was 25 m upstream from a road crossing (Fig. 1b). The LCB watershed represents approximately 1% (1.77 km2) of the Squam Lakes watershed, and is 91.7% forest (60.9% deciduous, 29.6% evergreen, 1.2% mixed tree types), 3.4% wetland, 2.1% developed open space, 2.5% pasture land, and 0.4% shrub land. The elevation range for the LCB watershed is 183 to 456 m and a mean slope of 17.4%. The watershed is underlain by granitic bedrock overlain by glacial drift of varying thickness, with spodic soils that are dominantly fine sandy loam of the Marlow and Peru series. The land use is dominated by second growth, mixed hardwood forest. Climatology at our site is well-represented by the Hubbard Brook Experimental Forest, about 24 km northwest of the Squam Lake. The region experiences 1326 mm mean annual precipitation that is evenly distributed over the year, with 493 mm estimated annual evapotranspiration and 6 °C mean annual air temperature (Bailey et al. 2003).

Fig. 1
figure 1

Squam Lake showing location of Livermore Cove Brook watershed (a) and Livermore Cove Brook watershed with its land cover types (b)

2.2 Event Sampling

The stream response to three rain events was sampled in the summer of 2016. Precipitation data came from Plymouth, NH municipal airport (NOAA site K1P1), approximately 16 km west of the study area. For each event, we collected streamwater samples across 48 h at 1-h intervals, initiated a few hours before the expected onset of precipitation and extending 12 to 24 h after cessation. Samples were collected using an ISCO sampler fitted with 24 1-L pre-cleaned LDPE bottles acid-washed with 10% HCl and rinsed three times with deionized (DI) water.

Sampling of event one began on 6/5/2016 at 8:30 AM (local time) and ended on 6/7/2016 at 10:42 AM. Event two was sampled from 7/9/2016 at 10:00 AM to 7/11/2016 at 9:10 AM. Event three was sampled from 8/12/2016 at 4:00 PM to 8/14/2016 at 7:20 PM. There was continuous rain for 12 h during event one and discontinuous rain for 39 and 34 h during events two and three, respectively.

2.3 Laboratory Analyses

2.3.1 Physical Properties

Immediately after each event, the water samples were split, with 250 mL set aside for later P analysis, and the remainder analyzed for the physical properties: specific electrical conductivity (SC, μS cm−1), turbidity (NTU), and total suspended solids (TSS, mg L−1). Total suspended solids were measured at a lower frequency than turbidity. SC was measured using an Accumet Basic AB30 conductivity meter calibrated with conductivity standard solution of 1000 μs cm−1 and deionized (DI) water (< 1 μs cm−1). Turbidity was measured using a HACH 2100 turbidimeter calibrated using standard curve from 0 to 1000 NTU. For TSS analysis, Whatman GF/F filters (0.7 μm nominal pore size) were rinsed by vacuum filtering 250 mL DI water, dried for 2 h at 110 °C, cooled for 5 min in a desiccator, and weighed on an analytical balance. The untreated streamwater sample was shaken and 250 mL of it was filtered. For TSS analysis, Whatman GF/F filters (0.7 μm nominal pore size) were rinsed by vacuum filtering 250 mL DI water, dried for 2 h at 110 °C, cooled for 5 min in a desiccator, and weighed on an analytical balance. These filters were dried for 2 h at 110 °C, cooled in a desiccator, and weighed. TSS is calculated as (weightfinal − weightinitial) / volume of sample.

Analysis of the stable isotopic composition of water, deuterium (δ2H, ‰), utilized a Los Gatos cavity ring-down spectrometer (Green et al. 2015). The term “new water” distinguishes event-related water in the stream (Sklash and Farvolden 1979; Hooper and Shoemaker 1986). This study uses δ2H values of each sample to calculate the percent of new water (%NW) that occurs in the stream as the result of the rain event. Since δ2H is a conservative tracer, we follow Pinder and Jones (1969) in separating storm hydrographs using different chemical tracers:

$$ X=100\left(\frac{C_{t-{C}_0}}{C_{n-{C}_0}}\right) $$
(1)

where, X = % new water, Ct = isotopic composition of a stream sample, Co = isotopic composition of a stream sample collected during pre-rain flows, and Cn = isotopic composition of rain water. Rain water was collected throughout each event in Plymouth, NH, and combined in a uniform mixture before δ2H analysis. The δ2H of the rain in each event is used as Cn. The value of Co was determined from the δ2H of low-flow period from the first sample collected for the event.

2.3.2 Phosphorus Analyses

Within 8 h of collection, 250 mL subsamples were acidified with 1.5 mL of 50% H2SO4 and stored at 4 °C. To prevent P contamination, all containers coming into contact with the samples were rinsed three times with DI water, soaked in 10% HCl for 24 h, then rinsed three more times with DI water and filled with DI water between uses. Within 48 h of collection, all the water samples from event one and 25 samples each from events two and three were analyzed for soluble reactive phosphorus (SRP), primarily dissolved inorganic phosphorus. SRP was measured spectrophotometrically, reacting first with molybdate in the presence of antimony to form an antimony phosphomolybdate complex, followed by reduction to molybdenum blue with ascorbic acid (APHA 1998). Sample aliquots of 50 mL each were filtered through Whatman GF/F filters, combined with 8 mL of the reagent complex and mixed. After at least 10 min but no later than 30 min, absorbance was measured at 880 nm using a Spectronic 20 spectrophotometer (Bausch and Lomb). SRP concentration (μg L−1) was calculated using a standard curve (r2 ranging from 0.999 to 0.996).

All the samples from events one and three, and 25 samples from event two, were analyzed for total phosphorus (TP) and total dissolved phosphorus (TDP). To measure TP, all the P in unfiltered water samples was converted into orthophosphate as follows (Menzel and Corwin 1965). To each 50 mL of sample, 1 mL of potassium persulfate (K2S2O8) and 1 mL of 11 N H2SO4 was added and mixed, then autoclaved for 30 min. After cooling, samples were neutralized with 11 N NaOH using phenolphthalein indicator, mixed with 8 mL of combined reagent, and TP concentration (μg L−1) measured on the spectrophotometer (as with the SRP method).

For TDP, we filtered water samples using Whatman GF/F filters, then digested and analyzed as with the SRP analysis. Total particulate phosphorus (TPP) was calculated as TP-TDP, and dissolved organic phosphorus (DOP) as TDP-SRP. Our operational definition of dissolved for TDP, SRP, and DOP was any P passing through a 0.7-μm nominal pore size filter. The method detection limit of SRP analysis for all three events was 4.3 μg L−1, 1.8 μg L−1, and 3.7 μg L−1, respectively.

2.4 Data Analysis

We estimated unit discharge (UD) at Livermore Cove Brook using measurements taken in 1999, 2000, and 2016 (n = 31) and comparing them to those of Watershed 3 at the Hubbard Brook Experimental Forest (Bailey et al. 2003). We leveraged the discharge monitoring at Hubbard Brook because our study was too short a duration to justify establishing a full rating curve for Livermore Cove Brook. Watershed 3 is similar in size (0.4 km2) to Livermore Cove Brook, reasonably close (20 km), and has similar physiography. The main hydrologic event differences between the two sites were from spatially variable precipitation caused by local convection. The three storms we studied were stratiform, and so we assumed the precipitation amount and timing were regionally similar. The resulting relationship was log10 (Livermore UD) = 0.72 × log10 (Hubbard Brook UD) − 0.34 (r2 = 0.61). The total runoff and the ratio of runoff to total precipitation (runoff ratio) for each storm were calculated.

We quantified the relationship between P concentrations and independent stream variables using the Spearman rank correlation. We used this non-parametric approach to account for non-linear relationships. The P concentrations included TP, SRP, DOP, and TPP. The independent variables included UD, %NW, SC, turbidity, and TSS. These independent variables were chosen either because they provide information on hydrologic transport processes (UD and %NW) or they are commonly included in stream monitoring protocols (SC and turbidity).

Hourly TP flux (kg P h1) was calculated by converting UD to stream discharge (m3 h−1) and then multiplying by the corresponding TP concentration. The two-day event TP flux was calculated by summing the hourly flux and reported as kg P event−1. The TP yield of each event (kg P ha−1 event−1) was calculated as event TP flux divided by watershed area.

3 Results

3.1 Event One

During event one, there were 12 h of continuous rain with total precipitation of 41.9 mm (Table 1). Precipitation increased slowly, peaked during the 10th hour (12.2 mm h−1), and then quickly abated (Fig. 2a). Mean precipitation rate for the event was 3.5 mm h−1. The short but relatively high rates of precipitation resulted in sharp deviations from pre-event baselines for all measured streamwater variables.

Table 1 Summary of water chemistry and hydrology of events 1, 2, and 3 for Livermore Cove Brook where n is the number samples used for analysis
Fig. 2
figure 2

Time series of unit discharge (UD) and percent of new water (%NW) for a event one, b event two, and c event three. Hanging bars are hourly precipitation, solid vertical lines show interval of maximum %NW, and dotted vertical lines indicate interval of maximum precipitation

UD increased rapidly from 0.3 to 17.8 mm day−1, peaked concurrently with precipitation, and then slowly declined (Fig. 2a). At the end of the sampling period, UD had not returned to the pre-event baseline. The runoff ratio was 0.19 for event one (Table 2). The δ2H ranged from − 57.1‰ prior to event to a peak of − 41.8‰. The δ2H of rain water in event one was − 40.6‰. From this, we calculated that the %NW in the stream increased from 0.5 to 92.6% at peak flow and then declined to pre-event levels within 12 h (Fig. 2a). Peak %NW lagged both peak precipitation and UD by 2 h. The %NW values returned to the pre-event baseline during the sampling period but UD did not.

Table 2 General characteristic of rainfall, unit discharge, ratio of streamflow to rain, TP flux, and TP yield during events one, two, and three

SC decreased from a pre-event value of 46.1 μS cm−1 to 28.4 μS cm−1 during the event and rebounded to 43.5 within 13 h after the event (Fig. 3a). SC decreased as %NW in the stream increased. Turbidity across the event ranged from 0.55 to 127 NTU. It increased greatly during the event to its maximum value and quickly returned to baseline within 7 h (Fig. 3a). Turbidity increased as %NW in the stream increased. TSS values ranged from 0 to 474.4 mg L−1 (n = 9).

Fig. 3
figure 3

Time series of specific conductivity (SC) and turbidity (Turb) for a event one, b event two, and c event three. Hanging bars are hourly precipitation, solid vertical lines show interval of maximum %NW, and dotted vertical lines indicate interval of maximum precipitation

Concentration for all forms of P increased greatly with UD and %NW (Fig. 4a). TP, DOP, and TPP peaked 1 h prior to maximum precipitation and 3 h before peak of %NW while peak SRP concentration lagged others by 2 h (Fig. 4a). TP ranged from baseline of 14.6 to 453.7 μg L−1. The DOP ranged from 1.1 to 249 μg L−1, and SRP ranged from 0.1 μg L−1 at baseflow to 18.9 μg L−1 (Fig. 4a). The concentration of TPP increased with UD, ranging from 0.0 to 190.3 μg L−1 at the peak.

Fig. 4
figure 4

Time series of total phosphorus (TP), dissolved organic phosphorus (DOP), total particulate phosphorus (TPP), and soluble reactive phosphorus (SRP) for a event one, b event two, and c event three. Hanging bars are hourly precipitation, solid vertical lines show interval of maximum %NW, and dotted vertical lines indicate interval of maximum precipitation

During baseflow, almost all the TP was in the form of DOP (mean = 78%). As the event progressed, the contribution of TPP to the TP pool increased in the stream to 46%. This change is indicated by the TPP:TP ratio (Fig. 5a). The contribution of SRP to TP pool increased from 5% during baseflow to 24.6% during peak stormflow.

Fig. 5
figure 5

Time series of the ratio of TPP:TP, DOP:TP, and SRP:TP for a event one, b event two, and c event three. Hanging bars are hourly precipitation, solid vertical lines show interval of maximum %NW, and dotted vertical lines indicate interval of maximum precipitation

3.2 Event Two

During event two, there were 39 h of low-intensity rain with a total precipitation of 19.6 mm and mean precipitation rate of 0.50 mm h−1 (Fig. 2b). There were trace amounts of rain 8 h earlier than the sampling period. The intensity of rain was highest during the 11th hour of sampling (7.1 mm h−1). This long and discontinuous precipitation resulted in more variability in all the measured variables.

During this event, the UD in the stream reached 1.1 mm day−1 from the 14th to 22nd hours of sampling; baseflow was 0.4 mm day−1 (Fig. 2b). The UD lagged rain by 3 h and %NW by 2 h from the peak value; UD declined moderately. The runoff ratio was 0.08 for event 2 (Table 2). The δ2H in streamwater ranged from − 62.4 to − 50.4‰. The streamwater δ2H prior to the event was − 59.0‰ while the δ2H of rain water of the event was 39.9‰. Peak %NW in the stream occurred at 45% at the 12th hour, which was 2 h before UD peaked (Fig. 2b). The %NW peaked an hour after maximum precipitation (Fig. 2b). During this event, %NW returned to its baseline value at the end of sampling but UD did not.

SC decreased from a pre-event value of 46.9 μS cm−1 to 33.9 μS cm−1 during the event and did not rebound to its pre-event condition even after 34 h (Fig. 3b). Turbidity ranged from 2.6 to 25.2 NTU, peaked 2 h after max %NW, and returned to its baseline value within 9 h (Fig. 3b). Across this event, TSS ranged from 2.4 to 13.3 mg L−1 (n = 6).

All P species responded to precipitation in a similar manner as event one but with less intensity. TP ranged from a baseline of 15.5 to 45 μg L−1 at peak. TP returned to its baseline value within 10 h after the event (Fig. 4b). DOP ranged from 3.0 to 24.0 μg L−1 at peak, and SRP ranged from 0.9 μg L−1 at baseflow to 12.9 μg L−1 at the peak (Fig. 4b). TPP ranged from baseline of 1.0 to 14.2 μg L−1 at peak. TP, SRP, and TPP peaked together at the 13th hour of the sampling period, an hour after the maximum %NW (Fig. 4b). DOP peaked 2 h after max %NW (Fig. 4b). SRP returned to baseline within 7 h (Fig. 4b).

As in event one, during baseflow, most of the TP was in the form of DOP (mean = 73%). As the storm event developed, the contribution of TPP to the TP pool in the stream increased to 42%, shown by the TPP:TP ratio (Fig. 5b). The contribution of SRP to the TP pool also increased from 5% during baseflow to 36% during peak stormflow.

3.3 Event Three

Event three was composed of 34 h of discontinuous rain with total precipitation of 39.9 mm and an average precipitation rate of 1.2 mm h−1 (Fig. 2c). The precipitation peaked twice, during the 5th and 33rd hours of sampling. The discontinuous precipitation affected all the measured variables.

During event three, the UD in the stream ranged from 1.1 to 3.2 mm day−1 (Fig. 2c), peaked in the 5th hour, declined continuously for 15 h, then peaked again for the 33rd and 34th hours. After the event, UD declined slowly to its base value. The runoff ratio for event three was 0.1 (Table 2). The δ2H in streamwater ranged from − 56.5 to − 38.6‰. The δ2H prior to the event was − 56.5‰. The δ2H of rain water for the event was − 27.6‰. Therefore, the stream had 61.7 %NW during peak flow at the 23rd hour of the event (Fig. 2c). In this event, %NW neither peaked with UD nor with precipitation, but rather peaked between them.

SC decreased from a pre-event value of 68.3 μS cm−1 to a trough of 31.7 μS cm−1 during the event and rebounded three times during the 9th, 23rd, and 33rd hours. Contrary to other events, SC did not return to its baseline value (Fig. 3c). Turbidity ranged from 0.5 to 11.1 NTU (Fig. 3c), peaking three times during the 9th, 23rd (with %NW), and 32nd hours of the event with decreasing SC in streamwater (Fig. 3c). TSS ranged from 4.3 to 28.6 mg L−1 (n = 7).

All P species showed variability in event three. TP ranged from baseline of 13.2 to 83.7 μg L−1 at peak. The DOP ranged from baseline of 1.9 μg L−1 to 66.6 μg L−1 at peak, and SRP ranged from 2.2 μg L−1 at baseflow to 16.5 μg L−1 at the peak (Fig. 4c). TPP increased with streamflow, ranging from 0.0 to 26.5 μg L−1 at peak (Fig. 4c). TP and DOP peaked at the 9th, 23rd, and 33rd hours of the event (Fig. 4c). The mean contribution of DOP to TP (22%) was less than other events but the contribution of TPP to TP pool during the storm event was comparable at 45% (Fig. 5c). The contribution of SRP to TP pool increased from 0.02% during baseflow to 76% during peak stormflow.

3.4 Correlations of P with Other Streamwater Variables

Across the three storms, TP and SRP concentration had the highest Spearman correlation with %NW, DOP had the highest correlation with UD, and TPP had the highest correlation with turbidity (Tables 3 and 4). SC had the lowest correlation with TP and DOP; however, it had the second highest correlation with SRP. The correlation between TSS and TPP had a Spearman ρ of 0.75, and a ρ of 0.76 for turbidity. For individual events, the correlations were generally highest for the first event. Turbidity emerged as highly correlated with TP for events one and three (Spearman ρ of 0.91 and 0.83, respectively).

Table 3 Spearman rank correlations (ρ) between the streamwater variables measured in this study and the phosphorus species. The correlations are shown for all three events combined and individual events
Table 4 TP yield in different watersheds of northern New England region

3.5 TP Flux and Yield

Total TP fluxes for events one, two, and three were 1.348, 0.054, and 0.215 kg P event−1, respectively. Most of TP flux for event one occurred during the first 24 h (95%) likely due to the short, but intense, rainfall. TP flux was more spread out for the following, longer duration events (41.5 and 41.4%, respectively). Event one yielded 0.0075 kg TP ha−1 event−1, event two generated 0.0003 kg TP ha−1 event−1, and event three was 0.0012 kg TP ha−1 event−1. The combined events resulted in a TP yield of 0.009 kg TP ha−1 to Livermore Cove Brook.

4 Discussion

Our first hypothesis that storm events would produce high P streamwater concentrations in this forested catchment was supported by our data. Streamwater P concentrations responded quickly with increasing UD and %NW during storm events in Livermore Cove Brook. The peak TP concentration was proportional to the runoff ratio, suggesting that the precipitation amount, intensity, and antecedent moisture affected the P transport. The first event occurred as the watershed was drying from seasonal snowmelt, had the most punctuated rainfall, and produced the highest peak TP concentration. Event two was a small episode occurring on dry soils and produced little hydrologic response and low peak TP concentration. Event three was a multi-day, low-intensity precipitation producing an intermediate peak TP concentration (Fig. 4c). The hydrological response of Livermore Cove Brook to storms acted to mobilize P in agreement with many other studies (Sharpley and Syers 1979; McDowell et al. 2001; Rodríguez-Blanco et al. 2009; Rodríguez-Blanco et al. 2013). Like Gentry et al. (2007), we found that all P species increased with stream discharge.

It should be noted that despite Livermore Cove Brook, being a relatively undeveloped, forested watershed, it still produced a very high concentration of TP of 453.7 μg L−1 during peak flow in event one. Although events two and three had lower peak concentrations, they were still high enough to classify the streams as mesotrophic (Dodds et al. 1998). The high concentrations were unexpected, given the high forest cover in this watershed. Most of the TP was introduced in the form of DOP in all three events, fluctuating between 95% in baseflow to 50% in event flow (Fig. 5), suggesting a non-erosion source of TP as further discussed below. These unexpectedly high values raise new questions about background concentrations, their sources, and transport in forested areas of New England.

Our data from event one provides an opportunity to explore concentration-discharge relationships and their indication of P transport dynamics. During precipitation events, UD and P concentrations were not linearly related. P concentrations per UD were much higher during the rising limb of discharge than falling (Fig. 6). For this reason, TP, TPP, and DOP experienced clockwise hysteresis loops with both UD (Fig. 6) and %NW (Fig. 7). TPP and DOP responded quickly with the streamflow and peaked before the peak of UD and %NW, suggesting that they were mobilized quickly and their sources were nearby the stream. Woody wetlands adjacent to the stream and the stream banks are the most probable sources of organic P to the stream, which supports similar findings of Dupas et al. (2015) and Rodríguez-Blanco et al. (2009). Riparian forests are important for reducing P loading in mixed landuse watersheds (Lowrance et al. 1997), but in our case, it seems that these riparian areas may have been a net source of P during the storms we monitored. The magnitude of the hysteresis loops formed by TP, TPP, and DOP were larger than those of SRP showing that there was higher mobility of sediment and organic matter during event one than inorganic P. House and Warwick (1998) also found clockwise hysteresis between TP, TDP, and SRP with discharge during storm events. The hysteresis loop of SRP was smaller in magnitude indicating that SRP was not mobilized like other P species during the event (Bowes et al. 2005).

Fig. 6
figure 6

Hysteresis loops and trajectories between unit discharge (UD) and a TP, b TDP, c TPP, d SRP, and e DOP of event one

Fig. 7
figure 7

Hysteresis loop and trajectories between % of new water (%NW) and a TP, b TDP, c TPP, d SRP, and e DOP of event one

Like other species of P, SRP also demonstrated clockwise hysteresis with both UD and %NW but those hysteresis shapes looked different from others (Figs. 6d and 7d). We believe this indicates that the SRP came from different sources. Unlike other species of P, SRP peaked 2 h later than TPP and DOP with both UD and %NW (Figs. 6 and 7) suggesting that SRP was slowly mobilized and its sources were farther away from the stream. Other studies suggest that the subsurface flow is the major source of SRP (Rodríguez-Blanco et al. 2013; Dupas et al. 2015). The hysteresis loop of SRP had a smaller magnitude indicating that SRP was not mobilized like other P species during the event (Bowes et al. 2005). A previous study by Dupas et al. (2015) showed that SRP concentrations increased when the relative contribution of deep groundwater from the upland area was low compared with wetland groundwater. SRP lagged discharge, suggesting that it could be mobilized by old water displacement. This is supported by the higher correlation of SRP with %NW than with UD, suggesting that the source water for discharge is a more important control on concentration than the total discharge.

In our study, DOP was the dominant fraction of TP during baseflow while TPP was dominant during stormflow (Fig. 5). During high event flows, TPP became a significant component contributing up to 46% of TP (Fig. 5). Similarly, Rodríguez-Blanco et al. (2009) found that TPP was dominant fraction of TP which increased significantly during storms. Gentry et al. (2007) found that all P species increase with stream discharge but that TPP was dominant during overland flow runoff events. The variability of TPP export is affected by the erodibility of stream sediments and the export capacity of stream; Rodríguez-Blanco et al. (2013) highlighted that the erosion process is significant in determining the concentration of sediment and generating TPP in the stream.

Our second hypothesis was supported by our data, showing that stable water isotopes were more correlated with P species than other independent variables we tested. Isotopic source water tracing combined with P monitoring is rare. Stable water isotopes have been used to trace P transported through macropores in agricultural fields (Williams et al. 2016) and identify sources of P during baseflow in urban settings (Janke et al. 2014). We are not aware of previous isotopic tracing of water sources combined with P monitoring in a forested setting. Our results suggest that broader use of isotopic tracing with P monitoring may improve predictions of P transport and sources for watershed management.

Data analysis highlighted other strong correlates with P concentrations, and suggested that the type of storm being monitored can influence the strength of P proxies. Turbidity emerged as another strong correlate with P concentration, especially TP and TPP. The similar correlation between turbidity and TPP as the TSS versus TPP relationship was encouraging because TSS is much more labor intensive to measure than turbidity. The first event had much stronger correlations with P concentrations than the second and third events. This is probably because the precipitation event was larger and more punctuated than the other two events, creating a clearer hydrologic response. While the correlations were strongest with the first event, the correlations across all three events suggested that further monitoring would be able to produce robust proxies for streamwater P concentrations, as has been demonstrated by previous studies (e.g., Lannergård et al. 2019).

The correlation and hysteresis analysis, and difference in P transport dynamics across storms suggest new ideas about how P is transported in New England forested watersheds. We hypothesize that the sources of TPP and DOP are from sediment in ephemeral stream beds and near-stream soils. When a streamflow generation threshold is met, these sediments are mobilized, releasing particulate, interstitial, and adsorbed P. Bed sediment can store a large amount of P which can be released under certain conditions (Wang and Pant 2010). We hypothesize that the SRP contribution to the stream is from groundwater and lags peak UD due to transport via displacement of groundwater. The lack of a clear hysteresis loop between SRP and %NW (Fig. 7d) provides evidence for this because %NW is a more direct measurement of groundwater (old) contribution to streamflow.

The different storm events had dissimilar TP fluxes and yields. Event one generated highest TP flux and yield while event two generated the lowest (Table 2). The modeled annual TP load in Livermore Cove Brook for 1999 was 39.14 kg year−1 with an estimated annual yield of 0.221 kg P ha−1 year−1 (Squam Lakes Association 2002). Assuming that the annual load is relatively consistent over time, the three measured events could contribute ~ 4.1% of the estimated annual TP load in just 6 days. The TP yield calculated for three rainfall events in Livermore Cove Brook helps inform a simple TP yield exercise for testing the contributions of storms to annual yields. Assuming a constant baseflow of 0.02 m3s−1 and TP concentration of 0.02 mg L−1, annual TP baseflow load to Squam Lake is 12.6 kg P year−1. Assuming 10 days in the year with precipitation events > 10 mm day−1 having event flows of 0.20 m3 s−1 and TP concentration of 0.10 mg L−1 results in annual event loading of 17.3 kg P year−1. Therefore, it is very possible that a large component of annual TP load to Squam Lake may be the result of a few episodic storm events. It is also important to note that the bulk of the total rainfall event P load occurred within only a few hours of the event duration. Without the hourly sampling design, a large component of actual event P load would have gone unrecorded. Clearly, nutrient loading models might be much improved with more empirical event data.

Studies in other forested catchments in the New England region (Table 4) resulted in background P yields from forest basin ranges from 0.03 to 0.10 kg P ha−1 year−1 (Ahl 1988). Clark et al. (2000) determined nutrient concentrations and yield in 85 streams with undeveloped basins across the USA. They found the median basin TP concentration was 0.022 mg L−1 and median annual basin TP yield was 0.085 kg ha−1. Moore et al. (2004) estimated TP catchment yield of less than 0.21 kg ha−1 year−1 for the Livermore Cove Brook subwatershed region by using SPARROW model based on source load from 1992 to 1993. Their value compares favorably to our estimate of 0.226 kg ha−1 year−1. Moore et al. (2004) also predicted that the forested land in Merrimack River basin in New Hampshire contributed 26% of the total P load.

5 Conclusions

This study examined natural P yields from the forested Livermore Cove Brook subwatershed of Squam Lake, central NH, USA, and found that terrestrial runoff during larger storm events generates high P concentrations and loading to the lake. This suggests that changes in frequency of high-precipitation events may directly influence P concentrations in lake waters, with consequences to lake clarity, algal growth, and other water quality measures. In addition, the form of P entering the lake varies because of these precipitation events. While DOP is the dominant form of P in the streamwater baseflows, during erosion and resuspension in storm events, TPP becomes increasingly important. Conversely, SRP is a minor component during storm events, but becomes a significant component in baseflows. The inclusion of stable water isotopes in our monitoring aided the interpretation of sources and transport of P in the watershed. Correlations between %NW and the different P species suggest that source water tracing can aid watershed P management. Turbidity similarly emerged as an important correlate with streamwater P concentration, indicating that that broader adoption of turbidity monitoring with P may improve P monitoring programs. Overall, our study suggests that watershed P models should do more to capture the disproportionate effects of episodic rain events on annual loading estimates, even in forested environments.