Introduction

Nutrient loading models are useful for quantifying how changes in land use, management practices, and nutrient sources will likely impact in-stream and coastal water quality. While water quality trends can be observed through periodic sampling, observations are limited in spatial and temporal extent by expense and accessibility (Smith et al. 1997; Valiela et al. 1997; Alexander et al. 2002; Borah and Bera 2004). Cutbacks to state and federal monitoring programs have led to decreases in the frequency and geographic coverage of samples (Robertson and Saad 2011). Models can be used to estimate nutrient loads at un-monitored locations, increasing the spatial and temporal resolution of nutrient loading estimates (Smith et al. 1997). They can also forecast trends in water quality under scenarios of changing land use and climate, and have been used at various scales to assess the sources of nutrients, target source areas for management, and predict the effectiveness of management strategies. Models are particularly useful to better understand the contributions of non-point sources to nutrient loads (Nikolaidis et al. 1998; Borah and Bera 2004) and can be used to understand the role of pathway and process in the delivery of nutrients which cannot be directly observed or measured.

Several nutrient load modeling approaches exist in the literature (as reviewed in Alexander et al. 2002; Borah and Bera 2004; Valiela 2002). These range from semi-distributed hydrologic models such as the soil water assessment tool (SWAT) to simple mass balance accounting (Jaworski et al. 1992; Alexander et al. 2002; Boyer et al. 2002). A common approach used to quantify the nutrient sources for a watershed involves assuming nutrient loading rates are directly related to factors such as land use or management practices rather than directly accounting for independent source contributions. For instance, it is common to consider an “urban” source rather than the individual contributions to the urban environment, which can include several potential sources that have different spatial distributions and delivery mechanisms. Such methods generally assume that there is a linear relationship between changes in land use or management and source contribution to loading, which limits the transferability of these models (Zhang 2011; Destouni et al. 2006). The lack of direct source accounting makes it difficult to apply these models consistently in highly variable watersheds (Nikolaidis et al. 1998; Destouni et al. 2006).

With the exception of the process-based models, existing approaches are generally applied to yearly loading averages and do not consider seasonal variability in processes and pathways. Detailed descriptions of each source, independent of transport mechanisms, are necessary to describe nutrient loads across highly variable watersheds and to predict how nutrient loading will be impacted by changes in climate and land use.

The modeling approach described here separates nutrient load modeling into two distinct steps: (1) spatially- and source-explicit nutrient source modeling, developed and described in (Luscz et al. 2015), and (2) spatially-explicit, temporally-variable statistical transport and fate, or “pathway and process”, modeling. This manuscript focuses on the structure, calibration, and validation of the statistical fate and transport model in Michigan’s Lower Peninsula. Seasonal models were developed to describe loading of total nitrogen (TN) and total phosphorus (TP) during baseflow and snow melt conditions. In addition to predicting nutrient loads at points along streams in the Lower Peninsula, the model can predict the amount and source of nutrients delivered to surface water, the pathways (surface or ground) along which nutrients traveled to surface water, the attenuation of nutrients, and the contribution of individual sources to stream loads.

The models were calibrated and validated using data collected between 2010 and 2012. An annual model was also developed using average annual load estimates developed for the USGS SPAtially Referenced Regressions On Watershed attributes (SPARROW) Great Lakes model (Robertson and Saad 2011), so that a direct comparison can be made between the approach described here and other regional scale nutrient loading models.

Fig. 1
figure 1

Comparison of a sub-basin based approach (such as the USGS SPARROW model) and the spatially explicit approach described here

This work was motivated by the need for a nutrient model that can accurately and consistently predict nutrient loads at a high level of spatial detail across diverse watersheds. This model improves on existing nutrient loading models by enhancing the spatial detail of individual sources at a regional scale while retaining the ability to predict source contributions to the observed load. In addition, unlike models that consider a sub-watershed approach (Fig. 1), this model defines explicit and seasonally variable pathways that can be used to identify specific source areas within individual sub-watersheds.

Methods

Model domain

Models were constructed for watersheds in the Lower Peninsula (LP) of Michigan. The northern portion of the LP is mostly unmanaged land while the dominant land uses in the south are rangeland and row crop agriculture. There are also pockets of developed land that are primarily located in the south. The largest developed area is around the city of Detroit, which is in the southeast corner of the state. Michigan’s LP is covered primarily by unconsolidated glacial drift, which may be greater than 300 meters thick in the central northern section and only a few meters thick in a few regions where bedrock outcrops along coastal bluffs or incised stream valleys (Olcott 1992). These sediments are composed of relatively coarse-textured, high hydraulic conductivity glacial tills, glacial outwash sands and gravels, with some fluvially reworked sands and gravels, and locally extensive lacustrine clay deposits. Annual streamflow across much of the LP is dominated by baseflow from this generally productive unconsolidated aquifer system. For the model domain, the average proportion of flow contributed by groundwater (estimated by Wolock 2003) is 54% and ranges from 19 to 90% across the study domain.

The model represents the period from 2005 to 2012. Inputs to the model are provided at a 90 meter grid cell spacing. For this study, we developed models representing three different nutrient loadings: early spring snow melt, late summer baseflow, and the annual average. The seasonal conditions were chosen to represent two presumably different loading scenarios: very high flows during snow melt that flush nutrients from a dormant winter landscape including surface runoff and groundwater inputs to stream discharge, and low flows representing primarily groundwater transport pathways. Annual average loads provide a time-weighted average of these and other stream hydrologic conditions.

Model description

The form of the model equation considers a simplified conceptual structure in which sources applied to the landscape are subject to losses at the point of application, and attenuation along two pathways: travel within the upland portion of the basin and travel within the stream. This attenuation is described by statistically derived reduction factors that are functions of a basin’s physical features.

The model structure is similar in many ways to that of SPARROW, however it differs in several important ways. First, this model has a spatially explicit description of sources and basin factors, which includes within basin variability and the addition of travel distance and travel time as factors. This allows for an expanded process-based description of pathways. Second, this model does not include any calibratable “source terms” that allow for additional nutrient source applications not accounted for in the input layers. Because this model assumes that all nutrient sources have been explicitly included, there is no need for any coefficients that would allow total applied nutrients to exceed inputs. While the source terms used in the model could underestimate the true source values this would be reflected in artificially small loss and attenuation terms. Third, this model splits overland travel into two pathways: surface and subsurface, with a statistically-derived partitioning parameter. The differences between the spatially explicit model structure described here and a sub-basin based model such as SPARROW are illustrated in Fig. 1.

Fig. 2
figure 2

Conceptual schematic of modeling approach

The basic input for this modeling approach is a spatially detailed description of nutrient applications to the landscape from six distinct sources: (1) atmospheric deposition, (2) septic tanks, (3) point source stream loads, (4) manure fertilizer applications, (5) chemical agricultural fertilizer, and (6) non-agricultural chemical fertilizers. Nutrient applications are defined for each cell of the model using an estimation process described in Luscz et al. (2015). Maps of these sources for both N and P are included in the supplementary material. The load at an observation point (for instance, a sampled stream location) is modeled by summing the contribution of nutrients from all cells that are upstream of the observation point, and applying the statistically-derived reduction factors, which are functions of that cell’s position in the watershed. The conceptual model underlying the model equation is shown in Fig. 2, and the general functional form of the model is given in Eq. (1).

$$\begin{aligned} L=\sum _{i}^{sources}\sum _{j}^{cells}Ext_{i}S_{ij} \left[ Fgrd_{j}Bgrd_{ij}+\left( 1-Fgrd_{j}\right) Bsurf_{j}\right] R_{j} \end{aligned}$$
(1)

where L is the load (kg/year) at a modeled in-stream location, \(S_{ij}\) is the application of source i to catchment cell j, Ext describes the in-place removal of nutrients prior to transport (due to, for instance, denitrification by soil microbes, or harvest N and P through plant uptake), Fgrd is the fraction of nutrients entering the subsurface pathway, and Bgrd, Bsurf, and R are reduction factors that account for physical and biological attenuation in the watershed. Bsurf, Bgrd, Btile, R, Ext,and Fgrd are unitless; the units of S are kg/year.

Bsurf and Bgrd describe the proportion of input nutrients that remain after attenuation occurs during travel through the landscape to surface water; Bsurf represents the surface pathway and Bgrd represents the groundwater pathway. Bsurf and Bgrd are exponential functions of travel distance from cell j to surface water. These factors are calculated as

$$\begin{aligned} Bsurf_{j}=e{}^{(\alpha _{1}*D_{j})} \end{aligned}$$
(2)
$$\begin{aligned} Bgrd_{j}=e^{(\alpha _{2}*D_{j})} \end{aligned}$$
(3)

where \(\alpha _{1}\) and \(\alpha _{2}\) are empirically derived negatively-valued coefficients, and D is flow distance from cell j to the nearest stream. While in reality, depending on the time of year and the biologic and physical conditions, the basin could become a source of stored nutrients, it was assumed that the net effect of these processes would be small over the period described by the model. The reduction factors thus describe permanent removal of nutrients relative to the time scale of the model. For point sources, which are assumed to be applied directly to stream channels, D is set equal to 0, thus Bsurf and Bgrd are both equal to 1. This ensures that point sources only experience attenuation along the river pathway.

Ext is an extraction factor that accounts for the in-place removal of nutrients (such as harvest), which is only active for surface-applied sources (i.e. atmospheric deposition, chemical agricultural fertilizers, and manure) on land that is harvested. For all other cells and sources, Ext is equal to 1. The model applies a single value for Ext across all surface applied sources within harvested lands. While each of these sources may have different extraction rates, due to the complex nature of N or P cycling within the root zone of harvestable areas, it is difficult to separate these terms using the statistical approach applied here while still maintaining a parsimonious model.

R describes the proportion of nutrients that remain after attenuation occurs during the in-stream portion of the pathway and is an exponential function of both in-stream travel time and normalized basin yield (Eq. 4).

$$\begin{aligned} R_{j}=e^{(\alpha _{3}*T_{j})}*BY_{j}/BY_{max} \end{aligned}$$
(4)

where T represents the in-stream travel time from cell j to the downstream observation point, \(\alpha _{3}\) is an empirically derived coefficient, BY \(_{j}\) is the basin yield defined for each sub-watershed (defined as basin discharge divided by basin area), and \(BY_{max}\) is the maximum basin yield for the dataset. Rivers with slow moving water and/or small, shallow channels, may experience more biologic and physical processing due to increased interaction between the channel bed, the hyporheic zone, and the biota in the stream (Alexander et al. 2000; Mainston and Parr 2002). These watersheds tend to have low basin yields.

One alternative groundwater pathway (septic plumes) and one alternative surface pathway (tile drains) was also considered in the model. Since attenuation in septic plumes occurs differently than attenuation in other areas (Valiela et al. 1997; Robertson and Cherry 1992; Reneau and Pettry 1976; Gilliom and Patmont 1983), Bsep, which describes septic plume attenuation, is substituted for Bgrd for the septic source. Bsep has the same form as Bsurf and Bgrd (\(Bsep=e^{(\alpha _{7}*D_{j})}\)). In the Lower Peninsula model, the coefficient for Bsep was fixed as \(\alpha _{7}=-0.002\) based on analysis by Valiela et al. (1997) who compiled data from several studies sampling septic plumes and concluded that roughly 35% of loss occurred in the septic plume over a 200 meter distance with an exponential loss rate. Btile (\(Btile=e^{(\alpha _{5}*D_{j})}\)) was substituted for Bsurf in cells where tile drains exist since tiles alter the overland flow pathway.

Fgrd is a function of normalized groundwater recharge, and is calculated as

$$\begin{aligned} Fgrd_{j}=\alpha _{6}*recharge_{j}/recharge_{max} \end{aligned}$$
(5)

where \(recharge_{j}\) is the estimated recharge in each cell, and \(recharge_{max}\) is the maximum value of recharge across the study domain. Fgrd is multiplied by the source loading in each cell to determine the proportion of the load that travels via the subsurface pathway. For surface water cells, this model assumes that there is no subsurface pathway, and all nutrients applied directly to those cells (including atmospheric deposition that falls on surface water) are routed only via surface water. Similarly, Fgrd is set to 0 for point sources, which are only applied to river cells . Fgrd is set to 1 for the septic source, since septic tank loading occurs only in the subsurface.

Several alternative parameters were considered in the modeling structure including travel through lacustrine and palustrine wetlands, in-stream travel distance, and depth to bedrock. These parameters were not included in the final model because model estimates were relatively insensitive to these factors, and they did not improve model performance.

Model inputs

Nutrient source inputs for the model were estimated using a 90 meter grid resolution for the LP based on methods described in Luscz et al. (2015). Six independent sources of each N and P were mapped using readily-available GIS and remote sensing datasets, in addition to manually mapped urban-related features. Five non-point sources (atmospheric deposition, chemical agricultural fertilizers, non-agricultural chemical fertilizers, manure, and septic system loads) were described in addition to point source loads. Source-specific nutrient maps reproduced from Luscz et al. (2015) are included in the supplementary material (S3 and S4). Annual rates (kg/year) were used for all source inputs except for atmospheric deposition for which seasonal rates were determined. The processes that lead to nutrients being available for transport are complex, and for modeling purposes we assumed that annual loads for upland terms are sufficient. Since atmospheric deposition is loaded to wetlands, seasonal inputs were estimated for atmospheric deposition based on seasonally averaged precipitation and deposition rates. Although point sources are also loaded to streams and wetlands the point source contributions represent annual loads, as we assumed that point sources are largely stable since they are mostly outflows linked to populations served.

The following sections describe the methods used to develop the landscape and travel pathway factors that were used as inputs to the model. Travel pathway and landscape terms were calibrated seasonally and are based on seasonal inputs. However, the model showed insensitivity to seasonal variations in some of these terms, as discussed in "Model optimization" section.

Watersheds and travel distance

Travel distance was calculated using the national elevation dataset (NED) (Gesch et al. 2002; Gesch 2007) and the ArcGIS Hydrology Toolbox, which calculates flow travel distance based on an elevation dataset. Sub-watersheds were generated using sampling locations as pour points in the ArcGIS Hydrology Toolbox.

Recharge

Recharge was estimated using a meta-model derived from a process-based hydrologic model of the Muskegon River watershed (Hyndman et al. 2007; Wiley et al. 2010), located in the central portion of the LP. The meta-model estimates the percentage of precipitation that becomes recharge from soil hydraulic conductivity values and land use class. The hydrologic model, built using the landscape hydrologic model (LHM, Hyndman et al. 2007; Kendall 2009), simulated the period of 1980–2007 with hourly timesteps at ~425 meter resolution. Average annual recharge and precipitation were calculated for each cell in the LHM simulation, then linear regressions were fit for each land use class to the proportion of annual precipitation that became recharge as a function of saturated soil hydraulic conductivity. These equations are included in the supplementary material. The recharge proportion meta-model was applied to Michigan’s LP using soil hydraulic conductivity derived from the soil characteristics of the soil survey geographic database (SSURGO) (Natural Resource Conservation Service) and land class data from the national land cover database (NLCD) (Fry et al. 2011). Estimates of annual precipitation were downloaded from the PRISM climate group (PRISM Climate Group 2011).

Tile drainage

The tile drained area of Michigan was surveyed on a county level by the NRCS during the 1992 National Resource Inventory (NRCS 1995). However, the model developed here required updated information and a fully spatially-explicit tile drainage layer. To derive this layer, first, a 750 meter buffer was added to the Canal/Ditch feature of the National Hydrography Dataset (Roth and Dewald 1999). Second, cells classified in the 2006 NLCD (Fry et al. 2011) as “Cultivated Crops” were selected within each buffer. Since the Canal/Ditch feature includes stream features that do not drain tiles, further classification was required. As a result, the third step was selecting areas of low slope (cells where the average slope is less than 2% over 1 km) and/or low soil hydraulic conductivity (≤5.0 × 10−6 cm/s) within the canal ditch buffers as potential tile drained areas. This three step process of classification ensured that areas that are extremely flat but have moderate drainage, such as those that surround Saginaw Bay, were classified as tile drained areas. The resulting tile drainage map is shown in Fig. 3.

Fig. 3
figure 3

Estimated tile drained areas for the LP of Michigan (indicated in black) estimated from soil type, slope, and NHD features

Stream velocity and travel time

Stream morphology plays a role in nutrient cycling by controlling the residence time of surface water in a watershed and the contact between water and the stream bed (Mainston and Parr 2002; Alexander et al. 2000). The model accounts for stream morphology using estimates of stream velocity, in-stream travel time, and basin yield.

Flow velocity was estimated using an empirical relationship derived by Leopold and Maddock (1953) that relates channel area and depth to discharge:

$$\begin{aligned} A=aQ^{b} \end{aligned}$$
(6)

where A is channel area (m2), Q is discharge (m3) and a and b are estimated coefficients with appropriate units. A similar approach has been used to assess stream morphology and discharge in several studies (Alexander et al. 2000; Schulze and Hunger 2005; Bjerklie et al. 2003). The relationship for channel area shown in Eq. (6) was divided by Q and manipulated to derive a relationship between velocity and discharge:

$$\begin{aligned} v=\frac{1}{a}Q^{1-b} \end{aligned}$$
(7)

The coefficients for these relationships were derived empirically from channel area and discharge data available from USGS gauges for Michigan’s Lower Peninsula (U.S. Geological Survey 2012). The USGS site visit observations for channel area and discharge were averaged for each gauge.

Site visit data were filtered for discharge observations collected during high flow conditions (discharge observations between the 75th and 90th percentiles) and low flow (baseflow) conditions (discharge observations less than the 20th percentile). The 75th and 90th percentiles were chosen for high flow so that the dataset did not include observations collected under flood conditions. Values within 10% of the median value were selected to represent average (annual) flow conditions. The datasets were further subset based on the geology underlying the gauge location; three geologic models (till, lacustrine, and outwash) were fit for each flow condition (Farrand and Bell 1982). The results of the analysis are reported in the supplementary material.

In order to apply these relationships to all stream cells, the discharge at every location in the Lower Peninsula was estimated from the flow accumulation derived using NED and the ArcGIS Hydrology Toolbox. USGS gauge data were used to derive empirical relationships between calculated cell flow accumulation and discharge. The 80th, median, and 20th percentiles of discharge were used to create high flow, average flow, and low flow (baseflow) linear models. For the baseflow relationship, flow accumulation was multiplied by the baseflow index, which is the proportion of flow attributed to groundwater discharge as estimated for gauges by the USGS (Wolock 2003). One discharge model was created for each flow condition.

Fig. 4
figure 4

Estimated in-stream travel time (hours) to sample point for the Saginaw Bay watershed under high flow (e.g., snow melt) conditions

In-stream flow length was calculated for each cell by subtracting the flow length between each cell and the stream network from the flow length between each cell and the downstream observation point. The in-stream velocity was estimated from the predicted discharge in each cell using Eq. (7), and was then used to calculate travel time by weighting the cell-by-cell flow length function within the Hydrology Toolbox. Calculated travel time to the Saginaw Bay watershed sample location is shown in Fig. 4.

In-stream nutrient loading data

For the seasonal pathway and process models, in-stream nutrient loading estimates were calibrated to and validated using observations collected during synoptic sampling rounds. Flow measurements and nutrient samples were collected from the fall of 2010 to the spring of 2012 in five field campaigns intended to capture synoptic conditions during baseflow and snow melt in the Lower Peninsula of Michigan. Based on precipitation totals, the fall (2010 and 2011) and winter (2011 and 2012) seasons during the sampling period represented average conditions in the Lower Peninsula. The watersheds sampled represent a large range in watershed size, land use, and nutrient inputs.

The snow melt period was selected to capture the period following the first significant snow melt during the early spring which flushes accumulated nutrients from the landscape. The baseflow period, which occurs primarily in the late summer and early fall, was selected to capture the period when average stream flows are at a minimum due to limited inputs from surface runoff. The timing of the sampling events was determined based on data from the USGS stream gauges and weather data for the sampled watersheds.

Fig. 5
figure 5

Map of Michigan’s Lower Peninsula showing the locations of nutrient loading observations. Nutrient loading observations that were used to calibrate the seasonal models were collected within four focus watersheds: Boardman-Charlevoix, Saginaw Bay, Muskegon River, and Grand River. The limits of these major watersheds are shown in grey. Points with an “asterisk” indicate observations that were not included in the annual model calibration. These observations represent watersheds with loads that are 1–2 orders of magnitude lower than the rest of the dataset

The two seasonal calibration datasets were collected from 2011 to 2012. For baseflow, samples were collected between August and October and for melt, samples were collected in early March. The 2011–2012 sampling focused on four major watersheds in the LP: Grand River (south central), Saginaw Bay (east), Muskegon River (north central), and Boardman-Charlevoix (northwest) (Fig. 5). Between 15 and 30 sites were sampled within each watershed for a total of 90 sites that were chosen to evenly distribute samples throughout the four watersheds.

The two seasonal validation datasets were collected in early October 2010 (baseflow) and mid-late March 2011 (melt). Sample locations for the 2010–2011 campaign were chosen to capture the vast majority of the surface water outflow to the Great Lakes from the LP, and to include smaller sites with greater variability in hydrogeologic and land use conditions. Sites were located at the farthest accessible downstream sample point on 33 major and 34 minor streams and rivers discharging to the Great Lakes; a total of 67 locations were sampled. This sampling, along with the suite of in-field and lab analytes collected, is discussed in greater detail in Verhougstraete et al. (2015). The watershed area for the combined sampling covers over 70% of the land area of the LP.

Stream discharge was either measured directly or obtained from the real-time USGS gauge station rated values (U.S. Geological Survey 2012) if a site was located at a USGS gauge. Discharge was measured using a Sontek RiverSurveyor S5 acoustic doppler profiler (ADCP) in all but lower flow streams, which were measured using a Marsh-McBirney Flo-Mate 2000. Samples to be analyzed for total nitrogen (TN) and total phosphorus (TP) were collected at the time of flow measurement. Samples were also collected near USGS stream gauges, and data from gauges were downloaded to coincide with the time of sample collection. Care was taken during sampling to ensure that the hydrologic events represented (baseflow or snow melt) were truly occurring, requiring rapid sampling following meteorological events.

Grab samples were collected in-stream or from bridge crossings and then immediately frozen on dry ice. The samples were analyzed by the Michigan State University Algae Lab using second derivative spectroscopy (TN) (Crumpton et al. 1992) and ascorbic acid methods (TP) following persulfate digestion (Standard Methods 4500-P.E. and 4500-N.C). The full sampling results are provided in the supplementary material.

Estimated annualized loads for each event were calculated as \(L_{obs}=Q*C\) where Q is the observed stream discharge (m3/year) measured at a site, and C is the concentration of a particular nutrient (kg/m3 of TN or TP) from the sample collected at a site.

Annual loads reported by Robertson and Saad (2011) and used to calibrate the SPARROW Great Lakes model were also used to calibrate separate runs of the annual TN and TP models. These runs were performed so that the results and model performance presented in this work could be directly compared to SPARROW and the results reported in Robertson and Saad (2011). Loads from watersheds located entirely within the Lower Peninsula were selected from the larger Great Lakes dataset.

Three observations within the annual data set represent watersheds with loads that were 1–2 orders of magnitude smaller than the rest of that dataset. These observations (indicated by “*” in Fig. 5) include the River Raisin and two locations on the Huron River. Since the dataset for the annual model was relatively limited, these observations were excluded from the annual model calibration to prevent these points from biasing the calibration. However, loads calculated using the optimized model for these three locations have been included in the calculated R\(^{2}\) values and the presentation of residuals and results.

It is important to note that the loads reported by Robertson and Saad (2011) and used to calibrate the annual models presented here represent the long-term mean annual loads and were computed using Fluxmaster [as described in Robertson and Saad (2011) and Saad et al. (2011)] while the loads used to calibrate the seasonal models are annualized instantaneous loads, collected during a relatively stable flow period representative of each condition (snow melt and baseflow). The results from the annual models are used both for comparison to SPARROW, as well as to investigate seasonal effects of uptake along transport pathways.

Model optimization

The values of the coefficients \(\alpha _{1}\) through \(\alpha _{6}\) (\(\alpha _{7}\) was fixed) for each model were fit using the Matlab function fminseach, which uses the Nelder–Mead simplex algorithm (Nelder and Mead 1965), a direct search algorithm. The function was executed using the Global Start function which runs the optimization function a large number of times using various start points generated with a scatter search algorithm (MathWorks 2016; Glover 1997). The global search algorithm identifies numerous local objective function minima generated from varying initial parameter values in an attempt to identify the global minimum. The aim of this approach was to ensure that a globally optimal parameter set is obtained, and to provide uncertainty estimates for the parameters.

The objective function (\(\phi\)) for the optimization was to minimize the mean absolute difference between the natural log of the observed concentration of a particular nutrient and the natural log of the simulated concentration:

$$\begin{aligned} \phi =mean(abs[ln(C_{model})-ln(C_{observed})]) \end{aligned}$$
(8)

where \(C_{model}=L_{model}/Q\) where Q is the observed stream discharge (m3/year), \(L_{model}\) is the modeled in-stream load (kg/year) calculated by the model equation (Eq. 1), and \(C_{observed}\) is the observed concentration of a particular nutrient (kg/m3 of TN or TP). The final coefficient values were determined by averaging the 10% of optimized coefficient values identified by the global search algorithm that had the lowest objective function value. During model development, objective functions were built using area normalized loads and total annual loads as the target of the objective function. Concentrations were ultimately selected for the objective function because they improved the model’s sensitivity to the pathway terms; when loads were used, the model was dominated by the flow term. However, following model calibration, the annualized load was calculated and used to present model results and residuals. This was done so that model results would be comparable to those presented for other regional scale models.

Six parameters were optimized, corresponding to the coefficients for Bsurf, Btile, Bgrd, Fgrd, Ext and R. Bounds were applied to the coefficients to reduce optimization time and ensure that the coefficients applied were in agreement with the process-based form of the statistical model. Coefficients that modify distance (Bsurf, Btile, Bgrd, and R) were limited between 0 and −1 since those coefficients describe the attenuation of nutrients that occurs along transport pathways. The proportional coefficients (Fgrd and Ext) were constrained between 0 and 1. Furthermore, the parameters for the exponential coefficients were log-transformed for optimization, allowing the optimization routine to operate in a relatively linear parameter space.

Some of the basin parameters included in the model describe processes that are not expected to exhibit significant seasonal variation. These parameters include the tile pathway (Btile) and those pertaining to the groundwater pathway (Fgrd and Bgrd). Initially, these parameters were optimized, allowing for seasonal variability; however the model was insensitive to the seasonality of these parameters. To increase sensitivity, and to maintain a parsimonious model, these coefficients were linked during the seasonal model optimization. This was achieved by creating a combined optimization run where the values for Btile, Fgrd, and Bgrd were shared between the baseflow and melt models. The combined objective function value was then used as the target of the optimization.

Sensitivity analysis

Sensitivity was calculated for each parameter value by individually perturbing values from the globally-optimized parameter set, and computed according to

$$Sensitivity = \frac{{\left( {\phi \left| {_{{optim}} - \left| {_{{optim*(1 + \Delta )}} } \right|} \right.} \right)}}{{\Delta *\phi \left| {_{{optim}} } \right.}}$$
(9)

where Δ is the small change in the parameter value (0.5%) and \(\phi |_{optim}\) is the optimization function evaluated at the optimized parameter value. The calculated sensitivity for each parameter was then normalized to the most sensitive parameter for each linked model.

Residual analysis

For residual analyses, error was calculated as

$$\begin{aligned} Error=log(L_{model})-log(L{}_{observed}) \end{aligned}$$
(10)

A residual of 1 indicates an order magnitude over-prediction by the model. Residuals were analyzed to determine the potential for model bias. A linear regression analysis [following Alexander et al. (2002)] was performed to relate residuals to watershed characteristics such as land use, runoff, and watershed area.

Results and discussion

Model calibration

Table 1 Summary of model estimated parameters. The average and range of the top 10% of estimated model parameter results are reported for each model. For the seasonal models Bgrd, Btile, and Fgrd are linked, so the baseflow and melt models share the same values for these coefficients
Table 2 Summary of estimated model fit parameters

The optimized model coefficients are reported in Table 1, along with the observed range in coefficient values amongst the top 10% of optimized values. Most coefficients are relatively insensitive (less than an order of magnitude) to differences in initial search values.

The results for the six optimized models (TN Baseflow, TN Melt, TN Annual, TP Baseflow, TP Melt, and TP Annual) are shown in Tables 1 and 2. All models had adjusted R2 values greater than 0.72, and all but two models (TP Melt and TP Annual) had adjusted R2 values greater than 0.84. Overall, the nitrogen models performed better than their phosphorous counterparts except for the baseflow model. Seasonally, the baseflow models performed better than the melt models.

The performance of the annual models is comparable to the SPARROW models for the Lower Peninsula watersheds presented by Robertson and Saad (2011). Based on R2 and root mean squared error (RMSE), the TN annual model performs as well as the SPARROW model for watersheds in the LP; the R2 value for both models is 0.95 and the RMSE value is 0.03. The TP SPARROW model has a slightly higher R2 value and a slightly lower RMSE than the TP annual model presented here; the R2 value for this work is 0.81 and the RMSE value is 0.11 and the R2 value for the LP watersheds from the SPARROW model is 0.86 with an RMSE of 0.05.

Fig. 6
figure 6

Simulated and observed log daily loads for calibration and validation datasets

Figure 6 shows how log modeled loads compare to log observed calibration loads for the six optimized models, and validation loads for the four seasonal models. The plots show that most models have good correspondence to the 1:1 line. The TN Melt model shows a tendency to under-predict the highest and lowest loads. The annual models over-predict the lowest loads. Plots of observed yields and modeled loads are provided in the supplementary material.

Nutrient source delivery

Sources delivered to surface water

The results from the annual model show that atmospheric deposition and chemical agricultural fertilizer are the largest contributors to the in-stream nitrogen load (Fig. 7). However, the relative contribution of each source to the total nutrients delivered to surface water is highly variable seasonally and across land use types. During baseflow, atmospheric deposition is the primary source of nitrogen across all land use types except for agricultural land. During melt, atmospheric deposition is the majority source of nitrogen in unmanaged and urban land, while chemical agricultural fertilizer and manure are the largest sources of nitrogen in agricultural and rangeland, respectively. Point sources, septics and direct atmospheric deposition become a relatively larger contributor to total nutrients delivered when there is less runoff and more basin attenuation (such as during baseflow). The model predicts that there is very little basin processing during the melt event so surface derived sources are relatively more important during melt. The model predicts that significant sources of nutrients from urban areas can include non-agricultural fertilizer, septic tanks, and atmospheric deposition, depending on the season.

Fig. 7
figure 7

Estimated percent of total nutrients delivered to surface water for each source

The largest contributors of phosphorus to surface water are chemical agricultural fertilizer, manure, and point sources. Atmospheric deposition contributes a relatively small amount of phosphorus and is often considered effectively zero in many nutrient loading models. However, during baseflow, the model predicts that nearly 15% of total phosphorus delivered to streams is derived from atmospheric deposition, so ignoring atmospheric deposition as a source of phosphorus may lead to under-prediction of total phosphorus in such models. This effect can be even greater in watersheds that contain large amounts of unmanaged land (forest, range, grasslands). The TP Baseflow model predicts that in unmanaged land, atmospheric deposition of phosphorus is the primary source of phosphorus during baseflow. Recent work has indicated that over the past decade, concentrations of TP in lakes and streams has increased across the United States in watersheds covering all land uses (Stoddard et al. 2016). Stoddard et al. (2016) suggest that since the increases appear to be ubiquitous, a widespread mechanism, such as atmospheric deposition, may be the cause.

Table 3 Predicted annualized nutrient export (kg/ha/year) by landuse for Lower Peninsula of Michigan

As shown in Table 3, the model predicts that there is a high-degree of seasonal variability in the amount of nutrients exported from each land use. The export rate of phosphorus for urban land is approximately the same as cropland during baseflow, which may be related to the relative increase in point source delivery during baseflow (Fig. 7). The export rates for both nitrogen and phosphorus are significantly higher during melt than baseflow, particularly for agricultural land.

Fig. 8
figure 8

Predicted annualized nutrient export to surface water (kg/ha/year)

The spatially explicit export of nutrients to surface water is shown in Fig. 8. The results show how seasonally, the relative importance of different sources and land uses vary. During melt, urban and agricultural areas become significant source areas for nitrogen and phosphorus. Within this region, there are areas that export over 100 kg/ha/year of nitrogen and over 10 kg/ha/year of phosphorus on an annualized basis. These highly concentrated source areas correspond to confined animal feeding operations (in cultivated areas) and non-agricultural fertilizer application to golf courses which were explicitly included in the nutrient source layers developed by Luscz et al. (2015) (provided in the supplementary material). The TP Melt model results show that locally, urban areas can seasonally export as much phosphorus as cultivated areas.

Sources delivered downstream

Table 4 shows the model-predicted share of the total annual nutrient load that is contributed by each source to the downstream load for basins used for the annual model calibration (model predicted source contributions to observed loads are provided in the supplementary material). The ranges in source contributions are shown in parenthesis. These basins coincide with the Michigan Lower Peninsula basins used in the broader SPARROW Great Lakes model (Robertson and Saad 2011). The source contribution to total nutrient load predicted by this model agrees well with predictions from the SPARROW model. Both models predict that the predominant sources of nitrogen to the sub-basin loads are atmospheric deposition and fertilizer. This model predicts that for the watersheds used to calibrate this model, roughly 44% of the downstream load is derived from atmospheric nitrogen. Based on results provided by Robertson and Saad (2011), for the tributaries to Lakes Huron and Michigan that have drainage areas greater than 150 km2, the average contribution of atmospheric deposition is 51%. A few sources identified by the SPARROW model are surrogates for other sources not explicitly incorporated in that model.

Table 4 Average source contributions to total basin nutrient load predicted by annual model

For the Lakes Huron and Michigan tributaries that have drainage areas greater than 150 km2, SPARROW attributes an average of nearly 20% of the TP load to an “urban” source and 30% of the load is attributed to forested land. This model predicts that approximately 29% of total phosphorus for the sampled watersheds is contributed by non-agricultural fertilizer and septic tanks and 7% of phosphorus is contributed by atmospheric deposition. Depending on the nature of the development, increases in urban area may not lead to proportional increases in each type of urban source. For instance, low to medium density urban areas like the city of Lansing likely have a higher incidence of septic and lawn fertilizer use than a high density urban area such as downtown Detroit. The inclusion of these sources explicitly, contributes to the model’s low bias (see "Residuals" section).

The SPARROW Great Lakes model for TN does not directly include contributions from urban land uses. These sources are either attributed to other sources or contribute to the model error (Robertson and Saad 2011). This model predicts that around 10% of the total monitored load for the Lower Peninsula watersheds is contributed by non-agricultural fertilizer and septic tanks, which are likely attributed to point sources in the SPARROW model.

Pathways and processes

The coefficients that represent process and pathway mechanisms quantify how the landscape attenuates nutrients. Each of the coefficients modifies a spatially explicit attribute that results in spatially explicit reduction factors. In general, significantly more attenuation and extraction of nutrients was predicted by the baseflow event model than the melt model for both N and P for seasonally-varying parameters (Table 1; Bsurf, R, and Ext). Parameter values optimized to the annual loads dataset fell between those seasonally-optimized for either the melt or baseflow event datasets. For those parameters simultaneously optimized across the seasonal models (Bgrd, Btile, and Fgrd), optimized values differed significantly between the annual and seasonal models.

In the case of Fgrd, the coefficient \(\alpha _{6}\) represents the portion of nutrients that enters the groundwater pathway for the maximum recharge value of about 1.1 meters per year. For cells that have recharge less than 1.1 meter per year, the model applies a reduced value of Fgrd according to Eq. (5). The optimized model predicts that seasonally, a maximum of 66% of nutrients enter the groundwater pathway while the annual model predicts that a maximum of 35% of nitrogen enters the groundwater pathway (Table 1). This variable is not expected to exhibit much seasonal variation, and therefore, the value of Fgrd should be similar for both the seasonal and annual models. Since the annual calibration dataset for the Lower Peninsula was limited, it’s likely that a larger dataset could reduce the variability in the optimized values for this coefficient. For phosphorus, the annual and seasonal model predictions are similar; the seasonal models predict that a maximum of about 41% of nutrients enter the groundwater pathway while the annual model predicts 34% of nutrients enter this pathway. Based on the estimated recharge values for the LP, the average Fgrd for the model domain (considering seasonal results) is approximately 14% for TP and 22% for TN.

The groundwater basin reduction factors (\(\alpha {}_{2}\)) show that attenuation in the groundwater pathway occurs over much longer distances than along the surface pathway. The optimized coefficient for Bgrd is small across all models, suggesting that most nutrients entering the groundwater pathway persist across long distances. The models predict that relatively more nitrogen is delivered via the groundwater pathway relative to phosphorus.

The results for the basin reduction factors (\(\alpha {}_{1}\)) indicate differences in processes and pathways both seasonally and between nitrogen and phosphorus. For example, the model indicates that more phosphorus and nitrogen are delivered to streams (less attenuated) during melt than during baseflow. The seasonal model predicts less variation in attenuation in the surface pathway for phosphorous than for nitrogen.

Tile drains affect flow pathways and processes in several ways. Tile drains change the natural drainage pathways of water, decreasing travel times to surface water and decreasing the time available for physical and chemical processes that remove nutrients from agricultural runoff (Robertson and Saad 2011). Several physical and biological processes that affect nutrients differently may affect the delivery of nutrients in tile drains. The results for the annual models (\(\alpha {}_{5}\)) indicate that annually, less attenuation occurs in the tile pathway relative to the surface (overland) pathway. The results for the seasonal models indicate that more phosphorus than nitrogen is retained by tile drained areas. A similar result was found for the SPARROW Great Lakes model, (Robertson and Saad 2011) which showed increased delivery of nitrogen but decreased delivery of phosphorus in areas with tile drains.

The results for in-stream processing (\(\alpha {}_{3}\)) show that in basins with the largest basin yield, very little in-stream attenuation occurs. This is consistent with other studies that have shown that higher attenuation rates occur in small and/or slow moving streams where there is more time for water to interact with the stream bed (Robertson and Saad 2011; Mainston and Parr 2002; Alexander et al. 2000). The model also predicts that slightly more attenuation occurs during baseflow when discharge is lower.

As expected, optimized values for extraction (Ext), given by \(\alpha _{4}\) were significantly higher during baseflow than during the melt event for both N (94 vs. 0%) and P (94 vs. 2%). Annual values fell between the two event models, and had a greater uncertainty (range in top 10% of optimized model parameter values). Values of \(\alpha _{4}\) fell between 39 and 57% for the TN annual model, and 13–41% for the TP annual model. As mentioned above for Fgrd,  the greater uncertainty range in the annual models could be due to the more limited optimization dataset.

Sensitivity

The localized sensitivity near the optimum value of each parameter is shown in Table 5. The sensitivities reported in Table 5 were normalized to the most sensitive parameter in each linked model to indicate the relative sensitivity of each parameter. The results show that Ext was one of the most sensitive parameters, likely due to the significant variability of nutrient loads across agricultural land due to both manure and chemical agricultural fertilizer sources.

Table 5 Sensitivity of model objective function to 0.5% change in parameter value normalized to the most sensitive parameter in each event

Bgrd which describes the attenuation occurring in the groundwater pathway is one of the least sensitive parameters. The model used overland travel distance to describe spatial variations in groundwater attenuation; however, this may not accurately reflect the groundwater residence time, particularly in watersheds where the groundwatershed is significantly different from the surface watershed, or where hydraulic conductivity values vary significantly. A better description of the groundwater pathway may improve the sensitivity of the model to this parameter.

Residuals

The medians of the model residuals \((log(L_{model})-log(L{}_{observed}))\) for the calibration and validation watersheds are close to 0, and with few exceptions, the residuals are less than an order of magnitude (Fig. 9). Residuals by watershed are shown in Fig. 10. The spatial distribution of residuals is nearly random with the exception of the TP Melt model which tends to under-predict loads in watersheds in the northern part of the state. These watersheds are dominated by forested and undeveloped land while the southern watersheds are predominantly agricultural or urban land uses. This suggests that there may be a slight bias in the TP Melt model that leads to over-predictions of agricultural or urban sources and under-prediction of natural sources.

Fig. 9
figure 9

Boxplots of model residuals for calibration and validation watersheds

Fig. 10
figure 10

Log model residuals (kg/year) by watershed for calibration dataset

To explore this potential bias further, a regression of prediction errors and watershed characteristics was developed. This approach was used by Alexander et al. (2002) who compared several large basin statistical nitrogen loading models for the New England region. The same watershed characteristics that were used in their analysis were used here to assess model bias, including basin area, runoff (basin yield), cultivated area, and urban area. This approach provides an indication of model bias and a comparison of the performance of this model with other regional statistical nutrient loading models.

Table 6 Results of residual linear regression analysis comparing model residuals and basin parameters

The results of the analysis, which are shown in Table 6, indicate that most of the models have very little bias. The values shown in bold indicate parameters with p < 0.05. As suggested by the residual map, the R2 of the TP Melt models shows a potential relationship between the residuals and the regression parameters, specifically cultivated land which is confirmed by a significant coefficient.

The results also suggest that the TN Annual and Baseflow models may also have small biases related to cultivated land. However, this bias appears to be relatively small; plots of model residuals relative to watershed characteristics with significant coefficients (p < 0.05) are provided in the supplementary material. Based on this analysis, the modeling approach described here has similar or less bias (low R2, slopes, and least amount of significance attributed to regression parameters) than the regional TN models described by Alexander et al. (2002) with the exception of the TP Melt model.

The source of the bias in the TP models may be because the delivery of sediment is not directly described by the model but is an important transport mechanism for phosphorus (Mainston and Parr 2002). Relatively large amounts of sediments are mobilized during melt, and therefore, sediment delivery may be a sensitive parameter that is not included in the model. Additionally, the model does not explicitly include impoundments, which are common in the northern watersheds and affect the delivery of sediment downstream. Further analysis is needed to determine if these factors are the source of the model bias.

Conclusions

The model results indicate that chemical agricultural fertilizer is the largest annual source of nitrogen and phosphorus to surface water in Michigan’s Lower Peninsula. Atmospheric deposition contributes, on average, 4% of phosphorus delivered to streams and accounted for as much as 33% of the observed stream load in modeled watershed loads. Most models do not account for atmospheric deposition of phosphorus and thus may be attributing partial loads to other sources, particularly in forested watersheds where this is a more significant source of phosphorus. This work supports the hypothesis that in minimally disturbed watersheds, atmospheric deposition can be an important source of phosphorus.

The results also indicate that nitrogen and phosphorus export rates can be significantly greater during melt than baseflow. There is a large amount of variation in export rates between seasons and land uses, which has implications for management strategies and predicting how climate change will impact the delivery of nutrients (LaBeau et al. 2015). Resources for watershed management should be allocated based on the anticipated improvement in water quality (Destouni et al. 2006). Depending on the time of year, the dominant nutrient source in a watershed, and the locations of sources with respect to surface water, the effectiveness of different management strategies may vary. Spatially explicit nutrient models provide a method to estimate the efficiency of different strategies and the potential areas and sources that should be targeted.

With the exception of the annual models, the calibration dataset consists only of data collected from the Grand River, Saginaw Bay, Boardman–Charlevoix, and Muskegon River watersheds while the validation dataset contains samples covering nearly the entire Lower Peninsula. The validation dataset has high R2 values and closely follows the 1:1 line, demonstrating the geographic and temporal transportability of the model.

The model could be improved with a more detailed description of the groundwater pathway. Currently the model uses a surface travel distance to describe groundwater attenuation and uses a surface watershed boundary to determine nutrient inputs to the groundwater pathway. The groundwater pathway, though often ignored, may be a significant pathway for nutrient delivery in the Great Lakes (Robinson 2015). A number of factors affect the transport of nutrients in this pathway and a better description of groundwater travel pathway would especially improve model predictions in watersheds that have groundwatersheds that are significantly different from surface watersheds; the addition of travel time in the groundwater pathway would also improve performance since groundwater residence time can be highly variable.

In addition, the presented model does not incorporate the effect of wetlands and impoundments, which have been shown to impact the ability of watersheds to retain nutrients (Robertson and Saad 2011). A spatially explicit description of these features should improve the model performance and provide insight to their impact on nutrient loading. Finally, the bias of the phosphorus models needs be explored further; perhaps a better description of sediment delivery is needed.

This work demonstrates the value of a spatially explicit description of nutrient sources and delivery mechanisms. Not only does the model perform well in watersheds with diverse conditions and scales, it also provides information beyond prediction of nutrient loads, including seasonal dynamics, and source apportionment of delivered nutrients. It performs as well as other regional scale models (such as SPARROW), and the annual nitrogen model has less land use bias related to cultivated land area compared to other regional nitrogen models (modeling loads in New England) (Alexander et al. 2002; Robertson and Saad 2011). The model performed consistently across diverse watersheds with little to no land use bias (particularly the TN models) and across two different seasons. This suggests that this model has the potential to perform well in other regions and for other time periods. In addition, the inclusion of spatially explicit pathways makes it possible to identify the contribution of concentrated sources such confined animal feeding operations, septic systems, and golf courses. While the relative contribution of these sources may be small at the regional scale, the model results suggest that locally, they can be a significant source of nutrients to surface water, especially in watersheds dominated by urban and suburban land uses. Most significantly, unlike most regional scale models that exist in the literature, this model is able to provide information on the differences in seasonal sources, pathways, and processes and defines within basin variability without a significant increase in model complexity.