Introduction

Ecosystem services support and regulate the natural processes that humans depend on. With many services declining (MEA 2005), the need for ecological restoration is growing (Rey Benayas and others 2009). It would be efficient to restore multiple services in a single site (Kusler 2004; Banerjee and others 2013). However, not all services co-occur as “bundles,” which were defined by (Raudsepp-Hearne and others 2010) as sets of ecosystem services that repeatedly co-occur across space or time. Some services such as agricultural production and water quality improvement might instead form “tradeoffs” (Rodriguez and others 2006). Most landscape analyses are based on indirect estimates of services and generally show that (1) services occur in characteristic bundles, and (2) no area maximizes all services, although the number that co-occur at high levels varies (Raudsepp-Hearne and others 2010; Eigenbrod and others 2010; Haase and others 2012; Miller and others 2012; Qiu and Turner 2013). The extent to which these arguments apply to direct measures of services or to individual wetland sites or restoration projects remains unclear.

Wetlands are noted for improving water quality, abating flood waters, supporting biodiversity, and storing carbon (MEA—Wetlands 2005; Zedler and Kercher 2005; Jordan and others 2011; Moreno-Mateos and others 2012), and recent guidance calls for watershed approaches to sustain wetland area and services (NRC 2001; US Army Corps of Engineers 2008; US EPA 2012). A recent watershed-approach study entailed mapping seven potentially restorable services within each of 17 subwatersheds (Miller and others 2012). There is further need to determine which services are co-restorable at individual sites, but testing the compatibility of particular services requires intensive measurement (for example, Acreman and others 2011). There is some evidence that ecosystem functions and services depend on levels of plant diversity or net primary productivity (NPP; McNaughton and others 1989; Zavaleta and others 2010; Cardinale and others 2012; Hooper and others 2012). Positive correlations between NPP and plant diversity and other services are common in grassland experiments. The same may not hold in wetlands, where correlations between NPP and diversity are often lacking or negative (for example, Moore and Keddy 1989; Gough and others 1994; Schultz and others 2011). Assessments of multiple ecosystem services can help clarify bundles and tradeoffs and suggest co-restorable wetland services.

Concerted efforts to measure many wetland processes are relatively rare because they require integration of hydrodynamics, ecology, and biogeochemistry (but see Zedler and others 1986; Odum and others 1995; Mitsch and others 2012). Here, we applied the “intensive small-n” approach geomorphologists use to investigate complex small-scale processes (Richards 1996; Spencer and Harvey 2012) to assess provision of multiple services within three parallel constructed wetlands in the Yahara Watershed, in southern Wisconsin, USA.

Our three parallel wetlands had the same size, shape, elevation, topography, and soils, were planted with the same species, and received similar surface inflows; however, they drained differentially due to variation in subsurface clay thickness. Hydrologic regime is known to be a major determinant of wetland structure, function, and services (Brinson 1993; Brauman and others 2007). We capitalized on differences in drainage and hydrologic regime by assessing the differential development of structure and services in our wetlands over 3 years. Drainage rates differed visibly from the first rainfall following wetland excavation, and different vegetation established in year 1 (Boehm 2011). From 2010 to 2012, we monitored the development of two hydrologic services [flow attenuation (FA) and stormwater retention (SR), which could reduce erosion and flooding downstream], two vegetation-based services (NPP and plant diversity support, which could provide wildlife habitat and cultural services), and two water-quality based services (erosion resistance and water quality improvement, which could retain soils and reduce eutrophication downstream).

We addressed two questions underlying fundamental relationships among wetland hydrologic regimes and ecosystem services: (1) How did hydrologic regime affect six wetland services? (2) Which services formed bundles versus tradeoffs? We also indicate insights derived from our intensive small-n approach. To the best of our knowledge, ours are the first integrated measures of these services with the aim of identifying bundles and tradeoffs.

Methods

Site Description and Set-up

Our wetlands (3 parallel swales; Figure 1) were excavated in 2008–2009 within the University of Wisconsin—Madison Arboretum, Madison, Wisconsin (43.04°N, 89.42°W) to treat stormwater from a 45.7-ha urban watershed. Stormwater flowed through a 0.10-ha forebay, a 0.17-ha retention pond, then through identical weirs into four swales (96-m long, 8.7-m wide at inlet and 14.7-m wide at outlet; slope: 0.06 cm/m) separated by 0.3-m high earthen berms. Outflows moved through identical weirs to a 0.13-ha collection swale, then to a 0.30-ha retention pond. Inlet weir inverts differed by no more than 0.01 mm in elevation and outlet weirs varied by no more than 0.02 mm in elevation according to as-built surveys of the facility (Montgomery Associates, personal communication). Prior to planting, swales were capped with 15 cm of topsoil excavated from the site (six samples had means of: 28% sand, 59% silt, 14% clay, 40 ppm total phosphorus (P), 2033 ppm total nitrogen (N), and pH of 6.3; Montgomery Associates, personal communication). In November 2009, seed mixtures with 27 native wet prairie herbs were sown into Swales I, II, and III at a rate of 590 seeds/m2 with the same 16 assemblages planted in 16 equal-area “sections” running the length of each swale. Swale 0 (Figure 1) was seeded differently and excluded from this study.

Figure 1
figure 1

Map of our research site, with: black and gray inverted triangles representing inlet and outlet weirs of each swale (where water level and contaminant loads were measured), white squares representing vegetation plots (where plant abundance and diversity were sampled), and black dots overlaid on vegetation plots representing cohesive soil strength test sites; background image from WisconsinView (www.wisconsinview.org).

Regulators required that stormwater be diverted around all swales in 2010 while vegetation established and further required that inflows be increased in 2013; hence, we sampled hydrologic and water quality services primarily in 2011–2012 when we controlled inflows.

Hydrologic Sampling

We monitored surface water flows using six pressure transducers (HOBO water level loggers, Onset Computer Corporation, Pocasset, Massachusetts, USA), which recorded water level every 60 s. Stilling wells constructed from polyvinyl chloride (PVC) pipe shielded the pressure transducers from sunlight and debris. At least biweekly we measured water level manually, reset the loggers to avoid sensor drift, and ensured that their sensor modules were clean.

We assessed the integrated effect of subsurface processes (loss by infiltration) on the surface water regime by measuring how quickly each swale’s surface water elevation decreased after storms. We measured this water level recession rate as the slope of the approximately linear portion of the water level time series, beginning once the storm passed and outflow over the weir ceased and ending either when the next storm arrived or when the water level receded below the sensor depth. This water level recession rate predominantly measures infiltration, but also includes a small evapotranspiration component that was likely similar in each swale.

Sharp-crested, aluminum 30° V-notch weirs at the inlet and outlet of each swale regulated flow through the system and allowed for flow measurement. A pressure transducer installed upstream of Swale III’s inlet measured water level in the forebay and provided the data needed to calculate flow into each swale. Pressure transducers installed near the outlet of each swale provided the data to calculate flow out of each swale. In addition, one pressure transducer just downstream of each Swale III weir verified that weirs were unsubmerged, with no backflow, and weir equations were valid for computing flow from water level. Flow through the sharp-crested, V-notch weirs can be expressed as

$$ Q_{\text{V}} = C_{1} H^{5/2} , $$
(1)

where \( C_{1} \) is an experimentally determined constant and H is the depth of water above the invert of the weir (Ricketts and others 2004). From 27 May 2011 to 16 June 2011 and 12 September 2011 to 20 November 2011 (or, 8 of our 29 monitored storms), we attempted to control flows by sealing triangular PVC weir plates into weir V-notches, creating trapezoidal cross-sections with higher inverts. Flow through sharp-crested trapezoidal weirs can be expressed as

$$ Q_{\text{T}} = C_{2} H^{5/2} + C_{3} H^{3/2} , $$
(2)

where \( C_{2} \) and \( C_{3} \) are also experimentally determined constants (Ricketts and others 2004). We experimentally determined weir coefficients in a flume using a replica of the weirs installed in the stormwater management facility. When it became clear that drainage was affecting hydrologic regime more than weir plates (see “Results” section), we removed the plates. Thus, swales received equal inputs for all but 60 days during 2010–2012.

Using data from the water level loggers and weirs, we quantified two hydrologic ecosystem services: FA (a swale’s capacity to reduce peak stormflow rates) and SR (a swale’s capacity to reduce the volume of stormwater conveyed downstream). We calculated \( {\text{FA}}_{\text{S}} \) for each swale as

$$ {\text{FA}}_{\text{S}} = 1 - \frac{1}{n}\mathop \sum \limits_{j = 1}^{n} \frac{{\hat{Q}_{{{\text{out}}_{{{\text{S}},j}} }} }}{{\hat{Q}_{{{\text{in}}_{{{\text{S}},j}} }} }}, $$
(3)

where \( \hat{Q}_{{{\text{in}}_{{{\text{S}},j}} }} \) and \( \hat{Q}_{{{\text{out}}_{{{\text{S}},j}} }} \) are the peak flows measured through the inlet and outlet weirs of swale S, respectively, during the jth storm. \( {\text{FA}}_{\text{S}} \) is equal to the average, over all storms, of peak FA expressed as a fraction of peak inflow. We calculated \( {\text{SR}}_{\text{S}} \) for each swale as

$$ {\text{SR}}_{\text{S}} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{m} Q_{{{\text{out}}_{{{\text{S}},i}} }} }}{{\mathop \sum \nolimits_{i = 1}^{m} Q_{{{\text{in}}_{{{\text{S}},i}} }} }}, $$
(4)

where \( Q_{{{\text{in}}_{{{\text{S}},i}} }} \) and \( Q_{{{\text{out}}_{{{\text{S}},i}} }} \) are the ith stormflow volumes measured through the inlet and outlet weirs of swale S, respectively, of the m measurements taken when all pressure transducers were sampling water depths. Thus, SRS was a measure of the cumulative volume of surface water volume stored or removed during our monitoring period.

We considered higher scores on our FA and SR metrics to represent greater reduction in peak stormflows and stormwater discharge, respectively.

Vegetation Sampling

In 2010, Boehm (2011) sampled plant species richness and shoot biomass in 32 0.25-m2 plots spaced uniformly throughout Swales I, II, and III. In 2011 and 2012, we sampled 1 m away from the previous year’s plot in a random direction to avoid previously disturbed vegetation (Figure 1). In May, July, and August of 2011 and 2012, we sampled: composition (presence of all vascular plant species rooted in each plot); maximum standing leaf height; leaf area index (LAI; using an Accupar-LP 80 ceptometer; Decagon Devices, Pullman, Washington, USA); moss cover (as percentage of centimeter with moss present along a 40-cm transect in the plot center); and presence of standing water at the plot center. In the last 2 weeks of August and first week of September we harvested: the year’s shoot biomass, litter mass (standing dead and loose litter from previous years), and root and rhizome biomass (hereafter, root biomass). We clipped shoots of plants that were rooted within each plot at the soil surface, sorted them by species, and dried them at 70°C for 48 h; we collected and dried litter mass the same way. Root samples were collected as one 10-cm-deep by 10-cm-diameter core taken from the center of each plot. Cores were washed immediately or stored at −5°C (to prevent decomposition) for up to 4 weeks; we spray-washed cores over a 1-mm mesh removing adherent debris and soil by hand, and then dried and weighed root samples as with shoots and litter.

We used shoot biomass as an indicator of NPP because it represented annually produced material (not accumulated). We used species richness (derived from composition data) as an indicator of diversity support because Swales I–III received the same seed mixes and were open to colonization by the same naturally dispersed propagules.

Soil Stability and Water Quality Sampling

Soil Stability

We used a Cohesive Strength Meter (CSM; Model MKIV 60 psi, Partrac, Glasgow, UK) to measure critical shear stress as an indicator of surface soil stability. The CSM utilizes infrared optical sensors within a test chamber to measure water transparency after the soil surface is subjected to water pulses at increasing pressures to induce sediment detachment.

We sampled critical shear stress in a randomly chosen subset of vegetation plots (Figure 1), in September and November of 2010 and 2011, after collecting biomass. At each test site we categorized substrate surface as algal mat, moss mat, matted organic matter, bare soil, or muck soil; we defined bare soil as consolidated substrates that lack surface mats and muck soils as substrates with extremely low cohesion. We avoided obvious soil cracks and edges of algal mats. We discarded measurements with an initial beam transmission reading less than 70%, since this indicated prior surface particle disturbance. Multiple beam transmission measurements were averaged for each incremental pressure value. We converted vertically applied jet pressures to an equivalent horizontal bed shear stress (τ o), defined by Tolhurst and others (1999) as

$$ \tau_{o} = \left( {67 - 67\exp^{{ - \frac{x}{310}}} } \right) - \left( {195 + 195\exp^{{ - \frac{x}{1623}}} } \right), $$
(5)

where τ o (N m−2) is bed shear stress and x (kPa) is eroding pressure. Critical shear stress (τ c) was determined according to Black (2007). In some instances of highly resistant surfaces, sediment did not detach under the maximum producible pressure. In these instances, we recorded a critical shear stress of 9.12 Pa, the stress equivalent of our device’s maximum producible pressure. We considered swales with higher critical shear stress to have higher erosion resistance.

Water Quality

We collected stormwater samples over the course of the hydrograph for 13 selected storms from September 2011 to October 2012 using solar-powered Teledyne ISCO Portable Samplers (Model 6712 and 6715FR, Teledyne ISCO, Lincoln, Nebraska, USA); with those ISCO samplers we used a combination of Bubble Flow and Area Velocity Flow modules to measure water head above the weir inverts. Prior to each storm, modules were calibrated with manual measurement of the water level above the weir invert to avoid sensor drift. We used equations (1) and (2) to calculate flow rates, which we integrated over time to estimate the inflow and outflow volumes of each swale for each storm.

Each ISCO unit was programmed to collect samples in up to 24 1-l bottles per event on a volume pacing basis. Sample pacing ranged from 7.6 to 15.1 m3 of water flow between samples, depending on the anticipated storm event size. Sampling continued until the water level receded below the weir invert or change in flow rate plateaued. All samples were iced during collection and transport. Bottles from each swale position were composited by equal volumes into 3–6 composites per ISCO, based on the assessed hydrographs. Composites were analyzed by the Wisconsin State Laboratory of Hygiene, Madison, Wisconsin, for concentrations of total suspended solids (TSS), total nitrogen (TN), total phosphorus (TP), and total dissolved phosphorus (TDP). Samples for TP and TDP analysis were preserved with H2SO4 immediately after compositing.

We calculated total inflow and outflow stormwater volumes per rain event per swale using weir level data from the ISCO modules and HOBO pressure transducers. To calculate event loads (masses) of TSS, TN, TP, and TDP, we first multiplied sample concentrations by corresponding flow volumes to determine incremental loads, then we summed incremental loads into event loads. We calculated removal efficiencies (%) per event per swale as: removal efficiency = (((inlet load − outlet load)/inlet load) × 100), with inlet and outlet loads in g.

We considered higher contaminant removal efficiencies indicative of greater water quality improvement.

Comparison of Multiple Services

To facilitate comparisons of multiple services we normalized all swale means to the highest individual swale mean for each service. In the case of water quality improvement, which includes distinct component measurements, we first averaged the four contaminant removal efficiencies to create a more general water quality improvement score; to allow graphical comparison alongside other services we applied a linear transformation to water quality improvement scores such that all scores were positive prior to normalization.

Statistical Analysis

For plot-scale vegetation data in each year and for storm-derived measurements (recession rates, FA, and removal efficiencies) made discretely for each swale over the course of our monitoring period, we compared means with one-way ANOVAs followed by Tukey pairwise contrasts at α = 0.05. For CSM measurements, we used the same procedure, but with ANOVA for unbalanced data. Vegetation parameters were measured in nearly the same location in each of the three growing seasons (2010–2012), so we also tested for effects among years and interactions between swale and year for those parameters, using repeated-measures ANOVA; to test for additional spatial variation we regressed vegetation parameters on plot position along the length of the swales. For all tests, we used R statistical package (R Core Team 2012). All error terms presented in the form mean ± error are standard errors.

Results

We accumulated 2.3 million measurements of water level, 912 of stormwater contaminant loads at swale inlets and outlets, 576 of plant abundance and diversity, and 141 of critical shear stress. Although constructed to be replicates, the wetlands developed different structures and services in relation to hydrologic regime.

Hydrologic Regimes

All swales received nearly the same inflows, but hydrologic regimes differed in association with subsurface heterogeneity (clay thickness; Montgomery Associates 2007). After storms, swale drainage rates differed consistently (Figure 2): Swale II drained significantly faster (averaging 5.9 ± 0.6 cm of recession in water level per day) compared to Swales I and III, which did not differ from one another in drainage rate (averaging 1.7 ± 0.2 and 1.2 ± 0.1 cm of recession in water level per day, respectively; n = 23 storms for each mean swale drainage rate). However, Swale III was very different from Swale I and II in extent and duration of inundation: averaging results from six surveys of our 96 vegetation monitoring plots in 2011–2012, Swale III was highest in percentage of plots with standing water 76 ± 10%, Swale I was intermediate (27 ± 12%), and Swale II was lowest (19 ± 9%). Though Swale III generally ponded water, one drawdown occurred during a severe drought in the summer of 2012. Over the monitored period (2011 and 2012), cumulative inflow was no greater in Swale III than in Swales I and II. In fact, our 60-day weir manipulation led Swale I to receive the greatest water volume, and Swale II and III to receive 91 and 84% as much, respectively. During our 29 monitored storms, Swale III always had an outflow, whereas Swales I and II each lacked outflow during three storms.

Figure 2
figure 2

Representative time series of water surface elevation measured in each swale during four storms, showing consistently different water level recession rates between storms. Note that the gaps in data, in Swales I and II only, occurred when water levels dropped below the height of our data loggers.

Hydrologic Services

On average, during monitoring, Swale I (FAI = 0.50) and Swale II (FAII = 0.53) attenuated flows significantly more than Swale III (FAIII = 0.34; Figure 3A); also, cumulatively, Swale II retained the largest fraction of incoming stormwater (SRII = 0.36), Swale I was intermediate (SRI = 0.30), and Swale III retained the least (SRIII = 0.02; Figure 3B).

Figure 3
figure 3

A Mean FA provided by the swales measured as a percentage of peak inflow. Bars indicate the value of FAS, the FA metric. Error bars are standard errors and Tukey letters are derived from post-ANOVA pairwise contrasts with 95% confidence intervals, n = 29. B SR measured as the percent reduction in cumulative outflow volume relative to cumulative inflow volume during 2011–2012.

Vegetation Structure

Vegetation composition, abundance, and diversity quickly differentiated, even though the three swales were seeded with the same 27 native species in the same proportions. Of those species, 18 occurred at least once in our 96 monitoring plots in 2010, 2011, or 2012 and 23 other species self-recruited (Online Appendix 1). The most frequent and abundant colonizers were cattails, Typha latifolia, Typha angustifolia, and their hybrid, Typha × glauca (hereafter “Typha”), especially in standing water. Typha occurred in over 80% of all plots and produced the majority of shoot biomass collected from all plots (55% in 2010, 87% in 2011, and 82% in 2012). In 2011, we found 29 plant species in Swale II, 19 in Swale I, and 9 in Swale III. In the same year, Typha was by far the tallest taxon and the most frequent, occurring in 100, 84, and 56% of plots in Swales III, I, and II, respectively. Typha was also the most abundant, comprising 99, 82, and 68% of shoot biomass collected from Swales III, I, and II, respectively (Online Appendix 1).

Additional variation in vegetation was visible along the lengths of Swales I and II in 2010. In both swales, areas near inlets were wetter and more invaded by Typha than areas farther down the swale. Accordingly, shoot biomass was higher near inlets and species richness was lower (P < 0.05 for linear regressions on section number (that is, position along the swale)). No such trends occurred in Swale III, where inundation and Typha dominance were more uniform. In 2011 and 2012, swale explained at least 4× the variation in shoot biomass and species richness, compared to section number (based on ANOVA sums of squares). At the site scale (pooling Swales I–III, n = 96 plots), year had no effect on mean species richness, but shoot biomass was significantly lower during the 2012 drought than in 2010 and 2011 (P < 0.05, one-way ANOVA). Swales ranked the same in shoot biomass in all years with no interaction between swale and year (P > 0.05, repeated measures ANOVA); likewise, ranks for species richness were the same in all years and showed no swale–year interaction (Figure 4). Our indicators of plant abundance and diversity, shoot biomass and species richness, followed the same trends as alternative indicators (Table 1; Online Appendix 2).

Figure 4
figure 4

Mean levels of A shoot biomass and B species richness by swale and year. Error bars are standard errors and Tukey letters are derived from post-ANOVA pairwise contrasts with 95% confidence intervals, n = 32 for all bars.

Table 1 Indicators of Plant Abundance

Vegetation Services

During our sampling (2010–2012), Swale III consistently provided the highest level of NPP and Swale II the lowest, whereas Swale II provided the highest level of diversity support and Swale III the lowest; Swale I was intermediate in all cases and generally more similar to Swale II (Figure 4A, B).

Soil and Water

Soil Stability

Although all swales were capped with the same topsoil, they differed in cover of moss and algal mats and in soil stability. Moss and algal mats developed extensively in Swales I and II, where soils were not continuously inundated (Figure 2) and where more light penetrated the canopy (that is, lower LAI; Table 1). Mats were most frequent in Swale II, with 16 occurrences of moss mats and 18 of algal mats among the 50 plots sampled there for soil stability in 2010–2011. In contrast, there were zero occurrences of moss mats and three of algal mats among 37 plots sampled in Swale III in the same period. Matted organic matter, found in 21 of 37 plots, was the most frequent substrate in Swale III (where litter mass was high; Table 1); muck soils were second-most frequent, found in 7 of 37 plots.

Over all plots sampled for critical shear stress, muck soil was the most erodible substrate with an average τ c of 1.8 Pa, followed by bare soil at 3.0 Pa, matted organic matter at 5.6 Pa, algal mat at least 7.4 Pa, and moss mat at least 8.6 Pa (Prellwitz and Thompson 2014); in several instances, the latter two exceeded the maximum producible shear stress of the CSM (9.12 Pa). Independent observations of moss cover in all vegetation plots during plant sampling affirmed patterns observed in the subset of plots sampled for soil stability: swale means of moss cover were 7, 11, and 2 cm (of 40-cm transects) for Swales I, II, and III, respectively, in 2011; and mosses were far less frequent throughout Swale III over six samples in 2011–2012 (Online Appendix 1).

Erosion Resistance

Swale II was most erosion resistant, Swale I was intermediate, and Swale III was least resistant (Figure 5A), based on critical shear stress measurements at the soil surface in 2010–2011 (n = 53, 50, and 37 for Swales I, II, and III, respectively). The artificial maximum shear stress (that is, maximum measureable) was reached 8 of 53 times in Swale I, 21 of 50 times in Swale II, and 4 of 37 times in Swale III. Thus, our critical shear stress data are conservative, especially for Swale II.

Figure 5
figure 5

A Mean critical shear stress (erosion thresholds) for each swale. Error bars are standard errors and Tukey letters are derived from post-ANOVA pairwise contrasts with 95% confidence intervals, n = 53, 50, and 37 for Swales I, II, and III, respectively. B Mean removal efficiencies of TN, TSS, TP, and TDP by swale. Error bars and Tukey letters are as above, n = 13 for Swale I and III and n = 12 for Swale II.

Water Quality

Contaminant loads were sampled at all swale inlets and outlets in 13 storms ranging from 5.6 to 64.9 mm of rainfall, excepting one storm in Swale II when equipment malfunctioned (n = 12 for all removal efficiencies in Swale II). The subset of storms sampled included 23 of the 111 dates with at least 1 mm of rainfall between 15 April and 8 November in 2011 and 2012 (some storms occurred over multiple dates; Online Appendix 3). During sampled storms, we recorded the following flow-weighted concentrations of contaminants averaged across all swale inflows: 7.3 mg/l of TSS, 0.82 mg/l of TN, 0.10 mg/l of TP, and 0.05 mg/l of TDP (Online Appendix 3). Cumulatively, the loads (masses) of contaminants entering all swales during storms were 60 kg of TSS, 6.6 kg of TN, 1.0 kg of TP, and 0.6 kg of TDP. In swales where downstream loads of some contaminants exceeded upstream loads, the “removal efficiency” was negative, that is, a net export of contaminants occurred.

The three swales differed significantly in removal of TSS, TN, TP, TDP (TDP = the dissolved portion of TP). Although the sign and magnitude of removal differed by contaminant, one trend was consistent for all contaminants: removal efficiency was highest in Swale II, intermediate in Swale I, and lowest in Swale III (Figure 5B). Based on cumulative inflow and outflow loads during the storms we sampled in 2011 and 2012, the three swales combined to remove 17.7 kg of TSS and 0.44 kg TN, but also combined to export 0.35 kg TP and 0.36 kg of TDP. Thus, removal efficiencies differed by swale and contaminant.

Water Quality Improvement

None of the three swales provided substantial water quality improvement. Although negative removal efficiencies could be called an ecosystem disservice or lack of water quality improvement, swales did vary in their effect on water quality; Swale II had the most positive effect (some removal of TSS and TN and least discharge of TP and TDP), Swale I was intermediate, and Swale III had the most negative effect (some discharge of TSS and TN and most discharge of TP and TDP).

Overall Service Provision

Comparing all three wetlands for all six services measured over 2010–2012, Swale II ranked lowest in NPP and highest in FA, SR, diversity support, erosion resistance, and water quality improvement, whereas Swale III ranked highest in NPP and lowest in the other five services (Figure 6). Those patterns held in all years and for individual or averaged contaminant removal efficiencies (Online Appendix 4).

Figure 6
figure 6

Relative provision of six ecosystem services for each swale. Data were normalized to the maximum swale value (1) for each parameter. For clarity, removal efficiencies of TSS, TN, TP, and TDP were averaged into a single water quality improvement parameter (Online Appendix 4).

Discussion

All Services Responded to Hydrologic Regime

Differences in levels of six wetland services were substantial, as indicated by ranges of means among swales: 18.0× in SR, 2.2× in NPP, 2.2× in diversity support, 1.8× in erosion resistance, 1.6× in FA, and from positive to negative water quality improvement. We attribute these differences principally to hydrologic regime, because the wetlands were initially replicates in size, shape, species planted, quality of inflowing water, and quantity of inflowing water (except 60 days with adjusted weir plates in 2011). It is recognized that hydrologic regime is a fundamental determinant of wetland characteristics (for example, Brinson 1993), and our study affirms that even small variations can have overarching effects on the development of wetland structure and services (Brauman and others 2007; Webb and others 2012). Differences in water recession rates (ranging ~5×; from 1.2 cm/day in Swale III to 5.9 cm/day in Swale II) caused differences in swale water levels between every storm. Much greater variation in infiltration capacity (orders of magnitude) occurs among soils that differ in texture alone (Rawls and others 1983). Yet in swales with similar topsoil and inflows, differential thickness of a subsoil clay layer was enough to produce distinct hydrologic regimes.

Swales I and II drained between storms and provided similar levels of most services (Figures 2, 6). We hypothesize a sequence of cause–effect mechanisms as follows: greater infiltration and periodically dry soils slowed flows and removed large volumes of surface water from the system, increasing FA and SR. Fluctuating water levels facilitated more species of wet prairie and sedge meadow plants and restricted Typha dominance (Frieswyk and Zedler 2007). Occasional drying aided erosion resistance by ensuring consolidated surface soils (not erodible muck soils; Grabowski and others 2011) and by allowing moss rhizoids and mucilaginous algae to stabilize surface soil particles. In addition, TSS, N, and P, were “removed” (that is, pollutants were settled or infiltrated rather than discharged into downstream surface waters). These hypotheses are consistent with the idea that hydrologic services, plant diversity, and surface water quality improvement could co-occur in draining wetlands. We recommend further tests of those associations.

In contrast, we hypothesize that Swale III developed as follows: minimal infiltration led to prolonged inundation, which reduced the swale’s effective storage volume and allowed P to become soluble and exportable, as in other created or restored wetlands with similar hydrologic regimes (Aldous and others 2005; Boers and Zedler 2008; Montgomery and Eames 2008; Ardón and others 2010). Ponded water and available nutrients favored dominance by Typha, which is known to increase NPP and restrict plant diversity in natural and restored wetlands (Craft and others 2007; Frieswyk and others 2007; Boers and others 2007). Dense shade from Typha and its thick litter layer reduced light available to moss and algal mats on the soil surface, increasing the erodibility of surface soils. Again, we recommend direct tests of those proposed cause–effect mechanisms. Created sites offer the opportunity to perform such tests by manipulating vegetation and hydrologic regimes.

In experimental wetlands and mesocosms, pulsed or fluctuating water levels tend to favor greater N-removal than static water levels (Busnardo and others 1992; Phipps and Crumpton 1994; Tanner and others 1999; Mitsch and others 2012). Patterns of N-removal in our study match those results. It is possible that, alternating oxic and anoxic conditions in Swale II favored coupled nitrification–dentification there (in addition to N-removal via infiltration), or that anoxic conditions and high nutrient availability allowed denitrification in Swale III that was outweighed by export of particulate N along with TSS; direct measurement of denitrification could clarify the mechanisms underlying differential N-removal. Also, because removal rates of N and other contaminants are largely dictated by loading rates (Jordan and others 2011), it may have been unusually difficult for our system to substantially reduce loads of through-flowing contaminants due to relatively high-quality inflows; inflowing concentrations of all the contaminants we measured were around the 25th percentile of those reported for similar treatment wetlands in the International Stormwater BMP Database (2012; Online Appendix 3).

However, relatively low inflows of P allowed us to recognize an important consequence of hydrologic regimes in our system: P-export. Specifically, near-continual inundation of Swale III was associated with much higher P-export compared to the periodically dry Swales I and II (Figure 5). Given that TDP loads increased to an even greater extent than TP loads between swale inlets and outlets, we conclude that exported P almost surely mobilized from soils and detritus when swales were inundated and anoxic, as in previous studies (for example, Aldous and others 2005).

Prolonged inundation commonly leads to productive monocultures of Typha or other wetland invaders (Kercher and others 2007; Boers and others 2007; Frieswyk and Zedler 2007; Boers and Zedler 2008; Hunt and others 2011). In studies of natural wetlands in the same watershed as our study site (the Yahara), Owen (1995) and Kurtz and others (2007) observed links between hydrologic regime and vegetation similar to those in our swales: intermittent drainage leading to sedge meadow and ponded water leading to cattail dominance. Tight linkages between hydrologic regime and wetland attributes and services highlight the importance of establishing target regimes; natural variation in subsoils resulted in different levels of each service in our created wetlands. Further, if hydrologic regime is a consistent and comprehensive driver of wetland services, bundling of services should be widespread.

Bundled Services and Tradeoffs were Evident

In a landscape in Quebec, Canada, Raudsepp-Hearne and others (2010) estimated 12 services and found six types of bundles, but no site with all services at high levels. In our Yahara Watershed, Qiu and Turner (2013) estimated ten ecosystem services and found that high levels of multiple services often did not coincide. Our site-based results from intensive sampling are similar and suggest a subset of co-restorable services in our site.

Bundling of five of the six services we assessed was indicated by their co-occurrence at higher levels in two draining swales; these were FA, SR, diversity support, erosion resistance, and water quality improvement. In contrast, a tradeoff was suggested by the co-occurrence of the lowest levels of these services with prolonged inundation and very high NPP. The relationships among services were complex and were also affected by hydrologic regimes, but our detailed data helped us hypothesize mechanisms underlying bundles and tradeoffs.

We observed a positive association between erosion resistance and water quality improvement, but neither of those was positively correlated to NPP. In Swales I and II rapid drainage and short, sparse vegetation coincided with erosion-resistant surface soils and TSS removal. We propose that low vascular plant biomass allowed light to penetrate the canopy and soil-stabilizing moss and algal mats to expand (as in Bergamini and others 2001), and that mats helped prevent Swales I and II from contributing sediments downstream. In contrast, Swale III had the least moss and algal mat cover and was the only exporter of TSS. We sampled the cohesive strength of surface soils at small scales (<0.01 m2), so it is unclear how the local condition of mat presence compared to whole-swale conditions (for example, periodic dryness of soils) as a determinant of sediment detachment and export. However, the very high level of NPP in Swale III did not ensure sediment retention and nutrient removal, as is sometimes expected (Toet and others 2005; Quijas and others 2010; Mitsch and others 2012). From 2011 to 2012, NPP in Swale II was less than half that in Swale III, yet all plots in Swale II were vegetated and each plot produced at least 98 g of shoot biomass/m2/year. We speculate that the presence of moderately productive vegetation in Swales I and II was sufficient to limit erosion, and that very productive vegetation in Swale III indirectly contributed to sediment detachment and TSS export by shading out soil-stabilizing mats.

Similarly, very high NPP appeared to restrict plant diversity, as in other restored wetlands (Doherty and others 2011; Doherty and Zedler 2014) and in the experimental test of Kercher and others (2007) involving an invasive grass, Phalaris arundinacea. In the latter experiment, invasion and loss of resident species coincided with flooding, indicating an interaction between hydrologic regime and a highly productive plant. In natural wetlands as well, low NPP and high diversity co-occur where drainage is relatively fast, whereas high NPP and low diversity co-occur where drainage is slow (Moore and Keddy 1989; Amon and others 2002; Kurtz and others 2007; Acreman and others 2011; Webb and others 2012). In addition to NPP, leaf height and litter accumulation often confer competitive advantages in crowded plant communities (Grime 1979; Givnish 1982); the Typha-dominated vegetation in Swale III was taller and accumulated more litter than vegetation in Swales I and II (Table 1). In nutrient-rich wetlands, Typha, P. arundinacea, and other productive dominants are known to lower plant diversity by excluding other species (Green and Galatowitsch 2002; Craft and others 2007; Boers and others 2007, Jelinski and others 2011); though the same does not appear to be true in uplands (Adler and others 2011; Schultz and others 2011; Cardinale and others 2012). In wetlands, efforts to moderate NPP, for example, by not refilling treatment wetlands with nutrient-rich topsoil, could promote plant diversity.

Overall, our results fail to support the idea that levels of most ecosystem services track NPP (McNaughton and others 1989; MEA 2005; Zavaleta and others 2010; Cardinale and others 2012; Hooper and others 2012). Because plant cover was correlated with NPP, cover would also have been a misleading indicator of overall wetland service; we join others in cautioning against the use of plant cover as an indicator of overall wetland function or service (Cole 2002; Fennessy and others 2007; Matthews and Endress 2008; Moreno-Mateos and others 2012). For particular services, NPP and cover should be used as indicators, for example, tall, dense vegetation can indicate salt marsh bird nesting (Zedler 1993), aboveground biomass can indicate potential biofuel yield or nutrient removal via harvest (Meerburg and others 2010), and large root systems can indicate potential to limit methane emissions by aerating anoxic sediments (Bouchard and others 2007; Fausser and others 2012). Some level of NPP is necessary to support other services (MEA 2005), but our data suggest that very high NPP can be detrimental. We hypothesize a hump-shaped curve for wetland ecosystem services versus NPP, just as intermediate levels of NPP support maximal diversity in some biotic communities (Mittelbach and others 2001).

Small-Scale, Integrated Assessments Provided New Insights

Large-scale maps drawn from available environmental or land cover data allow for spatially extensive characterization of ecosystem services (for example, Raudsepp-Hearne and others 2010; Miller and others 2012) but are more reliable if derived from primary data (Eigenbrod and others 2010) and ground-truthed (Qiu and Turner 2013). We were able to attribute overall differences in services to differential hydrologic regimes because our three swales (area ~0.12 ha each) were designed to hold wetland size, grade, geometry, soils, light levels, and inflows constant. That design, plus intensive direct measurements, allowed us to recognize patterns in service provision and to hypothesize specific cause–effect mechanisms.

Integration of hydrologists, ecologists, and engineers enhanced understanding of interrelationships among ecosystem services (Rice and others 2010; Spencer and Harvey 2012). Direct measurement of water level, LAI, and abundance of moss and algal mats helped us explain variation in surface soil stability (Prellwitz and Thompson 2014), which would have been overlooked by measuring only whole-swale TSS removal. Estimating erosion resistance by directly stressing surface soils to induce sediment detachment (Tolhurst and others 1999) indicated the importance of moss and algal mats. Had we relied instead on root biomass (Quijas and others 2010) or soil loss equations (for example, MUSLE; Williams 1975) to assay erosion resistance, we would have concluded that Swale III was most resistant. Instead, direct measurements of contaminant loads showed that Swale III exported TSS, and monitoring of surface soil stability, vegetation, and water levels helped explain the export of TSS and TN in Swale III and export of TP and TDP from all three swales. Sampling TDP (not just TP; Toet and others 2005) showed that Swale III outflows had 2.5× the TDP load (in g) as its inflows, and earlier mechanistic studies on P-release (e.g., Boers and Zedler 2008) led us to conclude that topsoil P (and its co-precipitates) had become soluble during prolonged inundation. Further, TP and TDP export in Swales I and II would not have been evident from measurements of the particulate contaminant TSS alone (both swales removed TSS). We recommend more direct measures of erosion resistance, and measurements of dissolved and particulate nutrients and TSS to assess the efficacy of TSS as a general indicator of water quality (Collins and others 2010).

Conclusions

We found substantial variation in six wetland ecosystem services due to differences in the hydrologic regimes of three wetlands that were otherwise replicates. Ponding facilitated invasion and dominance by highly productive Typha, as well as mobilization and export of P. In comparison, drainage supported greater flood regulation, more diverse vegetation, a greater abundance of soil-stabilizing moss and algal mats, and cleaner water (Swales I and II discharged less contaminants than Swale III, though they exported some P). Associations between wetland hydrologic regime and services were strong, but services also appeared to be linked. The interactions among hydrologic regime and plant-based services gave rise to patterns like those seen in nearby natural wetlands (Kurtz and others 2007). If bundles of services occur reliably in wetlands, and certain bundles of services are co-restorable, restorationists will be able to set more achievable targets for individual sites. Identification and verification of such bundles require interdisciplinary research with direct measurement of multiple services. Clearer understanding of which services a given site (and its hydrologic regime) can support will enable restoration planners to locate projects in sites that offer complementary services or particular services that are needed based on large-scale assessments and watershed approaches (NRC 2009; Miller and others 2012; Wilkinson and others 2013).