Coal tar-based sealants (CT-sealants) are used to beautify and protect off-street asphalt pavements (i.e., driveways, parking lots, and some playgrounds) in many urban areas of North America. A thin layer of this product is either sprayed or brushed on asphalt surfaces, and it must be reapplied every few years due to abrasion and loss of the sealant. The composition of this dark black emulsion has changed little since it was first introduced before World War II (Walters 2013), and it is generally composed of 20–35 % refined coal tar pitch (Mahler et al. 2012). This pitch results from the distillation of coal tar that is a by-product of the coking of coal, for which the metallurgical coke is used to produce steel (Baron et al. 1998). Coal tar pitch is a known human carcinogen (National Toxicology Program 2011), is composed of numerous polycyclic aromatic hydrocarbons (PAHs) and their methyl and polymethyl derivatives (ATSDR 2002), and is classified in the United States as a hazardous waste product if disposed on land (USEPA 2013a). A 1992 exclusion of the United States Resource Conservation and Recovery Act (RCRA) allows it to be recycled (USEPA 1992), most notably for use as the binder for aluminum smelting electrodes (95 % of use; Valle et al. 2007). Less than 5 % is recycled for the production of CT-sealants (Valle et al. 2007).

During the past decade, CT-sealants have emerged as an important source of PAHs in urban environments in North America, particularly in parking lot dust, stormwater runoff, lake and stream sediment, soil, settled house dust, and the air over sealcoated parking lots (Rowe and O’Connor 2011; Mahler et al. 2012; Pavlowsky 2012). Dust from CT-sealants, followed by vehicle emissions and coal combustion, were the primary sources of PAHs to a subset of urban lakes in the central and eastern United States (Van Metre and Mahler 2010). This contamination is of concern because it may pose unacceptable risks to ecological receptors and human health. PAHs associated with CT-sealant particles have been shown to be bioavailable to aquatic benthic invertebrates and to cause acute and chronic effects to these organisms in an outdoor laboratory test system (Bryer et al. 2010) and downstream of parking lots where CT-sealants were applied, respectively (Scoggins et al. 2007). Amphibians, such as frogs and salamanders, are sensitive indicator species to aquatic contaminants due to their thin skin, which is permeable to water, and their use of wetlands and ponds for mating and early life-stage development. Sediment spiked with CT-sealant at a range of total PAH concentrations found in urban water bodies (i.e., ≤300 mg/kg) adversely affected the growth and survival of early stage frogs (Xenopus laevis; Bryer et al. 2006) as well as the ability of larval spotted salamanders (Ambystoma maculatum) to swim when exposed to CT-sealant under ultraviolet (UV) light (Bommarito et al. 2010). These effects may make amphibians more susceptible to predators. For people residing near CT-sealed parking lots, the estimated lifetime cancer risk is increased at 6 × 10−5 for a scenario where the incidental ingestion of PAH-contaminated house dust (tracked in from parking lots) occurs during the first 6 years of life (Williams et al. 2013).

Stormwater ponds have not previously been studied to determine if CT-sealants contribute to the source apportionment of PAHs in bed sediments. Stormwater ponds are commonly used in North America and elsewhere in the world (e.g., Europe, New Zealand) as a popular best-management practice (BMP) for the treatment of urban stormwater runoff (Crane et al. 2010; Minnesota Stormwater Steering Committee 2013). This stormwater runoff frequently contains a mixture of PAHs, mostly from pyrogenic (i.e., combustion) and/or petrogenic (i.e., oil-based) sources, and these contaminants are ubiquitous in urban environments (Battelle Memorial Institute et al. 2003). PAH compounds with higher molecular mass (i.e., four to six rings) are less water soluble, more hydrophobic, and more lipophilic than those compounds with two or three rings, resulting in preferential partitioning to organic carbon, fat in tissues, and particles in air, land, and water (Burgess et al. 2003). The presence of black carbon (e.g., soot) increases the sorption of PAHs, too (Cornelissen et al. 2005). Upon reaching a stormwater pond designed to slow the flow of runoff, PAHs associated with organic and inorganic particles in stormwater will mostly settle out to the bed sediments. Constructed ponds have a typical design life of approximately 20–30 years with large-scale sediment removal suggested every 8–12 years (Crane et al. 2010). When stormwater ponds were first implemented approximately 30 years ago, the accumulation of metals was of most concern (USEPA 1983). However, potentially toxic concentrations of PAHs are now being found in pond sediments (Crane et al. 2010; Weinstein et al. 2010a, b; Mahler et al. 2012).

In the United States, stormwater runoff is usually transported through publicly owned Municipal Separate Storm Sewer Systems (MS4 s) and then discharged into local lakes and streams (USEPA 2013b); Canada does not have a counterpart to this program. Publicly owned stormwater ponds comprise one part of the MS4 conveyance system. Use of the term “stormwater pond” in this paper includes both constructed ponds and natural water bodies (e.g., ponds, wetlands) that receive stormwater. Stormwater discharges from MS4s are regulated through National Pollutant Discharge Elimination System permits that are administered by state agencies through the federal Clean Water Act (USEPA 2013c). The Minnesota Pollution Control Agency (MPCA) is a state agency that regulates the stormwater discharges of 236 entities (e.g., municipalities, universities; MPCA 2013a). One municipality can have oversight of potentially hundreds of stormwater ponds through a single MS4 permit, and the pond sediments must be analyzed for a suite of PAHs, copper, and arsenic before any sediment removal activities are conducted in Minnesota.

Several cities in Minnesota have determined that concentrations of PAHs are high enough in their stormwater ponds to warrant expensive disposal of the dredged sediment in specially lined landfills. Disposal costs could reach $1 billion if just 10 % of the estimated 20,000 municipal stormwater ponds in the Minneapolis—St. Paul, MN (Twin Cities) metropolitan area exceed the state PAH guideline value for restricted disposal (Mahler et al. 2012). This is an important emerging issue in Minnesota and likely other states and countries that use stormwater ponds as a primary BMP for flood control and water-quality improvement.

Determining the sources of PAHs to Minnesota’s stormwater ponds is needed to further source control and pollution prevention efforts to decrease inputs of PAHs to stormwater. The distribution of PAHs in sediment provides a unique “fingerprint” that allows environmental forensic techniques to be used compared with known source “fingerprints” to allocate sources of PAHs. These techniques range from simple diagnostic PAH source ratios to statistical ordination techniques, such as principal components analysis, to quantitative mixing models (Boehm 2006).

In this study, environmental forensic techniques were used to determine sources of PAHs, such as those from CT-sealants, in the bed sediments of a representative subset of stormwater ponds from the Twin Cities, MN. The effects of land use on the distribution of PAHs in pond sediments were also assessed within the context of ancillary measurements of black carbon, total organic carbon (TOC), and particle size, which can affect the fate and distribution of PAHs. In addition, nationally based ecological and human health indicators were calculated and compared with benchmark values to determine potential risks to benthic invertebrates as well as to determine local disposal options for dredged material based on human health risks. To accomplish these objectives, different subsets of PAH compounds were used to provide the most relevant comparisons. Finally, the management implications of these findings were considered with implications to other local, regional, national, and international jurisdictions.

Materials and Methods

Sample Collection

Before sample collection, stakeholder input and field assessments were used to select appropriate study ponds. Input was solicited from 39 government units in the seven-county Twin Cities, MN, metropolitan area to request assistance with identifying candidate stormwater ponds in the MS4 program. A broader request for assistance was also sent to the Minnesota Cities Stormwater Coalition, which helps members implement federal and state stormwater requirements. Further consideration was given to ponds the municipalities provided permission for the MPCA to sample and use the data. A suite of candidate stormwater ponds were evaluated to assess whether they met the following four study criteria: (1) ≥5 years of age, (2) ≥15 cm soft-sediment deposits, (3) accessible by field kayak for sampling, and (4) watershed dominated by residential, commercial, or industrial land uses. The municipalities designation of major land uses were either supported with their geographic information system (GIS) land-use data, aerial maps, or the professional knowledge of the municipal stormwater manager or engineer. For pond watersheds that included multiple land-use categories (Table S1), the primary land use was used to categorize the pond. The municipalities had varying GIS capabilities, and it was beyond the scope of this project to ground truth their land-use assessments and drainage areas. Google maps were used for reconnaissance viewing of candidate stormwater ponds before site visits. Candidate stormwater ponds were evaluated in the field to select five ponds in each major land-use classification. Two ponds were rejected during field sampling due to (1) inadequate depth of soft sediments; and (2) too much cattail detritus in the sediments, which did not allow cohesive sediment cores to be collected. These ponds were replaced with other acceptable ponds in the same major land-use category. The final suite of 15 stormwater ponds in the Twin Cities, MN, metropolitan area provided a representative subset of statewide MS4 ponds for this study (Fig. 1; Table S1). The ponds included both natural and constructed ponds and were between 0.12 and 5.3 ha in size.

Fig. 1
figure 1

Location of stormwater pond stations (SWP09-) in the Twin Cities, MN, metropolitan area

Kayaks were used to sample three stations within each pond during October 2009, generally in an equidistant transect across the center of each pond. Field replicate samples were collected from two ponds to assess the overall variability or precision of both the field and analytical procedures. GPS coordinates were taken at each station with a Garmin global positioning system unit (Garmin International, Inc., Olathe, KS). A modified drop corer was used to collect multiple samples of the upper 15 cm of sediment. The sediment samples were extruded into a 1.9 L-precleaned Pyrex mixing cup until approximately 1.3 L had been collected. The cup was covered with aluminum foil and stored in a bucket until transported to shore for sample processing. Excess water was decanted, and the sample was then homogenized with a large, precleaned metal spoon.

Subsamples of sediment for carcinogenic PAHs (cPAHs; for calculation of a human health benchmark) were transferred to pesticide-grade, precleaned, amber-colored glass jars with Teflon-lined lids. Other subsamples were distributed in precleaned jars for determination of TOC, black carbon, particle size, and the other chemical parameters listed on p. S2 of the electronic supplementary material (ESM) The sample jars were labeled and stored on ice in coolers. An equal aliquot of sediment from each of the three discrete stations were composited together, homogenized, and divided into precleaned, amber-colored glass jars for analysis of a suite of 34 PAHs (∑PAH34) to be used for environmental forensic work, land-use comparisons, and calculation of ecological benchmarks. The remaining composite sample was distributed to other precleaned sample jars for determination of cPAHs, TOC, black carbon, particle size, and the parameters listed on p. S2 of the ESM. Samples were identified with a SWP09-prefix followed by the pond number (01 through 15) and then either the discrete sample site number (1, 2, or 3) or “C” for composite. For ease of reporting, the SWP09-prefix will not be used further in this article.

Samples were transported to the MPCA, and the discrete and composite samples for cPAHs and TOC were refrigerated (3 °C) before MPCA transport to Pace Analytical Services, Inc. in Minneapolis, MN. Similarly, the discrete and composite samples for particle size were refrigerated (3 °C) before MPCA transport to the Chemical Engineering Department at the University of Minnesota, Duluth, MN. The discrete and composite samples for black carbon were frozen (−18 °C) before shipment by overnight courier to TestAmerica in South Burlington, VT. The composite samples for ∑PAH34 analyses were frozen (−18 °C) before shipment by overnight courier to Battelle: Duxbury Operations in Duxbury, MA; samples at Battelle were stored in an access-limited walk-in-freezer at −20 °C until sample preparation could begin. Sample chain-of-custody was recorded.

Analytical Extraction and Instrumental Analysis

Sediment samples for ∑PAH34 were prepared for analysis according to a modification of USEPA SW-846 Method 3550. Approximately 20 g of well-mixed sediment was spiked with the appropriate amount of the following PAH surrogate internal standards: d8-naphthalene, d10-acenaphthene, d10-phenanthrene, and d12-benzo[a]pyrene. Each sample was serially extracted three times with 100 mL of dichloromethane using orbital shaker table techniques. The combined extracts were dried over anhydrous sodium sulfate and concentrated by Kuderna–Danish and nitrogen evaporation techniques. The sample extracts were processed through an alumina gravity column to isolate the PAH compounds. The weights of the resulting extracts were determined gravimetrically. Activated copper was added to the sample extracts to remove any sulfur that may have been present in the samples. The preinjection volume and/or extract split were adjusted to 3 mg/mL. The extracts were spiked with internal standard and submitted for PAH analysis. The samples were analyzed within 40 days of extraction in 1 analytical batch. A set of 18 parent (unsubstituted) and 16 alkylated PAHs (Table 1) were measured using an Agilent 6890 gas chromatograph (GC) with an Agilent 5973 mass selective (MS) detector in the selected ion mode (SIM) using a modification of USEPA SW-846 Method 8270. An initial calibration of target analytes was performed before analysis to demonstrate the linear range of analysis. Calibration verification was performed at the beginning and end of each 24-h period in which samples were analyzed with a minimum five level curve for all compounds. Concentrations of PAHs were calculated versus internal standards. Target PAHs were quantified using the average response factors generated from the initial calibration. The alkyl homologue PAH series were assigned the response factor of the parent PAH. Percent moisture was determined so the results could be reported in dry weight.

Table 1 PAH compounds and source categories included in different groups of PAHs

Sediment samples for cPAHs were processed and extracted according to USEPA SW-846 Method 3550. Three surrogate compounds were added before extraction: nitrobenzene-d5, 2-fluorobiphenyl, and terphenyl-d14. A set of 24 cPAHs (Table 1) was determined by USEPA SW-846 Method 8270 by GC/MS-SIM. Interferences from the sample extracts were removed by dilution instead of using column chromatography. Percent moisture was determined using ASTM D2216 to report the results in dry weight.

TOC was analyzed using USEPA SW846 9060A. Before analysis, the samples were dried and pulverized with a mortar and pestle. Sample weights were determined that would achieve a result between the reporting limit and the linear range of the Tekmar/Dohrmann Apollo 9000 TOC analyzer with a non-dispersive infrared (NDIR) detector; the maximum sample weight was 0.04 g. Samples were run in duplicate.

The preparation procedures for black carbon were based on Gustafsson et al. (1996, 2001) where samples were dried, ground, and passed through a sieve. The analytical approach used the Lloyd Kahn method (Kahn 1988). Black carbon was determined using a Carlo Erba Elemental Analyzer, either model NA 1500 or model EA 1108. The samples were run in duplicate, and the results were averaged.

Eleven particle size classes of clay, silt, and sand (Jackson 1979) were determined by a laser diffraction method. The sample preparation involved equilibrating the samples at room temperature for at least 0.5 h followed by rotation on a roller mixer for at least 0.5 h to homogenize the sediments. The samples were not pretreated to remove organic matter. The startup sequence for the Cilas 1180L laser diffraction analyzer in the Fraunhofer mode included a 1-h warm-up period followed by rinsing and flushing three times with 1 μm of filtered tap water. This sequence was used before each new standard [standard reference material (SRM) 1003a] and sample. Sediment samples were homogenized again with a metal spatula before analysis, and the subsamples (usually 2 × 100 μL) were taken with a micropipettor (Laboratory Industries). Blanks, standards, and samples were run with a dispersant of 0.2 % (w/w) sodium polyphosphate prepared in 1 μm of filtered tap water. Blanks were run before each standard and sample. Each standard and one sample from each batch of samples were analyzed twice after introduction into the analyzer to provide an instrumental duplicate. The median size and specific surface area of “apparent” (i.e., organic and inorganic) particles in each sample were also determined. After instrumental analysis, another subsample was taken to determine moisture content.

Analytical Quality Assurance/Quality Control

The ∑PAH34 analyses followed USEPA 8270 M data quality objectives of the blank less than five times the method detection limit (MDL), surrogate recoveries between 40 and 120 %, laboratory control sample (LCS) recovery between 40 and 120 %, and matrix spike/matrix spike duplicate (MS/MSD) precision with a relative percent difference (RPD) <30 %. In particular, the mean (SD) of the surrogate recoveries in the samples and field replicates were as follows: 64.8 % (9 %) for naphthalene-d8, 75.4 % (8.2 %) for acenaphthene-d10, 78.1 % (5.9 %) for phenanthrene-d10, and 85.5 % (6.0 %) for benzo[a]pyrene-d12. A procedural blank was analyzed to ensure the sample extraction and analysis methods were free of contamination, and no exceedances were noted; the surrogate recovery of d8-naphthalene was low (30 %) in the procedural blank, but all other surrogate recoveries were within acceptable quality-control (QC) limits. A control oil sample (i.e., North Slope Crude), included to assess accuracy, had an acceptable percent difference <30 %. Although no primary exceedances were noted for the MS/MSD, the recoveries of 32 spiked analytes were outside the measurement quality objective (MQO) because the concentrations of these target compounds in the sample (05-C) were so high that the spike level was less than five times the ambient concentration of sample 05-C. The accuracy for all compounds was shown in the LCS, and the RPD for nine compounds outside the MQO met the contingency criteria. The percent relative SD (% RSD) between the response factors for the individual target analytes was ≤30 %. The mean RSD of all target analytes was <15 %, indicating that the GC/MS-SIM calibrations were acceptable. The results for a SRM 1944 sediment sample were acceptable within the certified ranges, which also included the laboratories internal precision criteria of 30 %.

The cPAH results conformed to current National Environmental Laboratory Accreditation Conference (NELAC) standards (NELAC 2003). All method blank and laboratory control spikes passed surrogate criteria in all batches. Increased reporting limits resulted for three sample extracts (09-2, 09-3, and 09-C) that could not be concentrated to the routine final volume; in turn, the surrogate recovery could not be evaluated against control limits for these samples due to sample dilution. Analyte recovery in the LCS was lower than QC limits in the last half of the samples (06-1 through 15-C), so the results may be biased low. All laboratory control spikes were within QC limits except for 3-methyl-cholanthrene and dibenzo[a,l]pyrene. All internal standards were within QC limits. All analytical duplicate samples were within the method acceptance criteria. All analytes were lower than the reporting limits for the method blank. MS/MSD samples were run on three samples, which were diluted 1:10 like the source samples. A number of analytes in the MS/MSDs were outside the laboratory control limits due to hydrocarbon interferences. The sample results were deemed acceptable with additional caveats noted in the ESM (pp. S2–S3).

The mean RPD between the average of the three discrete samples and corresponding composite sample for detected cPAHs was <15 % (p. S3, Tables S2 and S3). An assessment of the precision between pond samples and matching field replicates for cPAHs and ∑PAH34 (Table S4) is described further in the ESM (pp. S3–S4).

Seven sets of QC samples were run with the TOC analyses, and TOC was not detected in any of the method blanks. The LCS samples ranged between 99 and 110 %, which were within the QC limits of 80–120 %. Five of the seven sets of MS/MSD samples had acceptable percent recovery limits between 50 and 150 % with the maximum RPD <30 %. For two sets of MS/MSD samples, the MSD recovery was outside the laboratory control limit for sample 07-3, and the maximum RPD exceeded 30 % for samples 03-3 and 07-3. Analytical duplicates were run, and the results were averaged to give the final TOC results. For the composite sample results presented in this article, 10 samples had RPD values ≤30 % (the preferred QC result) and five samples (03-C, 05-C, 11-C, 13-C, and 14-C) had RPD values >30 %. The high RPD values may be partly a result of the small sample volume used in the TOC analysis, especially if the samples were not mixed well.

Four sets of QC samples were run with the black carbon analyses, and black carbon was not detected in any of the method blanks. A MS/MSD was run with each set of sample runs. SRM 1944 was used as the LCS, whereby the laboratory established a true value of 9900 mg/kg black carbon based on their method development for this procedure. The LCS samples ranged between 107 and 117 %, which were within the control limits of 50 and 150 %. For the three sets of MS/MSD samples run, MS recovery ranged from 97 to 100 % (i.e., within 50–150 % of control limits). The MS/MSD RPD ranged from 1 to 6 %. For the analytical duplicates run on the composite samples reported in this article, the RPD values ranged between 0 and 18 % and were lower than the QC limit of 40 %. The analytical duplicates were averaged to give the final results.

The particle size samples were run in six analytical batches. When the precision for the median diameter of instrumental duplicates exceeded 14 %, this indicated difficulties with homogenization and dispersion of the samples. For the composite sample results provided here, this situation arose for samples 02-C and 03-C. For these samples, dispersant was added to each jar, and the samples were homogenized again with the roller mixer. These samples were reanalyzed in the last set of samples, and the precision of the instrumental duplicates was acceptable. The results of six runs of standard SRM 1003a, which contained spheres in the range of 8–58 μm, were acceptable.

To determine if a quantitative source apportionment model could be used on the PAH data set of another stormwater pond study performed in the Twin Cities, MN, metropolitan area (Polta et al. 2006), a quality assurance (QA)/QC review of the data was performed. Thirty-two sediment samples from eight of ten stormwater ponds (i.e., the Met Council data set) were acceptable as detailed in the ESM (p. S4).

Environmental Forensics

PAH source categories and PAH source ratios were used as a first step to evaluate the ∑PAH34 data. PAHs were grouped by parent and alkylated PAH categories (Boehm 2006; Table 1). Summations of low molecular-weight (LMW) PAHs consisting of 2- to 3-ring PAHs, and high molecular-weight (HMW) PAHs consisting of 4- to 6-ring PAHs were calculated (Table 1). The following diagnostic source ratios were calculated to aid in determining the preponderance of petrogenic (oil-based) or pyrogenic (combustion-based) sources of PAHs: fluoranthene/pyrene (Flu:Py), phenanthrene/anthracene (Phe:An), and Flu/(Py + Flu) (Budzinski et al. 1997; Yunker et al. 2002). The Flu:Py isomer ratios were also used to aid in identifying sources of CT-sealant in addition to the following: benzo[a]pyrene/benzo[e]pyrene (BaP:BeP) and indeno[1,2,3-cd]pyrene/benzo[g,h,i]perylene (IP:BgP) (Mahler et al. 2005). Double-ratio plots were generated in SigmaPlot 12.3 (Systat Software Inc., San Jose, CA) for IP:BgP versus Flu:Py, BaP:BeP versus Flu:Py, and IP:BgP versus BaP:BeP.

The USEPA Contaminant Mass Balance (CMB) 8.2 receptor model (Coulter 2004) was used to provide a rigorous quantitative approach to determine sources of a subset of PAHs from the ∑PAH34 and Met Council (Polta et al. 2006) data sets to a specified group of sources. The model uses an effective variance weighted least square solution to the CMB equations provided that the following model assumptions are met: (1) the composition of each source profile is consistent over the period represented by the receptor (i.e., sample) data; (2) chemical species do not react with each other or with the environment; (3) all sources that contribute significantly to the receptor have been identified, and their chemical profile is known; (4) the number of sources is less than or equal to the number of species (i.e., PAH compounds); (5) the source profiles are linearly independent of each other; and (6) measurement uncertainties are random, uncorrelated, and normally distributed (Li et al. 2003; Coulter 2004).

Although the CMB8.2 model was developed by the USEPA for air data, it has been successfully adapted to the source apportionment of a subset of PAHs in sediments. In particular, procedures developed by two research groups (Li et al. 2003; Van Metre and Mahler 2010) were followed for this study, including the adaptation of several air source profiles for sediments. The model included ≤12 parent PAHs (∑PAHCMB): Phe, An, Flu, Py, benzo[a]anthracene (BaA), chrysene (Chy), benzo[b]fluoranthene (BbF), benzo[k]fluoranthene (BkF), BeP, BaP, IP, and BgP (Table 1). A group of PAH source profiles (Table 2) were considered for inclusion in the model, including the general categories of CT-sealant pavement dust and particulate runoff, asphalt sealant pavement dust, vehicle-related sources, coal combustion, fuel oil combustion, fireplace combustion of wood products, and used motor oil.

Table 2 PAH source categories and source profiles considered in the CMB8.2 model

Each run of the CMB8.2 model attempted to fit the source data to the ambient sediment data for ≤20 iterations using source elimination. The source profile uncertainty was set at 40 % (Li et al. 2003; Van Metre and Mahler 2010), and the minimum source projection was set at the default value of 0.95. The uncertainty values of the ambient PAH data from this stormwater pond study were based on the RPDs of the sample and field replicate for 12-C and 12-C-R (which ranged from 7.7 % for Phe to 20.1 % for BeP; a more generic uncertainty value of 20 % was used for the Met Council data set). Output files were generated as comma-separated value files that were imported into Microsoft Excel 2010 (Microsoft Corporation, Redmond, WA) for further analysis.

Statistical Analyses

The analytical data were evaluated with a variety of statistical techniques. Summary statistics were run in either Microsoft Excel 2010 or SigmaPlot 12.3 and 12.5. Shapiro–Wilk normality tests, Pearson product moment correlations, Spearman rank order correlations, equal variance tests, pairwise Student's t-tests, Mann–Whitney rank-sum tests, Holm-Sidak method tests, one-way analysis of variance (ANOVA), and the Kruskal–Wallis ANOVA on ranks were performed in either SigmaPlot 12.3 or 12.5. Kaplan–Meier statistics were used for censored (i.e., nondetect) cPAH data (Helsel 2010).

Several statistical procedures were used to prepare the ambient ∑PAHCMB data and for evaluating source profile data for inclusion in the CMB8.2 model. PAH proportional values (i.e., individual PAH concentrations divided by ∑PAHCMB) were calculated and assessed for normality using the Shapiro–Wilk test. A Spearman rank order correlation was run when nonnormal data were present. The linear dependence between PAH proportional values of normally distributed sources and ambient sediment data were assessed using Pearson product moment correlations (r) and the statistical significance (p) of the correlations. Sources with significant correlations to the ambient sediment data were considered further by giving highest priority to the sources that had the greatest number of PAHs within ±1 SD of the mean PAH proportional values for the separate stormwater pond and Met Council data sets.

The performance of each CMB8.2 model run was assessed by Chi square (χ 2) and coefficient of determination (R 2) values as well as by the percent mass estimated by the model (Coulter 2004). χ2 is the weighted sum of squares of the differences between the calculated and measured fitting species concentrations; values less than one are preferred (indicating a very good fit to the data), whereas values between one and two are also acceptable. R 2 values near 1.0 indicate that the source contribution estimates explain the measured concentrations very well, but other R 2 values >0.8 are acceptable. Percent mass represents the percent ratio of the sum of the model-calculated source contribution estimates to the measured mass concentration, and ratios approaching 100 % are most ideal; ratios between 80 and 120 % are also acceptable. In addition, the T-statistic that appears with the source contribution estimates provides additional information to assess model performance because it can be indicative of collinearity among the source profiles when it is <2.0 (Coulter 2004).

Calculated Benchmarks

The USEPA’s PAH Equilibrium Partitioning Sediment Benchmark (ESB) Toxic Units model, for evaluating ecological risk to aquatic invertebrates, was calculated for ∑PAH34 using final chronic values (USEPA 2003; Burgess 2009) and TOC data (p. S4 of the ESM). Concentrations reported between the MDL and the reporting limit for a small number of LMW PAHs were considered estimated data. These estimated data comprised a small percentage of ∑PAH34 (mean = 0.79 %, n = 15 ponds) and were used as reported in the calculations.

Benzo[a]pyrene (B[a]P) equivalents are widely used to assess human health risks of cPAHs classified as probable or possible human carcinogens. B[a]P equivalents were calculated for 24 cPAHs (Table 1) by summing the products of the potency equivalency factor for each cPAH (Table S2) by its concentration (Minnesota Department of Health 2013; MPCA 2012).

Results and Discussion

PAH Forensics

Generalized Source Categories

The group of 34 PAHs measured in the stormwater pond sediments yielded useful information on generalized source categories based on the number of fused benzene rings in each compound and whether the individual PAH compounds contained an alkyl group. The distribution of ∑PAH34 spanned two orders of magnitude and was dominated by HMW and parent PAHs (Fig. 2). The mean (SD) percentage of alkylated PAHs in the 15 stormwater ponds was 21.5 % (7.1 %), whereas the mean (SD) percentage of parent PAHs was 78.5 % (7.1 %; Table S5). The high percentage of parent PAHs observed in these ponds was consistent with a typical urban background of PAHs in sediments dominated with fluoranthene and pyrene isomers (Battelle Memorial Institute et al. 2003). There were no significant differences between either the median percentages of alkylated (p = 0.99) or parent (p = 0.99) PAHs between land-use classes using Kruskal–Wallis one-way ANOVA on ranks.

Fig. 2
figure 2

Box plots of PAH concentration groups in composite stormwater pond sediment samples. The boxes represent the 25th percentile, median, and 75th percentile values. Whiskers (error bars) above and below the box indicate the 90th and 10th percentiles, respectively. Data points outside the error bars should not be considered as outliers because they represent real environmental data

Sample 02-C [2.5 mg/kg dry weight (dw) ∑PAH34] was unique because the highest percentages of perylene (12.0 %) and alkylated PAH fractions (43.9 %, Table S5) were observed in this pond. This native pond retained a relatively greater percentage of perylene (a mostly naturally produced diagenic PAH; Stillman et al. 1998) after stormwater was diverted to it. This pond was also located by a major highway that contributed some petroleum-related (i.e., petrogenic) runoff as evidenced by the bell-shaped distribution of C0, C1, C2, and C3-fluorenes; this was the only pond to display this classic signature of a petrogenic distribution for this subgroup of PAHs (Boehm 2006). Perylene comprised 1.1 % to 3.6 % of ∑PAH34 in the other 14 ponds. If perylene only arose from natural processes, its distribution would be independent of other parent PAHs in the pond sediments. However, the high Pearson’s r of 0.990 was highly significant (p = 2.4E − 012) for the correlation between perylene and total parent PAHs (excluding perylene) for all 15 ponds. Products such as coal tar pitch (used to make CT-sealants) contain perylene (ATSDR 2002), and it has also been measured in other combustion products as summarized by Lima et al. (2005). Thus, there was an anthropogenic source of perylene in most of the study ponds.

As part of a weight-of-evidence approach, simple PAH ratios can provide useful screening information for identifying petrogenic and pyrogenic sources of PAHs. The LMW/HMW PAH ratios for all 15 ponds were <0.60, and values <1.0 are indicative of combustion-derived (pyrogenic) sources dominating the PAH profile in each pond (Wang et al. 2006). In addition, the Flu/Py ratios were all >1.0, and the Flu/(Py + Flu) ratios were all >0.4, which were also indicative of pyrogenic sources (Budzinski et al. 1997; Yunker et al. 2002). The Phe:An ratios were <10 for 12 ponds, which were consistent with pyrogenic sources (Budzinski et al. 1997). Three ponds had Phe:An ratios between 10 and 11.8, and values between 10 and 15 are considered indeterminate relative to source (Gao et al. 2007). Although the simple PAH ratios provided a broad assessment of the importance of pyrogenic sources of PAHs to metropolitan area stormwater ponds, these ratios do not identify which types of pyrogenic sources were important. PAH double-ratio plots were used to further assess whether CT-sealants comprised one of these pyrogenic sources.

PAH Double-Ratio Plots

The PAH double-ratio plots for IP:BgP versus Flu:Py, BaP:BeP versus Flu:Py, and IP:BgP versus BaP:BeP were compared with ranges of these isomer ratios for particles washed from coal tar-sealed parking lots in Austin, TX (Fig. 3; Mahler et al. 2005). Although many of the ponds corresponded to a source profile for CT-sealants, scatter in the double-ratio plots was indicative of multiple sources of PAHs. Only the residential pond sample (02-C) and industrial pond sample (03-C) were outside the range of CT-sealants in all three graphs. Some ponds that were within the range of CT-sealants in one or two plots were outside the range in another double ratio plot. Although these PAH double-ratio plots lend support to CT-sealants as a source of PAHs to stormwater pond sediments in the Twin Cities, MN, these plots do not provide a complete understanding of all sources. These types of environmental forensic analyses are semiquantitative, do not account for varying rates of weathering of PAH compounds, and work best at sites dominated by a single source (Battelle Memorial Institute et al. 2003). Application of PAH ratios for air samples has also shown that these ratios can produce contradictory source results (Katsoyiannis et al. 2011). Therefore, quantitative modeling is needed to provide a rigorous assessment of PAH sources.

Fig. 3
figure 3

Comparison of double-ratio plots of PAH compounds from composite samples of this study (n = 15) with ellipsoid ranges of particles washed from parking lots with CT-based sealants in Austin, TX (Mahler et al. 2005)

CMB8.2 Model

The mean PAH proportional values from this study were significantly correlated (r = 0.976, p = 5.5E − 08) with those from a nationwide study of 40 lakes (Van Metre and Mahler 2010; Fig. S1). Thus, the use of several source profiles evaluated in the nationwide study was appropriate for this study. The PAH proportional values of the mean sediment data and 14 source profiles passed the Shapiro–Wilk normality test, whereas eight sources failed (Table S6). A Spearman rank order correlation table was generated for the PAH proportional values of the mean sediment samples and all PAH source profiles (Table S7). Because none of the nonnormally distributed PAH sources were significantly correlated with this data set, a Pearson product moment correlation table was run of only the normally distributed sediment and source data (Table S8). Seven sources (including all sealant sources) were highly significantly correlated (p < 0.001), and four other sources encompassing vehicle emissions and wood combustion were significantly correlated (0.001 < p < 0.05) to the 15 stormwater pond mean data set (Figs. 4 and S2, Table S8). Plots of the PAH proportional values for the other sources that were not significantly correlated with this stormwater pond data set are provided in the ESM (Fig. S2).

Fig. 4
figure 4

Comparison of PAH proportional values (i.e., mean individual PAH concentration normalized to ∑PAHCMB concentration) between sources used in the EPA’s CMB8.2 model (open circles) and the mean profile for 15 stormwater pond sediments (closed circles; uncertainty bars indicate one SD). PAHs range from LMW to HMW compounds along the x-axis. Pearson’s r and p values are given in parentheses; p values <0.05 are significant

To prioritize the source profiles further, the PAH proportional values from this short-list of significant sources were compared with ±1 SD of the range of mean PAH proportional values from the stormwater pond data set (Table S9). The number of PAHs that overlapped within these ranges was tallied for the 12 PAH compounds used in the CMB8.2 model. The CT-sealant pavement particulate washoff (Austin, TX) source had the highest number of overlapping PAHs (n = 7) followed by the CT-sealant dust profile from Minneapolis, MN (n = 6 PAHs). The other CT-sealant dust profiles had three to five overlapping PAHs. Because CT-sealant weathers over time in the environment, variability in the PAH source signature is to be expected because LMW PAHs volatilize or degrade and HMW PAHs are enriched (Van Metre and Mahler 2010). Asphalt-based sealant dust had two overlapping PAH compounds (Table S9) and was also highly significantly correlated (p < 0.001) with all of the CT-sealant sources (Table S8). The asphalt-based sealant dust could have been cross-contaminated with CT-sealants through the application process (commercial sealcoat operators use the same equipment for both types of sealant), by tracking of CT-sealant particles on tires of vehicles also driven on CT-sealed surfaces, and/or by deposition of CT-sealant dust or particles on asphalt-based sealant surfaces from atmospheric deposition or stormwater runoff, respectively. Therefore, asphalt-based sealant dust was excluded from the model because it appeared to be contaminated with CT-sealant dust. The synthetic log combustion source had three overlapping PAHs, but these logs contain petroleum waxes and wood particles and were highly significantly correlated (p < 0.001) with gasoline vehicle particulate emissions and pine wood soot particles #2 (Table S8). Although synthetic logs were included in some model runs, this source was not included in the final model run due to the mix of sources in this product, which were already represented by other significant sources. Traffic tunnel air contained three overlapping PAHs that were distinct from the two PAHs that overlapped for gasoline vehicle particulate emissions. Comparing this short-list of sources to a tighter range of ±0.5 SD of the stormwater pond PAH proportional values still yielded a number of overlapping PAHs, especially for CT-sealant pavement particulate washoff from Austin, TX (i.e., n = six overlapping PAHs; Table S10).

Eighty-eight different runs of the CMB8.2 model were made using different combinations of source profiles (including ones without significant correlations) and PAH compounds in the subset of ∑PAHCMB. The best model results (i.e., no samples were excluded from the CMB8.2 model, and model performance was acceptable for χ2, R 2, and % mass) were obtained using all 12 model PAHs and a short-list of source profiles that were either significantly correlated (0.001 < p < 0.05) or highly significantly correlated (p < 0.001) with the sediment data (run A, Table 3). The mean RPD between the measured and modeled ∑PAHCMB concentrations was 0.99 %, indicating that the model worked very well in predicting measured concentrations of PAHs (Table 3).

Table 3 Summary of two CMB8.2 model runs from the ∑PAH34 data set (n = 15 composite samples, 12 fitting PAHs)

Two CT sources were included in run A because CT-sealant pavement particulate washoff (Austin, TX) provided a better fit to the mean PAH proportional sample data for Flu, BaA, BbF, BeP, BaP, IP, and BgP, whereas CT-sealant dust (Minneapolis, MN) provided a better fit for Phe, An, and Chy as well as an overlapping fit for BaA, BbF, and BaP (Fig. 4; Tables S9 and S10). The model used a mixture of these CT-sealant sources for seven ponds in run A (i.e., 01-C, 05-C, 07-C, 09-C, 12-C, 14-C, and 15-C; Table S11). In addition, pavement particulate washoff (Austin, TX) was the only CT-sealant source used by the model for the other eight ponds (Table S11). If CT-sealant dust (Minneapolis, MN) was excluded from the model, then the apportionment to this source would shift to other sources that had highly significant Pearson’s r values with it [i.e., CT-sealant pavement particulate washoff (Austin, TX) and pine wood soot particles #2; Table S8]. Rather than use these CT-sealant dust sources separately in model runs and average the results as performed by Van Metre and Mahler (2010), it was preferable to include both sources in the model runs because these CT-sealant sources were not weighted equally (Table S11). Using these two CT-sealant sources also bracketed some of the range of environmental weathering of CT-sealant and took into consideration processes relevant to the transport of CT-sealant particles to stormwater ponds (i.e., dust that could potentially be washed into stormwater ponds and measured runoff of particles from CT-sealed parking lots).

The CMB8.2 model calculated total ∑PAHCMB concentrations based on the sum of the modeled ∑PAHCMB concentrations appropriated to each source. To compare the allocation of sources between ponds, the percentage of each source comprising the total calculated ∑PAHCMB was determined. Model run A resulted in mean percentage values dominated by 55.2 % CT-sealant particulate washoff (Austin, TX) with an added contribution of 11.9 % CT-sealant dust (Minneapolis, MN) for a combined mean CT-sealant source of 67.1 % followed by traffic tunnel air (29.5 %) and pine wood soot particles #2 (3.4 %; Table 3). The T-statistics from run A were >2 (i.e., no collinearity) for 11 ponds, including eight ponds for CT-sealant pavement particulate washoff (Austin, TX) and five ponds for traffic tunnel air, two of which overlapped with this dominant CT-sealant source. The model did not allocate any ∑PAHCMB concentrations to gasoline vehicle particulate emissions in run A, possibly because this gasoline source had a significant Pearson’s r with traffic tunnel air (r = 0.669, p = 0.017; Table S8), and traffic tunnel air had more PAHs within ±0.5 SD of the stormwater pond mean PAH proportional values than the gasoline source (Table S10).

Before CT-sealants emergence as an important source of PAHs to urban sediments east of the Rocky Mountains (Van Metre and Mahler 2010), previous environmental forensic work indicated that atmospheric combustion sources (e.g., vehicle-related emissions, coal combustion, coke oven emissions) were a dominant source of PAHs to urban waterways in the Great Lakes area and east coast of the United States (Dickhut et al. 2000; Li et al. 2003; Bzdusek et al. 2004). The CMB8.2 model was also run excluding the CT-sealant sources (model run B) to test if the inclusion of other source parameters used in run A would yield better or worse model performance results. Model run B had acceptable model performance for all samples, and the mean source apportionment of PAHs comprised traffic tunnel air (53.9 %), gasoline vehicle particulate emissions (24.4 %), and pine wood soot particles #2 (21.8 %; Table 3). The mean RPD between the measured and modeled ∑PAHCMB concentrations was acceptable at 4.1 %. The model results had T-statistics >2 for traffic tunnel air in all 15 stormwater ponds, pine wood soot particles #2 for pond sample 07-C, and gasoline vehicle particulate emissions in pond sample 11-C. Of particular note is the high percentage of pine wood soot particles #2 that comprised some of the composite sample results (Table S12). For example, 79.2 mg/kg of the 181.7 mg/kg dw ∑PAHCMB calculated by the model (i.e., 43.6 %) for pond sample 05-C was attributed to a pine wood source. This is a highly unlikely result because major northern boreal forest fires that occurred in northern Alberta, Canada, during 1998 resulted in total parent PAH concentrations in the low μg/kg range in nearby aquatic sediments (Gabos et al. 2001). In addition, perylene accounted for 50–60 % of parent PAHs in the northern Alberta fire (Gabos et al. 2001), which is a minor component of parent PAHs in 14 of these stormwater pond sediments.

Pairwise statistical tests for model performance parameters and sources were performed between model runs A and B using either Student t or the Mann–Whitney rank-sum test (Table 4). The model performance of run B was significantly less robust than run A because the median values for R 2 and percent mass were significantly lower and the median χ 2 and RPD values were significantly greater in run B (Table 4). The mean percentage of PAHs attributed to gasoline vehicle particulate emissions, traffic tunnel air sources, and pine wood soot #2 particles were all significantly greater in run B (Table 4). With excluding CT-sealant dust/runoff particulate sources, the model yielded significantly poorer model performance in run B and increased the source allocation to transportation-related sources and pine wood soot #2 particles. Thus, it is important to include CT-sealant sources in source apportionment models of urban sediments where these products are used in the watershed. These dust and runoff particulate sources represent a more weathered, environmentally relevant source of PAHs than commercial CT-sealant products or CT-sealcoat scrapings from freshly applied product (Van Metre and Mahler 2010).

Table 4 Pairwise statistical test results between CMB8.2 model runs A (including CT-sealant sources) and B (excluding CT-sealant sources)

Early application of the CMB8.2 model to sediments from Lake Calumet, Chicago, IL, showed major PAH contributions from coke oven emissions and traffic sources (Li et al. 2003); a re-examination of their model would be interesting to now include CT-sealant sources because these products are widely used in the Chicago, IL, area (Van Metre and Mahler 2010). Van Metre and Mahler (2010) determined that CT-sealant dust comprised 45.8 and 70.5 %, respectively, of the modeled ∑PAHCMB concentration from lake sediments collected from two Chicago suburbs. Updating early applications of the CMB8.2 model that were performed before CT-sealants emerged as an important source of PAHs would also satisfy a model requirement that all sources that contribute significantly to the receptor have been identified and their chemical profile is known.

Although the CMB8.2 model provided a rigorous source apportionment of ∑PAHCMB in stormwater ponds from the Twin Cities, MN, metropolitan area, this model has several important limitations. These limitations include the following: (1) the model results are sensitive to uncertainty in the input data, which is often not well defined (Van Metre and Mahler 2010); this study used an uncertainty of 40 % for all source profiles as recommended by Li et al. (2003) and followed by Van Metre and Mahler (2010); (2) the source profiles are limited by the number of PAHs measured for which this study was limited by a common suite of 12 parent PAHs; a broader suite of PAHs would allow more detailed and unique source profiles to be developed that could decrease collinearity in some source profiles; (3) source profiles were obtained from multiple literature sources, which may have varying data quality objectives and adherence to QA/QC protocols; the CT-sealant dust and particulate washoff source profiles from the United States Geological Survey (Mahler et al. 2004, 2010; Van Metre et al. 2008, 2012) were of high quality and had the best documented QA/QC procedures of the source profiles used in this study; (4) no source profiles were available for the combustion of biofuels, which may be more important in the future as these fuels increase in use in Minnesota and elsewhere in the United States; Minnesota was the first state in the United States to require the use of biodiesel as of 2009 (http://www.mda.state.mn.us/renewable/biodiesel.aspx); and (5) source results can be misinterpreted if important sources (such as CT-sealant dust for urban areas where these products are used) are excluded from the model. Although models can always be refined and improved, and new source profiles can be assessed for their applicability as they become available, the excellent statistical results of model run A provided credence for the strong performance of the CMB8.2 model in estimating sources of PAHs to the study pond sediments.

CMB8.2 Model Concordance

The results of model run A were consistent with the use of the CMB8.2 model on other local, regional, and national sediment PAH data sets that included weathered CT-sealant source profiles in the model (Table 5). The model results for the other data sets listed in Table 5 were all within ±1 SD of the study data set, and all attributed the greatest source apportionment of PAHs to CT-sealant sources. The Met Council data set of Twin Cities, MN, stormwater pond sediments (Polta et al. 2006) had a lower weighted mean percent of CT-sealant than this study (Table 5; pp. S4–S5 of the ESM and Tables S13 through S15). This difference may be due to a broader range of sediment depths (i.e., 15.2 to 45.7 cm) included in the Met Council data set, which may represent a more varied PAH source history than this study, i.e., limited to the upper 15-cm depth interval. The other model results in Table 5 were from Van Metre and Mahler (2010) who found that loading of ∑PAHCMB from CT-sealcoat dust surpassed urbanization or population density for explaining increased concentrations of PAHs in urban lake sediments across the United States. Thus, even small areas of sealcoated parking lots can contribute the greatest loads of PAHs to some water bodies. In Wisconsin, Selbig et al. (2013) observed strong correlations in the PAH fingerprint of stormwater pond and streambed sediment to that of CT-sealant, although the CMB8.2 model was not used to determine the source apportionment of PAHs in these samples.

Table 5 Fractional contribution of CT-sealant sources to stormwater pond and lake sediments determined by the CMB8.2 model

Distribution of Analytes and Comparisons with Land Use

Due to the dominance of parent and HMW PAHs from primarily CT-sealant sources, other ancillary measurements may be helpful for describing the fate and transport of PAHs in these ponds. In turn, this information could be used by stormwater managers for the implementation of additional BMPs, potentially for different watershed land uses, for decreasing PAH loads in stormwater runoff before reaching stormwater ponds. Box plots of the distribution of ∑PAH34 (Fig. 2), TOC and black carbon (Fig. 5), and particle size parameters (Figs. 6, 7) are provided to summarize the composite site data. The results for individual composite samples are listed in Table S16. ∑PAH34 spanned a wide range of concentrations from 2.5 to 234.9 mg/kg dw. TOC also displayed a wide range (1.8–22.0 %) due to contributions from plant detritus; cattails and other shoreline vegetation are planted around many stormwater ponds to discourage human access. Black carbon displayed a more narrow range of 0.13–2.01 %. Medium silt (5.0–20 μm) and coarse silt (20–50 μm) comprised the greatest mean percentages of particle size classes of 30.9 and 26.6 %, respectively (Table S16).

Fig. 5
figure 5

Distribution of TOC and black carbon in composite stormwater pond sediment samples. See Fig. 2 for a description of the box-plot components

Fig. 6
figure 6

Distribution of particle size classes in composite stormwater pond sediment samples. See Fig. 2 for a description of the box-plot components

Fig. 7
figure 7

Distribution of particle size parameters in composite stormwater pond sediments. See Fig. 2 for a description of the box-plot components

The distribution of ∑PAH34 in the study ponds was bimodal, and further description of the most contaminated sites will aid in the interpretation of the subsequent “Risk Indicators” section. The highest ∑PAH34 concentrations were observed at: (1) an industrial pond located adjacent to an upland site used to store and process seasonal quantities of deicing sands (during spring), chip seal rock from residential streets (mostly summer), and street sweepings (during fall) from the City of Bloomington, MN (05-C; 234.9 mg/kg dw ∑PAH34); (2) another industrial pond that was located in a former wood treating Superfund site in New Brighton, MN that had been remediated (12-C; 102.3 mg/kg dw ∑PAH34); and (3) a commercial pond that received drainage from primarily commercial and also residential areas in North St. Paul, MN (09-C; 93.7 mg/kg dw ∑PAH34; Fig. 1, Table S16). These three stormwater ponds (median ∑PAH34 = 102.3 mg/kg dw) were significantly more contaminated than the other 12 ponds (median ∑PAH34 = 5.9 mg/kg dw; Mann–Whitney rank-sum test, p = 0.012). For pond sample 05-C, runoff from nearby street sweeping piles flowed into the pond, and these street sweepings likely contained abraded CT-sealant particles transported by way of stormwater runoff, vehicle tire tracking, and wind dispersion to nearby streets (Mahler et al. 2012). CMB8.2 model run A results showed that CT-sealant sources comprised 87.2 % of the source apportionment of PAHs in these pond sediments (Table S11). For pond sample 12-C, PAH concentrations were approximately five times greater than those in other nearby pond, ditch, and wetland sediments from the former Superfund site (CH2M Hill et al. 2010). Although creosote was used in some of the former wood treating operations at this site, the distribution of PAHs in this sample was not consistent with the two- to four-ring PAH compounds that dominate creosote (Murphy and Brown 2005). Pentachlorophenol and chromated copper arsenate were the primary contaminants of concern in site soil near pond #12 before remediation (CH2M Hill et al. 2010). CT-sealant sources accounted for 77.1 % of the source apportionment of PAHs in pond #12 (Table S11), which had several parking lots draining into it. Pond #09 differed from these other two ponds in that vehicle-related sources (i.e., traffic tunnel air) comprised the major source apportionment of PAHs (51.1 %) followed by CT-sealant sources (34.4 %; Table S11).

Further statistical analysis of the PAH and ancillary data were performed in preparation for assessing correlations with the data. The normality of the ∑PAH34 and ancillary data concentrations was accessed using Shapiro–Wilk normality test for which black carbon, clay and silt fractions, specific surface area of particles, and total clay, silt, and sand passed normality (p < 0.05; Table S17). ∑PAH34 (and other PAH fractions described in the previous section), TOC, sand fractions, and median particle size all failed normality (p < 0.05; Table S17). Data failing normality were log-transformed, and the Shapiro–Wilk normality test was run again, which achieved normality for most parameters (Table S18). The fine sand fraction and LMW PAHs still failed normality, and these parameters were transformed with square-root and inverse transformation, respectively, to pass the normality test (Table S18).

Pearson product moment correlation analysis was performed of these normalized data (Table S19). Previous Spearman rank order correlation analysis of the raw data showed that no significant Spearman rho values (p < 0.05) between ∑PAH34 and either black carbon, TOC, or any particle size fractions (Table S20). However, log ∑PAH34 was significantly correlated to black carbon (r = 0.564, p = 0.029) but not to log TOC or particle size parameters (Table S19). Similarly, there was a lack of correlation between ∑PAH12 and TOC in randomly selected surficial sediments collected from the lower St. Louis River, MN, estuary that also exhibited a broad range of TOC values (0.02–19.2 %; Crane et al. 2005). In comparison, Yang et al. (2010a) did not observe any correlation between ∑PAH13 and TOC or soot carbon in soil, sediment, and parking lot and street dust from the Fort Worth, TX, watershed, although values of log ∑PAH13 were positively correlated to the sum of organic carbon (OC) in coal tar pitch, asphalt, and soot particles (R 2 = 0.66, p = 0.008), and these correlations were improved further (R 2 = 0.96 and p < 0.001) when residential area soil and residential street dust were excluded.

Yang et al. (2010b) showed that coal tar from sealant particles in environmental samples is resistant to the same thermal oxidation method used in this study to analyze black carbon and can contribute to black carbon measurements. Although the highest black carbon measurements of this study also corresponded to two of the most contaminated stormwater pond sediments for ∑PAH34 (i.e., pond samples 05-C and 09-C), organic petrography methods must be used to determine the types of incomplete combustion-derived carbonaceous particles (e.g., coal, char, soot, asphalt, and coal tar pitch) comprising black carbon in these samples. The presence of carbonaceous particles, such as black carbon, in sediments can increase the sorption of PAHs by up to two orders of magnitude greater than expected on the basis of sorption to amorphous organic matter only, resulting in lower bioavailability than expected and limited potential for microbial degradation (Cornelissen et al. 2005). Black carbon also had a significant Pearson’s r value (0.605, p = 0.017) with log TOC (Table S19), which is consistent with it being a component of TOC.

The lack of a significant Pearson’s correlation between log ∑PAH34 and any of the particle size classes (Table S19), particularly silts and clays, was unexpected because transport of PAH-associated particles in stormwater runoff appears to be the largest transport pathway for deposition of PAHs in most urban sediments (Battelle Memorial Institute et al. 2003). However, the bed sediments of stormwater ponds are comprised of loose particles derived primarily from the erosion and weathering of upland soils and rocks, from stormwater runoff of upland areas and impervious surfaces (e.g., parking lots, driveways, streets, roads, and roofs) including carbonaceous particles, and from organic particles from decomposing plants and animals. Carbonaceous particles (primarily coal, coke, charcoal, coal tar pitch, cenospheres, and wood) separated from the mineral fraction (i.e., clay, silt, and sand) of urban harbor sediments collected across the United States were found to contribute 5–7 % of the total mass of particles and 60–90 % of polychlorinated biphenyls and PAHs (Ghosh et al. 2003). Thus, coal tar pitch and other carbonaceous particles in stormwater pond sediments, which may contain the greatest concentrations of PAHs, may be diluted by other mineral and biological organic particle sources, which would preclude significant correlations between log ∑PAH34 and particle size classes in bed sediments.

The inclusion of inorganic and organic fractions in the particle size analyses provided a realistic assessment of the apparent particle size distribution in these ponds. The organic fraction probably comprised a small percentage of the apparent particle size distribution in most fractions because log TOC was not significantly correlated (p < 0.05) with the majority of particle size classes (Table S19). However, both coarse silt (20–50 μm) and the log of very fine sand (50–100 μm) had significant (p < 0.05) positive Pearson correlations with log TOC in this study (Table S19); plant detritus likely contributed to these particle size classes. In particular, pond samples 04-C and 06-C had visible detritus in the sediment, which resulted in the: (1) highest percentages of TOC (22.0 and 9.5 %, respectively) observed in this study, (2) increased percentages of coarse silt (33.0 and 30.3 %, respectively), and (3) the highest percentages of very fine sand (31.5 and 27.4 %, respectively; Table S16). Additional results for the Pearson correlations between individual particle size classes are provided in the ESM (p. S6).

Land use was not a useful variable for this small study. No significant differences (p < 0.05) were observed between land-use categories for ∑PAH34, black carbon, TOC, or particle size fractions when using either one-way ANOVA for normally distributed data or Kruskal–Wallis one-way ANOVA on ranks when the normality test failed (Table S21). The power of the one-way ANOVA tests were all lower than the desired power of 0.800, indicating that one is less likely to detect a difference when one actually exists. The Holm–Sidak method was used as additional confirmation to assess all pairwise multiple comparisons between land uses for normally distributed data, and the results were all not significant (p > 0.05; Table S21). Due to the wide range of analyte values in some land-use categories (e.g., ∑PAH34 in industrial ponds) and the small sample size of five ponds in each major land-use classification, a greater number of ponds would increase the statistical power to test the null hypothesis of no significant differences between land-use categories. In addition, the watersheds of some ponds included mixed land uses (Table S1), which increased the complexity of employing land use as a variable. Because the major source of PAHs to most of the study pond sediments was consistent with CT-sealants, it would be useful in the future to determine the percentage of impervious surfaces due to asphalt driveways and parking lots in each watershed based on aerial photography. Follow-up field surveys could be performed using a colorimetric screening test developed by the City of Austin, TX (Bashara et al. 2012) to assess the percentage of CT-sealants and asphalt-based sealants on sealed asphalt surfaces.

Risk Indicators

Ecological Indicators of Risk

For each pond, a subset of 12 parent PAHs was summed (∑PAH12, Table 1) and compared with widely used consensus-based sediment-quality guidelines (SQGs; MacDonald et al. 2000) adopted for use in Minnesota (Crane et al. 2002; Crane and MacDonald 2003; Crane and Hennes 2007). Two industrial pond samples (05-C and 12-C) and one commercial pond sample (09-C) exceeded the consensus-based probable effect concentration (PEC) of 22.8 mg/kg dw (MacDonald et al. 2000), above which harmful effects on benthic (sediment-dwelling) organisms are likely to be observed (Fig. S3, Table 6). The median ∑PAH12 concentration for this subgroup of ponds was 49.2 mg/kg dw. Three residential ponds and one industrial pond (15-C) had ∑PAH12 concentrations less than the consensus-based threshold effect concentration (TEC) value of 1.61 mg/kg dw (MacDonald et al. 2000), below which harmful effects on benthic organisms are unlikely to be observed (Fig. S3). The median ∑PAH12 concentration for this subgroup of ponds was 1.26 mg/kg dw. Eight ponds with a mix of land uses had ∑PAH12 concentrations between the TEC and PEC values, which is typical of urban sediments (Crane et al. 2002). The median ∑PAH12 concentration for this subgroup of ponds was 4.1 mg/kg dw. The group of ponds exceeding the PEC value had significantly greater (p < 0.05) median PAH concentrations than those ponds that had lower than the TEC value [based on Dunn’s Method after running a Kruskal–Wallis one-way ANOVA on ranks (p = 0.003)]. No significant differences (p > 0.05) were observed for the other two pairs of pond comparisons (i.e., <TEC and between the TEC and PEC values; >PEC and between the TEC and PEC values).

Table 6 Summary results of ecological and human health indicators

The three composite pond samples with the highest concentrations of ∑PAH12 (i.e., 05-C, 09-C, and 12-C) were also the only study ponds to exceed the USEPA’s ESB Toxic Units of 1.0 (Table 6), indicating that adverse effects to benthic organisms in these ponds are likely to occur through narcosis (resulting in the alteration of cell membrane function; Burgess 2009). Although predictions of adverse effects using ∑PAH34 in the ESB Toxic Units model has less uncertainty than estimates based on comparisons of ∑PAH12 with PEC values, both methods identified the same samples of concern to benthic invertebrates. This concurrence was likely aided by the lower mean percentage of alkylated PAHs (21.5 %) in the stormwater pond sediments because benthic invertebrates are more sensitive to toxic effects from alkylated PAHs (Hawthorne et al. 2007).

Although stormwater ponds are considered treatment systems in Minnesota, and ecological effects have not typically been considered, these results may be an indicator of potential downstream ecological impacts, particularly as stormwater ponds fill-up and become less efficient. In addition, some benthic invertebrates inhabiting stormwater ponds may drift downstream and colonize other areas or be consumed by higher trophic–level organisms. Because aquatic invertebrates have varying abilities to metabolize PAHs, selective bioaccumulation of some compounds may occur and be transferred to predators (Gewurtz et al. 2000). In addition, exposure to UV-A light in shallow areas may enhance the phototoxicity of PAHs to benthic invertebrates (Diamond et al. 2003) and larval fish (Niemi et al. 2006). Other biological concerns from exposure to PAHs due to CT-sealants are summarized in Mahler et al. (2012).

Human Health Indicator of Risk

Human cancer risks tend to be evaluated internationally by the use of B[a]P relative potencies because the carcinogenicity of B[a]P is well understood (CCME 2010). Because PAHs always occur as mixtures, an additive dose–response is assumed when calculating B[a]P equivalents (CCME 2010). Although there are consistent procedures for calculating B[a]P equivalents, public health and environmental agencies from different jurisdictions develop their own guidelines based on various human health exposure scenarios by which to compare B[a]P equivalent concentrations. Excavated sediment from stormwater ponds in Minnesota is evaluated based on upland disposal of the material. The B[a]P equivalents are compared to the MPCA’s Soil Reference Values (SRVs), which are soil contaminant concentrations above which an unacceptable risk to human health is predicted under two different exposure scenarios for residential and industrial land uses (MPCA 1999).

Composite samples from nine ponds exceeded the industrial SRV of 3 mg/kg B[a]P equivalents (Tables 6 and S22), and dredged material from these ponds would be characterized as having significant contamination warranting restricted disposal (e.g., in lined landfills; Stollenwerk et al. 2012). Two ponds were less than the residential SRV of 2 mg/kg B[a]P equivalents, implying unrestricted disposal of dredged material (Stollenwerk et al. 2012). B[a]P equivalents could not be calculated for the remaining four ponds that had 100 % censored cPAH data. A greater percentage of detected cPAHs (Table S2) probably would have been achieved if chemical interferences had been removed from the sample extracts using column chromatography instead of diluting them (which caused the reporting limits to increase).

Management Implications

Regulations to control the use and disposal of stormwater pond sediments in the United States and Canada vary widely (Polta 2004). Some states, such as South Carolina, do not require any chemical testing of pond sediments before dredging (Weinstein et al. 2010a). Jurisdictions lacking chemical testing and sediment benchmarks may benefit from the approach used in Minnesota. There are also inconsistent policies for privately owned stormwater ponds that are not part of a MS4 system. Thus, thousands of pond sediments in the United States may be dredged and disposed of in ways that increase environmental releases of PAHs. Even with proper assessment, there are not enough specially lined landfills to accommodate the volume of dredge material from the most contaminated ponds.

An additional confounding factor is that intense rainfalls are getting larger and more frequent in the Midwest United States due to climate change, and these storms frequently overload stormwater systems (Cruce and Yurkovich 2011). Municipalities that delay maintenance dredging of their stormwater ponds have less capacity to manage intense rainfalls and risk the resuspension and transport of PAH-contaminated sediments to downstream areas where ecological and human health impacts may be more severe.

Product bans on CT-sealants can decrease PAHs in the environment because suitable alternatives either have much lower concentrations of PAHs (i.e., asphalt-based sealants) or no PAHs at all (i.e., acrylic sealants; Crane et al. 2010). One nationwide, commercial sealant applicator in the United States is phasing out CT-sealants, and several large retailers in the United States no longer provide this product to consumers (MPCA 2013b). The State of Washington, District of Columbia, and several local units of government in eight states have banned CT-sealants (MPCA 2013b). The MPCA increased awareness about this issue, and the Minnesota Legislature provided an incentive for municipalities to pass CT-sealant bans in 2009 (Crane et al. 2010), resulting in 29 municipal bans (MPCA 2013b); some municipalities that passed bans competed for legislative-funded grants to remove PAH-contaminated sediments from their stormwater ponds. During May 2013, the State of Minnesota enacted a law to prohibit the statewide sale and use of CT-sealants effective January 1, 2014. Federal United States legislation was proposed in 2013 to amend the Toxic Substances Control Act (TSCA) to prohibit the manufacture, processing, distribution in commerce, and use of coal tar sealants, and for other purposes (http://beta.congress.gov/bill/113th-congress/house-bill/1625/titles). Alternatively, Wagner (2009) suggested using competition-based regulation to augment TSCA and allow for the manufacturer of a purportedly less toxic sealant to petition the USEPA to rule on whether CT-sealants produce an “unreasonable risk.” Any bans may take more than a decade to monitor their effectiveness in the environment due to the physical–chemical properties of PAHs that lead to their persistence (Mahler et al. 2012). Pavlowsky (2012) used a regression model approach to suggest that a CT-sealant ban, if implemented, could potentially result in an 80–90 % decrease in PAH concentrations in stream and pond sediments in Springfield, MO, but that it could take ≥20 years to achieve this decreased level.

Because most of the coal tar pitch produced in North America is used by the aluminum industry in a controlled environment, the small amount used to produce CT-sealants could be shifted to other products if a nationwide ban was implemented. During 2006 and 2007, there were shortages of coal tar pitch for which manufacturers and commercial sealant applicators in the United States had to compensate by: (1) allocating CT-sealant to some customers, (2) producing CT-sealant blended with asphalt-based sealant, and (3) producing and marketing just asphalt-based sealant (Heydorn 2006). Thus, the sealant industry was able to adapt quickly to these shortages (Crane et al. 2010).

The supply of coal tar in North America is also decreasing, which is reducing the feedstock for producing coal tar pitch. In the late 1980s and early 1990s, the closing of United States coke ovens (of which coal tar is a by-product) accelerated due to economic and environmental considerations; the development of new processes for making steel have also decreased or eliminated the need for metallurgical coke (Baron et al. 1998). Between 1970 and 1990, production of metallurgical coke in the United States decreased by >50 %, whereas Japanese and Chinese production increased by 50 and 300 %, respectively (Kaegi et al. 2000). In 1999, China provided more than one-third of the world’s production of coke and more than half of global coke exports (Terjung 2000). The importation of coal tar or coal tar pitch from other countries for use in producing CT-sealants in the United States is performed through the RCRA exclusion (USEPA 1992), which allows this waste product to be recycled into CT-sealants.

Environmental programs can also educate consumers and businesses about using alternatives to regular asphalt pavements for driveways and parking lots, such as concrete, permeable asphalt or concrete, permeable pavers, or gravel (Frazer 2005; Dietz 2007). Parking lots are also usually larger than needed in North America, and urban planners and businesses can consider decreasing their size. For example, parking lots comprise approximately 5 % of urban land use in Illinois, Indiana, Michigan, and Wisconsin in the United States, and the number of parking spaces per registered vehicle in each state ranges from approximately 2.5–3 (Davis et al. 2010).

Of broader concern, the dark black pavement that results from application of CT-sealants absorbs solar radiation and radiates heat back to the atmosphere, contributing to the urban heat island effect, increased CO2 emissions, and increased potential of rutting failure in asphalt pavements (Mallick et al. 2009). Reflective coatings can be used to decrease pavement temperature and increase the albedo of urban areas, resulting in a decrease of atmospheric temperature (Akbari et al. 2012). Permeable materials used in these “cool” pavements can also decrease the need for stormwater ponds. Other stormwater BMPs, such as rain gardens and tree filters, can also decrease the need for stormwater ponds and provide more favorable conditions for microbial degradation of PAHs (LeFevre et al. 2012).

Conclusion

In this study, the CMB8.2 model provided a rigorous environmental forensic technique compared with PAH source ratios for determining the source apportionment of PAHs in stormwater pond sediments from the Minneapolis-St. Paul, MN, metropolitan area. For the 15 ponds sampled, CT-sealants comprised the major source of PAHs (67.1 %) followed by vehicle emissions (29.5 %) and wood combustion (3.4 %). The dominance of CT-sealants was consistent with other local, regional, and national urban sediment data sets where this model has been used.

For this study, ∑PAH34 did not differ significantly between groups of ponds in major land-use classes, partly due to the small sample size and large range of ∑PAH34 in ponds from industrial land-use areas. Thus, municipalities charged with monitoring stormwater ponds for PAHs cannot presently employ land use as a screening tool for prioritizing ponds for analysis. Ancillary physical and chemical parameters showed mixed results in accounting for the fate and transport of PAHs. The lack of correlation between ∑PAH34 and TOC in pond sediments was due in part to the influx of TOC from surrounding wetland vegetation. Particle size classes included both inorganic and organic particles, and no significant correlations were observed between subfractions of clay, silt, and sand with different groups of PAHs. Particles originating from impervious surfaces (e.g., road dust, CT-sealant particles) in stormwater runoff may have been diluted by eroded soil particles and decomposing plant and animal matter. Log ∑PAH34 displayed a significant correlation with black carbon (r = 0.564, p = 0.029), which increases the sorption of PAHs, and this sorption can affect the bioavailability and toxicity of PAHs.

Three stormwater ponds exceeded ecological benchmarks that are likely to cause adverse effects to benthic invertebrates, and follow-up sediment toxicity tests could be used to test for acute and chronic effects. Nine ponds exceeded the human health-based B[a]P equivalents guideline value for restricted disposal of stormwater pond sediments in Minnesota. This finding would result in high disposal costs for these sediments. Pollution prevention efforts and product bans of CT-sealants, such as Minnesota’s new statewide ban, provide management options for decreasing the release of PAHs in urban environments. Once implemented, sediment quality should improve over time so that ecological risks are reduced and beneficial reuse options for excavated sediment can be considered for more ponds.