This is the second in a series of three articles based on a baseline ecological risk assessment (BERA) of the Calcasieu Estuary, Louisiana (MacDonald et al. 2002). This study was undertaken based on the results of earlier investigations showing that several areas in the Calcasieu Estuary have been contaminated by historic releases of toxic or bioaccumulative substances into aquatic ecosystems. For example, Curry et al. (1997) reported that contamination from industrial development is apparent in sediments from the Calcasieu Estuary, with the highest concentrations evident in the areas of dense industrial activity. These areas included Bayou d’Inde, Bayou Verdine, Clooney Island Loop, Coon Island Loop, and Prien Lake (see Fig. 1 in MacDonald et al. 2010a; Curry et al. 1997). Similarly, McLaren/Hart-Chem Risk Environmental Engineering (MHEE 1998) reported that the concentrations of a number of chemicals of potential concern (COPCs) were sufficient to adversely affect benthic invertebrates at several locations within the estuary, including Bayou Verdine, Coon Island Loop, and upper and lower Bayou d’Inde. In a more localized assessment of the risks posed to ecological receptors associated with exposure to contaminated sediments, Entrix Inc. (2001) reported that various COPCs occurred in Bayou Verdine sediments at levels sufficient to adversely affect benthic invertebrates.

Although sediment chemistry data are essential for evaluating sediment-quality conditions, chemistry data alone provide incomplete information for classifying or managing contaminated sediments (Long et al. 1995). Interpretive tools are also required to relate sediment chemistry data to the potential for observing adverse biological effects (Wenning et al. 2005). Various toxicity and bioaccumulation tests can be performed to evaluate the biological significance of sediment-associated COPCs (MacDonald et al. 1996; Ingersoll et al. 1997; United States Environmental Protection Agency [USEPA] 2000a). In addition, numerical sediment-quality guidelines (SQGs) can be used to interpret data on the concentrations of COPCs in whole sediments. In this context, SQGs represent the concentrations of COPCs in whole sediments below which adverse effects on sediment-dwelling organisms are unlikely to occur or above which such effects are likely to be observed (Wenning et al. 2005).

Guidelines for assessing sediment quality relative to the potential for adverse effects on sediment-dwelling organisms in aquatic systems have been derived using a combination of mechanistic and empirical approaches (Wenning et al. 2005). Application of these methods has resulted in the derivation of numerical SQGs for many COPCs in freshwater, estuarine, and marine sediments (Long and Morgan 1990; Di Toro et al. 1991; Persaud et al. 1993; Ingersoll et al. 1996; Smith et al. 1996; Cubbage et al. 1997; USEPA 1997; New York State Department of Environmental Conservation 1999; Wenning et al. 2005). Additionally, various chemical mixture models have been developed to further support evaluations of sediment-quality conditions using sediment chemistry data (Swartz et al. 1995; Ankley et al. 1996; Berry et al. 1996; Long et al. 1998a, b; Long and MacDonald 1998; MacDonald et al. 2000; USEPA 2000b).

In the Calcasieu Estuary BERA, numerical SQGs were required to identify the concentrations of COPCs likely to be associated with adverse effects on sediment-dwelling organisms (MacDonald et al. 2002, 2010a). More specifically, numerical SQGs were needed to identify the whole-sediment samples in which the concentrations of COPCs were sufficient to adversely affect various groups of ecological receptors, including microorganisms (i.e., microbial community), benthic macroinvertebrates (i.e., benthic invertebrate community), or fish (i.e., fish community; MacDonald et al. 2002). In this application, SQGs were needed to classify individual sediment samples from the Calcasieu Estuary as likely toxic or likely not toxic and to calculate the predicted incidence of toxicity or the predicted magnitude of toxicity for various reaches and areas of concern (AOCs) in the Estuary (MacDonald et al. 2010b). In addition, such SQGs were needed to identify contaminants of concern (COCs) in sediment samples from the Calcasieu Estuary (i.e., the substances that are causing or substantially contributing to sediment toxicity or other adverse effects; MacDonald et al. 2002, 2010b).

The results of previous evaluations of various SQGs and associated chemical mixture models indicate that SQGs generally provide reliable and/or predictive tools for assessing sediment-quality conditions throughout the United States (Barrick et al. 1988; Long et al. 1995, 1998a, b; MacDonald et al. 1996, 2000, 2010c; Long and MacDonald 1998; Field et al. 1999, 2002; USEPA 2000b; Ingersoll et al. 2005; Word et al. 2005). Nevertheless, further information was needed to support the selection of tools that would be the most appropriate for application in the Calcasieu Estuary. For this reason, three sets of SQGs for the protection of benthic invertebrates including: (1) ERMs (Long et al. 1995); (2) PECs (MacDonald et al. 2000); and (3) T50s (concentrations associated with a 50% probability of observing toxicity in marine amphipods; Field et al. 2002) were evaluated to determine their applicability in the Calcasieu Estuary (i.e., for assessing the effects of mixtures of a broad range of COPCs). More specifically, the ability of the SQGs to correctly predict the presence and absence of whole-sediment toxicity to amphipods, based on sediment chemistry data alone, was evaluated (i.e., the predictive ability of the SQGs; Word et al. 2005).

This article describes the approach that was used to evaluate the predictive ability of the SQGs in the Calcasieu Estuary. More specifically, this article describes the steps that were taken to acquire matching sediment chemistry and toxicity data from the Calcasieu Estuary, to review and evaluate each of the candidate data sets, and to compile data sets into an estuary-specific sediment toxicity database. The methods used to evaluate the predictive ability of the SQGs and the results of that evaluation are also presented.

Methods

The evaluation of the predictive ability of the effects-based SQGs involved several steps. In the first step of the process, candidate SQGs from various sources were identified and compiled. Next, matching sediment chemistry and toxicity data were obtained from several studies conducted in the Calcasieu Estuary. Subsequently, the acquired data were reviewed and evaluated to determine their scientific and technical validity (i.e., relative to the project-specific data-selection criteria). Data that met the data selection criteria were incorporated into a relational project database. Finally, the compiled data were used to evaluate the predictive ability of the contaminant mixture models that have been developed for three types of effects-based SQGs, including the ERM-quotient model (i.e., ERM-Q; Long et al. 1998a), the PEC-Q model (MacDonald et al. 2000; USEPA 2000b), and the P-Max/P-Avg models (Field et al. 2002).

Identification and Compilation of Candidate SQGs

Identification and compilation of candidate SQGs represents an essential element of the overall sediment risk-assessment process. As a first step, the published SQGs derived by various investigators and jurisdictions for assessing the quality of freshwater, estuarine, and marine sediments were identified and collated. More specifically, SQGs that were applicable for assessing effects on the benthic invertebrate community associated with exposure to whole sediments were compiled. Next, the SQGs obtained from all sources were evaluated to determine their applicability to this study. To facilitate this evaluation, the supporting documentation for each of the SQGs was reviewed. The collated SQGs were further considered for use in this study if (1) the methods that were used to derive the SQGs were readily apparent; (2) the SQGs were based on unique empirical data that related contaminant concentrations to harmful effects on sediment-dwelling organisms or were intended to be predictive of effects on sediment-dwelling organisms (i.e., not simply an indicator of background contamination); (3) the SQGs had been derived on a de novo basis (i.e., not simply adopted from another jurisdiction or source); and (4) mixture models had been developed for assessing the combined effects of COPCs. A total of three sets of SQGs met these criteria, including the ERMs developed by Long et al. (1995), the PECs developed by MacDonald et al. (2000), and the LRMs formulated by Field et al. (2002).

Acquisition of Candidate Data Sets

An extensive search of the scientific literature was conducted to acquire matching sediment chemistry and toxicity data from the Calcasieu Estuary. To support the predictive ability evaluation, information was acquired on the concentrations of COPCs (i.e., trace metals; polychlorinated biphenyls [PCBs]; polycyclic aromatic hydrocarbons [PAHs]; certain organochlorine pesticides, such as chlordane, endrin, and dieldrin; and several other classes of organic contaminants) in sediments from the estuary and the matching data on the effects on sediment-dwelling organisms associated with exposure to those sediments (i.e., based on the results of whole-sediment toxicity tests). Candidate data sets were acquired by first accessing the information contained in MacDonald Environmental Sciences Limited’s (MESL) database on the effects of sediment-associated contaminants on aquatic organisms (i.e., Biological Effects Database for Sediments; MacDonald et al. 1994). Then the information contained in the National Oceanic and Atmospheric Administration’s (NOAA) Query Manager database was obtained. On-line searches of a number of bibliographic databases were also conducted to obtain recently published articles from peer-reviewed journals. The recent volumes of peer-reviewed journals that routinely publish papers on the effects of sediment-associated contaminants were also reviewed to access recently published data. Finally various practitioners in the sediment quality-assessment field were contacted, either by letter or phone, to obtain additional published and unpublished data sets relevant to this project (Table 1).

Table 1 Listing of matching sediment chemistry and toxicity data sets compiled in the Calcasieu Estuary database

Review and Evaluation of Candidate Data Sets

All of the historic data sets and associated documentation retrieved during the course of this study were critically evaluated to determine their scientific and technical validity. To support this evaluation, selection criteria were developed based on the performance criteria for measurement data established in the project sampling and analysis plans (MacDonald et al. 2001). These selection criteria provided a means of consistently evaluating the procedures used in each study, including the methods used to collect, handle, and transport sediment samples (e.g., consistent with American Society for Testing and Materials [ASTM] 2009a); the methods applied to conduct sediment toxicity tests (e.g., consistent with ASTM 2009b, c); the methods used to determine the concentrations of COPCs in sediments (e.g., consistent with MacDonald et al. 2008); and the statistical tests applied to the study results (MacDonald et al. 2002). In some cases, additional communications with investigators or professional judgment were needed to determine if the selection criteria had been satisfied.

The estuary-specific database on the toxicity of sediment-associated COPCs to Hyalella azteca in 10-day exposures contained matching sediment chemistry and toxicity data for 127 sediment samples (Tables 1, 2). These data were obtained from three sources, including an investigation of sediment quality conditions in PPG Canal and lower Bayou d’Inde conducted by MHEE (1998), a BERA of Bayou Verdine conducted by Entrix Inc. (2001), and the Phase II Remedial Investigation of the Calcasieu Estuary (MESL 2001; Table 1). In each of these evaluations, the acute toxicity associated with exposure to whole-sediment samples was evaluated using the results of 10-day toxicity tests with H. azteca. All of these toxicity tests were conducted at overlying water salinities of approximately 10‰. The concentrations of a wide variety of COPCs were measured in each sample, including trace metals, PAHs, PCBs, organochlorine pesticides, chlorinated benzenes, phthalates, polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDDs/PCDFs), and other substances. Conventional variables, such as grain size and total organic carbon (TOC), were also measured to support the interpretation of the resultant data.

Table 2 Summary of whole-sediment toxicity data used to evaluate the predictive ability of the preliminary sediment quality-assessment guidelines for the Calcasieu Estuary

A substantial quantity of data are available from 28-day toxicity tests with the amphipod H. azteca to evaluate the predictive ability of the SQGs in the Calcasieu Estuary (n = 100; Table 2). These data were obtained from the Phase II Remedial Investigation of the Calcasieu Estuary (MESL 2001; Table 1). In this evaluation, the toxicity of whole-sediment samples was evaluated using the results of 28-day toxicity tests with H. azteca. All of these toxicity tests were conducted at overlying water salinities of approximately 10‰. The concentrations of a wide variety of COPCs were measured in each sample, including trace metals, PAHs, PCBs, organochlorine pesticides, chlorinated benzenes, phthalates, PCDDs/PCDFs, and other substances. Conventional variables, such as grain size and TOC, were also measured to support the interpretation of the resultant data.

The estuary-specific database on the toxicity of sediment-associated COPCs to Ampelisca abdita in 10-day toxicity tests contains matching sediment chemistry for 165 sediment samples from the Calcasieu Estuary (Table 2). These data were obtained from four sources, including (1) an investigation of Calcasieu Estuary sediments conducted by the USEPA in 1988 and 1989 (Redmond et al. 1996), (2) a follow-up study conducted by the USEPA in 1997 (Toxicon Environmental Sciences [TES] 1997), (3) samples collected under the Environmental Monitoring and Assessment Program (USEPA, unpublished data), and (4) the Phase II Remedial Investigation of the Calcasieu Estuary (Harding ESE 2001; MESL 2001; Table 1). In each of these evaluations, the toxicity of sediment samples was evaluated using the results of 10-day toxicity tests with A. abdita. All of these toxicity tests were conducted at overlying water salinities of approximately 30‰. The concentrations of a wide variety of COPCs were measured in each sample, including trace metals, PAHs, PCBs, organochlorine pesticides, chlorinated benzenes, phthalates, PCDDs/PCDFs, and other substances. Conventional variables, such as grain size and TOC, were also measured to support the interpretation of the resultant data.

Development of a Sediment Toxicity Database

All of the matching sediment chemistry and toxicity data assembled that met the screening criteria were incorporated into the project database on a per-sample basis. Each record in the resultant database included the citation; a brief description of the study area (i.e., by water body and reach); a description of the sampling locations (including georeferencing data if available); information on the toxicity tests that were conducted (including species tested, endpoint measured, test duration); type of material tested (whole sediment or pore water); the TOC concentrations (if reported); and the chemical concentrations (expressed on a dry-weight basis). Other supporting data, such as simultaneously extracted metals concentrations, acid-volatile sulfides, particle size distributions, and other variables, were also included in the individual data records (as available).

The information compiled includes data on a substantial number of samples with a broad range of concentrations and various toxicity endpoints (Table 1). These data sets provided information on the toxicity of whole-sediment samples to the following benthic invertebrate species: the amphipod H. azteca (endpoints = survival and growth in 10- and 28-day whole-sediment exposures), the amphipod A. abdita (endpoint = survival in 10-day whole-sediment exposures); and the polychaete Nereis virens (endpoint = survival in 28-day whole-sediment exposures). Both amphipod species were tested in this study because the Calcasieu Estuary exhibits a wide range of salinities and supports a host of freshwater, estuarine, and marine organisms. Additionally, the results of pore-water toxicity tests on the sea urchin Arbacia punctulata (endpoints = fertilization and gamete development) were incorporated into the project database. These four invertebrate species were selected because they have been routinely used in sediment assessments, because standard methods are available for conducting the toxicity tests, and because Gulf Coast-specific toxicity tests have not been widely recommended. The results of benthic invertebrate community structure analyses were also incorporated into the project database. However, only the toxicity data for the two amphipod species were used to evaluate the reliability of the candidate SQGs.

Although matching sediment chemistry and toxicity data were available on various species and toxicity test endpoints, only a subset of these data were selected for evaluating the predictive ability of the SQGs. More specifically, the results of the following toxicity tests were used in the predictive ability evaluation (Table 2): (1) 10-day whole-sediment toxicity tests with the salt-water acclimated amphipod H. azteca (endpoints = survival or growth); (2) 28-day whole-sediment toxicity tests with the salt-water acclimated amphipod H. azteca (endpoints = survival or growth); and (3) 10-day whole-sediment toxicity tests with the amphipod A. abdita (endpoint = survival).

Amphipod toxicity test results were selected for use in the evaluation of the SQGs for several reasons. First, sediment toxicity tests with amphipods have been standardized (ASTM 2009b, c; USEPA 2000a). In addition, a large quantity of data on the effects of sediment-associated COPCs on freshwater and marine amphipods are available nationally (USEPA 2000b; Field et al. 2002). As such, the results of tests conducted with sediment samples from the Calcasieu Estuary could be readily compared with the results of tests conducted elsewhere in the United States. In this respect, the results of amphipod toxicity tests compiled for the estuary can be compared with the concentration–response relations (e.g., mean PEC-Q, mean ERM-Q, P-Avg, and P-Max [i.e., percent incidence of toxicity]) developed for amphipods using larger databases (Long et al. 1998a; Long and MacDonald 1998; USEPA 2005, 2000b; Field et al. 2002).

Individual sediment samples were designated as toxic or not toxic based on comparison of the measured response for that sample with the response for the laboratory control samples (i.e., consistent with the approach used in the larger databases). More specifically, the sediment samples tested with A. abdita were designated as toxic if survival was significantly different from the control (based on analysis of variance [ANOVA]) and control-adjusted survival was <80% (Thursby et al. 1997). For H. azteca survival, sediment samples were designated as toxic if there was a significant decrease in survival relative to a control (based on ANOVA) and the control-adjusted survival was <80% (according to Long and MacDonald 1998). For H. azteca growth, sediment samples were designated as toxic if there was a significant decrease in amphipod length relative to a control (based on ANOVA) and the control-adjusted length was <90% (USEPA 2001). If the results for the control treatment were unavailable, then the responses for sediment samples from the study areas were compared with those for appropriately selected sediment samples from reference areas (i.e., reference sediments; ASTM 2009b).

To support subsequent interpretation of the sediment chemistry data, the total concentrations of several chemical classes were determined for each sediment sample using the methods described by MacDonald et al. (2000). In calculating the total concentrations of the various chemical classes, values less than the detection limit were assigned a value of one half of detection limit, except when the detection limit was greater than the consensus-based PEC (or an alternate SQG if a PEC was not available; MacDonald et al. 2000). In this latter case, the value that was less than detection limit was not used in the calculation of the total concentration of the substance, in the calculation of mean PEC-Qs, mean ERM-Qs, or in the evaluation of the P-Avg/P-Max model. Increased detection limits were only rarely observed for the chemicals included in these chemical mixture models (e.g., metals, PAHs, PCBs, and/or organochlorine pesticides).

Evaluation of the Predictive Ability of SQGs

Previous evaluations of numerical SQGs have typically focused on determining the reliability and predictive ability of SQGs. Reliability is evaluated to determine if the SQGs for individual substances are consistent with their stated narrative intent (e.g., TECs are intended to define COPC concentrations above which adverse effects on benthic invertebrates are only rarely observed) and is assessed with the data used to derive the original SQGs. Predictive ability evaluations are also conducted to determine if the SQGs perform in a manner consistent with their narrative intent but are conducted with independent data sets (i.e., data sets not used to derive the SQGs). For all three sets of SQGs examined (i.e., ERMs, PECs, and LRM point estimates), the results of previous evaluations have demonstrated the reliability of the SQGs for individual substances and COPC mixtures (Long et al. 1995; MacDonald et al. 2000; Field et al. 2002; USEPA 2000b).

In this investigation, the three sets of SQGs for benthic invertebrates were evaluated based on the performance of their associated chemical mixture models in the Calcasieu Estuary (i.e., to evaluate predictive ability). As a first step, the incidence of sediment toxicity within ranges of mean ERM-Qs or mean PEC-Qs was calculated for the Calcasieu Estuary and compared with the incidence of effects that was observed for sediments collected at sites located throughout the United States. More specifically, the incidence of toxicity for each of the selected toxicity tests was determined for the following categories of mean ERM-Qs or mean PEC-Qs: <0.1, 0.1 to <0.5, 0.5 to <1.0, 1.0 to <5.0, ≥1.0, and ≥5.0. These ranges are the same as those used in the USEPA (2000b) evaluation of the predictive ability of the consensus-based PECs.

The mean ERM-Qs and mean PEC-Qs were also evaluated by deriving concentration-response relations from the information contained in the Calcasieu Estuary database. The relation between mean ERM-Qs and PEC-Qs (concentration) and incidence of toxicity (response) was evaluated by regression analysis applied to the summarized data for each toxicity test endpoint (i.e., using statistical software). More specifically, the underlying sediment chemistry and toxicity data were sorted by increasing mean ERM-Q or PEC-Q and compiled into groups of ≤15 samples depending on the number of samples available for each endpoint (i.e., to yield a minimum of 10 groups of samples; consistent with the methods of MacDonald et al. 2000; USEPA 2000b). For each group of samples, which are termed “concentration intervals,” the incidence of toxicity (i.e., percent of samples toxic) and the geometric mean of the mean ERM-Q or mean PEC-Q were calculated. These summarized data were then plotted and used to generate the regression models for each toxicity test endpoint. The coefficient of determination (r 2) and level of significance (p) were then determined for each model using statistical software. Subsequently, the estuary-specific concentration-response relations for H. azteca were compared with the concentration-response curves (based on mean PEC-Qs) generated by the USEPA (2000b) using the results of 10- to 14-day or 28- to 42-day toxicity tests with H. azteca using F test (Zar 1999). Similarly, estuary-specific concentration-response relations for A. abdita were compared with concentration-response curves based on mean ERM-Qs or PEC-Qs that were generated using a national database for two marine amphipod species, A. abdita and Rhepoxynius abronius (Field et al. 1999). Field et al. (1999) evaluated matching chemistry and toxicity data for these two species and concluded that their sensitivities to lead, mercury, zinc, and phenanthrene were similar. Therefore, data for both marine species were compiled in the national database and treated as equivalent. The regional data were considered to be consistent with the national data if the regional dose-response curve generally fell within the 95% prediction limits for the relation that was generated using the information contained in the national database (as determined using statistical software).

The procedures developed by Field et al. (2002) were used to evaluate the predictive ability of the P-Max/P-Avg models. More specifically, the data on the concentration of each COPC in each sediment sample from the Calcasieu Estuary were used in conjunction with the corresponding LRMs to determine the probability that each sample would be toxic to A. abdita or H. azteca (i.e., a probability was calculated for each COPC). Subsequently, the P-Max or P-Avg for each sediment sample was determined based on the probabilities that were calculated for the various COPCs that were measured. Next, the P-Max model or P-Avg model from Field et al. (2002) was used to determine the final probability of observing toxicity to amphipods for each sediment sample (i.e., expressed as a decimal fraction between 0.0 and 1.0). Finally, the incidence of toxicity for sediment samples included within four ranges of probabilities (i.e., 0.0 to 0.25, >0.25 to 0.50, >0.50 to 0.75, and >0.75) was determined and then compared with the predicted incidence of toxicity for each range of probabilities (e.g., as calculated based on the average P-Max or P-Avg for the samples within the range applied to the national P-Max/P-Avg models; Field et al. 2002).

Results and Discussion

The results of the predictive ability evaluation were presented separately for the three types of whole-sediment toxicity tests considered, including 10-day toxicity tests with the amphipod H. azteca, the 28-day toxicity tests with H. azteca, or the 10-day toxicity tests with the amphipod A. abdita. Subsequently, the implications of these results relative to the selection of a chemical mixture model for assessing sediment-quality conditions in the Calcasieu Estuary are discussed.

Toxicity to the Amphipod H. azteca in 10-Day Exposures

The incidence of sediment toxicity within ranges of mean PEC-Qs provides useful information for assessing the predictive ability of the SQGs. Importantly, the results of this analysis indicated that the incidence of toxicity in 10-day whole-sediment toxicity tests with the amphipod H. azteca (endpoints = survival or growth) increased consistently and markedly with increasing concentrations of COPCs in Calcasieu Estuary sediments (e.g., as indicated by mean PEC-Qs; n = 127; Table 3). At mean PEC-Qs <0.1, the incidence of toxicity was 25% (n = 40). The incidence of sediment toxicity increased to 46% at mean PEC-Qs 0.1 to 1.0 (n = 80). Greater mean PEC-Qs were associated with a greater incidence of toxicity to H. azteca (e.g., 86% for mean PEC-Qs >1.0 [n = 7] and 100% for mean PEC-Qs ≥5.0 [n = 2]). Overall, the incidence of toxicity (10-day survival or growth) in the estuary-specific data set for H. azteca was 42% (n = 127). By comparison, the incidence of toxicity to H. azteca (e.g., based on the results of 10- to 14-day whole-sediment toxicity) in the national database was 18% for mean PEC-Qs <0.1 (n = 147), 20% for mean PEC-Qs 0.1 to <1.0 (n = 361), 54% for mean PEC-Qs ≥1.0 (n = 162), and 71% for mean PEC-Qs ≥5.0 (n = 70; Table 3; USEPA 2000b). Overall, the incidence of toxicity in the national data set was 28% (n = 670). These results show that the incidence of toxicity to H. azteca exposed to Calcasieu Estuary sediments for 10 days was generally similar to that in the national database at low PEC-Qs (e.g., <0.1). However, Calcasieu Estuary sediments tended to exhibit more toxicity to H. azteca at greater PEC-Qs (e.g., >0.1) than was the case for the sediment samples represented in the national database.

Table 3 Incidence of sediment toxicity to H. azteca within ranges of mean PEC-Qs for sediments from the Calcasieu Estuary and elsewhere in the United States (from USEPA 2000a)

The concentration-response relations developed using the information contained in the national database (USEPA 2000b) provide a basis for understanding how the incidence of toxicity is likely to change with increases in the concentrations of sediment-associated contaminants (USEPA 2000b). Based on the results of the analyses conducted using the matching sediment chemistry and 10-day toxicity data with H. azteca from the Calcasieu Estuary, the estuary-specific concentration-response curves (e.g., Fig. 1 for survival and Fig. 2 for survival or growth) were generally >95% prediction limits for the concentration-response relation generated using the information contained in the national database. Furthermore, comparing the models using F test show that the Calcasieu Estuary data are significantly different from those in the national database (F 0.05(4,40) = 7.03 [p < 0.001] and F0.05(4,40) -7.40 [p < 0.001], respectively]. When taken together with the information on the incidence of toxicity within ranges of SQG-Qs, it is apparent that sediment samples from the Calcasieu Estuary tend to be more toxic to H. azteca in 10-day exposures than those with similar levels of contamination from elsewhere in the United States.

Fig. 1
figure 1

Relation between the geometric mean of the mean PEC-Q and the incidence of 10- to 14-day toxicity to H. azteca (endpoint = survival) in the national and Calcasieu Estuary databases (DB)

Fig. 2
figure 2

Relation between the geometric mean of the mean PEC-Q and the incidence of 10- to 14-day toxicity to H. azteca (endpoint = survival or growth) in the national and Calcasieu Estuary databases (DB)

The national LRMs also provide a basis for predicting 10-day toxicity to the amphipod H. azteca (e.g., by applying the national P-Max and P-Avg models to the sediment chemistry data for the Calcasieu Estuary [Field et al. 2002]). The results of this evaluation indicate that P-Avg values ranged from 0.28 to 0.82 and the P-Max values ranged from 0.29 to 0.90 for the 127 samples considered in this evaluation. The predicted incidence of toxicity to amphipods (e.g., using the LRM for marine amphipods) and observed incidence of toxicity (based on results of 10-day toxicity tests with H. azteca) for each of the four ranges of P-Max values are presented in Fig. 3, and those for four ranges of P-Avg values are presented in Fig. 4. At a probability range >0.5, the predicted and observed incidence of toxicity increased with increasing sediment chemistry for both the P-Max model and P-Avg models. However, better agreement was evident between the observed and predicted toxic samples using the P-Avg model (Fig. 4). These results indicate that the toxicity of sediment samples from the Calcasieu Estuary to H. azteca in 10-day toxicity tests tends to be similar to that of sediment samples with similar levels of COPCs from elsewhere in North America based on the information contained in the national database for marine amphipods.

Fig. 3
figure 3

Average predicted and proportion toxic within probability quartiles, using the P-Max model, based on the survival or growth of H. azteca in 10-day toxicity tests

Fig. 4
figure 4

Average predicted and proportion toxic within probability quartiles, using the P-Avg model, based on the survival or growth of H. azteca in 10-day toxicity tests

Toxicity to the Amphipod H. azteca in 28-Day Exposures

The incidence of toxicity to H. azteca in 28-day exposures increases consistently and markedly with increasing concentrations of COPCs (e.g., as indicated by mean PEC-Qs, n = 100; Table 3). Considering either survival or growth, the incidence of sediment toxicity was low (e.g., 6%; n = 34) at mean PEC-Qs <0.1 (Table 3). The incidence of toxicity was greater (e.g., 30%; n = 60) at mean PEC-Qs 0.1 to <1.0. By comparison, amphipod survival or growth was significantly decreased in 67% of the sediment samples with mean PEC-Qs ≥1.0 (n = 6). Overall, the incidence of toxicity (28-day survival or growth) in the estuary-specific data set was 24% (n = 100). Using the information contained in the national database, the incidence of toxicity to H. azteca in 28-day exposures was determined to be 10% (n = 63) at mean PEC-Qs <0.1, 30% (n = 66) at mean PEC-Qs 0.1 to <1.0, and 97% (n = 31) at mean PEC-Qs ≥1.0 (Table 3). Overall, the incidence of toxicity in the national data set was 35% (n = 60). These results indicate that the incidences of toxicity are similar within the ranges of mean PEC-Q between 0.1 and 1.0 for the estuary-specific and national data sets. However, the incidence of toxicity appears to be lower in Calcasieu Estuary sediments at mean PEC-Qs >1.0 (e.g., 67%; n = 6), compared with the sediment samples represented in the national database (97%; n = 31).

Compared with the tabulated incidence of toxicity results, examination of the estuary-specific concentration-response curves indicates that H. azteca in 28-day survival is decreased at lower levels of sediment-associated COPCs in the Calcasieu Estuary than is the case for sediment samples from elsewhere in the United States [Fig. 5; Table 3; F 0.05(4,12) = 5.98, p = 0.007], particularly at moderate PEC-Qs (e.g., 0.3 to 1.0). However, such differences were less apparent when survival or growth was considered (Fig. 6; the concentration-response relation for survival or growth generated using data from the Calcasieu Estuary generally fall within the 95% prediction limits for the national relation; F 0.05(4,12) = 1.39, p = 0.294). Therefore, it is not unreasonable to conclude that the differences between the national and estuary-specific concentration-response relations are relatively minor when toxicity to H. azteca in 28-day exposures is considered.

Fig. 5
figure 5

Relation between the geometric mean of the mean PEC-Q and the incidence of 28- to 42-day toxicity to H. azteca (endpoint = survival) in the national and Calcasieu Estuary databases (DB)

Fig. 6
figure 6

Relation between the geometric mean of the mean PEC-Q and the incidence of 28- to 42-day toxicity to H. azteca (endpoint = survival or growth) in the national and Calcasieu Estuary databases (DB)

The national LRMs can be used to predict toxicity to the amphipod, H. azteca, in 28-day exposures associated with exposure to Calcasieu Estuary sediments. The predicted and observed incidence of toxicity for each of the four ranges of P-Avg values are presented in Fig. 7. The predicted and observed incidence of toxicity increased with increasing probability range (increasing chemistry) for both the P-Max model (data not shown) and P-Avg model. However, better agreement was evident between the observed and predicted toxic samples using the P-Avg model. These results indicate that the toxicity of sediment samples from the Calcasieu Estuary to H. azteca in 28-day exposures tends to be similar to that for sediment samples from elsewhere in the United States with similar levels of COPCs (e.g., considering toxicity to marine amphipods in 10-day exposures). Nevertheless, both models tended to overpredict toxicity to H. azteca exposed for 28-day to Calcasieu Estuary sediments.

Fig. 7
figure 7

Average predicted and proportion toxic within probability quartiles, using the P-Avg model, based on the survival or growth of H. azteca in 28-day toxicity tests

Toxicity to the Amphipod A. abdita in 10-Day Exposures

The incidence of toxicity to A. abdita in 10-day exposures was greater than that for H. azteca in 10-day exposures. More specifically, the incidence of toxicity to A. abdita was 61% in sediments at mean ERM-Qs <0.1 (n = 84; Table 4). At mean ERM-Qs 0.1 to <1.0, the incidence of sediment toxicity increased to 81% (n = 68). The incidence of toxicity was greater (e.g., 100%) in sediments with mean ERM-Qs ≥1.0 (n = 13). Overall, the incidence of toxicity in the estuary-specific data set was 72% (119 of 165 sediment samples). By comparison, the incidences of toxicity to A. abdita (e.g., based on the results of 10-day whole-sediment toxicity tests with the survival endpoint) in the national database was 25% for mean ERM-Qs <0.1 (n = 1033), 49% for mean ERM 0.1 to <1.0 (n = 1169), and 77% for mean ERM-Qs ≥1.0 (n = 140; Table 4). Overall, the incidence of toxicity in the national data set was 40% (935 of 2342 sediment samples). These results indicate that sediment samples from the Calcasieu Estuary were more toxic to A. abdita than were sediment samples from elsewhere in the United States tested with marine amphipods (e.g., at comparable levels of contamination).

Table 4 Incidence of sediment toxicity to A. abdita within ranges of mean ERM-Qs for sediments from the Calcasieu Estuary and elsewhere in the United States

The matching sediment chemistry and toxicity data contained in the national database were also used to develop concentration-response relations for A. abdita (e.g., for mean ERM-Qs or mean PEC-Qs). These relations indicate that the incidence of toxicity to A. abdita in the national database was positively correlated with mean ERM-Qs (r 2 = 0.82; Fig. 8) and mean PEC-Qs (r 2 = 0.80; Fig. 9). The incidence in toxicity in the A. abdita test with Calcasieu sediments also increased with increasing mean ERM-Qs (r 2 = 0.70; Fig. 8) and with increasing PEC-Qs (r 2 = 0.51; Fig. 9) in the Calcasieu Estuary. The results of these analyses indicate that the relation between the incidence of toxicity and the SQG-Qs is more variable in the Calcasieu sediments compared with the national database. Additionally, the incidence of toxicity to A. abdita was greater in Calcasieu Estuary sediments than it was in sediments from elsewhere in the United States (e.g., with similar levels of contamination; F 0.05(4,17) = 10.20 [p < 0.001] and F 0.05(4,17) = 9.09 [p < 0.001], respectively).

Fig. 8
figure 8

Relation between the geometric mean of mean ERM-Qs and the incidence of 10-day toxicity to A. abdita or R. abronius (endpoint = survival) in the national and the incidence of 10-day toxicity to A. abdita (endpoint = survival) in the Calcasieu Estuary databases (DB)

Fig. 9
figure 9

Relationship between the geometric mean of mean PEC-Q and the incidence of 10-day toxicity to A. abdita or R. abronius (endpoint = survival) in the national database and the incidence of 10-day toxicity to A. abdita (endpoint = survival) in the Calcasieu databases (DB)

The national LRMs (e.g., P-Avg models) were used to predict the incidence of toxicity to A. abdita exposed to Calcasieu Estuary sediments for each of the four ranges of P-Avg values (Fig. 10). These results show that the predicted and observed incidence of toxicity for Calcasieu Estuary sediment samples increased with increasing probability range (increasing chemistry) for both the P-Max model (not shown) and P-Avg model. However, better agreement was evident between the observed and predicted toxic samples using the P-Avg model. These results also indicated that sediment samples from the Calcasieu Estuary tended to be more toxic than sediment samples with similar levels of COPCs from elsewhere in the United States.

Fig. 10
figure 10

Average predicted and proportion toxic within probability quartiles, using the P-Avg model, based on the survival of A. abdita in 10-day toxicity tests

Implications for the Calcasieu Estuary BERA

The results of the predictive ability evaluation indicate that the incidence of sediment toxicity in Calcasieu Estuary sediments generally increases with increasing concentrations of sediment-associated contaminants. This general relation was apparent for all of the toxicity test endpoints considered and all of the chemical mixture models applied (e.g., mean PEC-Qs, mean ERM-Qs, P-Avg, and P-Max). The strongest relations between incidence of toxicity and level of contamination were observed for the results of the 28-day toxicity tests with H. azteca, as evaluated using the mean PEC-Qs or the P-Avg Model (Figs. 5, 6, 7).

The concentration-response relations derived using the estuary-specific data sets (Table 5) provide a basis for calculating point estimates of sediment toxicity (e.g., P 50 values, which represent the level of chemical contamination associated with a 50% incidence of toxicity). For the 10-day toxicity tests with H. azteca, the site-specific P 50 values (expressed as mean PEC-Qs) for survival (0.46) and for survival or growth (0.24) were substantially lower than the P 50 values that were determined using the sediment samples represented in the national database (e.g., 4.5 for survival and 3.4 for survival or growth; Figs. 1 and 2; USEPA 2000b). Similarly, the P 50 values calculated for survival (0.44) and survival or growth (0.37) using the results of the 28-day toxicity tests with H. azteca (conducted with Calcasieu Estuary sediments) were also lower than those that were calculated using the national database (e.g., 3.2 for survival and 0.63 survival or growth; Figs. 5 and 6; USEPA 2000b). For A. abdita, the high incidence of toxicity observed in the toxicity tests conducted using sediments from the Calcasieu Estuary precluded the calculation of a P 50 value using mean ERM-Qs (Fig. 8) or mean PEC-Qs (Fig. 9). However, examination of the data plots indicate that the P 50 value would be less than a mean ERM-Q or mean PEC-Q of 0.05 (Figs. 8 and 9). These values are substantially lower than the P 50 value of 0.28 (e.g., for ERM-Qs) calculated using the national database (Fig. 8; MacDonald et al. 2004). These results indicate that sediment samples from the Calcasieu Estuary tend to be more toxic than those with similar levels of contamination from elsewhere in the United States.

Table 5 Regression summaries for concentration-response curves developed using data collected from the Calcasieu Estuary and from the national database

Because comparison of whole-sediment chemistry to chemical benchmarks represents an essential element of the overall risk-characterization process for the Calcasieu Estuary BERA, it is important to examine the possible reasons for the apparent enhanced toxicity of Calcasieu Estuary sediments. Quality-assurance measures, implemented in conjunction with the whole-sediment toxicity tests, provide a basis for identifying possible causes of enhanced toxicity. More specifically, negative and positive control results provide a basis for evaluating the validity and reliability of toxicity tests. However, all three tests had negative control survival and/or growth rates that were within acceptable ranges, and the reference toxicity tests results were consistent with the results that had been obtained previously by the two toxicity testing laboratories (MESL 2001; Harding ESE 2001). Furthermore, no evidence of laboratory contamination was evident in the toxicity data (e.g., replicates within treatments did not exhibit anomalous results). As such, the enhanced toxicity could not be attributed to either unacceptable conditions during the tests or to increased sensitivity of the organisms used in the exposures relative to the sensitivities of test organisms used in other studies by these two testing laboratories.

Information on the incidence of toxicity in sediment samples with low mean PEC-Qs or ERM-Qs also provides a basis for investigating the factors contributing to the enhanced toxicity observed in sediment samples from the Calcasieu Estuary. Based on the data contained in the estuary-specific database, the highest incidence of toxicity in relatively clean sediment samples (e.g., mean PEC-Qs or ERM-Qs <0.1) was observed in the 10-day toxicity tests with A. abdita (e.g., 61%; n = 84 for the survival endpoint; Table 4). By comparison, the incidence of toxicity in relatively clean sediment samples was substantially lower for the 10- (25%; n = 40) and 28-day (3%; n = 34) toxicity tests with H. azteca when the survival endpoint was considered (Table 3).

Although the results of the 28-day toxicity tests conducted on Calcasieu Estuary sediments using H. azteca are similar to those obtained for sediment samples from elsewhere in the United States, the incidence of toxicity observed in the toxicity tests with H. azteca and A. abdita in 10-day tests were greater than that determined using the national databases. These results suggest that some factors, in addition to the concentrations of metals, PAHs, PCBs, and/or organochlorine pesticides (e.g., which were considered in the calculation of mean SQG-Qs), contributed to the toxicity observed for H. azteca or A. abdita. However, the results of Spearman rank correlation analyses indicated that the levels of sand, silt, clay, total ammonia, unionized ammonia, pH, salinity, water hardness, dissolved oxygen, dissolved organic carbon, TOC, and moisture were not significantly negatively correlated with the survival of H. azteca or A. abdita. As such, it is unlikely that these factors explain the apparent enhanced toxicity of Calcasieu Estuary sediments to H. azteca or A. abdita in 10-day tests. Hence, it is likely that the levels of other contaminants that were frequently detected in sediment samples but not included in the mean SQG-Q calculations (e.g., hexachlorobutadiene, which exceeded generic sediment benchmarks in certain samples) and/or the levels of unmeasured contaminants contributed to enhanced toxicity. As such, predictions of sediment toxicity using the chemical mixture models have the potential to underestimate the actual toxicity of Calcasieu Estuary sediments to amphipods.

Conclusion

Numerical SQGs are frequently used to support large-scale investigations of sediment-quality conditions at contaminated sites with multiple sources and types of contamination. Because the scope and costs of management actions to address risks to benthic invertebrates can be substantial, it is appropriate to determine if such tools provide an accurate basis for classifying sediment samples relative to the risks that they pose to benthic invertebrates. In this study, an evaluation of three chemical mixture models (considering multiple toxicity test endpoints) was conducted to determine their potential application for assessing sediment-quality conditions in the Calcasieu Estuary (e.g., as part of a BERA), including mean PEC-Qs (USEPA 2000b), mean ERM-Qs (Long et al. 1998a, b; Long and MacDonald 1998), and LRM models (Field et al. 2002). The results of this evaluation indicate that all three models tend to underestimate the 10-day toxicity of COPC mixtures in sediment samples from the Calcasieu Estuary. It is likely that chemical factors, in addition to the concentrations of metals, PAHs, PCBs, and/or organochlorine pesticides, contributed to the 10-day toxicity of whole-sediment samples in the Calcasieu Estuary to amphipods. Importantly, the mean PEC-Q model derived using data from Calcasieu sediments generally provided a more reliable basis for predicting toxicity to H. azteca in sediment samples from the study area compared with the models derived using the national databases for freshwater or marine amphipods.

The results of the predictive ability evaluation provide important insights for selecting a chemical mixture model for use in the Calcasieu Estuary BERA (e.g., for predicting the incidence of toxicity to sediment-dwelling organisms using estuary-specific sediment chemistry data) or other sites. More specifically, the results of the predictive ability evaluation showed that the site-specific mean PEC-Q models generated with exposures of H. azteca and A. abdita would provide the most appropriate tools for predicting toxicity to benthic invertebrates in the Calcasieu Estuary. The concentration-response relations (for both incidence and magnitude of toxicity) that were developed to support the Calcasieu Estuary BERA are described in a companion article in MacDonald et al. (2010b).