Introduction

The detection and assessment of nutrient enrichment is an important environmental and socio-economic issue in the management and conservation of coastal and marine waters. The primary biological response to nutrient enrichment in aquatic environments, given suitable environmental conditions (such as light availability and water temperatures), is the growth of phytoplankton and higher plants (Fig. 1). Known consequences of nutrient enrichment include increased primary production, increased biomass of primary producers such as phytoplankton (indicated by concentrations of chlorophyll-a [hereafter Chl-a]), and depletion of dissolved oxygen (hereafter DO) due to decomposition of accumulated biomass, resulting in local hypoxic or anoxic conditions. Other consequences can include shifts in species composition, blooms of nuisance and toxic algae and macroalgae, increased growth of epiphytic algae, red tides, water discolouration and foaming, loss of submerged vegetation (hereafter SAV) due to shading, and changes in benthic community structure due to oxygen deficiency or the presence of toxic phytoplankton species (Tett 1987; Gillbricht 1988; Lancelot et al. 1987; Boynton et al. 1996; Bricker et al. 1999, 2003, 2007; Smayda and Reynolds 2001). The range of impacts from nutrient enrichment can span from single cell production to major trophic shifts. It is not possible, nor feasible, to monitor or assess all parts of a marine system to identify all impacts, so a selection of indicators is required that will adequately describe the eutrophic status of the environment.

Fig. 1
figure 1

Potential impacts of anthropogenic nutrient enrichment in coastal and marine waters. a Summary of primary and secondary impacts, and consequences of symptoms. Secondary impacts indicate undesirable disturbance to the system. b Gradient of secondary impacts, ranging from no problem/low (best) to high (worst) in response to influencing factors. N nitrogen, P phosphorus, SAV submerged aquatic vegetation, DO dissolved oxygen (from Bricker et al. 2007, 2008)

The development of classification systems and ecological assessment tools is an important and technically challenging aspect of assessing the consequences of nutrient enrichment. Assessments typically combine a selection of key indicators that enable reasonable evaluation of the overall status of eutrophication in coastal and marine waters, which enables managers and policy makers to make decisions about the mediation of problems linked to nutrient enrichment. In recent years, a number of assessment methods have been developed and used to assess the eutrophication status of coastal and marine waters for management purposes. Historically, in both the United States (US) and European Union (EU) the management and protection of coastal and marine environments from impacts of nutrient enrichment has been the responsibility of local and national agencies.

In the EU, water-related directives have set objectives for the protection and maintenance of EU coastal and marine water quality. Various EU directives (such as the Urban Waste Water Directive [UWWTD, CEC 1991a], the Nitrates Directive [ND, CEC 1991b], the Habitats Directive [HD, CEC 1992], the Water Framework Directive [WFD, CEC 2000] and the Marine Strategy Framework Directive [MSFD, CEC 2008]), the Oslo Paris Convention (OSPAR 2003a, b), the Helsinki Commission (HELCOM, Andersen and Laamanen 2009) and TRIX (Vollenweider et al. 1998) all consider the assessment of eutrophication through measurement of key indicators such as concentrations of nutrients, Chl-a and DO (Table 1). Some of the more recent directives (WFD, MSFD) and OSPAR include the additional identification of secondary impacts and undesirable disturbance to the ecosystem (e.g. low DO events, toxic algal blooms).

Table 1 Key indicators used by the different approaches for assessing eutrophic conditions

In the US, the Environmental Protection Agency (EPA) generally delegates responsibility to state water management bodies to implement standards and criteria to assess compliance with the Clean Water Act of 1972. This Act is the primary federal water protection law in the US with goals of eliminating discharge of contaminants to surface waters and ensuring that they meet standards necessary for human sports and recreation. The EPA and the National Oceanic and Atmospheric Administration (NOAA) support regulatory decisions by providing research and assessment results (Borja et al. 2008). The data and information required by the US Clean Water Act, Coastal Zone Management Act, Harmful Algae and Hypoxia Research and Control Act and Oceans Act are the basis for the development and application of the ASSETS (Bricker et al. 1999, 2003, 2007; Xiao et al. 2007; Ferreira et al. 2007) and EPA NCA (USEPA 2001a, 2005, 2008) assessment methods. They compare measurements of key indicators to reference conditions to provide information on pollutants of concern.

The overall aim of this study was to apply these approaches to two estuarine systems in the United Kingdom (UK) and to test the hypothesis that they would all result in the same final assessment of eutrophication status when applied to the same data. The intent was to make a side-by-side comparison of these methods to provide managers with information they could use to select one or more of the methods for use, or to improve methods already in use. The Thames and Medway estuaries, located on the east coast of England (Fig. 2), were selected as test areas to apply and compare results of the five eutrophication assessment methods. Both are large, sheltered, macro-tidal, well-mixed estuaries (Rogers et al. 2003; Table 2). Data were obtained from national programmes that monitor water quality, including measurements of phytoplankton community structure, Chl-a, DO, harmful algal blooms (HABs), macroalgae and SAV.

Fig. 2
figure 2

The study areas, where dots denote sites monitored under a long-term national program. a Thames estuary—indicating divisions into lower, middle and upper regions under the WFD (see scale). Data from the lower and middle regions were combined for application of the WFD, OSPAR COMPP, TRIX and EPA NCA assessments. For ASSETS, 2 salinity zones (1–25 psu and >25 psu) were evaluated separately and then combined for an estuary-wide area weighted final score. b Medway estuary, located at the lower end of the Thames. Data were combined as for the Thames

Table 2 Characteristics of the two study sites, Medway and Thames estuaries, and summary of main data used in this study

Methods

General description of study areas

The Thames estuary (248 km2) is one of the largest of 170 inlets on the coast of Great Britain and includes the area in which the River Thames meets the North Sea. This estuary has one of the largest watersheds in the UK (12,935 km2) and constitutes a major shipping route. The estuary supports regions of mudflats, open beaches, salt marshes and over 900 km of chalk stream systems. The Thames river basin district also contains 40 special areas of conservation and nearly 500 Sites of Special Scientific Interest, which identify the very best wildlife and geological sites in the UK, with a high level of protection and management (www.sssi.naturalengland.org.uk/). However, much of the watershed is dominated by urban areas including parts of London, which has a population of over 13 million people (Table 2). The Thames estuary is impacted by pollution as a result of the high population density. In addition, there are five main sewage treatment works and over 40 combined sewer overflows entering the Thames estuary. To the west, the watershed is dominated by agriculture and experiences a large volume of land run-off.

The Medway estuary is smaller than the Thames (57 km2) with a much smaller human population (Table 2), and consequently is subject to lower urban and agricultural pressures than the Thames. The shoreline is deeply indented with many islands, salt marshes, and large areas of brackish marshes. Like the Thames, the Medway serves as a shipping route.

Data description

To apply the five assessments to the study estuaries we used water quality data from the UK Environment Agency database containing long-term records of nutrients, Chl-a and DO. Consistent national standards for marine environmental quality were developed for UK waters and regular sampling at a network of estuarine, coastal and offshore monitoring stations (the National Marine Monitoring Programme, NMMP) was implemented in 1987. Sixty-eight sampling sites are located along the estuarine gradient in the Thames Estuary and twenty are located in the Medway Estuary (Fig. 2). Nutrients, Chl-a, phytoplankton counts and DO data obtained at least monthly between 2000 and 2007 were used in this study. Although the frequency of sampling was variable among these sites, there were >2500 Chl-a and >1600 DO data points for the Medway Estuary, and >1800 Chl-a and >2800 DO data points for the Thames Estuary (Table 2; Fig. 3). Data from sites where salinity was <1 psu were not included to avoid potential bias from freshwater species and sampling and analytical inconsistencies. Data required by the EPA NCA method were subsampled from the database during the index period to reproduce the sampling requirements of one sample per year.

Fig. 3
figure 3figure 3

Data (2000–2007) for concentrations of dissolved inorganic nitrogen (DIN), chlorophyll (Chl-a) and dissolved oxygen (DO) in the Thames (T1–T6) and Medway (M1–M6) estuaries. Data shown per year (T1, T3, T5 and M1, M3, M5) include all data from all stations sampled (Fig. 2), at salinities >1 psu. Data are also shown by month (T2, T4, T6 and M2, M4, M6) where Jan–Dec = 1–12. Note that data points may represent many superimposed samples. Very high values for DIN and Chl-a typically represent the low salinity end of each estuary, where nutrients enter the systems and freshwater phytoplankton dominate

Data on macroalgae and SAV were obtained from the Environment Agency and historical monitoring sources (Butcher 1941; Hughes and Paramor 2004; www.uktag.co.uk). Data on HABs were obtained directly from the Environment Agency and from other monitoring programmes (e.g. the shellfish monitoring programme of the UK Food Standards Agency, FSA) via the Centre for Environment, Fisheries and Aquaculture Science (CEFAS).

Overview of assessments

Five methods were used to evaluate the eutrophication status of the Thames and Medway estuaries in the UK; the WFD as used in the UK (UK WFD), the OSPAR Comprehensive Procedure (OSPAR COMPP), the Trophic Index (TRIX), the EPA National Coastal Assessment (EPA NCA), and the Assessment of Estuarine Trophic Status (ASSETS). These methods are described briefly below, and in greater detail in Tables 3, 4 and 5. The application of these methods to the Medway estuary using data from 2000 to 2007 is demonstrated in Tables 6, 7, 8, 9 and 10 (see “Results and discussion”).

Table 3 Details of approaches used for assessment of eutrophication status, including indicator variables, terminology for grouping of variables (e.g. Primary Symptoms are equivalent to Direct Effects), time and space scales for data analysis, and combination methods
Table 4 Summary of procedures used for evaluating the eutrophic status of estuarine and coastal waters and categories used for final classification
Table 5 Thresholds for concentrations of chlorophyll-a (Chl-a), dissolved oxygen (DO) and dissolved inorganic nitrogen (DIN) for the four assessments
Table 6 Results for the Medway estuary from application of the ASSETS method to annual data, 2000–2007
Table 7 Results for the Medway estuary from application of the EPA NCA method and data from the summer index period, 2000–2007
Table 8 Results for the Medway estuary from application of the UK WFD approach to annual and seasonal data, 2000–2007
Table 9 Results for the Medway from application of the OSPAR COMPP method to annual and seasonal data 2000–2007
Table 10 Results of application of the trophic index (TRIX) to data for the Medway estuary, 2000–2007

WFD (UK)

The overall aim of the WFD is to achieve good ecological status in all EU member state waterbodies by 2015, where good status represents a slight (no more than 50%) deviation from reference conditions. Assessments are carried out at a waterbody level where waterbodies are differentiated by typology. Specific reference conditions have been developed for each type of system where waterbody type is defined by characteristics including tidal range, mixing, exposure and salinity (Tables 3, 4 and 5). Each EU member state is required to adopt the WFD assessment process, though the selection of waterbody types, reference conditions, specific indicator variables and assessment methods can differ among member states (Vincent et al. 2002; Anon 2009). Each state must report the ecological status of each biological quality element (BQE) (including marine plants, benthic invertebrates and transitional fish). The WFD assessment process as used in the UK for eutrophication assessment and presented here for the marine plants component, focuses primarily on nutrient concentrations and marine primary producers, including phytoplankton (biomass, abundance and composition), macroalgae (biomass and areal extent) and angiosperms (biomass and areal extent) (Devlin et al. 2007a, b; Painting et al. 2007; Best et al. 2007). This approach is a shift from targets based only on water chemistry to targets based on the ecological structure of natural systems (Pollard and Huxham 1998).

The WFD requires the classification of all surface waterbodies into one of five ecological status classes: High, Good, Moderate, Poor or Bad. The ecological status of each waterbody is assessed using three BQEs, each of which yield an Environmental Quality Ratio (EQR) between zero (Bad status) and one (High status). The BQEs related to eutrophication assessment, applied after an initial assessment of risk of eutrophication that is based on nutrient load, turbidity, flushing time and tidal range include phytoplankton, macroalgae and angiosperms. The final assessment also includes a measure of Physico-chemical status, focusing on dissolved inorganic nitrogen (DIN) and DO measurements. The final score for each BQE within a waterbody is accompanied by a data quality measurement and a confidence of class (CofC). The CofC calculates the uncertainty in the estimate of each EQR to determine the most probable class (the one with the highest probability of occurring) and state the level of confidence that status is good or better, or moderate or worse.

Each BQE consists of one or more indicators that measure different aspects of the biological community. For example, in the UK WFD method for estuarine waters, the phytoplankton element consists of two different indicators, Chl-a concentration and phytoplankton cell abundance (Devlin et al. 2007a, b). For each indicator, final measurements are converted into a normalised ecological quality ratio (EQR) via a two-step process. The first step converts the data to a numerical scale between zero and one, where the status class boundaries are not equidistant. The second step transforms these values into an equal-width class scale between zero and one. The final score for the phytoplankton BQE is calculated as the average of the two normalised scores from each indicator, and related to one of the five assessment classes (Table 5). This process is repeated for the indicators that make up the macroalgae and angiosperm BQEs. Overlaying this is the integration of the physico-chemical assessment (nutrient and other chemical conditions) with the final BQE assessments (Table 4). Classification of final status is determined for each waterbody using a one-out-all-out approach; if results for one BQE or the physico-chemical element are less than good (i.e. Moderate to Bad), the waterbody is assigned the worst among the ratings (Anon 2007). For example, a waterbody with a High (best) status for phytoplankton but with a Moderate status for macroalgae will be assigned a Moderate classification for eutrophication status.

The UK WFD method is a multi-dimensional ecological assessment (i.e. not specifically for eutrophication) that also analyses fish, benthic invertebrates and pollutants. However, for the purposes of this paper, only the nutrients and marine plants associated with eutrophic pressures will be discussed.

OSPAR Comprehensive Procedure

The OSPAR Common Procedure (OSPAR CP) is designed to fulfil requirements of the OSPAR Commission that seeks to reduce nutrient inputs by 50% of 1985 levels in areas where nutrient inputs are likely to cause nutrient pollution (OSPAR 2008; Claussen et al. 2009; Topcu et al. 2009). This procedure uses a two step process: a screening procedure and a full assessment. Only areas designated as Potential Problem Area or Problem Area in the screening procedure require the second step, the application of the full OSPAR Comprehensive Procedure (COMPP, Fig. 4; Table 4). Problem Areas are defined as areas for which there is evidence of anthropogenically induced eutrophication (OSPAR 2003a). The OSPAR COMPP is applied to areas defined by salinity based typology as estuarine (0–20 psu), coastal (30–34.5 psu) or offshore (>34.5 psu).

Fig. 4
figure 4

Comparison of classification categories used by the different approaches, shown relative to WFD categories. Terminologies are different but the scaling is similar, e.g. high condition status in ASSETS is equivalent to bad condition status in the WFD. The OSPAR COMPP and TRIX methods do not use colour coding (adapted from Claussen et al. 2009). * For OSPAR COMPP, potential problem area is not represented since it is more a reflection of ‘unknown’ than of conditions between problem area and non problem area

The method includes four components: I—Causative Factors, II—Direct Effects, III—Indirect Effects, and IV—Other Possible Effects of nutrient enrichment. Nutrients are used to determine Causative Factors; Chl-a, phytoplankton, SAV, macroalgae and microphytobenthos are used to evaluate Direct Effects; DO, changes in species composition or kills of zoobenthos and/or fish are used to determine Indirect Effects; and Other Possible Effects include presence of algal toxins (OSPAR 2002; Claussen et al. 2009). Salinity related and/or region-specific threshold values are used (except for fish kills) to determine scores (+ or −), where (+) indicates a Problem Area where the indicator exceeds the threshold for acceptable water quality (i.e. reference value +50%, Table 3; Foden et al. 2009, this volume). Classification of status is determined for each category using a one-out-all-out approach where if one indicator within the component indicates a Problem Area then the whole category receives a Problem Area rating. The final classification of eutrophication status is determined from the four category ratings, also using a one-out-all-out approach (Table 4).

TRIX

The trophic index (TRIX) is defined by a linear combination of the logarithms of four state variables: Chl-a, DO as the absolute percent deviation from saturation (aD%O), DIN and total phosphorus (TP; Tables 3, 4). These indicators are expressions of productivity (Chl-a, DO) and nutritional factors (DIN, TP). Scaling coefficients are introduced to fix the lower and upper limit values of the index to provide scores (Fig. 4) in the range of zero (scarcely productive-open sea) to ten (highly productive) TRIX units (Giovanardi and Vollenweider 2004). There is no weighting of indicators. TRIX has typically been applied for comparison of eutrophication impacts at the regional level, and has been used to compare waterbodies that varied in condition from oligotrophic to eutrophic, such as the Black, Aegean, Adriatic, Tyrrhenian and Baltic Seas (Giovanardi and Vollenweider 2004; Vascetta et al. 2004).

EPA NCA

The US EPA’s National Coastal Assessment (NCA) is implemented through a federal—state partnership, and is designed to answer questions on environmental conditions in coastal waterbodies. The results supplement the US Clean Water Act (CWA) where waterbodies identified as not meeting state water quality criteria for designated uses require actions to correct pollution caused impairments (USEPA 2001a, 2003, 2005, 2008). Of the five EPA NCA indices of condition in coastal waterbodies, the Water Quality Index (WQI) is the indicator describing nutrient related conditions and will be the only one discussed here.

This method uses five indicators: DIN, dissolved inorganic phosphorus (DIP), Chl-a, water clarity (by Secchi depth and by comparison of light reaching the water surface and at 1 m depth) and DO (Table 3). The WQI uses the EPA Environmental Monitoring and Assessment Program’s (EMAP) probabilistic randomly selected sampling framework where samples are taken once per year by region during a summer index period (June through September; USEPA 2001a). An evaluation is made for each of the five indicators at each site by comparison with regionally defined reference conditions determined from national studies (Table 3). A combined water quality index rating is calculated for each site (Tables 3, 4), then for the region and the nation based on the ratio of individual indicators that are rated as Good, Fair or Poor (e.g. condition is Good if none are rated Poor and only one is rated Fair).

ASSETS

The US National Estuarine Eutrophication Assessment method and modifications, called Assessment of Estuarine Trophic Status (ASSETS) is designed to address requirements of the US Clean Water Act and the Harmful Algal Bloom and Hypoxia Research and Control Act (Bricker et al. 1999, 2003, 2006, 2007, 2008, 2009; Ferreira et al. 2007; Xiao et al. 2007; Scavia and Bricker 2006; http://www.eutro.org/register). ASSETS has been applied to 141 US waterbodies, several waterbodies under EU WFD requirements (e.g. Ferreira et al. 2003; Nobre et al. 2005; Bricker et al. 2007), and to systems in China and Australia (Xiao et al. 2007; Bricker et al. 2006, 2007; http://www.eutro.org/syslist.aspx). The ASSETS assessment includes examination of influencing factors (nutrient inputs as they are modified by natural hydrology of the system), eutrophic condition (nutrient related water quality conditions), and future outlook (forecast of future conditions based on predicted changes in nutrient loads). The assessment then combines results of the three components into a single overall rating (Tables 3, 4). The eutrophic condition component is the only one that will be discussed here.

ASSETS assigns an eutrophic condition rating based on five indicators, or symptoms, that are assessed by salinity zone (tidal 0–0.5 psu, mixing 0.5–25 psu, seawater >25 psu). An area-weighted system score is determined from the salinity zone results. Ratings for Chl-a and macroalgae, considered primary symptoms, are averaged while the worst of three secondary symptom ratings (DO, changes in areal coverage of SAV, nuisance and toxic bloom occurrence) is selected in a precautionary approach. This approach is used given that the secondary symptoms are indicative of well developed eutrophication impacts and an average might under-represent eutrophication status. Instead, the worst case of the three ratings is used as a precaution against neglecting to apply management measures or more intense study in cases where it is needed. Symptom scores are determined by combining the occurrence, spatial coverage and frequency of occurrence of concentrations that are considered a problem and comparing to reference conditions and thresholds determined from national studies. Confidence in the assessment is determined as the Data Confidence and Reliability (DCR) that is based on sampling frequency and spatial distribution of available data (Bricker et al. 1999, 2003). Primary and secondary results are combined by matrix into a single system score. The ASSETS assessment results in an eutrophic condition rating that falls into one of five grades: High (worst), Moderate High, Moderate, Moderate Low, Low (Best). (The ASSETS desktop program is available at http://www.eutro.org/register.)

Methodological differences and their potential impacts on assessment results

Spatial considerations

The assessments presented here are normally applied over different spatial scales to provide waterbody or region-specific assessments of nutrient impacts (Fig. 5). We have ignored these delineations and applied all five assessments to the entire area of each estuary (salinity 1–34.5 psu). The OSPAR COMPP, for example, is applied to estuarine, coastal and offshore waters, some of which are outside the boundary of areas typically evaluated by the WFD. The TRIX and EPA NCA methods use a regional spatial basis for assessment. The ASSETS assessment is normally applied to estuarine salinity zones that are then area-weighted to provide a system wide result and thus management can be targeted to the most impacted zone. Here, all methods were applied to the same area and existing data (2000–2007) were used to represent the spatial and temporal variability within the system. The change in waterbody area should not impact the assessment results since the reference conditions and thresholds of each method include waters of this type and salinity.

Fig. 5
figure 5

A comparison of spatial scales used by the different assessment approaches. OSPAR COMPP regions for assessment are typically offshore of the WFD waterbodies, but like ASSETS and EPA NCA, the approach may be applied to any waterbody

Sampling timeframe and frequency

The five assessment methods use data that span different timeframes, which can potentially influence results. ASSETS and TRIX use data over an annual cycle for all indicators. The OSPAR COMPP is based on spring and summer data for Chl-a, summertime data for DO and mean winter concentrations of nutrients (Table 3). The UK WFD assessment is based on annual data for Chl-a and DO, and winter data for nutrients. The EPA NCA uses seasonal data for all indicators with a once-per-year sample taken within an index period (i.e. June–September). This may underestimate nutrient conditions since nutrient concentrations will be low given the uptake by phytoplankton and other marine plants during summertime. Use of limited seasonal data can bias results due to the variability in peak bloom (as measured Chl-a) timing as shown in Narragansett Bay where the timing of the traditional winter-spring bloom has been delayed or eliminated during the past several decades (Nixon 2009). In particular, Nixon (2009) shows that from 1958 to 1977, maximum blooms occurred in the first four months of the year 95% of the time, while in years since 1977 maximum blooms occurred in the first four months of the year only 43% of the time. This means that more than half the time, seasonal Chl-a data may underestimate the worst case conditions they are attempting to capture. This has important implications for the final results of the eutrophication assessment and thus also to the required management response.

Reference conditions and thresholds

There is some confusion and considerable debate around the terminology for and use of reference conditions, background conditions and threshold values. For applications of the OSPAR COMPP, participating EU countries were required to determine reference conditions representing near pristine water quality, and to calculate acceptable thresholds allowing no more than 50% deviation from reference conditions. Threshold values applied in the UK application of the OSPAR COMPP and the reference conditions from which they were derived are summarised by Foden et al. (2009, this volume). Other EU countries (e.g. Germany, see Topcu et al. 2009) have determined reference conditions relevant to their waterbodies, which can therefore differ from those determined for UK waters. The UK WFD determines type specific thresholds that are intercalibrated with other EU countries. The ASSETS and EPA NCA methods use thresholds that are determined from national studies. The TRIX method does not use thresholds for individual indicators.

There is similarity and general consistency among the methods for three key indicators (Chl-a, DO and DIN), for example DO values less than 4 mg l−1 and Chl-a values greater than 10 μg l−1 are taken as indicators of nutrient enrichment for all methods. This reflects the effectiveness of these key indicators in identifying stages of eutrophication (Table 5; see also Borja et al. 2009). In an effort to maximize the assessments’ accuracy, in some methods the thresholds for some indicators are modified. For example, thresholds are lowered for systems that are sensitive to nutrients and thus show impacts at lower levels of nutrient input (e.g. Chl-a thresholds are lower in Florida Bay—ASSETS, EPA NCA; Bricker et al. 2007; EPA 2005) while thresholds are elevated for systems that are less sensitive and less susceptible to impacts of nutrient enrichment (e.g. DIN thresholds are higher in turbid east coast UK estuaries, UK WFD, Devlin et al. 2008, 2009; Table 5). While regional modification of thresholds is recommended where specific conditions (i.e. turbidity) are regionally uniform, it may inadvertently bias results if used without proper consideration. TRIX uses a fixed scale for assigning status based on the trophic index score rather than threshold values (Table 5).

Water column indicators: nutrients, chlorophyll, DO

The TRIX method uses a combination of N, P, Chl-a, and DO saturation (Tables 1, 3 and 4), which may produce biased results because it assumes that eutrophication processes are mainly reflected as changes in phytoplankton biomass. This does not hold true in estuaries and shallow systems where other primary producers (e.g. macroalgae [e.g. Nobre et al. 2005], seagrasses, etc.) may contribute a significant amount to total production. The EPA NCA method uses the same assumption as the TRIX method and may therefore introduce the same bias in the results.

While all methods except ASSETS include indicators based on water column nutrients (Tables 1, 5), concentrations may not necessarily correlate with a biological impact (Cloern 2001) and thus, results may not accurately represent eutrophic conditions. For example, a severely degraded system may exhibit low concentrations due to uptake by phytoplankton and macroalgae, particularly during the growing season. Conversely a relatively healthy system may have high concentrations due to high turbidity, low algal uptake, strong filter feeder populations, or may flush nutrients so quickly that primary producers do not have the opportunity to bloom extensively (Cloern 1999, 2001; Ferreira et al. 2005).

The OSPAR COMPP method uses 90th percentile and mean values, and sometimes the maximum of Chl-a in the growing season (OSPAR 2005), with concentrations >50% above a reference condition indicative of a Problem Area or (+) rating. The 5th percentile of DO concentrations is used by the OSPAR COMPP, with a value of >50% below a reference condition indicative of a Problem Area or (+) rating. The ASSETS method uses the 90th percentile of Chl-a and the 10th percentile for DO compared to reference concentrations. For both of these indicators, ASSETS combines the concentrations with the spatial extent of concentrations that are considered a problem, determined as the ratio of the number of stations above (for Chl-a) or below (for DO) the thresholds to the total number of stations (GIS determined spatial areas can also be used). The frequency of occurrence of concentrations that are considered a problem (as periodic, episodic or persistent) is included in the metric to determine the final indicator rating (Bricker et al. 1999, 2003, 2007; Table 5). The inclusion of the spatial distribution and frequency of occurrence is important for accurately evaluating the true scale and magnitude of nutrient impact on an estuary (Bricker et al. 1999, 2003).

The WFD is applied slightly differently across EU member states. In the UK, a number of tools have been developed to assess impacts of nutrient enrichment on the phytoplankton community, which include a Chl-a measurement, an index for phytoplankton community composition, and (more recently) an index for seasonal succession (Devlin et al. 2007a, b). In estuarine waters, the Chl-a indicator includes five statistical measures applied to two salinity bands (total = 10 measures): mean, median, % of samples <10 μg Chl-a l−1, % of samples <20 μg Chl-a l−1, % of samples >50 μg Chl-a l−1 (Tables 1, 4 and 5). The index for community composition uses two separate measures of abundance in estuaries, viz. the total cell count and single species count (see Results and discussion section). The 5th percentile of annual DO concentrations and mean winter nutrient concentrations are applied in the physico-chemical element.

Other biological indicators: macroalgae, HABs, seagrasses

The measurement of secondary biological impacts that are indicative of nutrient enrichment is a key difference among the assessments. Inclusion of these indicators provides a more robust picture of the scope of nutrient related impairments including evidence of disturbance (e.g. low DO events, toxic algal blooms).

As previously discussed, TRIX and EPA NCA do not use additional biological indicators which may lead to a result different than results of the other three methods. The UK WFD, ASSETS and OSPAR COMPP methods all include additional biological measurements (Tables 1, 6, 7, 8, 9 and 10), although the way in which they are included varies among assessment methods.

The ASSETS method includes indicators for macroalgal abundance and nuisance and toxic bloom occurrences. While there are standards indicating the concentration of cells that result in nuisance (e.g. Aureococcus anophagefferens, Gastrich and Wazniak 2002) and toxic (e.g. Karenia brevis, USFDA 2007) blooms, there is no standard number of cells that indicates unacceptable or dangerous concentrations for mixed phytoplankton communities. Likewise there is no standard measure for the level at which macroalgae biomass becomes a problem in US waters. Thus, for both macroalgae and nuisance and toxic bloom occurrences an heuristic method is used for determination of status, where they are considered a problem if there is a negative impact on any biological resource and/or human activity (e.g. causes low DO, fish kills, illness in humans, smothers benthic organisms, etc.). This manner of determination allows for all observations to be used despite the lack of a standard measure or metric. Both indicators include the frequency of occurrence of problems and the nuisance and toxic bloom indicator also includes the duration of blooms to try to reflect the magnitude of bloom problems in a comprehensive way. The ASSETS method also uses observed changes in the areal distribution of SAV, including the magnitude of observed areal losses, where losses are considered a problem.

The OSPAR COMPP and UK WFD methods both use quantitative measurements of secondary biological impacts with a macroalgae indicator based on levels of biomass and area covered. Larger marine plants such as SAV (i.e. angiosperms by WFD terminology) are based on number of species present and areal loss or gain. HABs are not specifically measured under WFD or OSPAR COMPP guidelines, though nuisance algae such as Phaeocystis sp. are measured as part of the indicator for phytoplankton abundances and indicator species. Algal toxins are included in the OSPAR COMPP Category IV: Other Possible Effects.

Combining indicator metrics into an overall rating

The calculations for combining the individual indicators into a final assessment of eutrophication status vary among methods (Table 4). The WFD as applied in the UK uses a combination of biological and physico-chemical outcomes where the worst case assessment outcome is used to classify the waterbody as one of five categories: High, Good, Moderate, Fair or Poor.

The OSPAR COMPP method evaluates the components of Category I to determine whether there is evidence of anthropogenic nutrient enrichment. Evaluations of Category II indicators are used to determine evidence of accelerated growth of primary producers (i.e. phytoplankton, SAV, macroalgae, microphytobenthos). DO is used as an indicator of Indirect Effects that are caused by, for example, excessive phytoplankton or macroalgal biomass, and algal toxins are used as an indicator of Other Possible Effects. A one-out-all-out approach is used to determine the rating (+ or −) of each category. The final classification as Problem Area, Non Problem Area or Potential Problem Area is determined from the four Category scores, also using a one-out-all-out approach.

The ASSETS assessment, in a precautionary approach, combines the average of the primary and worst of the secondary symptom ratings by a matrix, thus giving higher weighting to the symptoms that indicate a greater level of disturbance. Here, if one indicator changes it is possible it will be reflected in the overall rating. The ASSETS assessment results in a rating based on a five grade scale: High (worst), Moderate High, Moderate, Moderate Low, Low (best).

The EPA NCA method uses the proportion of Poor versus Good/Fair scores for individual indicators to determine the final rating (Good, Fair or Poor) with no weighting of indicators. TRIX does not have a scoring component for individual indicators but calculates a final score from the combination of the four components into a linear model with a scaling factor so that the resulting score is between zero (oligotrophic) and ten (trophic) TRIX units, with grades of Productive, Moderately Productive and Scarcely Productive. The TRIX and EPA NCA methods give equal weight to all indicators but the lack of inclusion of additional biological indicators means that the eutrophic rating can indicate worst case conditions without evidence of secondary biological disturbances.

Results and discussion

Despite differences, the five approaches compared here all identify a common action threshold where management intervention is recommended or required (Fig. 4). For the WFD, this is the boundary between Good and Moderate status, which is comparable with boundaries between Moderate Low and Moderate impact (ASSETS), Good and Fair (EPA NCA), Non Problem Area and Problem Area (OSPAR) and Scarcely Productive and Moderately Productive (TRIX). The differences in the number of final assessment categories (two for OSPAR COMPP, three for EPA NCA and TRIX, and five for the UK WFD, ASSETS) make it difficult to compare results directly. For example, it is not possible to know whether the OSPAR COMPP Problem Area rating is closer to the worst rating in the five grade scales, or is closer to a Moderate rating. Since these assessments are designed to evaluate conditions in order to guide development of management measures, the methods that give a more representative indication of conditions may be more useful for this task. While it may be easier to score a system on a scale that is either acceptable or unacceptable such as the OSPAR COMPP, it may be more difficult to identify small changes in eutrophic condition, as a result of management or of further degradation, than it is with methods that have a three or five grade scale.

Detailed results from the application of the five methods (ASSETS, EPA NCA, TRIX, OSPAR COMPP and UK WFD) are shown for the Medway estuary only (Tables 6, 7, 8, 9 and 10) to illustrate the calculations behind the assessment outputs. The same calculations were applied to the Thames, and the final results are presented. Ten indicators are used among the five assessments, with final outcomes indicating that both the Medway (Table 11) and the Thames (Table 12) have Moderate to High level impacts associated with nutrient enrichment.

Table 11 Final outcomes for the Medway estuary from application of five eutrophication assessment methods to data from 2000 to 2007
Table 12 Final outcomes for the Thames estuary from application of five eutrophication assessment methods to data from 2000 to 2007

Medway assessment outcomes

Results for the overall eutrophic condition based on application of the UK WFD and ASSETS approaches (Table 11) give an overall status of Moderate. The OSPAR COMPP classifies the Medway as a Problem Area. Outcomes from the EPA NCA and TRIX classify the status of the Medway as Poor and Highly Productive, respectively, which is a worse rating than results of the other methods. The reason for the worse rating is that the ASSETS and UK WFD use secondary biological indicators that show only Moderate (i.e. for macroalgae) problems, which mediate the worse scores of the nutrient and high biomass outcomes. However, all of the assessment scores indicate that the waterbody is below the threshold indicating acceptable water quality and thus would necessitate management intervention to reduce and manage nutrient loads.

The nutrient indicator assessment results are Moderate for UK WFD, (+) or Problem Area for OSPAR COMPP and Poor for EPA NCA (Table 11, see also Tables 7, 8 and 9). The EPA NCA shows a more severe rating than the UK WFD, despite the concentrations of the EPA NCA samples having a much lower summer mean (1.94 mg l−1) than mean winter concentrations of the UK WFD (11.7 mg l−1). The UK WFD assessment result of Moderate reflects the use of turbidity as a mitigating factor in the UK WFD process whereby a higher threshold is used for DIN (>0.98 mg l−1 = Moderate) in turbid systems than in clear systems (>0.42 mg l−1 = Moderate). Once a waterbody is designated as turbid (i.e. > 10 mg l−1 SPM) then assessment ratings for nutrient status cannot fall below Moderate (i.e. to Poor or Bad) due to low light conditions (Anon 2008). The EPA NCA results show a more severe rating than the UK WFD because the thresholds are not modified to accommodate the high turbidity. The accommodation for high turbidity conditions makes a difference since if the rating for the EPA DIN indicator, and by extension the DIP indicator, was Moderate then the rating for the waterbody would be Moderate. Likewise, the UK WFD rating for the Medway would be Poor if the threshold was not elevated (Tables 5, 7).

The scores for Chl-a a are Good based on the UK WFD and EPA NCA methods, but are rated a Problem by the OSPAR COMPP method and High (worst) for ASSETS (Table 11). The differences are explained by the use of spatial and frequency components in the ASSETS method, the once per year sampling of the EPA NCA method, the use of only growing season data by OSPAR, and the use of annual Chl-a data in multiple statistical measurements used by the UK WFD. The ASSETS method calculated the 90th percentile of annual Chl-a data as 13.5 and 7.1 μg l−1 for the mixing (5 km2) and seawater (51 km2) zones respectively, both of which fall into the Moderate category (Table 5). When combined with high spatial coverage of the Moderate Chl-a values and periodic frequency of occurrence within both zones (see Fig. 3b), the end result is a High (worst) rating for Chl-a. The 90th percentile of the growing season (spring–summer) data for the OSPAR COMPP is 35.6 μg l−1, resulting in a rating of Problem area or (+) when compared to the OSPAR Chl-a threshold (>15 μg l−1). The Good rating based on application of the EPA NCA method may be biased by the sampling timeframe since highest values of Chl-a typically occur in the months of April and June (Fig. 3b), and the sample index period is June–September. The once per index period samples may have missed the time of maximum bloom concentrations. In fact, for 2 years (2001 and 2003) samples used in the assessment were from July sampling dates while in all other years the samples used were from August to September.

Results for the UK WFD Chl-a indicator are rated as High (best) because 10 (out of a maximum of 10) statistical measurements did not exceed the threshold value (N = 10, EQR = 1.0; Table 5). Results from the phytoplankton indicators (single species and total taxa counts) are classified as Good (% exceedances = 19.8%, EQR = 0.61; Table 5). The two EQR scores are combined for a final classification of High (best; average EQR = 0.8) for the phytoplankton BQE (Table 8).

The results for DO are consistent among all methods (Table 11) despite the use of different timeframes (i.e. annual or seasonal) for data analysis (Table 5). Results from all methods indicate no significant problems with DO in the Medway estuary which rarely falls below 4 mg l−1 (Fig. 3c).

Macroalgal surveys carried out on the Medway have shown the affected area to be greater than 15% and the mean biomass (over affected areas) to be greater than 500 g m−2 resulting in a status of Moderate based on the UK WFD method and Problem area or (+) based on the OSPAR COMPP method. The ASSETS method results show a Moderate level of macroalgal abundance due to observed problems on a periodic basis (Fig. 6a; Table 11). It should be noted that the heuristic method used by the ASSETS method and data from measured surveys give the same result. However, there are no problems indicated by any method for HABs, phytoplankton indicator species, or kills of zoobenthos or fish. Due to the high turbidity of estuarine waters, seagrasses are insignificant and thus are noted as Not Applicable. Seagrasses are not included in the UK WFD, OSPAR COMPP and ASSETS assessment calculations.

In summary, the combination of high nutrients, high periodic Chl-a biomass and the presence of macroalgal blooms indicates that this is a Problem Area requiring some form of management intervention.

Thames assessment outcomes

The results of the application of the WFD and ASSETS methods classify the Thames as Moderate in status (Table 12). Results of the EPA NCA and TRIX, classify the Thames as Poor and Highly Productive respectively, signifying worse conditions than the results of the UK WFD and ASSETS. The OSPAR COMPP results show the Thames to be a Problem Area which is consistent with both the Moderate and Poor results. As with the Medway, despite the differences in assessments, they all identify the Thames estuary as impacted by nutrients and requiring management intervention.

All approaches indicated anthropogenic nutrient enrichment of the Thames. There are, however, differences among ratings which are due to the differences in the application of each indicator assessment. For the UK WFD, the high turbidity conditions found in the Thames estuary (Devlin et al. 2009) mediates the nutrient assessment rating by applying a higher nutrient threshold (see Table 5; Devlin et al. 2008) resulting in a Moderate rating. The EPA NCA method rates nutrient concentrations as Poor for both DIN and DIP and the OSPAR COMPP method results in a rating of Problem levels or (+).

All approaches show evidence of accelerated growth of phytoplankton, measured as Chl-a, as a response to nutrient enriched conditions. Scores for Chl-a are Moderate for UK WFD, Problem levels or (+) for OSPAR COMPP, Poor for EPA NCA and High (worst) for ASSETS. The 90th percentile for the ASSETS analysis gives a result of 50 and 11.6 μg l−1 in the mixing and seawater zones, respectively. The combination of these higher concentrations with high spatial coverage (over >50% of the waterbody) and the periodic or annual occurrence of such high concentrations, leads to the most severe rating of High (worst). The EPA NCA assessment method results in a Chl-a rating of Poor since concentrations were above 5 μg l−1 at >93% of all stations and above 20 μg l−1 at more than 25% during the summertime sampling period (i.e. June–September; Fig. 3 T3 and T4). Results from the OSPAR COMPP show that the 90th percentile of summertime Chl-a values (68 μg l−1) are greater than the threshold of 15 μg l−1 resulting in a rating of Problem Area or (+). Results for the UK WFD Chl-a indicator are classified as Moderate (0.40) because 5 out of a maximum of 10 statistical measurements exceeded a reference value (N = 5, EQR = 0.4, Table 5). Results from the UK WFD phytoplankton indicators (single species and total taxa counts) are classified as High (best; % exceedances = 7.4%, EQR = 0.85, Table 5). The two EQR scores are combined for a final classification of Good (average EQR = 0.63) for the phytoplankton BQE.

The results for the assessment of DO vary among methods with ASSETS results giving a rating of Low or best condition, compared to results of other methods of Good (UK WFD), Fair, (EPA NCA) and Non Problem Area or (−) for OSPAR COMPP. The seasonal versus annual timeframes of data seem to account for these differences as the use of annual data results in a higher concentration (better condition) of DO than growing season or summertime data only. For ASSETS, 10th percentile DO concentrations were calculated to be 3.2 and 6.7 mg l−1 in the mixing (47 km2) and seawater zones (201 km2) respectively whereas for the OSPAR COMPP and UK WFD, DO was calculated to be 2.4 mg l−1 (Table 4) over the whole system (Fig. 3c). Additionally, once the spatial coverage (low in both zones) and frequency of occurrence (periodic) are considered, the ASSETS rating remains low or best condition. The difference in sampling timeframes (annual data for ASSETS and UK WFD, summertime for OSPAR COMPP) and statistical criteria (fifth percentile for UK WFD, 10th percentile for ASSETS) as well as the consideration of spatial distribution of observed concentrations that are considered a problem, results in ASSETS giving a less severe score compared to the others, though showing that there are sometimes low DO values within the mixing zone.

The indicators of ecosystem function are fairly consistent among the three methods that include such measures, showing no significant problems with HABs, phytoplankton indicator species or macroalgal abundance, and no recorded zoobenthos or fish kills. It should be noted that in this system, as for the Medway, high turbidity limits seagrass growth. This indicator is considered Not Applicable and is not used in the assessment calculations.

In contrast to the Medway, where there was evidence of secondary biological impacts in the detection of macroalgal blooms, the Thames has no measured biological disturbance, however, the final assessment for all approaches was below the acceptable management guidelines suggesting that management is recommended or required.

Conclusion

The application of the five different eutrophication assessment methods to two estuarine systems in the UK in this study resulted in similar final assessment results when applied to the same data (2000–2007). All approaches indicated that both waterbodies required management intervention to reduce the impacts of nutrient enrichment. However, detailed results among the methods differed; those that use more than one biological indicator indicated (except for Moderate macroalgal problems in the Medway) that secondary biological impacts of nutrient enrichment were minimal. There are differences in timeframes for analysis of data (seasonal versus annual), characteristics used in the indicator metrics (concentration, spatial coverage, frequency of occurrence) and the way that indicators are combined to determine the final eutrophication status. These differences resulted in variable results for some indicators, such as nutrients (e.g. Moderate for UK WFD and Poor for EPA NCA) and Chl-a (e.g. ASSETS was High [worst] and UK WFD and EPA NCA were both Good for Medway). However, all methods show that management actions are recommended or required for both estuaries.

Comparison of the application of and results from the five approaches highlighted differences and complexities of application among the methods. Since eutrophication assessment is required by many countries and, along with targeted research, contributes to the development of management measures it is important that scientists, resource managers and legislators understand how well conditions are represented by the methods they use. Results from this study illustrate that there is a common understanding of the progression of eutrophication and the changes in key parameters that indicate eutrophication status. While not promoting the use of any single method, there are characteristics of specific methods that appear to make them more accurate with respect to evaluation of conditions than some other methods. Those characteristics recommended for inclusion in assessment methods to assure that results are useful for management include:

  • The selection of thresholds should accommodate the relative sensitivity (e.g. adjusted DIN threshold on account of high turbidity) of the system to nutrient related degradation so that conditions are not mis-represented;

  • Use of annual data and sampling through the year will increase the likelihood that eutrophic conditions will be observed and accurately evaluated;

  • The use of spatial coverage, frequency of occurrence, and duration (for HABs) in an indicator metric provides a way to capture more comprehensively, the magnitude of the problem;

  • Inclusion of secondary biological indicators (e.g. macroalgal abundance, nuisance and toxic blooms) provides a fuller picture of the scope of nutrient related problems;

  • The combination of indicators into an overall condition rating should use a method that is able to detect changes due to successful management measures or pollutant related degradation. For example, the use of a multi-grade scale for indicators and for overall eutrophic condition provides greater sensitivity to detect changes in condition with successive assessments than a two-grade scale;

  • Nutrient loads should be linked to impairments within the assessment process so that results will be relevant and useful for the development of targeted and successful management measures.

While these recommendations should not be considered inclusive, we hope that these observations will be helpful to resource managers for selecting an appropriate eutrophication assessment method, and to improve methods that are already being used.