Introduction

Eutrophication of estuarine environments is increasing globally as a result of increased urbanisation and intensification of agriculture in the coastal zone (Cloern 2001; Baird et al. 2003; Rabalais et al. 2007; Smith 2007). These changes impact heavily upon the chemical and biological characteristics of these ecosystems (Vaalgamaa 2004). Disproportionately high nutrient loads from anthropogenic sources within catchment areas can have significant detrimental effects on estuarine ecosystems (Loneragan and Bunn 1999). Therefore, nutrient over-enrichment, and the associated environmental stress it creates, is of great concern to estuarine scientists and managers (Flemer and Champ 2006). It is difficult however, to determine the eutrophication trend in estuaries over long periods of time because of the complex nature of such environments (Colman et al. 2002) and the scarcity of long-term monitoring records with only a small number of water quality datasets extending longer than the past few decades (Eyre 1997).

In cases where there is an absence of long-term data related to estuarine water quality, a paleoecological approach can be employed, by utilising the historical information stored in the stratigraphic record (Cooper et al. 2004). Diatom paleolimnological techniques permit qualitative and quantitative reconstruction of water quality by using information on the changing relative abundances of species identified in sediment cores. Trends in trophic status can then be inferred over long time periods (Tibby and Reid 2004). This involves identifying the relationships that exist statistically between modern water quality variables and surface sediment diatom assemblages, and applying these data to fossil diatom assemblages using the transfer function technique (Battarbee et al. 2001).

The use of paleoecological techniques and transfer functions in the Northern Hemisphere, specifically using diatoms in the coastal environment, is well established. Diatom-based transfer functions have been developed for sea-surface temperature (Jiang et al. 2002), salinity (Juggins 1992; Campeau et al. 1995; Fritz et al. 1999; Ryves et al. 2003), total phosphorus (Kauppila et al. 2002) and total nitrogen (Clarke et al. 2003; Ellegaard et al. 2006), with a number of other papers (Reavie et al. 1995; Andrén 1999; Reavie and Smol 2001; Ramstack et al. 2003; Weckström 2006) using the same techniques to investigate eutrophication. Whilst the use of transfer functions has recently been cautioned (Ginn et al. 2007, Reavie and Juggins 2011), they remain of value in areas where region specific diatom ecological tolerances are unknown and monitoring data is unavailable. In Australia the application of a paleoecological approach to understanding coastal systems is still very much a developing science (Taffs 2001; Saunders et al. 2007; Saunders et al. 2008; Saunders et al. 2009; Taffs et al. 2008; Saunders 2011, Tibby and Taffs 2011).

Information provided by paleoecological techniques can provide ranges of natural variability and baselines for management strategies (Saunders and Taffs 2009). However, applying such techniques within estuarine environments poses some challenges. Statistical calibration methods based on point sampling of water quality and surface sediment diatoms assume that such a sample represents the environment in which the diatom lives. This approach has received criticism (Birks 1998). Thus, studies that account for environmental variability in transfer function development have not been widespread (Bunbury and Gajewski 2008). However, capturing ranges of environmental variability when creating calibration datasets is often hampered by unavoidable spatial and practical constraints. Where such constraints do exist on research regimes, efforts can be allocated in different ways. One research strategy is to study a small number of estuaries intensively (a number of times) to capture variability over time or space. Another strategy is to select a large group of sites and sample these once over a short time period, to capture the variability that exists between sites in a known geographic area at one point in time.

Australian estuaries are generally oligotrophic (Glibert et al. 2006). This can potentially hamper studies seeking to develop relationships between estuarine biota and water chemistry, due to rapid nutrient cycling which leaves bio-available components as unmeasurable. This is in contrast to many Northern Hemisphere estuaries. Many systems have been overwhelmed by anthropogenic nutrient inputs, leaving significant measurable portions of bio-available nutrient forms in the water column that can be related to biological populations. At present, no dataset exists for inference of historical estuarine nutrient status in Australia. The aim of this paper is to establish the relationships that exist between diatoms and water quality parameters in east Australian sub-tropical estuaries, and to assess the validity and value of inference models that may be used to provide retrospective assessment of nutrient levels in estuarine systems.

Methods

Fifty-two sub-tropical estuaries were chosen for inclusion in an east Australian estuary diatom calibration set (Table 1). All sites were sampled in Spring 2006 when sub-tropical estuaries have maximum nutrient loads after the dry winter period. Sites were located between the most southern New South Wales site, Coffs Harbour Creek (30.31°S), and the most northern site in Queensland, the Calliope River (23.8°S), covering around 700 kilometres of coastline (Fig. 1). The fifty-two estuaries were selected based on location, landuse and ecological information from OzCoasts (2009), to maximise a gradient of trophic status.

Table 1 Selected information and environmental data from 52 East-Australian estuaries included in the training set
Fig. 1
figure 1

Locality map showing the study area on the east Australian coastline, and the location of the study area containing the 52 sub-tropical estuaries used in the diatom calibration dataset. Estuary numbers correspond to those listed in Table 1

To ensure that samples were taken from areas where significant continuous sediment build up was present, all sites were explored by boat, or if the ebb tidal influence was too great, on foot. Sites selected were generally on the margins of the estuary, with some located in a backswamp environment protected from the main estuary flow by mangrove forest. When collecting a calibration data set, it is important to maximize the gradient of the variable of interest, and minimize the gradients of all other variables measured (Birks 1995). In estuaries the salinity is highly dynamic dependent upon tidal range and freshwater input. To minimise this gradient all sites were sampled on the ebb tide, and within 10 km of the estuary mouth. This was necessary as the major goal of this research was to compile a dataset in which nutrients were accountable for the significant variances that existed in the diatom assemblage data. At each site, surface sediment sampling (top 0.5 cm) was collected in one of two ways, all on the ebb tide, with either a Glew mini-corer (Glew 1991) from a boat, or by accessing the exposed depositional area at low tide (Hassan et al. 2006). A 42-mm-width polycarbonate tube was used for both methods, which was mounted on a core extraction device to remove the upper 0.5 cm of each core. Sediment samples were kept at 4 °C until processed.

At the same sites, a 1-L water sample was collected from each site as close to the position of the sediment surface sample as possible, and taken from a depth of ~20 cm from an area where water was >1 m deep. This sample was frozen as soon as possible for analysis of Total Phosphorus (TP), Total Nitrogen (TN), ortho-phosphate (PO4), ammonium (NH4), nitrite (NO2) and nitrogen oxide (NOx) in the Environmental Analysis Laboratory at Southern Cross University using the nutrient method standard APHA 4500 (AHPHA 1998). Field water quality measurements (temperature, pH, dissolved oxygen, and conductivity) were taken from the same position as the water sample with a Horiba U-10 meter.

Diatom samples were prepared using the method of Parr et al. (2004). Slides were inspected using phase contrast and oil immersion at 1000× magnification under an Olympus CX40 compound light microscope fitted with an Olympus DP10 digital camera. Between 300 and 400 diatoms frustules were identified and counted from each sample to determine the diatom community assemblage. Diatoms were counted across several transverses of each slide to ensure that counting was representative. Taxonomy was based on the works of Witkowski et al. (2000) and Taffs (2005), with some input from Foged (1978) and Gell et al. (1999). All taxa were photographed and are archived with the first author.

All bio-available nutrients were removed from the dataset prior to analysis, due to the large number of values that were below detection for many estuaries. Twelve sites had PO4, NH4, NO2, and NOx concentrations all below detection limit. All other water quality variable measurements that were below detection limit, or were equal to zero, were given a value of half the detection limit for statistical programming purposes. A detrended correspondence analysis (DCA) with detrending by segments and down-weighting of rare species was performed on the species data to establish whether species distribution was unimodal or linear. As axis lengths were >3 SD units, canonical correspondence analysis (CCA) was deemed to be the most appropriate means of examining species responses to environmental gradients (ter Braak 1988). Prior to the CCA, environmental variables were checked for correlation and screened for distribution. Following this, TN was log10 transformed, and TP 1/sqrt transformed, to eliminate skewness. CCA of diatom data was then performed to identify relationships between diatom assemblages and water quality gradients. Forward selection was applied to determine which variables were found to make significant contributions to explanation of variance in the species data. Partial CCAs on each variable with the remaining environmental variables as co-variables were employed to determine if TP and TN were suitable for final modelling. Variance partitioning was used to determine the proportion of variation explained between TN and TP, and TP and pH. All ordinations were performed using the software program R (R Development Core Team 2006).

Weighted Averaging (WA), with inverse de-shrinking, and Weighted Averaging Partial Least Squares (WA-PLS) were selected as the transfer function techniques to be employed. All models were created using environmental transformed data, and performance was assessed using leave-one-out cross validation (jack-knifing). Models were first run with all sample sites included, and then re-run after removing the 12 sites that had concentrations of PO4, NH4, NO2, and NOx below detection limit. To evaluate the reliability of the reconstructions, the performance of the transfer function was determined from the correlation (r) between the observed and inferred data, the root mean-squared error (RMSE, observed-inferred), and the jack-knifed r 2 \( \left( {r_{\text{jack}}^{{^{ 2} }} } \right) \) and RMSE, or RMSE of prediction (RMSEP). The first two units measure the ‘apparent’ error, while the latter two are a more robust indicator of the true predictive capacity of the transfer function (Dixon 1993). The inferred value of the water quality variable for each sample is then based on the optima and tolerance of the taxa in the remaining samples of the training set data (Birks et al. 1990). All transfer functions were developed using the software program C2, version 1.4 (Juggins 2007).

Results

Table 1 outlines the environmental data measured for each estuary. These estuaries range in trophic condition from the relatively undisturbed Burrum River (TP = 0.017 mg/L, TN = 0.247 mg/L) and Rodd’s Harbour (TP = 0.013 mg/L, TN = 0.268 mg/L) in QLD, to more heavily modified estuaries such as Belongil Creek (TP = 0.142 mg/L, TN = 0.799 mg/L) in northern NSW and Eprapah Creek (TP = 0.642 mg/L, TN = 0.667 mg/L) in south-east QLD. The TP gradient was not evenly spread from the minima to the maxima as eastern Australian estuaries tend to be either pristine to slightly influenced by anthropogenic activities or heavily modified by urban and agricultural landuses. The pH values were all relatively constant due to significant marine influence with an average of 8.44. Dissolved oxygen ranged from 3.37 mg/L (Pimpama River, QLD) to 9.81 mg/L (Cudgera Creek, NSW), averaging 7.35 mg/L among all sites. Temperature showed some variation considering all sampling was conducted in the spring, with Terranora Creek in NSW (16.9 °C) the lowest, and Tooway Creek in QLD (25.1 °C) the maximum, recorded in this dataset. Average temperature of all sites was 21.8 °C. The electrical conductivity average was 47.9 mS/cm, with the standard deviation of 7.19 mS/cm indicating that the sampling strategy for minimising the salinity gradient showed some success. PO4 had a highest recorded measurement of 0.437 mg/L at Eprapah Creek in QLD, with twelve sites recording levels below the detection limit. The maximum NH4 concentration measured was 0.312 mg/L at Currumbin Creek in QLD, with over half (n = 31) of the sites having levels of ammonia that were below detection.

The diatom species identified from surface sediments included 419 species from 74 genera, 100 of which (from 28 genera) had a relative abundance >1 % in two sites (Table 2) and hence included in the statistical analyses. These species included mainly benthic taxa. The coastal diatom Planothidium delicatulum (Kützing) Round and Bukhtiyarova, was the most dominant species, occurring in all but five surface sediments, contributing upwards of 29 % of the relative abundance in the Beelbi Creek, Caboolture River, Cudgen Creek and Kolan River sites. Mayamaea atomus (Kützing) Lange-Bertalot, a small (<10 μm) species associated with freshwater eutrophic conditions (Juttner et al. 2003; Noga and Olech 2004) occurred in less than one-third of the sites sampled, but was extremely dominant in two geographically related northern NSW sites, the Belongil Estuary and the Brunswick River, with respective abundances of 31.33 and 49.54 %. Other dominant species were the marine-brackish diatoms Amphora coffeaeformis (Ahardh) Kützing, and Biremis lucens (Hustedt) Sabbe, Witkowski and Vyverman.

Table 2 Training set diatom taxa that occurred at least 1 % abundance in two or more estuaries, and their optima and tolerance values for TP, calculated by weighted averaging with inverse de-shrinking calibration and regression using the dataset that excluded all sites that had all bio-available nutrients below detection limit (BDL)

Canonical correspondence analysis indicated that no variable had a variance inflation factor >10, so all were tested in a partial CCA with each remaining variable as a co-variable (Table 3). This identified the significant water quality variables (p value <0.05) were TN, TP, pH, dissolved oxygen and temperature. The eigenvalues indicated that these five parameters accounted for >16 % of the variance in diatom species data (Table 4). CCA biplots (Fig. 2) show that TN and TP are correlated with axis 1, and temperature is correlated with axis 2. A correlation matrix on this reduced dataset also indicated interaction (0.56) between TN and TP. Conductivity was identified prior to sampling as a variable within the estuarine environment that may exert too much influence over the species data, hence the protocol of always sampling on the ebb tide, and within 10 km of the estuary mouth. The relatively high p value of conductivity (0.11) given by the CCA and permutation test with conductivity as the single constraining variable indicates that this was possibly successful in minimising the influence of salinity within this dataset. Variance partitioning was used after removing conductivity from the environmental data (Table 5) and indicated that the variation in the diatom data explained by TP was somewhat confounded with TN (p value 0.16), however, pH is not strongly confounded with TP with a p value of 0.35.

Table 3 Individual CCA results for all the environmental variables
Table 4 Results of the CCA on species and environmental data (TN, TP, pH, DO and Temp)
Fig. 2
figure 2

Canonical Correspondence Analysis of the diatom species and environmental data. SITES is the site distribution across the environmental gradients. Site numbers correspond to those in Table 1. SPECIES is the distribution of the diatom species across the environmental gradients. Species numbers correspond to those in Table 2

Table 5 Results of variance partitioning for TN and TP, TP and pH after removal of conductivity from the environmental dataset

Despite being significant variables, models for pH, DO and Temperature were not developed, due to the nutrient focus of the paper. Transfer functions were developed for TP and TN in the statistical package C2 (Juggins 2007) using both simple weighted averaging with inverse de-shrinking (WA) and weighted averaging partial least squares (WA-PLS) methods. This was performed to create models that (1) used all study sites, and (2) models that eliminated the 12 study sites which had concentrations for PO4, NH4, NO2 and NOx below detection limit (BDL).

Good correlation existed between observed and inferred values for TN using both WA and WA-PLS. However, the jack-knifed r 2 values indicated that the WA \( \left( {r_{\text{jack}}^{{^{ 2} }} \, = \,0.19} \right) \) and WAPLS \( \left( {r_{\text{jack}}^{{^{ 2} }} \, = \,0.23} \right) \) transfer functions performances were not statistically strong, and would be unlikely to produce reliable TN reconstructions. The removal of the 12 study sites that had PO4, NH4, NO2 and NOx BDL resulted in only a slight increase in the \( r_{\text{jack}}^{{^{ 2} }} \) values for TN \( \left( {{\text{WA}}\,r_{\text{jack}}^{{^{ 2} }} \, = \,0.27,{{\text{ WA}}{\text{-}}{\text{PLS}}}\,r_{\text{jack}}^{{^{ 2} }} \, = \,0.29} \right) \).

Inference models developed for TP also displayed good correlation between observed and inferred values for both WA and WA-PLS methods. Jack-knifed r 2 values for the WA model were almost identical to WA-PLS component 2 \( \left( {r_{\text{jack}}^{{^{ 2} }} \, = \,0.25\,{\text{and }}0. 2 4 {\text{ respectively}}} \right) \), with the WA model performing slightly better based on RMSEP scores. The removal of the 12 study sites which had PO4, NH4, NO2 and NOx BDL resulted in a vast improvement in the \( r_{\text{jack}}^{{^{ 2} }} \) values for TP, for both WA \( \left( {r_{\text{jack}}^{{^{ 2} }} \, = \,0.65} \right) \) and WA-PLS \( \left( {r_{\text{jack}}^{{^{ 2} }} \, = \,0.69} \right) \) models, indicating that TP reconstructions from this dataset may be used to form reliable TP reconstructions as shown in Logan et al. (2011) and Logan and Taffs (2011). Performance scores for each model, and model specifications, can be viewed in Table 6. The diatom species optima and tolerance ranges for TP developed using WA can be viewed in Table 2.

Table 6 Weighted averaging with inverse de-shrinking (WA) and weighted averaging partial least squares component 2 (WA-PLS 2) calibration and regression performance results for TP and TN

Diatoms that are shown to be good indicators of nutrient status along a TP gradient by this study are shown in Fig. 3. The highest abundances of the genus Cyclotella, which may provide evidence for increased productivity due to nutrient enrichment (Whitmore et al. 1996; Koster et al. 2005; Weckström 2006), were found at Eprapah Creek and the Evans River (Cyclotella meneghiniana Kützing), as well as Tooway Creek, the Nerang River and Pimpama River (Cyclotella striata (Kützing) Grunow). Two other planktonic species, Aulacoseira italica (Ehrenberg) Simonsen and Actinocyclus normanii (Gregory) Hustedt, occurred at relative abundances >15 % at two and three sites respectively.

Fig. 3
figure 3

Diatom species distribution along the Total Phosphorus gradient from 52 sub-tropical east Australian estuaries

Discussion

The majority of diatom species in the calibration dataset were benthic species. The abundance of planktonic species increased at sites with nutrient enrichment. An increase in abundance of planktonic diatoms has been noted in response to nutrient enrichment in previous work (Cooper and Brush 1993; Andrén et al. 1999; Cooper et al. 2004; Weckström 2006; Weckström and Juggins 2006; Saunders et al. 2008). Two sites, Eprapah Creek and the Richmond River, had TP concentrations that were more than twenty times above the ANZECC (2000) water quality trigger values. This coincided with the highest relative abundances (18.32 and 18.1 % respectively) of the planktonic taxa C. meneghiniana, which provides strong evidence for this species to be indicative of nutrient enrichment, and in particular elevated TP. The planktonic species, A. italica, recorded its two highest relative abundances in the Logan River and Burpengary Creek, sites which had TN concentrations (0.862 and 0.561 mg/L respectively) well in excess of the ANZECC (2000) trigger value, which could infer that this species also has value as an indicator of nutrient enrichment, perhaps more specifically for elevated TN. Another planktonic species found in this dataset, A. normanii, has been related to nutrient enrichment in brackish ecosystems (Andrén et al. 1999). A. normanii had its highest abundances at two sites, the Clarence and Mary River’s, both of which had below average TP concentrations, and relatively low TN concentrations, which may be indicative of allochthonous input from a possible eutrophic section of each river upstream of the estuarine site sampled.

The distribution of diatom taxa along the TP gradient in this study identified some species that possess indicator value (Fig. 3). C. meneghiniana, a planktonic species associated with increasing nutrient levels in other research (Andrén et al. 1999; Saunders et al. 2008), was at the upper end of this gradient. These occurrences were not only associated with high TP concentrations, but also large abundances of this species. This supports the findings of Andrén et al. (1999) and Korhola and Blom (1996), and suggests that C. meneghiniana is likely to be a universal indicator of elevated nutrient status in estuaries. Conversely, the presence of the epipelic Achnanthes fogedii Håkansson and Navicula cryptocephala Kützing in either fossil or surface sediments is likely to be indicative of lower nutrient concentrations in estuaries, based on their positioning on the TP gradient in this study.

The CCA analyses indicated that TN and TP were exhibiting significant influence on the diatom assemblages, however robust transfer function scores were only achievable with TP. While pH was also a significant variable, variance partitioning showed pH was exhibiting little influence on the strength of the relationship between diatoms and TP. Further analysis through variance partitioning indicated that although relatively strong, the signal from TP is not completely unique and may be shared with TN. The relationship between TN and TP is also identified by the correlation (0.56) between these two variables in the training set. Such a correlation reduces the ability to disentangle the effects of each variable completely.

This research, in particular the results of the TP transfer function, also suggests that measurements of ambient nutrient concentrations may only provide information about eutrophication status when nutrient inputs overwhelm the ability of biological responses (i.e. increase in algal populations and biomass). That is, when there is an “un-used” portion of nutrients not being consumed during primary production, thus becoming a significant and measureable part of ambient water quality conditions. This is based on the performance of the TP model after the removal of sites where the bio-available nutrients were all BDL. Prior to the removal of these sites, jack-knifed r 2 values for TP (Table 6) indicated a weak relationship between diatom assemblages and nutrient concentrations relative to the relationship formed between TP and diatom assemblages once these sites had been removed. It is possible that this may be indicative of circumstances where diatoms, as part of primary production, have contributed to the consumption of all bio-available nutrients in the sites that had PO4, NH4, NO2 or NOx concentrations BDL. This infers that although diatoms are likely to have a relationship with these nutrients, it is unable to be tested as primary production has removed the bio-available component of nutrients from the water column to the point where it is unmeasurable, possibly reducing the strength of diatoms as nutrient status indicators at these sites. The removal of sites that had bio-available nutrients BDL left a dataset consisting of sites with at least one of PO4, NH4, NO2 or NOx as a measurable part of the ambient water quality. This component could then be related to the diatom assemblage that was formed during primary production by the use of bio-available nutrients.

This model has been used to provide reconstructions of nutrient conditions in a heavily modified estuary (Logan et al. 2011) and a pristine estuary (Logan and Taffs 2011). These estuaries are currently at either extreme of the sampled TP gradient. The reconstructions in these papers are well supported by geochemical results. However further sampling is required in the mesotrophic range to further add to the validity of this model (Telford and Birks 2011). Reconstructions of mesotrophic sites using this model should be approached with caution. However, a transfer function for nutrient inferences is of value for this region as local ecological tolerances of diatom species have not previously been identified and monitoring data for these estuaries in unavailable. A cautioned reconstruction of nutrient changes within a heavily modified estuary has proven to be of value for land managers within the estuary catchment area (Hickey pers. comm. 2009).

Conclusion

This is the first study to use surface sediment diatoms to develop models for the inference of historical estuarine total nutrient concentrations in Australia, and to our knowledge, in the Southern Hemisphere. Given the lack of long term data on estuarine water quality (Eyre 1997), and indeed on eutrophication processes in Australia (Tibby 2004), development of techniques that are efficient in inferring past TN and TP concentrations are required. Statistical analysis indicated that only TP has shown it can be used for modelling purposes, although TP may be confounded with TN. Given the complex nature of estuaries, there may be underlying processes and other variables affecting the diatom distribution in relation to both TN and TP. The distribution of diatom species along the TP gradient did identify two planktonic species, C. meneghiniana and A. italica, as indicators of enhanced nutrient status in estuaries in sub-tropical eastern Australia. Previous work (Admiraal et al. 1984) has pointed out that the relationship between estuarine nutrient characteristics and diatom production is neither simple nor linear. Thus, there is need to develop a deeper understanding of the cycling of nutrients in Australian estuarine environments, and in particular, the manner in which diatoms interact with and influence nutrient cycling processes. For the moment, the TP inference model reported in this paper is of sufficient robustness to be used a starting point to infer previous trophic status of sub-tropical east Australian estuaries. It is hoped that this will contribute to improvements in the understanding of these dynamic ecosystems over longer periods of time and, in turn, enhance management efforts directed towards them.