Introduction

Estuaries and their associated coastal ecosystems are among the most productive habitats in the world because they receive nutrients from both terrestrial and marine sources, and more recently, anthropogenic inputs (e.g., Nixon 1995; Wiseman Jr. et al. 1999; Anderson et al. 2002). Geologically, they are relatively transient environments that formed as sea level rose following the last ice age and they exist in a balance between the present day sea level, freshwater inflow, and sediment deposition and transport (Day et al. 1989; Perillo 1995; Guccione 1995). Changes in the relative influence of sea level pressure and freshwater flow affect the salinity and nutrient concentrations across coastal watersheds, which in turn control the composition and distribution of algal communities (Therriault and Levasseur 1986; Mallin et al. 1993). Enhanced freshwater flow can reconfigure estuarine environmental gradients and change algal communities in different ways: increases in precipitation reduce salinity and increase atmospheric nutrient deposition (e.g., Fisher and Oppenheimer 1991; Mallin et al. 1993; Paerl 1995; Nixon 1995) and runoff from the watershed increases inputs of terrestrial nutrients (e.g., Glibert et al. 2006; Zhang et al. 2009), which can lead to eutrophication and hypoxia (e.g., Paerl 2006; Stevens et al. 2006; Rabalais et al. 2010). Estuarine ecosystems are therefore particularly sensitive to human alteration of runoff, terrestrial nutrient inputs, and sea level fluctuations.

Estuarine ecosystems exhibit temporal and spatial variation in environmental conditions that are influenced by surface and groundwater exchange from freshwater inflows and the sea. Characterizing the spatial dynamics of biological assemblage structure in order to evaluate and predict changes in the ecosystem on different time scales is an important goal for understanding estuary dynamics and community ecology (Soininen et al. 2004; Cottenie 2005). A spatial perspective of ecological communities can also help define how and where anthropogenic activities affect ecosystems and help evaluate management plans and monitoring strategies (Craig and Bosman 2013).

Diatom assemblages are good indicators of environmental transitions across estuaries because of their widespread occurrence and measurable preferences and tolerances for different environmental conditions (Battarbee 1986; Cooper 1999). Their short generation times allow them to respond to changes in environmental conditions over time scales that integrate high variation in physicochemical features but that also provide an early signal of directional change (Battarbee 1986; Stevenson et al. 2010). Diatom diversity and composition can vary significantly over small spatial and temporal scales (Parsons et al. 1999; Smol and Cumming 2000; Reid and Ogden 2009), enabling interpretation of environmental characteristics in transitional areas where taxa might tolerate local conditions but do not thrive in them (Gaiser et al. 1998; Frankovich et al. 2006). Where diatoms have strong relationships with environmental conditions, ecological response functions or transfer functions can be developed to quantitatively infer environmental variables from their composition (Fritz et al. 1991 and 2010; Birks 1995 and 1998; Juggins 2013). Because they preserve well in sediment, diatom assemblages can thus provide inferences about environmental conditions occurring in estuaries prior to human alteration and over long time scales using sediment core analysis (Cooper et al. 2010).

We investigated the spatial distribution of diatoms relative to environmental gradients in the Charlotte Harbor watershed, on southwest Florida’s Gulf Coast. South Florida, with its shallow topographic relief, porous karst aquifer, and intensely developed coastline, is particularly vulnerable to sea level rise (e.g., Ross et al. 2009; Saha et al. 2011; summarized in Noss 2011) and has an extraordinary history of hydrological alteration due to early efforts to drain Everglades wetlands for agricultural and urban development (summarized in Grunwald 2006; McVoy et al. 2011). The Charlotte Harbor watershed encompasses a large area and includes three major rivers that provide most of the freshwater inflow to the estuary. The three rivers represent distinct regions within the Charlotte Harbor watershed that differ in the extent and nature of their surrounding land use, and history of human alteration. The river watersheds contain a complex matrix of hydrological modification, agricultural areas, phosphorus mines, urban areas, and relatively pristine conservation lands that provide contrasting environmental gradients for examining controls on diatom assemblage structure.

During periods of high freshwater inflow such as after a tropical storm, episodic hypoxia occurs in Charlotte Harbor in response to nutrient loading and stratification (Tomasko et al. 2006; Stevens et al. 2006; Kim et al. 2010), and in late summer the hypoxic zone can approach 90 km2 (Camp Dresser and McKee, Inc. 1998). The spatial and temporal characteristics of nutrient loading across the watershed are complex, but previous research has linked nitrogen inputs to phytoplankton productivity that may cause expansion of the hypoxic zone (McPherson and Miller 1990; McPherson et al. 1990; Montgomery et al. 1991; Turner et al. 2006). There is evidence that nutrient limitation shifts from nitrogen limited in most of this watershed to phosphorus limited farther south (Heil et al. 2007). Algal productivity related to these hypoxia events responds differently to nutrient additions along a salinity gradient in the Charlotte Harbor estuary (Montgomery et al. 1991; Morrison et al. 1998). Further characterization of these environmental gradients and their control on biological communities are important for understanding the spatial structure of environmental change, as well as for guiding appropriate water quality management. Diatom assemblages may provide insight into changes along these gradients in the Charlotte Harbor estuary, since they have been used successfully to reveal changes in salinity and nutrients gradients in other parts of Florida (Gaiser et al. 2006; Huvane 2002; Frankovich et al. 2006; Wachnicka et al. 2010; Wachnicka and Gaiser 2011). Some studies have shown that taxon responses are consistent across watersheds, while others have shown that they are different on subbasin scales (Armitage et al. 2006; Telford et al. 2006; Wachnicka et al. 2010). Extensive spatial sampling across the Charlotte Harbor estuary and its watershed was conducted to provide insight into the relationships between taxa and environmental gradients and enable inference about past conditions.

Specifically, our goals in this study were to (1) characterize diatom assemblage distribution across the Charlotte Harbor estuary and its inflows at a time of low freshwater flow that maximizes capacity for detecting underlying spatial patterns, (2) investigate the relative influence of water quality variables on diatom assemblage composition within and among regions of the watershed, and (3) determine how accurately local environmental conditions can be predicted using extant diatom assemblage structure. Where prediction models are identifiable and strong, resulting diatom-based environmental transfer functions can be used to determine environmental histories from sediment records to provide restoration targets, as well as to track contemporary ecological response to management decisions. These biological interpretations could prove particularly valuable as water management decisions are continually updated with the goals of improving ecosystem integrity and resilience while maintaining function in flood control and navigation (SFWMD 2008).

Methods

Study Sites

Charlotte Harbor has a large watershed, spanning 12,653 km2 of central and south Florida, and an estuary surface area of approximately 700 km2 (Hammett 1988). It receives freshwater from three major rivers, the Myakka, Peace, and Caloosahatchee, as well as numerous small creeks and streams. Total freshwater inputs to the estuary including the three rivers, remaining watershed (coastal area and creeks and streams), and precipitation, averages over 3,500 millions of gallons per day (Hammett 1988). Charlotte Harbor is particularly appropriate for examining environmental influences on diatom assemblage structure because of its spatial variability in hydrological histories associated with different land uses. In the late nineteenth century, the Caloosahatchee River was channelized and connected to Lake Okeechobee as part of an effort to drain the Everglades, significantly affecting its hydrology. The historically shallow, meandering flow way was deepened and straightened to facilitate flood control and navigation (Lane 1990). Freshwater flow through the Caloosahatchee River is now highly managed, as the economic interests of flood control and navigation require maintenance of the artificial flow way; however, recognizing that management decisions can have negative impacts on water quality, legislation also requires that management plans restore and protect the ecological integrity of Charlotte Harbor surface waters (SFWMD 2008). The flow of the Caloosahatchee River is now managed by the U.S. Army Corps of Engineers and South Florida Water Management District via three lock and dam structures: the Moore Haven locks at the junction with Lake Okeechobee, the Ortona locks, and the W.P. Franklin locks farthest downstream that now provide a barrier to tidal action and marine water influence upstream (Lane 1990; Chamberlain and Doering 1998). Much of the drainage area of the Caloosahatchee is in agriculture; the Peace River watershed also includes extensive agriculture as well as phosphate mining (Pierce et al. 2004). For approximately twenty years in the late nineteenth and early twentieth centuries, the bottom of the Peace River was dredged for phosphate extraction until operations moved inland, though the impacts of mining continue to affect the watershed (Martin and Kim 1977; O’Donnell 1990). In contrast, the Myakka River watershed is relatively pristine, as it is largely undeveloped with much of its watershed in permanent conservation lands (Dorsey and Barry 1990). Its drainage basin has more freshwater wetland area than the rest of the Charlotte Harbor watershed and also expansive tidal wetlands. Streamflow analysis of the three rivers spanning 50 years showed a statistically significant declining trend in the Peace River flow volume, but no significant trend elsewhere in the watershed, which might be explained by decreasing pressure due to groundwater withdrawals from the Floridan aquifer (Hammett 1988; Pierce et al. 2004). The three rivers also differ widely in their drainage area; the Caloosahatchee is approximately 120 km long with a watershed of 3,570 km2, the Myakka River is nearly the same length (110 km) but with a watershed of only 610 km2, and the Peace River is 170 km long with a watershed of 3,540 km2 (Lane 1990; O’Donnell 1990; Dorsey and Barry 1990).

Fifty sites were selected for sampling in Charlotte Harbor, the Peace River, the Caloosahatchee River, and the Myakka River (Fig. 1). Sites were selected to capture salinity and nutrient gradients in the watershed and included open marine areas, brackish, and freshwater reaches of each river (<0.5 ppt salinity). Sites located beyond the point where the mouth of each river exceeded 1 km without further narrowing and those adjacent to two barrier islands that separate the estuary from the Gulf of Mexico were designated as Harbor sites. One sample, collected in the center of the harbor in a depositional environment from which sediment cores were taken for paleoecological study (described in Van Soelen et al. 2012), was selected to represent the central harbor and distance from that point was measured for all other samples. The upstream-most site (furthest from the harbor center) in each river was 149 km upriver (Peace River), 119 km upriver (Caloosahatchee River), and 48 km upriver (Myakka River). Caloosahatchee River samples were taken from either side of the W.P. Franklin locks (the farthest locks downstream), and the most upstream site was immediately downstream of the Ortona locks.

Fig. 1
figure 1

Map showing location of study sites in Charlotte Harbor, Florida, and three major river inflows

Sample Collection, Preparation, and Laboratory Analysis

Samples were collected in March 2012, which is during the dry season for southwest Florida and near the middle of the period of minimal freshwater flow that generally occurs between November and May (Hammett 1988). Sediments were collected by hand, by scraping the uppermost estimated 1-cm layer from the surface to capture the material deposited during approximately the last year; accretion rates during the past 100 years have increased to approximately 0.74 cm/year (Van Soelen et al. 2012). This upper sediment layer contains diatoms from a variety of habitats and is expected to include a mix of benthic, epiphytic, and planktonic taxa (Cooper 1999). Where deep water prohibited collection by hand, an Ekman dredge was used to retrieve undisturbed surface sediments and the upper 1 cm was scraped from the surface. Samples were transported to the laboratory on ice and then frozen. At the time of sample collection, salinity, conductivity, temperature, and pH were measured at mid-depth in the water column using a YSI multiparameter Sonde.

In the laboratory, samples were thawed and homogenized by hand. Approximately 10 ml of sediment was reserved for diatom analysis and the remainder was dried and ground for analysis of total phosphorus (TP) with a UV-2101PC Scanning Spectrophotometer, total nitrogen (TN) with an ANTEK 7000 N Nitrogen Analyzer, and total carbon (TC) with a Perkin Elmer Series II CHNS/O (2400) Analyzer by the Southeast Environmental Research Center Nutrient Analysis Laboratory at Florida International University. Diatom samples were cleaned using a series of acid baths following the oxidation technique described by Battarbee (1986) and then diluted until a neutral pH was achieved. A measured aliquot of 0.006 to 0.1 ml (depending on diatom concentration) of the resulting mixture was removed by calibrated pipette, placed on a coverslip, and dried. Coverslips were then permanently mounted onto glass slides using Naphrax®. A minimum of 500 diatom valves were identified from each sample along measured, random transects using a Nikon E4000 light microscope at ×1,000 magnification. Identification of diatoms was based on regional and standard diatom taxonomic literature (e.g., Peragallo and Peragallo 1908; Foged 1984; Hendey 1964; Hein et al. 2008; Witkowski et al. 2000).

Data Analysis

The abundance of all taxa in each sample was calculated relative to total abundance of diatom valves counted. Taxa occurring in at least three samples with a relative abundance of at least 0.5% in at least one sample were included for statistical analyses, because inclusion of rare taxa increases noise in the dataset (McCune et al. 2002). Relative abundance data were fourth-root transformed to more closely approximate a normal distribution and to downscale the relative importance of very abundant taxa. Indicator species analysis was used to identify taxa that were strongly associated with a particular region. Indicator values are based on affinity for a group from taxon persistence in the group (in most samples from that group) and exclusivity to that group (not found in most other groups) (McCune et al. 2002).

Environmental data were transformed using the method that best reduced the skewness for each variable. Conductivity and salinity values were arcsine transformed, nutrient data (TP, TN, and TC) were log-10 transformed, pH was inverse transformed, and distance from the Harbor center was square-root transformed. All environmental data were then normalized (mean subtracted and divided by standard deviation) to equalize values to a common scale (Clarke and Warwick 2001). Pearson’s correlation coefficients were calculated to evaluate the strength of associations among environmental variables.

Similarity matrices were created using Bray–Curtis similarity to evaluate diatom assemblage patterns and Euclidean distance for the environmental data (Clarke and Warwick 2001). Analysis of similarity (ANOSIM) was used to determine the similarity of diatom assemblages and environmental values among the four regions and among downstream and upstream river sites (Clarke and Gorley 2006). R values reported represent the difference of mean ranks of taxa by relative abundance and ranges from 0 to 1 with increasing dissimilarity, and comparisons having a p < 0.05 were considered significantly different (Clarke and Warwick 2001). Non-metric multi-dimensional scaling (NMDS) ordination (Kruskal and Wish 1978) was used to visualize assemblage patterns within and among regions. After evaluating the stress of ordinations with up to five axes, two-dimensional NMDS was selected to ease interpretability of results, given that stress values did not decrease greatly with additional axes. Stress is a measure of the departure from fit of the sample dissimilarity to distance in the ordination; it tends to decrease with additional axes, as more of the sample variance is represented, but additional dimensions make interpretation increasingly difficult (McCune et al. 2002). Environmental vectors representing the direction and strength of each variable’s correlation with assemblage differences were overlaid on the assemblage ordinations.

Pearson’s and Kendall correlation coefficients were calculated representing the linear and rank relationship, respectively, between each variable and the NMDS ordination of diatom assemblage (McCune et al. 2002). Bio-env stepwise analysis (BEST) was also used to evaluate the dissimilarity matrices. This analysis conducts a Mantel test on the dissimilarities among taxa and environmental data to determine which subsets of environmental variables have the strongest correlation with assemblage dissimilarity (Clarke and Gorley 2006). This analysis was conducted for the full dataset and for each of the four regions separately to evaluate how relationships differ on different spatial scales. The above analyses were all conducted using the software Primer version 6 (Clarke and Gorley 2006) and PC-ORD version 5 (McCune and Mefford 1999).

The optimum of each taxon for environmental variables of interest was determined by averaging the values of each variable at sites where the taxon occurs and weighting it by its relative abundance at each site. This enables calculation of a quantitative model of each variable for each site, based on the optima and abundance of the taxa present. This weighted-averaging regression-calibration approach assumes that species with optima closest to the observed value of a parameter will be most abundant at that site (Birks 1995). The breadth of tolerance of each taxon along each gradient was calculated as the abundance-weighted standard deviation of each variable (Birks 1995). Weighted averaging partial least squares (WA-PLS) regression with cross-validation was used to develop predictions of salinity, geographic distance from the central harbor point, TN, and TP. This method incorporates the residual correlations in the assemblage data (Ter Braak et al. 1993) and improves model performance in datasets where taxa have broad tolerances (Battarbee et al. 1999). The accuracy of diatom-based prediction models for these variables was evaluated by examining the relationship between observed environmental measurements at each site and values predicted by the optima, tolerances, and relative abundances of taxa in the assemblage at that site (the diatom-inferred value), and through evaluation of the root mean square error of prediction (RMSEP). Observed values of all variables were plotted against the residuals from these prediction models to evaluate relationships that might explain bias. Taxon optima calculated within each region were also plotted against optima estimated by the whole dataset to investigate regional bias, or skewness in applying a regionally-based model for the whole dataset. These analyses were conducted using C2 version 1.7.2 software (Juggins 2011).

Results

Diatom Assemblage Composition

A total of 296 diatom taxa were identified from samples across the Charlotte Harbor watershed. The most abundant taxa in the watershed were Catenula adhaerens Mereschkowsky, Planothidium delicatulum (Kützing) Round and Bukhtiyarova, Staurosirella martyi (Héribaud) Morales and Manoylov, Amphicocconeis disculoides (Hustedt) De Stefano and Marino, and Achnanthes lanceolata (Brébisson ex Kützing) Grunow, all with a mean relative abundance of greater than 4 % across all sites. Species richness across sites ranged from 18 to 65, and was not correlated with any environmental variable including distance. Average species richness was lowest in the Caloosahatchee River sites (37) and highest in the Myakka River (48). Peace River and Harbor sites had an average of 45 and 44 taxa per site, respectively. Diatom assemblages in each region were significantly different from assemblages in every other region (p < 0.03), and the two rivers with the broader sampling area, the Peace and Caloosahatchee, were especially different from the Harbor sites with R values of 0.58 and 0.56, respectively (Table 1). The assemblages in the furthest downstream river sites where tidal mixing occurs were significantly different between the Caloosahatchee and Myakka Rivers (R = 0.31; p < 0.05) and the Peace and Caloosahatchee Rivers (R = 0.24; p < 0.02), but there was no significant difference between assemblages in the downstream Peace and Myakka Rivers. In the upstream reaches of the rivers, there were significant differences between the Peace and Caloosahatchee River assemblages (R = 0.83; p < 0.001) and the Peace and Myakka River assemblages (R = 0.50; p < 0.03), but no significant difference between assemblages in the upstream Caloosahatchee and Myakka Rivers (Table 2).

Table 1 ANOSIM comparing regions by (a) diatom assemblage composition and (b) environmental measurements
Table 2 ANOSIM comparing diatom assemblages in (a) upstream river reaches, (b) downstream river reaches, and environmental characteristics in (c) upstream river reaches, and (d) downstream river reaches

The most abundant taxa (5 % and greater average relative abundance) in each region included two or more taxa that were common throughout the watershed, as well as some that were only abundant in that region (Table 3). Indicator species analysis revealed statistically significant indicators for each region (Table 4). Gomphonema brasiliense Grunow was the best indicator for the Caloosahatchee River (p = 0.003), Fragilaria sopotensis was the best indicator for the Peace River (p < 0.001), Nitzschia coarctata Grunow was the best indicator for the Myakka River (p < 0.001), and Fallacia nyella (Hustedt ex Simonsen) Mann was the best indicator for the Harbor region (p = 0.006).

Table 3 Mean relative abundance of the most abundant taxa in site assemblages grouped by region. Taxa that comprise at least 5% mean relative abundance in a region are shown
Table 4 Optima (Opt.) and tolerances (Tol.) for salinity (Sal.), total phosphorus (TP), total nitrogen (TN), and distance (Dist.) for abundant and indicator diatom species

Environmental Conditions and Relationships to Assemblage Structure

Gradients of salinity, conductivity, pH, TP, TN, and TC were apparent but varied in length and degree of overlap among the four regions (Table 5). A strong salinity gradient was present in each river with values ranging from almost completely fresh (<0.5 ppt) to brackish (>19 ppt). Nutrient gradients were also evident and varied in length among regions. The highest value for TN in the Caloosahatchee River (1.6 mg/g) was equal to the mean value in the Peace River, which had a much longer gradient (mean 3.0 mg/g); the Harbor sites had the highest mean TN and longest gradient of all the regions. The mean TP concentration in the Peace River (4,670 μg/g) was almost four times the mean for the Caloosahatchee River (1,255 μg/g), which had the next longest TP gradient and next highest concentrations. The Myakka River had the shortest TP gradient of the three rivers, which fell entirely within the range found in the Caloosahatchee River, and the Harbor had an even smaller range and the lowest mean concentration for TP.

Table 5 Mean, standard deviation (in parentheses below), and range (in italics below) of environmental variables for each region and for all sites: conductivity (Cond.), salinity (Sal.), total phosphorus (TP), total nitrogen (TN), total carbon (TC), and distance (Dist.) from site in harbor center where sediment cores were collected

Pearson’s correlation coefficients showed that relationships were strongly spatial, with distance correlated with all variables except for TN and TC (Table 6). Nutrient concentrations generally had weak correlations with other variables and one another. ANOSIM revealed significant environmental differences among some but not all of the subregions (Table 1). Water quality of the Myakka River was not significantly different from either the Harbor or the Peace River but was different from the Caloosahatchee River (p < 0.001). The Caloosahatchee River water quality was also significantly different from the Peace River and the Harbor (p < 0.006 and p < 0.001, respectively), and the Peace River and Harbor also differed in water quality (p < 0.001). When separated into downstream reaches (where tidal mixing occurs) and upstream reaches (no tidal mixing), the three rivers were all significantly different from each other in the downstream reaches and only the Peace River and Caloosahatchee River were not significantly different in the upstream reaches (Table 2).

Table 6 Pearson’s correlation coefficients of the environmental variables across the Charlotte Harbor watershed: conductivity (Cond.), salinity (Sal.), total phosphorus (TP), total nitrogen (TN), total carbon (TC), and distance (Dist.) from site in harbor center where sediment cores were collected

Different environmental variables correlated with assemblage patterns at different spatial scales (Fig. 2a–e), as shown in two-dimensional NMDS ordinations (2D stress = 0.18; 3D stress = 0.12) for all sites. Salinity had the strongest correlation with the assemblage patterns across the whole watershed (R = 0.89; p < 0.001), but not within individual regions (Table 7). Distance was also strongly correlated with assemblage patterns, and had the strongest correlation with the Peace River assemblages (R = 0.89; p < 0.001). Conductivity had the strongest relationship with Myakka River assemblages (R = 0.97; p < 0.001), but salinity and distance were nearly as strong (R = 0.96; p < 0.001 and R = 0.95; p < 0.01, respectively). In the Caloosahatchee River and Harbor sites, TC has the strongest relationship with the assemblages (R = 0.67 and 0.61, respectively; p < 0.05), and TC was the only statistically significant relationship for the Harbor sites.

Fig. 2
figure 2

Nonmetric multidimensional scaling ordination diagrams based on Bray–Curtis similarity in diatom composition of a the whole Charlotte Harbor watershed dataset, b Harbor sites, c Caloosahatchee River sites, d Peace River sites, and e Myakka River sites. Overlaid trajectories show the magnitude and direction of the correlation of the species ordination with environmental variables. Each diagram is oriented with salinity aligned on the horizontal axis

Table 7 Pearson’s correlation coefficients for each environmental variable with the species ordination

The Mantel test (BEST) revealed which combinations of variables have the strongest correlation with assemblage dissimilarity in each region and across the whole watershed (Table 8). Salinity, distance, and conductivity comprised the strongest correlations with differences in assemblage composition and were significantly correlated with one another. For the whole watershed, the strongest association between assemblage dissimilarity and environmental dissimilarity was with salinity alone, followed by salinity with distance (R = 0.69 and 0.68, respectively; p < 0.01). When the overriding covarying environmental variables, salinity, distance, and conductivity, were excluded from consideration, the correlation between environmental dissimilarity explained by the remaining variables remained strong. The strongest combination with the exclusion of these variables was pH, TC, and TN (R = 0.31; p < 0.01). In the Caloosahatchee River sites, the strongest correlation with assemblage dissimilarity was also salinity, followed by salinity with distance (R = 0.45 and 0.43, respectively; p < 0.05). Excluding salinity, distance, and conductivity, TC had the highest correlation with assemblage dissimilarity but was not statistically significant. In the Myakka River, conductivity alone had the highest correlation with assemblage dissimilarity, followed by the combination of salinity, distance, and conductivity (R = 0.92 and 0.90, respectively; p < 0.01). Excluding these three, TP and pH both had significant correlations with assemblage dissimilarity (R = 0.68 and 0.64, respectively; p < 0.04). In the Peace River, salinity and distance had the highest correlation with assemblage dissimilarity, followed by distance alone (R = 0.84 and 0.81, respectively; p < 0.01). TP, pH, and TC were also correlated with assemblage dissimilarity. In the Harbor sites, pH (R = 0.36) had the highest correlation with assemblage dissimilarity, but no relationships were statistically significant. Upstream assemblages did not have any statistically significant relationships with environmental variables, but downstream assemblages were correlated with pH, TP, and distance (R = 0.60; p < 0.01). With distance, salinity, and conductivity excluded, TP had the strongest correlation with downstream assemblages (R = 0.20; p < 0.04).

Table 8 Three strongest correlations of combinations of variables with species dissimilarity matrices when a) all environmental variables are included, and b) with salinity, distance, and conductivity excluded

Environmental Preferences and Predictions

Optima and tolerances for diatom taxa across all sites were calculated for salinity, distance, TN, and TP to develop diatom-inferred prediction models using the weighted-averaging regression-calibration approach (Table 4). Prediction models showed that taxa across the entire watershed and within each region reliably predict these variables; however, optima calculated within a given region often differed from those calculated from the whole dataset, causing regional prediction models to over- or underpredict values for the whole dataset. Therefore, the strongest inference models were those developed using the watershed as a whole (Fig. 3a–d). Prediction models for salinity and distance were very strong (R 2 = 0.96 and 0.93, respectively). Nutrient prediction models were also strong (R 2 = 0.82 and 0.83 for TN and TP, respectively). There were no significant correlations found between residuals and any other environmental variable.

Fig. 3
figure 3

Plots of observed vs. diatom predicted a salinity, b distance, c total nitrogen, and d total phosphorus using weighted averaging partial least squares regression with cross validation on the dataset of all Charlotte Harbor watershed sites. Line represents a 1:1 relationship

Discussion

Diatom Distribution and Relationships to Environmental Gradients

The significant environmental variation across the Charlotte Harbor watershed provided an excellent opportunity to characterize spatial controls on algal assemblages and their relationships to regional characteristics and water quality patterns. These relationships should be applicable across the Gulf Coast, since diatom assemblages in Charlotte Harbor included many taxa reported in other Gulf Coast estuaries. Three of the most abundant taxa in the Harbor sites, Opephora pacifica, A. disculoides, and Dimmeregramma minor, were reported by Cremer et al. (2007) in Rookery Bay, Florida, south of Charlotte Harbor; Opephora species and A. disculoides were also reported by Van Soelen et al. (2010) in Tampa Bay, Florida, north of Charlotte Harbor, as well as other taxa that were common in the Harbor sites such as P. delicatulum (5% mean relative abundance), Cyclotella litoralis Lange and Syvertsen, and Paralia sulcata (Ehrenberg) Cleve (each 1% mean relative abundance). In contrast, diatom assemblages found in the Myakka, Peace, and Caloosahatchee Rivers are poorly documented. Morales (2002, 2005) noted the importance of small, fragilarioid diatoms in the brackish waters of the Peace and Caloosahatchee Rivers and resolved some taxonomic discrepancies. Some of the genera that were the focus of these studies were common in the Peace and Caloosahatchee River sites, including Staurosirella, Fragilaria, and Opephora. No previous documentation of the diatom flora of the Myakka River was found, except it (along with the Peace and Caloosahatchee Rivers, as well as many other Florida streams) was noted as a location where a recently described species, Staurosira stevensonii Manoylov, Morales and Stoermer, which was not identified in our study, was found (Manoylov et al. 2003). Several of the most common taxa in the river regions, including A. lanceolata, C. adhaerens, and Staurosira construens var. venter (Ehrenberg) Hamilton, have been reported in eastern USA estuaries such as the Chesapeake Bay and the Pamlico and Neuse estuaries of North Carolina (Cooper 1995 and 2000); however, they have rarely been reported elsewhere along the Gulf Coast, suggesting that additional characterization and taxonomic evaluation of the diatom flora is needed along river to sea gradients in this region.

Several diatom taxa exhibited strong affinities for a particular region as evidenced by the relatively high number (18, or 6 %, of all identified taxa) of significant indicator species. Region-specific taxa optima differed from optima based on the whole watershed, which may be related to the smaller environmental gradient represented in each region compared to the whole watershed, but suggests that assemblage composition may be controlled by different environmental factors on different spatial scales. The diatom assemblages differed among regions suggesting variable environmental controls among subbasins. Differing correlations of environmental variables to assemblage dissimilarity among sites within each region also suggests variable environmental control on different spatial scales and is consistent with the findings of previous research on diatom community spatial structure in freshwater streams (e.g., Pan et al. 2000; Soininen et al. 2004).

The best indicator species for the Harbor and Myakka River sites, F. nyella and N. coarctata, have previously been reported in other coastal areas including Brazilian beaches (Garcia 2003) and intertidal mudflats in Europe (e.g., Haubois et al. 2005). Indicator species for the Peace River were generally those having the highest optima for total phosphorus, but G. brasiliense, an indicator for the Caloosahatchee River, had the second highest total phosphorus optimum of all abundant or indicator species and was not found in the Peace River sites. Its overall low occurrence (average 1% relative abundance in the Caloosahatchee River assemblages and not found anywhere else) might explain its limited distribution (six sites, all in the upper Caloosahatchee); the only site where G. brasiliense constituted more than 10 % relative abundance was the site with the highest total phosphorus concentrations found outside of the Peace River, which was immediately upstream from the Franklin locks that separate the upstream flow way from marine influence. The optimum value for distance from the Harbor for G. brasiliense was comparable to the optima for Peace River indicator species, which may be a stronger indication of its habitat affinity given its overall low occurrence in the watershed, although its salinity optimum was lower than most Peace River indicators. Interestingly, good indicators for the Harbor sites typically had lower salinity optima than indicators for the Myakka River, but this could suggest that different environmental variables are more important drivers for the distribution of these taxa. Additional sampling of the Myakka River in farther upstream reaches may help to clarify environmental drivers that are not related to salinity. Additional quantification of the autecology of diatom taxa in this watershed will provide improved ability to interpret these findings and evaluate impacts of water management decisions. For example, because G. brasiliense was only found at sites upstream of the lock separating the Caloosahatchee from marine waters, it may be a reliable indicator of this unique habitat. This species has not previously been reported in studies of Lake Okeechobee diatom flora (e.g., Stoermer et al. 1992), but may be associated with water chemistry changes that occur in response to freshwater releases from the lake.

Significant water quality differences among regions and subregions provided insight into how environmental variables affect biological patterns on different spatial scales. Water quality differences were evident among most regions, and some differences were stronger on the subregional scale. When considering only the downstream river sites that are subject to tidal mixing, each river was significantly different from the others, and the difference between the Peace and Caloosahatchee Rivers was stronger among downstream sites than upstream, which suggests that tidal mixing is not the dominant environmental control in at least one of these areas. Upstream sites were significantly different (both with distance included and excluded as a variable) except for the Peace and Caloosahatchee Rivers, which had the longest spatial reaches and may suggest moderation of environmental gradients at reaches farther inland. The salinity gradient, for example, is much gentler in the upstream reaches of these two rivers, with a much smaller range over a larger distance. Although differences among mean salinity values for each river were apparent (e.g., the mean salinity in the Myakka River was nearly four times the mean salinity in the Peace River), these differences were not surprising given the geographical characteristics of each river and the number of samples taken from each; the lower salinity means in the Peace and Caloosahatchee Rivers were likely because they are longer rivers and the sample size in inland reaches was higher, as reflected in their higher mean values for geographic distance. Salinity was clearly expected to covary with distance from the harbor, but the relationship is not strictly linear. Where salinity was below 1 ppt, which it was in 36 % of all sites, there is not a significant relationship between salinity and distance, suggesting that the expected covariation derived from the salinity gradient is greatly diminished in these inland reaches.

Salinity was the variable most strongly related to assemblage composition across the watershed, which has been shown to be an important factor controlling diatom distribution within estuarine ecosystems (Snoejis 1994; Underwood et al. 1998; Frankovich et al. 2006; Wachnicka et al. 2010). Within each catchment inflow to the estuary, however, environmental variables differed in the strength of their correlation with assemblage dissimilarity and tended to differ in the order of dominant controls. The covariation of salinity with other variables can be confounding because it becomes difficult to evaluate the relative influence of each variable on assemblages independently (Thornton et al. 2002); by sampling across a large spatial gradient that encompassed the freshwater reaches in each river where covariation is diminished, we believe we can begin to tease out some of these relationships. Nitrogen, which is the limiting nutrient in this watershed, particularly during the dry season (Montgomery et al. 1991; Morrison et al. 1998; Heil et al. 2007), did not significantly correlate with assemblage differences in any region of the watershed, which could be due to relatively low extant concentrations throughout the watershed. TP concentrations were significantly related to compositional dissimilarity in the Myakka River region and the watershed as a whole, and TP in conjunction with pH and TC had a lesser but still significant relationship with community structure in the Peace River. While it is not surprising that TP was related to assemblage structure in the Peace River because it had elevated TP concentrations compared to the other regions, it is somewhat surprising that it was not as strong a correlation as it was in the Myakka River, where concentrations were lower. This suggests that TP has a stronger relative control on diatom assemblages in the Myakka River than it does in the Peace River. All measured variables in the Peace River were significantly correlated with assemblage dissimilarity except for TN, while in the Myakka River only TP, salinity, conductivity, and distance were correlated with assemblage dissimilarity; relationships between Myakka River assemblages and TN, TC, and pH were not significant.

Estuarine diatom assemblages have complex spatial dynamics. Previous studies have demonstrated that different diatom assemblages can be found in similar environmental conditions across estuarine and salt marsh habitats (e.g., Sullivan 1978 and 1982; Navarro and Torres 1987), which is consistent with our findings in the Charlotte Harbor watershed. The notion that microorganisms should be cosmopolitan across tolerable conditions (Baas-Becking 1934) has been challenged by evidence of endemism and spatial limitation in diatoms (Kociolek and Spaulding 2000; Telford et al. 2006; Vanormelingen et al. 2008). Kociolek and Spaulding (2000) suggested that historical landscape factors have a greater effect on modern-day diatom assemblage patterns than does dispersal capacity. Our results appear to support this idea, in that the upstream reaches of the Peace and Caloosahatchee Rivers did not differ significantly in environmental characteristics among sites but the diatom assemblages were significantly different (R = 0.83, p < 0.001; Table 2). In a contiguous watershed where dispersal should not be limited, similar assemblages should occur where environmental differences are not substantial. While difficult to quantify, the difference in the hydrological history of these two upstream areas may explain the dissimilar diatom assemblages, or it may contribute additional environmental controls not measured; for example, sediment grain size was not evaluated in this study but can influence diatom distribution and can be altered through anthropogenic activity such as the mining that occurred in the Peace River (Cahoon et al. 1999). Similarly, Sullivan (1978, 1982) found elevation, which was poorly correlated with other variables, to explain most of the variability in diatom assemblage similarity across Gulf Coast salt marshes (most likely due to dessication regimes, which were not a factor for our study sites). Although it seems unlikely that factors like hydrodynamic energy play a significant role in driving assemblage composition in these upstream reaches during the dry season, as flow drops to nearly zero (Hammett 1988), it is possible that they are important during other seasons and could be important at other sites.

Weighted Averaging Inference Models

The WA-PLS models based on the diatom optima predictions were strong for the variables of interest, particularly salinity (R 2 = 0.96). This diatom-based inference value for salinity was slightly higher, but comparable to others generated previously for other regions including the southeastern coast of Florida (R 2 = 0.91; Gaiser et al. 2005) as well as lakes of the Great Plains region (R 2 = 0.91; Fritz et al. 1991), and nearly identical to one generated for Florida Bay at Florida’s southern tip that also covered a broad salinity gradient (R 2 = 0.97; Wachnicka et al. 2010). The prediction error in the salinity estimation model is quite low (RMSEP = 0.1), supporting salinity as the primary controlling variable across the watershed even though other variables have differing influences in different regions. Therefore a transfer function for salinity provides a reliable tool for quantitative predictions of salinity values across the watershed and for paleoecological reconstructions from diatoms preserved in sediment cores. The distance model was also strong (R 2 = 0.93, RMSEP = 1.6) but significantly correlated with salinity. Employed in concert, these models could provide useful insights into historic sea level position and predictions about sea level rise. Nutrient models were also strong and provide useful inferences. In almost all cases, the nutrient models were stronger within regional catchments than in the watershed as a whole, suggesting that different regions respond to nutrient gradients in different ways. Regionality in transfer function strength has interesting implications for water management strategies because of the scope of the watershed. For example, models to predict phosphorus concentration using diatom assemblages from the Peace River do not apply well to the watershed as a whole because they over- or underestimate some values along the gradient, but they do show a strong ability to predict conditions in the Peace River. Within-region strength could be an indication that the salinity gradient is a confounding effect on nutrient models on the larger spatial scale, but in the smaller river reaches nutrient models can be powerful. Regional inference models are limited by their relatively small sample size (the Peace River, the largest river subset, contained only 16 sites), but additional sampling could further refine the predictive models and provide valuable information for managing each catchment individually.

This study provides evidence that different environmental characteristics of freshwater inflows to an estuary generate distinctive diatom assemblages that can be used to characterize regional differences and the spatial patterns of environmental gradients across the watershed. Results from this study demonstrate that diatoms are excellent indicators of the relationships between the interacting influences of salinity and nutrient gradients in estuarine ecosystems. As changes in freshwater flow result from climate change and/or human alterations to hydrologic regimes, understanding the nature of environmental changes on different spatial scales will be increasingly important to predict ecosystem responses. Our models provide a powerful tool for quantifying these changes independent of direct measurements, allowing for interpretation of environmental changes not only across spatial gradients but also through time; the ability to infer conditions through periods prior to and throughout anthropogenic impacts will help to inform water quality management goals, particularly at the subregional scale.