Introduction

A major goal of paleolimnological studies is the quantification of past environmental conditions. Frequently, it is the change through time in some environmental parameter (e.g. pH, temperature) that is of interest, but these parameters may leave no direct trace in the sediment. However, it may be possible to infer this change through the use of some proxy indicator. Commonly, subfossil assemblages of some organism can be used to derive such environmental series through the application of a transfer function. This is based on a space-for-time substitution, where the present-day organism assemblages are sampled from the modern sediments of a spatial series of lakes. Water chemistry and other characteristics of the lake environment are sampled at the same time.

The first step in a paleolimnological study is thus to assess the organism-environment relations from a set of lakes along some environmental gradient (e.g. temperature, pH, total phosphorus). These are referred to as calibration or training sets, and different aquatic organisms including chironomids (e.g. Barley et al. 2006), ostracodes (e.g. Viehberg 2006), and diatoms (e.g. Bouchard et al. 2004) have been used in these studies. The statistical calibration methodology is based on the assumption that the point sample represents the “environment” in which the organism is living, just as the subfossils extracted from the sediment sample are supposed to be a representative sample of the aquatic community. It is recognized that this is only a crude approximation, but practical considerations usually do not allow the complete characterization of the lake environment over the entire lake volume, for the entire season, and over several years. Violations of this assumption are probably less critical for the sediment sample, as the sediment integrates several years of accumulation and contains organisms from many parts of the lake. In comparison, the properties of the water column are more variable. The methodology works only to the extent that the spatial differences between the sites are much greater than the annual and interannual differences within any lake.

Paleolimnologists acknowledge this and other problems (e.g. Birks 1995, 1998), and there have been a few studies addressing the impact of environmental variability on the development of transfer functions. For example, Siver and Hamer (1992) studied the seasonality of certain algae in relation to their potential for reconstructing the seasonal variability of water chemistry. They found that the flora represented in the surface sediments grew primarily during late autumn. Bradshaw et al. (2002) studied the seasonal variability of total phosphorus in relation to the diatoms in an effort to improve calibration sets. Models based on spring total phosphorus concentrations only slightly improved the results, but the lack of improvement may be due to the small size of the dataset (23 lakes). Köster and Pienitz (2006) investigated diatom seasonality in relation to environmental variables, derived a transfer function, and applied this to 1,000-year fossil record. They suggest that accounting for seasonality may improve paleolimnological inference models. Other studies have assessed the influence of within-lake variability on, for example, cladoceran (e.g. Kattel et al. 2007), chironomid (e.g. Heiri 2004; Eggermont et al. 2007), and macrofossil (e.g. Zhao et al. 2006) sediment assemblages. However, whether interannual variability of the environment affects the parameterization of transfer functions has not been sufficiently studied.

Many studies document the chemistry of spatial series of lakes (e.g. Armstrong and Schindler 1971; Hamilton et al. 2001) and other studies have monitored the seasonal or interannual variability of the water chemistry and physical properties (e.g. Schindler et al. 1974; Sorvari et al. 2000), and phytoplankton populations and communities (e.g. Pearsall 1930, 1932; Riley 1939; Scheffer and Robinson 1939; Hutchinson 1944; Brock 1985; Talling 1993; Harris and Baxter 1996; Watson et al. 1997) of one or several lakes. One conclusion of these studies is that the chemical and physical parameters are to some extent correlated with climate, suggesting that the year of sampling may strongly influence the species-environment relations and ultimately, the computation of paleoenvironmental transfer functions.

In this paper we address one question: does the year of sampling of the environmental variables affect the parameterization of organism-environment relations? In the years 2000 and 2006, we measured environmental variables and collected water samples from a set of 41 lakes in the southwest Yukon Territory and northern British Columbia, Canada. Diatoms, chironomids, and ostracodes have been identified from the surface sediments of these lakes and the organism-environment relations have been studied (Wilson and Gajewski 2002, 2004; Bunbury and Gajewski 2005). The combination of two years of environmental data collected during a similar time period (early summer), together with three groups of organism assemblage data from the same set of 41 lakes, has resulted in a dataset that can be used to investigate this question.

We recognize that there is a limit to the results that can be obtained because this was not a planned experiment. However, this will provide a first approximation towards answering this question. In this study, we assume the microfossils in the sediment are measured without error, as do all workers using this approach, and that the sediment assemblage is representative of the organisms in the lake. In addition, because a sediment sample accumulates several years of deposition and integrates the microfossils that grew under the conditions present in those years, it is reasonable to compare these organism assemblages to environmental datasets collected in different years, as this is what is done in practice.

Methods

Field and laboratory methods

A description of the field methods, water chemistry analysis, organism extraction and identification, and the study area (Fig. 1) are presented in Wilson and Gajewski (2002, 2004), and Bunbury and Gajewski (2005). Basically, a surface sediment sample was collected using either a Glew gravity corer (Glew 1991) or an Ekman dredge from the deepest part of a series of lakes (Fig. 1). Water samples were collected and analysed using standard methods.

Fig. 1
figure 1

Map of the southwest Yukon and the 41 lakes sampled for this study

Diatom data from 41 lakes were available in Wilson and Gajewski (2002). Only 39 of the sites were used in Wilson and Gajewski (2004), therefore chironomids were extracted and identified from the two remaining sites (Appendix 1) in order to retain the same set of lakes in all three comparisons performed here. Ostracode data were available from 31 of the sites (Bunbury and Gajewski 2005), and data from 10 lakes were added for this study (Appendix 2).

Environmental data for the year 2000 is found in Wilson and Gajewski (2002). In 2006, we returned to all 41 lakes to collect water chemistry samples and measure environmental parameters to compile a comparable dataset (Appendix 3) to the one that had been collected in 2000. Secchi depth was removed as an environmental variable from all analyses because measurements were often constrained by shallow lake depths, thereby not accurately representing water clarity.

Data analysis

Organism-environment relations

Wilson and Gajewski (2002, 2004) document datafile preparation for the year 2000 data. For example, mean values were substituted for missing data, and the trace metal variables were selected based on 50% of the data points being above detection limits; see the original papers for full details. The same approach was used in 2006 with the following exceptions. Surface-water pH was averaged from two different measurements taken by Oakton handheld pH meters. Alkalinity measured in the laboratory was missing for Cub, Kusawak, Little Louise, Scout and West Twin Lakes, therefore a regression was computed relating field and laboratory measurements of alkalinity (not shown), and the modelled values substituted. Alkalinity values of Grayling Lake appeared unreasonable for both field and laboratory values based on previous years’ data (unpublished); the mean of the modelled alkalinity values was substituted in this one instance.

Detrended correspondence analysis (DCA) (Hill and Gauch 1980) and canonical correspondence analysis (CCA; CANOCO 4.5) (ter Braak and Šmilauer 2002) were used for all analyses. DCA with detrending by linear segments and non-linear rescaling of axes revealed gradient lengths indicating unimodal methods were appropriate for chironomids (2.7 standard deviation units (SD)), ostracodes (3.8 SD), and diatoms (4.1 SD). The DCA gradient length was shorter for chironomids (2.7 SD), suggesting a redundancy analysis (RDA) might also be applicable. However, the results of the CCAs and the RDAs were comparable (not shown), therefore the CCAs are presented to make comparisons between the different proxies easier to interpret.

Different environmental variables are important for each group of organisms and those used in the analyses of the organism-environment relations are outlined in Wilson and Gajewski (2002, 2004) and Bunbury and Gajewski (2005). Bottom water temperature (Tb), and bottom dissolved oxygen (DOb) were added to the ostracode environmental dataset as we thought these to be potentially important variables not included in the original analysis. Strontium (Sr) and the strontium–calcium ratio (Sr/Ca) were not included in the analysis as strontium was not measured in 2000.

The ordinations for chironomids and diatoms were performed using 38 species and 29 environmental variables, and 130 species and 24 environmental variables, respectively. Ostracodes were absent in the sediments of 4 lakes. Since the ordination techniques we used do not accommodate complete species absence at a site (Lepš and Šmilauer 2003), the analyses for ostracodes were performed on the remaining 37 lakes using 30 species and 31 environmental variables.

Environmental variable transformations were determined based on the frequency distribution of the values of the variables in each year (not shown). It was necessary to apply different transformations to sulphate (SO4), total phosphorus (TP), and filtered total phosphorus (TPF) because the shape of the histograms varied between the 2 years. Depth, specific conductance (Cond), calcium (Ca), magnesium (Mg), dissolved organic carbon (DOC), total Kjeldahl nitrogen (TKN), silica (Si), and iron (Fe) were square root transformed; area, sodium (Na), chloride (Cl), chlorophyll a (Chla), aluminum (Al), manganese (Mn) and molybdenum (Mo) were log transformed; and all other variables were left untransformed. Sulphate, TP, and TPF were log transformed in 2000 and square root transformed in 2006.

Transformations applied to the species data were based on the original papers; chironomids were square root transformed, ostracodes were log transformed, diatoms were left untransformed, and rare species were downweighted in the DCAs and the CCAs. Biplot scaling was applied to the chironomid analyses, whereas Hill’s scaling was applied to the ostracode and diatom CCAs. Leverage diagnostics in CANOCO 4.5 were used to check on the influence of outliers and, where applicable, samples were made supplementary.

To reduce the number of collinear environmental variables, we first performed a series of constrained CCAs where each environmental variable was individually selected to determine its marginal effect in the analysis (Lepš and Šmilauer 2003). Only significant (P < 0.05, 499 Monte Carlo permutations) variables were included in further analyses. Second, preliminary CCAs were run to check for variables with high variance inflation factors (VIF), which were sequentially removed, resulting in VIFs <10 in all analyses. Lastly, step-wise regression (forward selection) was used to identify the variables that together best explain the variation in the species data.

Intra-set correlations are presented for the variables chosen in the forward selection procedure, and to assist in explaining the relative importance of each environmental variable to the composition of the given organism community (ter Braak 1986). More specifically, these correlations are related to the rate of change in the assemblages that can be expected by the per unit change of the environmental variable in question. We chose to use these correlations instead of the canonical coefficients, as the covariance of other variables incorporated into the model is considered in the computation of the intra-set correlations, thereby increasing their stability (ter Braak 1986).

Development of inference models

For the chironomid data, inference models were derived to determine the potential impact of inter-annual variability on the reconstructions that would be estimated from those models. Sediment loss-on-ignition (LOI) was measured only once using the same sediment from which the organisms were extracted, therefore development of an inference model of this variable was not appropriate. The CCAs conducted on the 2 years’ data revealed bottom water temperature (Tb) as an important variable explaining the variance in the species data in 2000, and surface water temperature (Ts) in 2006. Tb was selected as the variable of which to prepare inference models for the 2 years, and to assess differences in species optima and tolerances between years. This variable was selected over Ts because the ratio between the first and second eigenvalues (λ12) was greater in CCAs constrained to Tb (2000 = 0.527, 2006 = 0.507) than it was in CCAs constrained to Ts (2000 = 0.158, 2006 = 0.355) indicating a better relation between the chironomids and Tb. Generally, better inference models can be developed for environmental variables that have higher λ12 (ter Braak and Prentice 1988).

Detrended canonical correspondence analysis (DCCA) (ter Braak 1986) was used to evaluate the percent variance explained in the chironomid data, with Tb as the sole constraining variable in each year. DCCA with detrending by linear segments revealed gradient lengths of 1.2 standard deviation units (SD) when using the 2000 data and 1.1 SD using the 2006 data. Transfer functions were developed for both years using partial least squares regression and calibration (PLS), weighted averaging regression and calibration (WA), and weighted averaging partial least squares regression and calibration (WA-PLS). The cross-validation performance statistics indicated that the PLS model outperformed the WA and WA-PLS models in both years (not shown). Therefore, the PLS results are presented here.

The PLS models for each year were assessed using leave-one-out cross-validation (jack-knifing), and compared based on model performance statistics. Weighted averaging regression and calibration (WA) was used to estimate the optima and tolerances of the 38 chironomid species to Tb in both years. The predicted (jack-knifed) optimum and tolerance values were used instead of the estimated (apparent) values as the estimated values tend to be over-optimistic (Birks 1995). Transfer functions were developed using the computer program C2, version 1.4.3 (Juggins 2003), and the chironomid data were square-root transformed in both the PLS and WA analyses.

To assess if proxy-inferred values differed depending on which year’s data was used, we ran the PLS model for 2000 and 2006 and applied it to the chironomid data. A paired t-test was used to evaluate whether the difference between the predicted (jack-knifed) reconstructed bottom water temperature values between the years was statistically significant.

Results

Organism-environment relations

Chironomids

In the 2000 and 2006 data, nine variables were identified as statistically significant (P < 0.05, 499 Monte Carlo permutations) in the constrained CCAs after the removal of those with the highest variance inflation factors (Table 1). LOI explained the largest proportion of the variance in the species data in both years (14.8%), followed by Tb (10.9% and 10.6%), and Depth (12.1% and 10.0%). Statistically significant variables in the forward selection process using the 2000 data were LOI (P = 0.002), Tb (P = 0.002), Alk (P = 0.014), and TP (P = 0.010), and using the 2006 data were LOI (P = 0.002), Ts (P = 0.012), and Alk (P = 0.024) (Fig. 2). These variables account for 18.0% of the variance in the species data on the first two CCA axes (eigenvalues of 0.17 and 0.11, respectively) using the 2000 data, and 15.8% of the variation on CCA axes 1 and 2 (eigenvalues of 0.16 and 0.09, respectively) using the 2006 data.

Table 1 P-values (P < 0.05, 499 Monte Carlo permutations) and percent variance explained by each environmental variable in a constrained CCA where each variable is used as the sole constraining variable
Fig. 2
figure 2

Canonical correspondence analysis biplots of lakes and species with the environmental variables that explain the majority of the variation in the chironomid data using environmental data between 2000 and 2006. For corresponding taxon list refer to Appendix 1

Correlations between the environmental variables and the ordination axes (as revealed by intra-set correlations) indicate that LOI and bottom temperature contribute to CCA axis 1 in 2000, TP contributes to axis 2, and Alk to axes 2 and 3 (Table 2). Similar intra-set correlations exist between variables and axes in 2006, however Ts replaces Tb as the second variable contributing to axis 1, and also contributes to axes 2 and 3. Scatterplots depicting Tb and Ts indicates that this dataset contains two populations of lakes, deep and shallow (Fig. 3).

Table 2 Intra-set correlations for the first three axes of the forward selected variables in the chironomid analyses for 2000 and 2006
Fig. 3
figure 3

Scatterplots of the relation between surface water temperature (Ts) and bottom water temperature (Tb) in 2000 and 2006

Lakes, species, and variables are situated in the same relative location on the biplots using data from both years (Fig. 2). Three Guardsmen and Little Louise Lakes have lower surface and bottom water temperatures, whereas Sulphur, WHA and Little Hungry Lakes have higher surface and bottom water temperatures due to their shallow lake depth (i.e. 3 m or less). Stella, L1, L2 and Blanchard Lakes all have sediment with high percentages of organic content (LOI), while Fox Point and Small Lakes have sediment with low percentages of organic content, and alkaline lake water.

Psectrocladius (Allo/Mesopsectrocladius), Tanytarsus sp. C, and Tanytarsuspallidicornis type occur in lakes with greater sediment organic content (LOI), whereas percentages of Nanocladius and Parakiefferiella cf. sp. B are negatively correlated with LOI and positively correlated with Alk. Glyptotendipes, Tanytarsus lugens group, and to a lesser extent Cladopelma and Microtendipes inhabit lakes with warmer surface and bottom water, while Heterotrissocladius and Micropsectra atrofasciata are found in lakes with cooler surface and bottom water.

Ostracodes

Leverage diagnostics revealed that the Mg/Ca ratio in Emerald Lake had an extreme influence on the analysis using both the 2000 and the 2006 data (23.1× and 16.4×, respectively). In the 2006 analysis, St. Elias Lake had an influence on the environment ordination (3.4×). These two samples were made supplementary in the CCAs so as not to influence the definition of the ordination axes (ter Braak and Šmilauer 2002), but they were included on the biplot to observe where they ordinated.

The constrained CCAs indicate six variables are statistically significant in the 2000 data and seven variables are significant in the 2006 data, after the removal of variables with high VIFs (Table 1). Mg/Ca individually explained a large amount of the variance (33.1% in 2000 and 30.4% in 2006) in the species data in both years, as did several other variables (see Table 1). Forward selection chose Mg/Ca (P = 0.002), Tb (P = 0.002), Depth (P = 0.010), and K (P = 0.016) as the statistically significant variables that explain 18.7% of the variance in the ostracode data on the first two canonical axes (Fig. 4). This procedure selected only two variables using the 2006 data; Mg/Ca (P = 0.002) and Depth (P = 0.002), yet a similar amount of variance in the ostracode data was explained (17.3%). Eigenvalues for CCA axis 1 and 2 are 0.39 and 0.21 in 2000, and slightly lower in 2006 (0.36 and 0.19).

Fig. 4
figure 4

Canonical correspondence analysis joint plots of lakes and species with the environmental variables that explain the majority of the variation in the ostracode data using environmental data from 2000 and 2006

Intra-set correlations revealed that the Mg/Ca ratio is correlated with CCA axis 1, and Depth is correlated with CCA axis 2 in both the 2000 and 2006 analyses (Table 3). Tb and K are also highly correlated with axis 2 in 2000.

Table 3 Intra-set correlations for the first three axes of the forward selected variables in the ostracode analyses for 2000 and 2006

The plot of lakes and environmental variables reveal a similar relation among the variables and cases between the 2 years, as does the species plot (Fig. 4). Teapot, Grayling, Long, and Fox Point are deep lakes with cooler bottom water and ordinate together in the 2000 analysis. Several lakes have very low Mg/Ca ratios (e.g. Otter Falls, Decourcy, Ash, K819, Atthilu), whereas only a few have high Mg/Ca ratios in both years (Emerald, Keyhole, Jenny, Sulphur).

Ilyocypris bradyi, Limnocythere itasca, Candona acutula, and Candona compressa inhabit lakewater with high Mg/Ca ratios, whereas Cypria globosa, Cypria serena, and Cyclocypris sp. are found in water with low Mg/Ca ratios. Although their scores differ on CCA axis 1 in the 2 years, values of Candona ikpikpukensis, Limnocythere sp., Ilyocypris gibba, Candona rawsoni, and Cytherissa lacustris are correlated with CCA axis 2 (defined by variables associated with Depth) in both years.

Diatoms

Seven variables were identified as statistically significant in the constrained CCAs using both the 2000 and the 2006 environmental data, after variables with high VIFs were removed (Table 1). In 2000 and 2006, when selected as the sole constraining variable, Depth explained 25.5% and 22.4% of the variance, and Alk explained 24.4% and 22.3%, respectively. Other variables could also explain much of the variance when selected individually (see Table 1).

Depth (P = 0.002; P = 0.004), Alk (P = 0.002; P = 0.004), and Area (P = 0.024; P = 0.018) were the variables chosen using forward selection in both 2000 and 2006 (Fig. 5). Together, these variables explained 11.5% of the variance when using the 2000 environmental data, and 10.5% variance with the 2006 data. Eigenvalues for CCA axis 1 and 2 in 2000 were 0.28 and 0.22, and slightly lower in 2006 (0.25 and 0.19). Intra-set correlations from analyses conducted on both years of data revealed Depth and Alk were correlated with CCA axis 1, and Area was correlated with axes 2 and 3 (Table 4).

Fig. 5
figure 5

Canonical correspondence analysis joint plots of lakes and species with the environmental variables that explain the majority of the variation in the diatom data using environmental data from 2000 and 2006. For corresponding taxon list refer to Appendix 4

Table 4 Intra-set correlations for the first three axes of the forward selected variables in the diatom analyses for 2000 and 2006

As with the analyses based on the other two groups of organisms, lakes, species, and environmental variables were positioned in the same relative location on the ordination plots in both years (Fig. 5). Three Guardsmen, Pine and Wolverine Lakes have a relatively large surface area, whereas Little Hungry, Decourcy and Patrick Lakes are small. Long, Grayling and PC Lakes are relatively deep, whereas Ash and KL1 Lakes are shallow. Emerald, Keyhole, and Small Lakes have higher values of alkalinity than other lakes on the joint plot.

Abundances of Nitzschia frustulum v. bulnheimiana, Denticula valida, and Cymbopleura cf. subaequalis were correlated with Alk, percentages of Fragilaria cyclopum and Cyclotella aff. comensis were positively correlated with Depth, whereas values of Encyonopsis silesiacum (Bleisch) Mann and Nitzschia fossilis were negatively correlated. Abundances of Navicula aff. veneta were correlated with both Depth and Alk, while percentages of Achnanthes ricula aff., Fragilaria brevistriata and Fragilaria capucina were correlated with Area.

Inference models

The PLS one-component model that was developed using the 2000 Tb data had a higher jack-knifed r 2, a lower jack-knifed maximum bias, and a lower RMSEP than the model developed using the 2006 Tb data (Fig. 6). A one-component model was selected because the prediction error of the second component was greater in the models produced using both years’ data (Birks 1998). The estimated (apparent) and predicted (jack-knifed) r 2 were higher with the 2000 model (estimated = 0.61, predicted = 0.38) than they were with the 2006 model (0.60, 0.35). The root mean square error (RMSE) and the root mean square error of prediction (RMSEP) were both lower with the 2000 model (RMSE =  2.51°C, RMSEP =  3.18°C) than with the 2006 model (2.87°C, 3.67°C). In addition, the estimated and predicted maximum bias was lower with the 2000 model (estimated = 3.90°C, predicted = 5.04°C) than with the 2006 model (4.50°C, 6.94°C). A paired t-test revealed that the predicted proxy-inferred values generated by the transfer functions were not statistically different (t = −4.16, df = 40, P = 0.0002).

Fig. 6
figure 6

Estimated (apparent) and predicted (jack-knifed) chironomid-inferred bottom water temperatures using a one-component partial least squares (PLS) model for 2000 and 2006

The optimum and tolerance of each of the 38 chironomid taxa to Tb were generated for both years using WA (Table 5). In general, values of both optima and tolerances were comparable whether computed using the 2000 or 2006 data. Approximately 50% of the optima were within 0.5°C of each other when computed with the 2 years data and nearly 80% were within 1°C. Many of the taxa for which the difference in the optima computed using the 2000 and 2006 data greater than 1°C were those found in a small number of lakes, where it would be expected that the estimated value would be less reliable. Although optima computed using data from the year 2000 were equally smaller or larger than those computed using data from the year 2006, the tolerances from the year 2000 were in most cases smaller than those of 2006. However, the tolerances generated from the 2000 data overlap those produced using the 2006 data, indicating that the statistics are similar between the years (Table 5).

Table 5 Number of occurrences and predicted optimum, tolerance, and range of 38 chironomid taxa to bottom water temperature for 2000 and 2006

Discussion

The fundamental issue is whether a one-point sample can adequately characterize the lake environment, as this is the basis of quantitative paleolimnology using various transfer function methods. Within-lake spatial differences and seasonal changes all contribute to the total variability of environmental samples, and there have been studies that address the importance of this variability. Our interest was in the importance of interannual variability in exploratory multivariate analyses of organism-environment relations. We approached this research by inquiring whether the year of sampling of the environmental variables affects the parameterization of organism-environment relations.

It appears that, in spite of the differences in the environmental conditions between the 2 years, the year of sampling does not overly affect the organism-environment relations derived from these data. Environmental variables were positioned in the same relative location on the ordination plots, and the primary forward selected variables were the same in both years for each of the three groups of organisms. The eigenvalues and percent variance explained by CCA axes 1 and 2 are consistently lower in 2006 than in 2000 for all groups, suggesting that more of the variation in the species data is explained by the environmental variables in 2000 than in 2006 (ter Braak 1995); however, the differences are not great.

Physical features of the lake (temperature, surface area and/or depth) are important variables in explaining the distribution of chironomids, ostracodes, and diatoms, as the interaction among these variables determines the relative amounts of shallow and deep water environments available to the organisms (Håkanson 2004), as well as the temperature of the water in these environments. The chironomid-environment analysis included Tb in 2000 and Ts in 2006 in the forward selection process (Fig. 2). A positive relation appears to exist between these variables in shallow lakes; however, this relation is not as clear in deep lakes (Fig. 3), although there are fewer cases of the latter. In the ostracode-environment analysis based on the 2000 data, both Depth and Tb were selected by the forward selection process, and were correlated with the same axis (Table 3). Warmer temperatures and higher insolation at the surface of the lake during the summer months cause shallow lakes to heat through to the bottom, and deep lakes to thermally stratify. This affects both the chironomid and ostracode community composition in this region.

Water depth can also affect parameters such as sediment size distribution, which is related to organic matter and may influence community composition (Brinkhurst 1974). Chironomids and ostracodes are benthic organisms that can be found in both deep and shallow lakes, however few species dwell in both environments. In comparison, diatoms can be planktonic, benthic, or periphytic, and the majority of the lakes in this set with a large surface area are relatively shallow and provide a variety of habitats to support different species.

Chironomids and diatoms are ubiquitous organisms that inhabit both high and low pH environments. In the southwest Yukon, lakes are neutral to alkaline (pH values 7.5–9.4 in 2000; 7.95–9.3 in 2006) and different taxa from these two groups are found in neutral and alkaline lakes (Wilson and Gajewski 2002, 2004). However, most ostracode species require a pH above 7.0 to exist (Delorme 1991), and even higher values to be well preserved in the sediments after death. Therefore, the alkalinity gradient is more important as an explanatory variable for the chironomid and diatom taxa; since ostracodes require an alkaline environment simply to exist, they are found only along a small portion of the gradient. With all three groups of organisms, the dominant variables that would be expected to influence the distribution are identified in the ordinations irrespective of the year of environmental data used in the analysis.

The chironomid-derived bottom water temperature models developed for the 2 years are comparable, however the 2000 model outperforms the 2006 model (Fig. 6). The cross-validation maximum bias is high in both years given the range of the data, but this is likely due to the dominance of shallow lakes with warm bottom temperatures in this dataset (33 lakes <10 m; 8 lakes >10 m). The WA models produced generally lower taxon tolerances in 2000 than in 2006, suggesting less between-lake variability in environmental measurements during the year 2000. The estimated optima, however, are quite similar in both years, especially if the number of lakes with the taxon is higher. It is perhaps not surprising that these estimated parameters are similar, as the organisms would only be found in the lake if it could tolerate the inter-annual variability present in the region.

For our purposes, this study indicates that the limited environmental measurements typically made in the context of a paleolimnological study sufficiently characterize the environment for the statistical methodology used. However, this study would need to be repeated in different regions to determine if this is generally the case.

Conclusions

Interannual variability in the lake environment is of concern when deriving inference models relating organisms to environmental quantities. A sediment sample may integrate several years of input, so the year of sampling may cause statistical calibration to differ, depending on when the water sample was collected and other environmental parameters were measured. However, we found that the year of sampling is of secondary importance when relating the organism assemblages to environmental variables, but only with the major explanatory variables.

These results are limited to two different years and three organism datasets in one region. Nevertheless, they provide guarded optimism that the methodology of estimating transfer functions as currently applied is not entirely determined by the particular year when the data were collected. Further work is needed to ensure that these results are general and not simply due to the particular situation studied here.