Introduction

The estimation of biomass of freshwater macroinvertebrates is a necessary step when studying life histories, community relationships, transfer of energy, and turnover of biomass in food webs (Rigler & Downing, 1984). As an alternative to the direct determination of body mass, indirect methods based on functions describing length–mass relationships are widely used to obtain rapid estimates of individual mass from measurements of macroinvertebrate body dimensions (Burgherr & Meyer, 1997). In addition to their efficiency, indirect methods have the advantage that the measured individuals are not destroyed and are available for further analysis (Meyer, 1989).

Since it is not always possible to establish length–mass relationships using organisms from the ecosystem under study, it is common practice to use published relationships (e.g., Smock, 1980; Benke et al., 1999). It is important that these were determined using organisms from the same ecosystem type and geographic region because length–mass relationships can differ between habitats, leading to serious under- or overestimations of the true body mass when relationships from a different habitat are used (Baumgärtner & Rothhaupt, 2003; Méthot et al., 2012). Many compilations of length–mass relationships published for freshwater macroinvertebrates are now available (e.g., for North America: Smock, 1980; Benke et al., 1999; Johnston & Cunjak, 1999; Méthot et al., 2012; New Zealand: Towers et al., 1994; Europe: Mason, 1977; Poepperl, 1998; Baumgärtner & Rothhaupt, 2003), but most were established for macroinvertebrates from streams or rivers. For European lakes, only two compilations are available, one for shallow lakes in the United Kingdom (Mason, 1977) and one for the prealpine Lake Constance (Baumgärtner & Rothhaupt, 2003).

Aside from the lack of relationships for lakes, there is some dispute in the literature about the best way to estimate length–mass relationships, particularly when the aim is to use them to predict mass from length. Macroinvertebrate length–mass relationships are usually nonlinear and can be described by a power function:

$$M = a \cdot L^{b},$$
(1)

where M = mass, L = length of body dimension and a and b are parameters estimated by fitting the function to data (Wenzel et al., 1990; Johnston & Cunjak, 1999; Baumgärtner & Rothhaupt, 2003). As well as being nonlinear, the variance, or “scatter,” around length–mass relationship is usually greater for large than for small individuals; in statistical terms, the variance of the relationship is proportional to the mean and the error structure is multiplicative (Xiao et al., 2011). It is common practice to logarithmically transform the length and mass measurements so that the power function becomes the linear function (Bottrell et al., 1976):

$$\ln M = \ln a + \ln L \cdot b .$$
(2)

On the transformed logarithmic scale, the error structure becomes additive, the variance in mass is equal for all lengths, and the linear function (2) can be easily fitted to data using simple linear regression (Xiao et al., 2011). The resulting equation predicts log mass from log length, but these can be back-transformed to get predictions on the original unlogged scale (Xiao et al., 2011).

However, this procedure has been criticised on the grounds that (a) log transformation makes it more difficult to identify outliers in the data, (b) the log transformation makes the assumption that errors (variation) are multiplicative rather than additive, and (c) the resulting equation predicts geometric mean mass for a given length and not the arithmetic mean (Packard, 2009; Packard et al., 2010). Packard (2009) and Packard et al. (2010) recommend instead that nonlinear regression should be used on untransformed data. In response, Kerkhoff & Enquist (2009) argue that a) on the original arithmetic scale it is only outliers at the “long” end of the scale that will be easily seen, and b) the assumption of multiplicative errors is a feature, and not a bug, of the log transformation because in nature variation is usually multiplicative (Kerkhoff & Enquist, 2009; Glazier, 2013). Xiao et al. (2011) used simulation to demonstrate that the correct method depends on the error structure of the data and that assuming an incorrect error distribution will lead to biased estimates of the parameters and predictions that are poor over some range of the data, e.g., by consistently over- or underpredicting the mass of small individuals. Xiao et al. (2011) recommend comparing the likelihood of models with additive and multiplicative error structures to determine the best regression method.

If likelihood analysis indicates that log-linear regression should be used, the problem remains that back-transformed mass predictions will be biased. This is because in the log-linear regression models the mean of the log-transformed mass, i.e. the geometric mean, and the geometric mean is always less than the arithmetic mean (Smith, 1993; Hayes & Shonkwiler, 2006). For example, the geometric mean for the values 10, 100, and 1000 is 100, whereas the arithmetic mean of the same numbers is 370 (Hayes & Shonkwiler, 2006). However, rather than avoiding log transformation, correction factors can be estimated to correct this bias (Hayes & Shonkwiler, 2006), and here we use and test the smearing factor (Duan, 1983) a simple and robust nonparametric correction factor, which makes no assumptions about the error distribution (Smith, 1993).

While most published length–mass relationships have been estimated using unpreserved or frozen animals (e.g., Smock, 1980; Benke et al., 1999; Baumgärtner & Rothhaupt, 2003), most studies use these relationships on preserved animals (Leuven et al., 1985; Edwards et al., 2009). Preservation is especially needed for studies investigating biomass or secondary production of the entire macroinvertebrate community, where it is impossible to process the samples immediately after sampling (Edwards et al., 2009). In older studies preservation with hazardous substances such as, formalin or Kahle’s solution was conducted, as it has less influence on the preserved objectives. Meanwhile the most common preservative is ethanol. However, ethanol preservation causes a release of organic components such as enzymes or lipids resulting in mass changes, with more than 50% loss observed in some cases (Howmiller, 1972). The mass loss has to be accounted for and many conversion factors are available (e.g., Howmiller, 1972; Dermott & Paterson, 1974; Wiederholm & Eriksson, 1977; Landahl & Nagell, 1978; Leuven et al., 1985; von Schiller & Solimini, 2005). But using conversion factor to correct for changes in mass of preserved animals also leads to incorrect biomass estimates when body mass is predicted from regressions established on unpreserved animals. This is also a consequence of preservation in ethanol causing changes in macroinvertebrate length due to dehydration of the internal tissues and contraction of muscles (Britt, 1953; Lasenby et al., 1994; Leuven et al., 1985; von Schiller & Solimini, 2005). This has fostered other authors to establish conversion factors for length changes (Britt, 1953; Lasenby et al., 1994; Edwards et al., 2009). Another possibility is to use length–mass relationships based on preserved animals and subsequently apply a factor to convert from preserved to unpreserved mass (Leuven et al., 1985). The application of conversion factors for mass changes instead of conversion factors for length changes has the advantage that they also could be used to correct for mass changes of preserved animals weighed directly, where length is not measured.

The main objective of this study was to provide length–mass relationships for macroinvertebrates from temperate lakes of the central European lowland. In order to fulfil the demanding requirements of sound length–mass relationships, our associated research objectives were fourfold: (1) We aimed to clarify the appropriate statistical approach by comparing log-linear and nonlinear methods of estimating these relationships, and (2) by testing the smearing correction factor for removing the bias in mass estimates that is introduced by log- and back-transformation when using log-linear models. (3) We present conversion factors to correct for mass changes caused by preservation in ethanol. Lastly, (4) we aimed to assess the transferability of these length–mass relationships by comparing within- and between-lake mass predictions using our data, and by comparing our length–mass data with comparable published relationships from other regions.

Material and methods

Sampling and sample processing

Macroinvertebrates were sampled in 2008 in Lake Schulzensee (53°14′46″N, 13°16′26″E) and Lake Rathsburgsee (53°14′46″N, 13°16′26″E), and in 2011 in the littoral zone (0–4 m depth) of Lake Scharmützelsee (52°15′0″N, 14°3′0″E). All three lakes are located in Northeast Germany in the federal state of Brandenburg. Lake Schulzensee and Lake Rathsburgsee have surface areas of around 0.03 km2 and maximum depths of 4–5 m, while Lake Scharmützelsee has a surface area of 12 km2 and a maximum depth of 29.5 m (Grüneberg et al., 2011). The sampled area of all three lakes is characterised by sandy substrate mostly covered with macrophytes, flat or shallowly sloping shores, exposed and unexposed shores as well as low water level fluctuations. Macroinvertebrate sampling was carried out with a modified Ekman-Birge-grab and a hand net (500 µm mesh size) in different habitats, depths of the littoral, and seasons to cover the natural variability in length and mass. Immediately after sampling, macroinvertebrates were preserved in 96% ethanol. In the laboratory, the individuals were identified to the lowest taxonomic level possible and then stored in glass vials with 70% ethanol for at least 50 days until the mass loss due to preservation was stable (Leuven et al., 1985).

After mass stabilization, length and mass measurements were conducted on undamaged individuals having all appendages. For each taxon, an appropriate body dimension was measured to the nearest 0.01 mm (Fig. 1; Table 1). For the head width of insects, we did not measure the broadest section of the head as usually recommended in the literature, because we observed that the position of the broadest section of the head capsule varies between different larval stages of the same species; e.g., it is sometimes found in front of the eyes in younger stages but behind the eyes in older stages. Instead, we choose easy to find fixed points for taxa with similar characteristics (Fig. 1). The measured individuals were then dried for 24 h at 60°C in pre-weighed aluminium dishes, and the dry mass (DM) was weighed to the nearest 0.01 mg (Mettler AT261). For the zebra mussel Dreissena polymorpha Pallas, 1771, we removed the shell using hot water to determine DM without shell (Zwarts, 1991). The removal of shells from Gastropoda and small Sphaeriidae was impossible, therefore we determined the ash-free dry mass (AFDM) by combusting individuals for five hours at 450°C. In general, only large animals were weighed individually, otherwise we weighed several individuals of a similar length together and calculated a mean individual mass to reduce measurement error.

Fig. 1
figure 1

Black lines illustrate the measured body parts for the studied taxa. Some of the taxa presented stand for multiple taxa measured in the same way, these are: Phryganeidae BL for Athripsodes sp. BL, Hydroptilidae BL, Molanna angustata BL, Mystacides longicornis/nigra BL and Oecetis sp. BL; Gammaridae BL and HL for Chelicorophium curvispinum BL and HL; Hippeutis complanatus SW for Gyraulus sp. SW and Valvata cristata SW; Pisidium sp. SL for Valvata piscinalis SL; Caenis sp. HW for Ephemeroptera HW, Trichoptera HW and Odonata HW. (BL body length, HL head length, HW head width, SH shell height, SL shell length, SW shell width)

Table 1 Parameters of the length–mass relationship \(M_{\text{SE }} = e^{{\ln a + b \cdot \ln L}} \cdot {\text{SF}}\) for preserved benthic macroinvertebrates of combined lake, where M = mass [mg] and L = body dimension [mm]

To establish preservation conversion factors, a separate set of macroinvertebrate samples was taken from Lake Scharmützelsee in January 2013. Individuals were identified and processed on the same or the following day. For each conservation factor established for aggregated major taxonomic groups, eight to 22 replicates with 1–8 individuals from one taxon covering different sizes were used. Different numbers of identified taxa were only used for Hirudinea (4 taxa) and Trichoptera (6 taxa). Half of the individuals for each major taxonomic group were weighed directly and the other half stored in 10 ml glass vials filled with 70% ethanol in the dark to exclude potential effects on mass due to light (Leuven et al., 1985). The unpreserved animals were first carefully dried on filter paper and then weighed to the nearest 0.01 mg to determine the fresh mass (FM). Subsequently, unpreserved DM was determined by drying individuals for 24 h at 60°C. Small molluscs were combusted at 450°C for 5 h to measure the unpreserved AFDM. The DM and AFDM of the preserved individuals for each major taxonomic group were measured in the same way as the unpreserved individuals after 50 days (Leuven et al., 1985).

Statistical analysis

Macroinvertebrate length–mass relationships were established at species level if possible. In cases where species level identification was not possible, or there were not enough individuals, species were grouped into the subsequent higher taxonomic level. Based on this data, the following steps were carried out to create bias-corrected length–mass regression for preserved macroinvertebrates.

Data were first tested for gross outliers by plotting log-transformed mass against length estimates for each taxon and mass measurement. For many taxa, there was a problem with the mass estimates of very short individuals, particularly when mass was given as AFDM. There were two related problems. (1) The absolute portion of measurement error was constant and was therefore proportionally larger for small individuals than for large individuals. After log transformation, this measurement error showed up as increased variance at the small end of the length scale. (2) When the true mass of individuals was low enough to be close to the lower limit of the mass balance, some mass estimates become zero, or even negative during the estimation of AFDM. During log transformation, these zero or negative estimates have to be excluded and this introduces a bias at very short lengths because those individuals whose mass was overestimated are retained but those for whom mass was underestimated are lost from the sample. This problem was solved by determining, for each taxon, a lower length threshold below which estimates of mass become unreliable. All individuals below this length were removed from the data set which eliminated the bias caused by exclusion and removed the very variable mass measurements of extremely short individuals.

A log-linear regression (LLR) and a nonlinear regression (NLR) were fitted to the screened DM and AFDM data for each taxon. The LLR was fitted using R’s standard function for fitting linear models, “lm” (R Core Team, 2013). This model assumes an additive, normally distributed error distribution after log transformation, and therefore a multiplicative, lognormal error distribution on the untransformed scale.

$$\ln M = \ln a + b \cdot \ln L + \varepsilon \quad \varepsilon \sim N ( 0 , \sigma ).$$
(3)

The parameters ln a and b are the intercept and slope of the linear regression function, M = mass, L = length of body dimension, ε = a normally distributed error term with mean = 0, and standard deviation = σ. We write ln a here to indicate that once back-transformed, i.e., \(e^{a}\), it is equivalent to parameter a in the nonlinear model. The nonlinear regression model (NLR) was fitted to the untransformed length and mass values using R’s function, “nls”, for nonlinear regression. This model assumes an additive normally distributed error distribution on the untransformed scale.

$$M = a \cdot L^{b} + \varepsilon \quad {{\varepsilon \sim }}N ( 0 , \sigma ).$$
(4)

For each taxon, the likelihood of the data given each of the two fitted models was compared and used to determine the most appropriate regression model. Because the two models were fitted to different versions of the mass data, original, and log transformed, the likelihoods reported by the software could not be compared. However, comparable likelihoods were calculated by using the fitted parameters and variance components from the models with the appropriate probability density functions, as described by Xiao et al. (2011). For each log-linear regression, likelihoods for each observed mass were calculated using the lognormal probability density function (because the LLR assumes lognormal errors on the original scale) parameterized using the predicted mass for each observation as the mean (i.e., a different mean parameter for each observation), and the standard deviation of the residuals as the standard deviation. The product of these likelihoods gave the likelihood of the data given in the model. The same procedure was used for each nonlinear regression but with the normal probability density function instead of the lognormal one (because the NLR assumes normal errors on the original scale).

For each log-linear model (LLR), a smearing factor (\(SF\)) (Duan, 1983; Hayes & Shonkwiler, 2006) was calculated to adjust to the fact that the geometric mean mass is being predicted and not the arithmetic mean. The smearing factor is calculated by taking the mean of the back-transformed residuals from the fitted model, in this case a log e (ln) transformation was used so the formula is as follows:

$$SF = \frac{1}{n} \cdot \mathop \sum \limits_{i = 1}^{n} e^{{\varepsilon_{i} }} ,$$
(5)

where \(\varepsilon_{i}\) are the residuals from the fitted log-linear model.

The corrected mass estimate for an individual of a given length L is then calculated as follows

$${\text{M}}_{SE} = e^{{\ln a + b \cdot \ln {{L}}}} \cdot {\text{SF}} .$$
(6)

To evaluate the fit of the models, the total predicted mass of each taxon was compared to the measured total mass of each taxon. Linear regressions were fitted between percentage errors in individual mass estimates and individual lengths, and the slopes of these relationships were used to test for systematic biases such as a tendency to over- or underpredict the mass of long or short individuals.

To estimate factors to convert between dry mass of preserved and unpreserved individuals, a regression model was fitted predicting log DM from log FM (Fig. 2), with the same slope for preserved and unpreserved individuals but with different intercepts. The difference between these intercepts (Fig. 2) gives the estimated conversion factor. Initial testing indicated that the slope of the relationship did not differ between taxa (ANOVA, F = 1.178, P = 0.278, df = 24), and so a common slope (but different intercepts) was used for all taxa (equivalent to an ANCOVA). Doing so reduced the variance in the estimated relationships. Conversion factors were tested whether they differed significantly between taxa or whether it would be appropriate to use common correction factors for broad taxonomic groups. In addition, conversion factors for calculating AFDM from DM were also established, in order to use them for energy budget or material flow studies.

Fig. 2
figure 2

Example of the relationship between fresh and dry mass for preserved and unpreserved individuals for Chironomidae. The arrow indicates the correction factor for converting preserved into unpreserved mass

To assess the transferability of the fitted length–mass relationships between different lakes, lake-specific log-linear models and smearing factors were estimated for taxa that had minimum sample sizes of 20 in more than one lake. These lake-specific models were then used to predict mass for the same and different lake(s) and the accuracies of these predictions were compared. Finally, length–mass data collected here were compared with length–mass relationships published in Méthot et al. (2012), which used very similar methods to those here and included five taxa identified to a similar taxonomic level.

Example R code for fitting log-linear regression models and estimating the smearing correction factor is provided in Online Resource 1. The length–mass and alcohol preservation data used in this study are provided in Online Resource 2.

Results

For the vast majority of taxon and body dimension combinations, the likelihood of the data was higher given a log-linear versus a nonlinear regression model: 41 of 42 for length–dry mass and 10 of 10 for length–ash-free dry mass relationships. The one exception was for DM of Caenis horaria (Ephemeroptera) Stephens, 1835, where only 7 data points were available. A table of likelihoods, and likelihood ratios between the log-linear and nonlinear models, is provided in Online Resource 3.

Using nonlinear regression, mass estimates for complete samples were unbiased (Fig. 3), however this was only because biased estimates for short and long individuals cancelled each other out. For most taxon body combinations, nonlinear additive error models resulted in biased parameters and a poor fit over some range of the data. For example, for individuals of Caenis robusta (Ephemeroptera) Eaton, 1884 with head widths of less than 0.7 mm, almost all data points were above the estimated nonlinear regression line (Fig. 4). We found statistically significant relationships between individual prediction errors (residuals) from nonlinear models and measured length for 29 of 52 taxon and body dimension combinations (P < 0.05 for 29 out of 52). Residual plots for Anisoptera and C. robusta are given as examples (Fig. 5).

Fig. 3
figure 3

Error in total predicted mass. Each dot represents the estimated total sample mass for one taxon predicted from a log-linear, nonlinear, and smearing-corrected log-linear regression model. Estimates are biased to be too low when an uncorrected log-linear model is used. After smearing correction the estimates are unbiased

Fig. 4
figure 4

The relationship between length and mass for Caenis robusta, estimated by log-linear and nonlinear regression, shown on untransformed axes (a) and log10 transformed axes (b). Nonlinear regression results in a function that underestimates mass for short individuals, but this is only clear when viewed on log-transformed axes. The vertical dashed line indicates a length of 0.7 mm which is referred to in the main text

Fig. 5
figure 5

The relationship between length and percentage error in predicted mass of individuals for two example taxa, Anisoptera and Caenis robusta. Solid lines show the fitted regression models, dashed lines indicate the ideal situation with no relationship between length and % error. For the log-linear model the dotted line is obscured by the regression line. The nonlinear model results in very large proportional errors for small individuals

In contrast, log-linear regression models gave a good fit to mass for both long and short individuals (e.g., Fig. 4), with no significant (P > 0.05) relationships between prediction errors and length measurements (Fig. 5). However, due to the log transformation, relationships gave biased estimates of total mass on the original untransformed scale (Fig. 3). Estimated total mass was on average 7% lower than the observed total mass (Fig. 3) and for some taxa as much as 20% too low. Applying the smearing factor almost completely removed this bias. After correction, estimated total sample mass was on average only 1% lower than real total sample mass (Fig. 3). We provide summary information for these fitted length–mass relationships and smearing factors in Table 1, corresponding plots are provided in Online Resource 4.

Macroinvertebrate taxa lost between 16% (Hirudinea, see Table 2) and 30% (Caenis sp.) DM, or 14 to 37% AFDM during the 50-day preservation period. A likelihood ratio test indicated that there was no statistically significant difference between the ten taxa in the proportion of mass lost during preservation (F = 0.53, df = 9, P = 0.85) and therefore we also calculated overall conversion factors for lake macroinvertebrates based on all ten taxa (DM 22%; AFDM 20%).

Table 2 Conversion factors for mass (ash-free dry mass (AFDM) and dry mass (DM)) allowing for calculation from preserved (pres.) mass to unpreserved (unpres.) mass

There were five taxa that had sample sizes of 20 or more in multiple lakes. With the exception of two large outliers for the Chironomidae, whole sample mass predictions using lake-specific log-linear models and smearing correction factors had a similar range of error when they were applied to the same or to a different lake (Fig. 6). Of the six taxa in common between this study and Méthot et al. (2012), four of their length–mass relationships corresponded well with our data (Fig. 7). Méthot’s equation for Caenidae was a poor fit to our data, however, their relationship had an R 2 of 0.12.

Fig. 6
figure 6

Percentage errors in estimates of total sample mass when length–mass relationships estimated using individuals from a single lake were used to predict the mass of individuals from the same lake, and from a different lake. Comparisons were made for 7 taxa with at least 20 individuals sampled in each of two lakes

Fig. 7
figure 7

Length–mass relationships estimated by Méthot et al. (2012) plotted over individuals sampled for this study. The solid lines show regression equations, including smearing factors, from Table 2 of Méthot et al. (2012). Dashed lines show ± 1 SE for the parameters of the equation. Grey points are length–mass data from this study

Discussion

Our results clearly demonstrate that the establishment of length–mass relationships for lake macroinvertebrates relies on the appropriate processing of samples and sound statistical treatment of the data. We found that log-linear regression was much better than nonlinear regression for fitting power law relationships to macroinvertebrate length–mass data, because the underlying error structure is multiplicative. Although Xiao et al. (2011) found that some (17%) of the 471 allometric relationships they tested were better characterised by additive error, their data included many “morphological and physiological allometries between organismal traits” that were unlike those between body dimensions and mass. We expect multiplicative error to be the general case for length–mass relationships of macroinvertebrates. Xiao et al. (2011) demonstrated that models that assume the wrong error structure produce biased estimates of the parameters a, and b, of the power law function (Eq. 1) and result in curves that are a poor fit over some range of the data. In our case, using nonlinear regression would result in a poor fit to the small individuals in a sample and very large proportional errors in their mass estimates (Figs. 4, 5).

Although log-linear regression provided a good fit to the length–mass relationships along the entire range of body lengths, the mass predictions themselves are slightly biased because it is the geometric mean, rather than the arithmetic mean, that is being predicted. We found mass to be underestimated by an average of 7%, but the smearing factor (Duan, 1983) was very effective at removing this bias. We estimated smearing factors of between 1.00 (BL-DM Ceratopogonidae) and 1.15 (HW-DM Cloeon dipterum, Ephemeroptera (Linnaeus, 1761)), underestimation was more pronounced, and hence smearing factors larger, for relationships with more scatter (lower R 2). Thus, the use of a correction factor will be more important for taxa with more variable body forms. For example, in comparison to Arthropoda, shelled Mollusca are not so variable in their body form due to their stable inflexible shell and thus received low smearing factors (<1.08 in all our cases). The study of Méthot et al. (2012) is the only other study we know to have used the smearing factor for benthic freshwater macroinvertebrate length–mass relationships, and they too found that smearing factors for Mollusca (<1.12) were relatively low compared to those for other taxa in their study.

The smearing factors in Table 1 should be applied when using our estimated length–mass relationships to predict mass from the lengths of newly measured organisms. However, since the value of a smearing factor depends on the distribution of residuals in a specific regression, it is specific to that estimated length–mass relationship. In other words, our smearing factors are only valid for their corresponding relationship published in this study, and they cannot be used to correct predictions from other, previously published, length–mass relationships. Likewise the smearing factors in Méthot el al. (2012) only apply to the relationships in that study.

The length–mass regressions in this study were established with specimens preserved in 70% ethanol for at least 50 days. This duration is recommended by Leuven et al. (1985) if processing of the samples within 2 days after collection is not possible. It allows stabilization of length and mass changes (e.g., Leuven et al., 1985; Lasenby et al., 1994; Edwards et al., 2009) and enables comparable length–mass regressions to be established on specimens preserved for 50 days or longer. If estimates of unpreserved mass are desired then a preservation conversion factor should be applied to convert estimated preserved DM to unpreserved DM, or preserved AFDM to unpreserved AFDM, respectively. We provide conversion factors for 10 macroinvertebrate taxa that correspond to a remaining mass of between 70 and 84% after preservation. These are consistent with the majority of published values. For example, Leuven et al. (1985) observed a remaining DM of 80% for Erpobdella octoculata (Hirudinea) (Linnaeus, 1758), 80% for Glyptotendipes sp. (Chironomidae) and 84% for Asellus aquaticus (Crustacea) (Linnaeus, 1758) after three months preservation with ethanol. Although existing studies report preservation effects for single taxa, we did not find large differences in the size of the preservation effect between our 10 studied taxa. Therefore, we provide an overall preservation conversion factor (DM = 1.288, AFDM = 1.252), estimated using all 10 taxa, for use on similar taxa that have been weighed without shells.

Comparing the two potential sources of error that we have quantified, the bias due to log transformation was relatively small, a 7% underestimate on average, compared with a bias of 20-30% if the effects of preservation were not accounted for. Other errors may be introduced if the length measurement is not performed on precisely the same body part as done here (Fig. 1), or if the regressions are applied to individuals whose lengths lie outside range of those used to fit the models (Table 1). Furthermore, it is recommended to use the lowest taxonomic level possible, because generalization may lead to inaccurate estimates (Benke et al., 1999; Méthot et al., 2012). However, for groups such as Chironomidae, identification is often only feasible to subfamily (e.g., Orthocladiinae and Tanypodinae) or tribe (e.g., Tanytarsini and Chironomini), and therefore regression equations from groups can also be valuable. We therefore provide both species level regressions and some for higher taxonomic groupings.

Our ability to characterise between-lake variation in length–mass relationships was limited, because for most taxa our data came from just two lakes. For those comparisons we could make, prediction errors were only slightly larger between lakes than within a lake. The one exception was for the Chironomidae, but in this case the large difference was likely due to a difference in the species composition, and so the error had more to do with using relationships for higher taxa, than it did with using a relationship from a different location. The similarity of the length–mass relationships between the three lakes allowed us to establish combined lake regressions. Since all three lakes are located in Central Europe, it is likely that they can be used for most lakes with similar characteristics in this region.

A comparison of our length–mass relationships with regressions provided in the literature was only possible for those in Méthot et al. (2012), which were estimated using species from the large lowland Lake Saint-Pierre in the cool temperate zone of Québec, Canada. While the relationship for Caenidae provided by Méthot et al. (2012) deviated substantially from our Caenis sp. data, we cannot be sure that the species involved were the same, and the uncertainty of measuring the broadest section of the head capsule in Méthot et al. (2012) may have contributed to the very low R 2 (0.12) they obtained. In contrast, when the taxon was precisely identified, our data compare well with the equations of Méthot et al. (2012). This suggests that our relationships can be quite confidently transferred to lakes in other geographic regions with similar environmental conditions, such as the Central European lowlands (covering parts of Germany, Poland, Denmark, Netherlands and Belgium) or the temperate lowland zone of North America.

However, care should be taken when transferring length–mass relationships between locations with different physical characteristics. Baumgärtner & Rothhaupt (2003) found intraspecific differences in length–mass relationships between individuals living in stream versus littoral habitats of Lake Constance. They concluded that differences were explained by differences in the type of flow velocity between the sites. We would advocate that our regressions should not be applied to ecosystem with fundamentally different physical characteristics such as climate or flow velocity that influence the growth of macroinvertebrates. Further research to estimate the variability of length–mass relationships for the same taxa between lakes, and between habitats with different characteristics, would be valuable and lead to a better understanding of intraspecific variation in allometric relationships.

In summary, our study provides 52 length–mass relationships for littoral macroinvertebrates sampled from three Central European lakes, together with correction factors for the bias induced by log transformation and the effects of preservation in ethanol. These relationships can be used to obtain rapid estimates of body mass when studying ecosystems functioning of lakes (Rigler & Downing, 1984). Furthermore, we show that log-linear regression, with smearing correction factors, is superior to nonlinear regression for those who need to establish length–mass relationships for new taxa and or regions.