Introduction

The biology of sperm has recently received increasing attention (reviewed in Birkhead et al. 2009), which also applies to passerine birds (e.g. Helfenstein et al. 2010a; Immler et al. 2010; Schmoll and Kleven 2011; Lüpold et al. 2012; Albrecht et al. 2013; Hermosell et al. 2013; Laskemoen et al. 2013a). Passerine birds are highly suitable for studying sperm form and function in natural populations given the ease in capturing free-living males and subsequent collection of sperm samples. Application of a massaging technique to the cloacal protuberance (the storage organ for spermatozoa) of male passerines leads to an ejaculate droplet on top of the conical papilla where it can be easily collected (Wolfson 1952). Subsequently, such sperm samples are commonly preserved in formalin (formaldehyde dissolved in water) for later morphometric analyses (e.g. Helfenstein et al. 2010b; Schmoll and Kleven 2011; Lüpold et al. 2012; Albrecht et al. 2013).

Important research topics such as individual-based longitudinal studies of sperm traits naturally require comparisons of sperm samples across multiple sampling events, which may often be separated by years. Furthermore, sometimes sperm samples may be available exclusively from particular years due to purely logistic reasons. Finally, it has been recognized that systematically archiving sperm samples in museum collections represents a valuable resource for synthetic analyses, which will also often include samples from different years (e.g. Immler et al. 2012; Cramer et al. 2013; Hermosell et al. 2013). In studies with such temporally structured sampling regimes, formalin-stored sperm samples typically experience differences in storage duration prior to sperm morphometric analysis in the range of years unless spermatozoa are routinely pictured within a standardized time interval after sampling. Such studies thus presuppose that prolonged storage in formalin does not affect sperm morphology, an assumption that is often implicitly made in the analysis of avian sperm morphology, but never tested. Preservation in formalin, however, may lead to tissue shrinkage (e.g. Briskie and Birkhead 1993; Jonmarker et al. 2006) or to other morphological modifications, and for many study designs, any effect of sperm storage duration could profoundly confound the focal analysis. For comparative studies with relatively large effect sizes this may be less of a problem, but not necessarily so for studies conducted in a within-species context where biologically relevant effect sizes may be smaller. For example, Schmoll and Kleven (2011) discuss the potential of differential sperm storage duration to confound geographical variation in sperm morphology in Coal tits Periparus ater (see also Lifjeld et al. 2012), although the same reasoning applies just as well to variation between years (e.g. Cramer et al. 2013) or to variation between individuals (e.g. Lifjeld et al. 2013).

In order to avoid clumping of spermatozoa, sperm samples are typically dissolved and diluted in either phosphate buffered saline (PBS) or Dulbecco’s Modified Eagle Medium (DMEM), a standard culture medium for videotaping live sperm, before preservation and storage in formalin. Similar to sperm stored for different time intervals (see above), also sperm samples treated differently in this respect may end up in a combined sperm morphometric analysis.

To the best of our knowledge, there are no studies that have tested for effects of storage duration on sperm morphology or for effects of using PBS versus DMEM for solving sperm samples. The aim of this methodological study, therefore, was to assess the effect of long-term storage of formalin-preserved sperm samples on the length of avian spermatozoa as a fundamental trait of major interest in many studies. In addition, we also examined the effect of sperm samples being initially solved in PBS as compared to DMEM as a solvent medium before preservation and storage in formalin.

Methods

Sperm collection

We sampled sperm of territorial male Blue tits Cyanistes caeruleus, Coal tits and Great tits Parus major near Lingen/Ems in NW Germany (52°27′N, 7°15′E) non-invasively by cloacal massage (Wolfson 1952; Laskemoen et al. 2013b) during the nestling feeding period when nestlings were approximately 10–14 days old. The storage duration analysis was based on sperm samples obtained in May 2010, which were solved in approximately 3 µL standard PBS before being mixed well and preserved immediately in 250 µL of an approximately 5 % formaldehyde solution (equivalent to an approximately 12.5 % formalin solution assuming a stock solution of 40 % formaldehyde). The solvent medium analysis was based on experimental ejaculates obtained in May 2012, which were split in half by attaching in parallel two microcapillaries to the ejaculate droplet. Half of each ejaculate was treated as described above for the storage duration analysis, while the other half was diluted in 30 µL prewarmed (37 °C) advanced DMEM (Fisher Scientific product VX12491015, Fisher Scientific, Schwerte, Germany) in a 0.5-mL Eppendorf tube for the purpose of videotaping live sperm. The suspension was carefully pipetted up and down approximately five times to achieve proper dilution and the fraction of the suspension not used for videotaping (~25 µL) remained in the tube at 37 °C. Approximately 5 min later when videotaping had been finalized, this fraction was preserved in 250 µL of an approximately 5 % formaldehyde solution. All sperm samples were stored at ambient (room) temperature in 2-mL tubes with screw-on lids and sealing ring (Sarstedt product 72.694.006, Sarstedt AG & Co, Nümbrecht, Germany).

Sperm morphometry

Approximately 3 µL of sperm solution from each sperm sample were transferred onto a standard microscopic slide and air-dried overnight. The slide was then carefully rinsed with distilled water in order to remove dirt and salt crusts and air-dried again. Slides were subsequently examined by light microscopy at 400 times magnification under light-field conditions using an Olympus BX50 microscope, and all pictures were taken by the same person using a Canon EOS 600 digital camera. A micrometer scale was pictured for each sperm sample immediately before slides were screened for spermatozoa that showed no obviously artefactual morphology. Pictures of ten spermatozoa per sperm sample were selected for further analysis because measuring ten spermatozoa has been shown to provide a sufficiently precise estimate of a sample’s mean sperm total length (Laskemoen et al. 2007).

Spermatozoa from the samples used in the storage duration analysis were initially pictured in February and March 2012. The samples were subsequently re-pictured based on material from the same physical sperm samples in late April 2013, resulting in subsamples of pictures originating from the same physical sperm sample while differing in storage duration by 13–14 months. Spermatozoa from the samples used in the solvent medium analysis were all pictured in February and March 2013. To enforce blind measurements with respect to sperm sample identity and thus formalin storage duration or exposure to solvent medium type, all samples were anonymized before analysis by TS, including three replicated samples (one per species) in each of the two analyses which were unknowingly measured twice for repeatability analysis. Sperm head, midpiece, and tail length were subsequently measured to a precision of 0.01 μm during a continuous measuring period by a single observer per analysis using ImageJ 1.46 (Rasband 1997–2012). Sperm total length was calculated as the sum of these components.

In the storage duration analysis, initially eight Blue tit, seven Coal tit, and nine Great tit sperm samples were included. We excluded two Blue tit samples from analysis, which provided only three and five analysable spermatozoa in their re-pictured subsamples. Thus, a total of 22 (physical) sperm samples with 44 subsamples with on average 9.8 ± 0.6 (SD) spermatozoa per subsample were measured in the storage duration analysis. In the solvent medium analysis, a total of 29 experimental ejaculates (nine Blue tit, ten Coal tit, and ten Great tit) were included resulting in a total of 58 (physical) sperm samples with exactly ten spermatozoa measured for each. Across species, repeatability (sensu Lessels and Boag 1987) of measurements for sperm total length amounted to 0.97 ± 0.01 (SE; F = 72.0, df = 29,30, p < 0.001) for the storage duration analysis and to 0.99 ± 0.004 (SE; F = 218.1, df = 29,30, p < 0.001) for the solvent medium analysis.

Statistical analysis

We used R 3.1.1 (R Development Core Team 2014) and linear mixed effects models (LME, R function lmer from the package lme4, Bates et al. 2014) to test for effects of sperm storage duration or exposure to solvent medium type (both two-level categorical predictor variables) on sperm total length. For the storage duration analysis we included subsample identity nested in physical sperm sample identity as random effects to account for the dependency of measurements of spermatozoa from the same subsample of pictures and the same physical sperm sample, respectively. For the solvent medium analysis, we included physical sperm sample identity nested in experimental ejaculate identity as random effects to account for the dependency of measurements of spermatozoa from the same sperm sample and same ejaculate, respectively. Note that in both the cases the higher-order random effects effectively reproduce the paired designs of the analyses. We included species identity as fixed effect in all models to control confounding variation resulting from species-specific sperm morphology. Significance of fixed effects was determined by removing the focal term from a maximum likelihood fit of the current LME model. p values in the context of LME analyses refer to the increase in model deviance when a term is removed from a model compared against a χ 2 distribution using a likelihood ratio test. All statistical tests were two-tailed, and we rejected the null hypothesis at p < 0.05.

Results

Controlling for species identity (LME: χ 2 = 48.8, df = 2, p < 0.001), there was no significant difference in mean sperm total length between spermatozoa from the same physical sperm sample when pictured twice at intervals of approximately 13–14 months (χ 2 = 0.0009, df = 1, p = 0.98; Figs. 1a, 2a). An equivalent paired t test executed on the means per subsample gave similar results (t = −0.12, df = 21, p = 0.90).

Fig. 1
figure 1

Mean (±SE) sperm total length per sample in relation to storage duration and solvent medium. a Spermatozoa from the same physical sperm sample when pictured twice at intervals of approximately 13–14 months while stored in 5 % formaldehyde for N = 6 Blue tit (blue), 7 Coal tit (black), and 9 Great tit (green) physical sperm samples. b Spermatozoa from the same experimental ejaculate split and solved in either DMEM or PBS before storage in 5 % formaldehyde for N = 9 Blue tit (blue), 10 Coal tit (black), and 10 Great tit (green) experimental ejaculates (colour figure online)

Fig. 2
figure 2

Mean pairwise difference in sperm total length with 95 % confidence interval (CI) between a spermatozoa from the same physical sperm sample when pictured twice at intervals of approximately 13–14 months while stored in 5 % formaldehyde (N = 6 Blue tit, 7 Coal tit, and 9 Great tit physical sperm samples) and between b spermatozoa from the same experimental ejaculate split and solved in either PBS or DMEM before storage in 5 % formaldehyde (N = 9 Blue tit, 10 Coal tit, and 10 Great tit ejaculates). Results are derived from linear mixed effects models which control for the fixed effect of species identity and include as random effects subsample identity nested in physical sperm sample identity for (a) or physical sperm sample identity nested in experimental ejaculate identity for (b)

Controlling for species identity (χ 2 = 42.1, df = 2, p < 0.001), there was no significant difference in mean sperm total length between spermatozoa from the same experimental ejaculate solved either in DMEM or PBS before storage in 5 % formaldehyde solution (χ 2 = 0.61, df = 1, p = 0.44; Figs. 1b, 2b). An equivalent paired t test executed on the means per physical sperm sample gave similar results (t = −0.22, df = 28, p = 0.83).

Discussion

In order to exclude potentially confounding effects of sperm storage duration during sperm morphometric analyses, spermatozoa would have to be pictured and/or measured in a defined time window after sampling. For logistic reasons, this may often not be possible particularly for studies involving museum samples. These may have been collected by different scientists and may be selected for inclusion in specific analyses only a posteriori after long and variable times in storage. Note that even if measuring would routinely be done in a defined time window after sampling, data obtained this way could not be analysed blindly with respect to sampling date and thus often not blindly with respect to the hypothesis under test, for example, when testing for age effects. Furthermore, in order to avoid observer effects, sperm morphometric measurements should be taken by the same observer and during a continuous measuring period, as even within individual observers measuring rules may drift over longer time periods. Thus, measuring sperm in a defined time window after sampling and storing the results in a data base for later synoptic analyses is also problematic.

Alternatively, samples pictured and measured after being stored for different periods of time could well be merged into a combined analysis when strong effects of differential storage duration can be ruled out. Spermatozoa from samples used for our storage duration analysis were first pictured approximately 22 months after they were initially stored, and they were re-pictured approximately 13–14 months later to mimic a typical situation where samples were obtained and also must be analysed across study years. Our analysis of differential storage duration demonstrates that prolonged storage of avian sperm samples in a 5 % formaldehyde solution is unlikely to affect sperm morphology. With an effect size of effectively zero and, importantly, with 95 % confidence that any actually existing effects are in the range of up to half a micrometer only (corresponding to approximately 0.5 % of sperm total length), even within-species approaches with small expected effect sizes seem unlikely to be severely confounded by pooling samples with different storage durations. We thus conclude that sperm samples from museum collections or individual long-term studies, which often have experienced different storage durations, can be merged and used in combined sperm morphometric analyses without the risk of confounding differential storage duration with e.g. natural variation between populations (Schmoll and Kleven 2011), between years (e.g. Cramer et al. 2013), between individuals (Lifjeld et al. 2013), or with age effects in individual-based longitudinal studies (Cramer et al. 2013). We would like to note that our study was not designed to test whether the formalin fixation process itself does affect sperm morphology. An interesting additional study would, therefore, be to split freshly obtained ejaculates and compare the morphology of spermatozoa between fresh versus formalin-fixed subsamples from the same experimental ejaculate.

Before preservation and storage in formalin, avian sperm samples are normally dissolved and diluted in either PBS or DMEM, the latter a standard culture medium for videotaping live sperm. Our study also revealed no evidence that differential solvent media (PBS vs. DMEM) affected sperm length. Similar to the results obtained for differential storage durations, 95 % confidence intervals around our estimate are in the range of half a percent of sperm total length. Thus, also spermatozoa from samples that were or were not used for the analysis of sperm velocity before preservation in formalin may be pooled and used in combined sperm morphometric analyses.

In our solvent medium analysis, split ejaculates were dissolved in either 3 µL PBS or 30 µL DMEM before transfer into 250 µL formalin. Even though at least 5 µL DMEM solution was normally used up for videotaping of live sperm, the different volumes led to slightly different formaldehyde end concentrations in the storage tubes for PBS and DMEM samples, respectively. This could have confounded our solvent medium analysis and could potentially confound future studies, too. To avoid this and allow pooling of samples across solvent media, sperm samples should ideally be dissolved in volumes that result in similar formaldehyde end concentrations across solvent media.