Introduction

In 1929, Macomber and Sanders published one of the first studies on sperm counts in human ejaculates [1]. Since then, the evaluation of semen parameters has become standardized and reference values have been proposed as seen in the World Health Organization (WHO) laboratory manual for the examination of human sperm [2]. In 1992, a review of publications on semen quality over the previous 50 years concluded that a significant decline in semen parameters was present [3]. That review and subsequent similar ones garnered both attention and criticism arguing against the validity of the conclusion [4,5,6]. When comparing results of each subsequent WHO guideline for reference semen parameters, it can be noted that reference ranges have decreased albeit in different populations and, more importantly, using different criteria for sperm assessment. These changes are notable in 10- to 12-year intervals. A more recent meta-analysis on temporal trends in sperm count tried to avoid the methodological flaws noted in previous reviews but nevertheless came to the same conclusion that sperm quality demonstrated a significant decline over time [7]. This meta-analysis had a considerable sample size (42,953 men) and used meta-regression models avoiding the limitation of previous studies; however, some limitations are difficult to overcome in a retrospective meta-analysis such as different methods of sperm counting between laboratories, especially over time and the high percentage nonresponse bias in most studies assessing sperm quality. Yet, the popular media in response has published articles about a fertility apocalypse. Nevertheless, the evidence for declining sperm count is debatable and some studies concluded that the sperm count stayed stable or even improved [8,9,10]. With this in mind, we decided to retrospectively analyze semen parameters from patients attending a single fertility center over a 10-year period (2009–2018), to see if trends were present in the infertile population as related to sperm quality over time. By selecting to study a single center, some of the bias involving different populations and different methods of semen analysis could, hopefully, be mitigated.

Material and methods

Initial data included 17,915 semen samples, collected between January 2009 and December 2018, from individuals who attended the reproductive center of the McGill University Health Center (MUHC) in Montreal, Canada. All samples were produced after a 3- to 5-day period of ejaculatory abstinence. When more than one semen sample was available from the same individual, we used only the most recent sample, leaving a total of 12,188 semen samples (which we refer to as ‘the entire dataset’), of which the majority (86.8%) were from individuals between 30 and 50 years old. The vast majority (97.4%) of these semen samples were from individuals who attended the reproductive center for fertility treatments; the rest (2.6%) were from individuals who attended the clinic to preserve sperm due to planned chemotherapy or near future use of gonadotoxic drugs. Cases with missing data were excluded from the final analysis. We considered comparing samples within the same individual, but since specimens were often spaced by just a few months in most cases, this did not permit evaluation within the same individual over time.

Data collection

Baseline semen analysis was performed within an hour from obtaining the sample, after liquefaction, with the semen maintained in a warming bath during that time. The concentration, total sperm count, motility, and progressive motility of the sperm were determined either manually or when the concentration was sufficient, for 200 spermatozoa by the computer-assisted sperm analysis (CASA) sperm analyzer (CASA system; HTM-IVOS, version 12.3; Hamilton Thorne Biosciences Inc., MA), with intra- and inter-assay coefficients of variation lower than 10%. All CASA results were subsequently verified manually (using a slide of fresh semen) by one of the three andrology technicians. For sperm morphology evaluation, a smear was prepared from 5 to 20 µl of semen, stained with the ‘Siemens Diff Quick stain kit’ (VWR, Siemens Healthcare LTD., CA), and the morphology of at least 200 sperm cells was determined for each sample under a compound microscope at 1000 × magnification. From 2011, the criteria for normal morphology were changed with adherence to the 5th edition of the WHO laboratory manual for the examination and processing of human sperm [2]. The semen analysis results were saved in the archives of the clinic and were retrieved for analysis. Ethical approval was obtained through the Institutional Review Board (IRB) and the Institutional Ethics Committee of the MUHC, number 2020–5643.

Data analysis

Analysis was done after dividing the dataset into two groups: at or above the WHO 2010 lower reference limits (ARL) (N = 6325) and below the reference limits (BRL) (N = 5521), 342 specimens were removed from this analysis due to missing data in one or more of the parameters. Analysis was performed using the R programming language. The distributions per year were plotted with confidence intervals. P values for differences between groups were calculated by using a one-way Kruskal–Wallis rank-sum test. Regarding power analysis, with 10 groups, 800 individuals in each group (most groups included > 1000 samples), with a low type I error of 0.01 and very high power (type II error) of 0.99, an effect size of 0.06 could be detected.

Results

As can be seen in Fig. 1, semen volume increased slightly in the ARL group (p = 0.049) before returning to baseline and was stable in the BRL group (p = 0.59). Sperm concentration and total count of the BRL and ARL group declined initially and then recovered slightly (p < 0.0001, in all cases) (Figs. 2 and 3, respectively). Although these changes were statistically significant, they are likely stochastic, related to the large study population, and clinically, these changes were quite mild and would not have impacted fertility potential. Sperm total motility and progressive motility of both the BRL group and the ARL group increased slightly from 2009 until 2015 and then decreased back to baseline (p < 0.0001) (Figs. 4 and 5, respectively). This change offset the decrease in count seen in those years. When addressing the total motile sperm count (TMSC) over the 10-year period, no clinically significant changes were found. The TMSC over 10 years in the BRL group remained unchanged (p = 0.12), while in the ARL group, TMSC decreased in what was a statistically significant (p < 0.0001) but not clinically significant manner (Fig. 6). A spurious change was observed with sperm morphology that declined after the first 2 years and remained stable thereafter (p < 0.0001, in both groups) (Fig. 7). However, this change was attributed to a contemporaneous change in the method of analyzing strict morphology which happened when the change occurred. In 2011, our lab started implementing the WHO 2010 criteria for sperm morphology and a new lab technician manually verified the morphology in accordance with said guidelines, and this resulted in stricter criteria for morphology being followed. In the following years, no clinically significant changes in sperm morphology were noted. No other changes in analyzing the semen parameters occurred during the study period.

Fig. 1
figure 1

Trends in sperm volume across the study period

Fig. 2
figure 2

Trends in sperm concentration across the study period

Fig. 3
figure 3

Trends in total sperm count across the study period

Fig. 4
figure 4

Trends in sperm motility across the study period

Fig. 5
figure 5

Trends in sperm progressive motility across the study period

Fig. 6
figure 6

Trends in total motile sperm count across the study period

Fig. 7
figure 7

Trends in sperm morphology across the study period

Discussion

If one googles sperm trends, the results are alarming. The common belief is that sperm quality is decreasing, and we might be heading towards an era of limited male fertility. Not only is the popular media promoting this notion, but scientific publications are increasingly reporting decreased sperm quality. From one of the first reviews on the subject until the more recent meta-analysis, the conclusion is that sperm count is on a downward slope. While there are many methodological limitations with these meta-analyses as well as studies reporting conflicting results, the common knowledge is of declining male fertility. This stands in contrast with the fact that the rate of infertility does not show the expected upward slope [11].

A common caveat of most meta-analysis is comparison of samples from different geographical locations and from different labs using different protocols for semen analysis. This limitation is amplified when assessing results over a long span of time, further adding bias with regard to the different protocols for assessing human sperm. Another possible bias with many studies on unselected or fertile population is that they are limited by the fact that subjects who are willing to participate in the study may differ from the general population, having some reason for which they wish to have their sperm count evaluated such as suspected infertility.

Single-center studies have given conflicting results regarding trends in sperm quality. Jorgensen et al. in a cross-sectional, prospective study on 4867 Danish men over 15 years (1996–2010) found a slight increase in median sperm concentration and total sperm count; however, when compared to historic data from a Danish infertility clinic in the 1940s, a negative trend in sperm concentration was noted [8]. It is possible that changes in the methods of calculating these values could have contributed to these results. A continuation of the study, on 6386 volunteers in Denmark, found no change in sperm counts over a period of 21 years (from 1996 to 2016) on different subjects, who presented at different time periods [12]. The authors hypothesized that they would note an improvement in sperm count over this period secondary to a reduction in the incidence of maternal smoking during the corresponding neonatal period of these volunteers. Maternal smoking may be associated with a decrease in sperm counts in the offspring [13]. Therefore, the authors expected a rise in sperm counts among participants reciprocal to the decrease in maternal smoking. The authors concluded that without this improvement in sperm counts due to a reduction in maternal smoking, the stability in the parameters might actually represent a deterioration in sperm quality.

The aforementioned meta-analysis and studies excluded patients suffering from infertility. However, studies assessing sperm quality among infertile couples also come to a similar conclusion—sperm quality is declining [14, 15]. In a study of 119,972 first semen analyses from selected infertility centers in 2 countries, the proportion of men with a TMSC > 15 million declined between the years 2002 and 2017 and a reciprocal increase was found in men with a TMSC between 5 and 15 million and with a TMSC below 5 million [14]. In our study, no statistically significant change in the TMSC of the BRL group was found (p = 0.13) and as such we would not expect a deterioration at the low levels of TMSCs. A retrospective study on seminal parameters from 23,504 men found a decrease in all sperm parameters over a period of seven year (2010–2017) [16]. Lastly, a prospective single center study including 936 men and 1618 samples found a significant decrease in sperm volume, concentration, count, motility, and morphology over a period of 17 years (2000–2017) [17].

Many different factors have been implicated as causative in the decline in sperm parameters. Lifestyle factors such as smoking [18, 19], alcohol use [20], obesity [21, 22], and even stress have been associated with a decrease in sperm parameters. Endocrine disruptors such as phthalates have been associated with decreased sperm concentration and motility in some studies [23,24,25]. Even the use of mobile phones has been associated with reduced sperm quality [26, 27]. It should be acknowledged that rates of infertility have not increased during this time period.

All these presumptive causative agents would be expected to cause a decline in more than one sperm parameter, yet in our study of 12,188 men, we found no clinically significant changes in any of the sperm parameters over the 10-year period. We analyzed not only the sperm parameters used in the WHO reference ranges but also the TMSC which was reported to have a better correlation with male factor infertility [28, 29]. This analysis too did not show any clinically significant deterioration over the study period, providing reassurance for minimal impacts on fertility. Whether a previous decline in sperm quality never did occur, did previously occur but now reached a plateau, or is only evident over a longer period or in an unselected population is a matter requiring further preferably prospective studies.

The limitations of our study are its retrospective nature and that the study group consists of male partners from couples referred due to infertility and of men referred for sperm cryopreservation prior to initiating gonadotoxic treatment. While trying to address this possible limitation by analyzing separately the group of men with normal semen parameters, it is still possible, although unlikely, that trends in sperm quality in the general population are not reflected in the infertile population. It would have been interesting to compare the same male over a long period of time; however, our samples did not permit this because repeat semen analysis was performed over a relatively short time period, and such a comparison might also add a bias related to the changes in sperm quality seen with age [30]. Another limitation of the study is that many of the men treated in our clinic came from different geographic locations which in itself could have an effect on sperm parameters; however, seeing that this has been true during the entire study period, it is unlikely to have caused a significant effect on our results. The above confounders were not controlled for in this analysis, which may have affected the results.

The strength of this study lies in the large number of subjects included, each included only once, and that the study was performed in a single IVF center, thus minimizing confounders by not altering the study population or the analysis methods. Another strength of this study is the evaluation of a broad range of semen parameters.

Conclusion

In a group of over 12,000 men attending a single fertility center over 10 years, semen analysis parameters remained relatively stable with small clinically insignificant stochastic changes. These findings occurred irrespective of an analysis of the ARL representing fertile parameters or BRL, representing the male factor infertility population. Longer evaluations should be undertaken to confirm this effect. However, male fertility does not seem to be steadily decreasing, in spite of mild statistically significant changes in semen quality noted.