Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

5.1 Introduction

Infertility is defined as the inability to conceive after one year of unprotected intercourse and affects globally about 15–20 % of couples. Of these, 30–40 % can be attributed to an identifiable male factor, 30–40 % to female factors, and the remaining 20 % to a combination of both male and female factors (Guttmacher 1956; Thonneau et al. 1991). Although many couples may present with an obvious and identifiable cause for the subfertility, there are cases with unexplained reasons for the delay in conception. Traditional semen analysis is the first test used to evaluate the male partner. This chapter discusses the basics of semen analysis and interpretation of the results in the light of the 2010 WHO Laboratory Manual for the Examination and Processing of Human Semen.

Evaluation of the man begins with a thorough history and physical examination and proceeds to laboratory examination. The initial screening evaluation of the male should include, at a minimum, a reproductive history and two semen analyses. If possible, the two semen analyses should be separated by a time period of at least 1 month. It is important to correlate history with the results of semen analysis because a person’s medical history might affect the results of the semen analysis and, hence, fertility potential. Important considerations include past exposures to chemicals, heavy metals, pesticides, and extreme heat (specifically the workplace environment as well as recreational activities, such as frequency of hot tubs and use of a heated waterbed). Recreational drug use, as well as prescription medications, history of sexually transmitted infections, and other communicable diseases, genital infections, and genital injuries as well as fertility history are to be considered for evaluation (Sigman et al. 2009).

5.2 Semen Analysis

Semen analysis is the most widely used test to predict male fertility potential. It provides information on the functional status of the seminiferous tubules, epididymis, and accessory sex glands, and its results are often taken as a surrogate measure of a man’s ability to father a pregnancy. Although this test reveals useful information for the initial evaluation of the infertile male, it is not a test of fertility (Jequier 2010). It provides no insights into the functional potential of the spermatozoon to fertilize an ovum or to undergo the subsequent maturation processes required to achieve fertilization. It is important to understand that while the results may correlate with “fertility,” the assay is not a direct measure of fertility (Guzick et al. 2001; Smith et al. 1977; Brazil 2010). An understanding of the physiology and pathophysiology associated with ejaculation and semen collection is also critical to the interpretation of the results of semen analysis.

Routine semen analysis includes (a) physical characteristics of semen, including liquefaction, viscosity, pH, color, and odor, (b) specimen volume, (c) sperm concentration, (d) sperm motility and progression, (e) sperm morphology, (f) leukocyte quantification, and (g) fructose detection in cases where no spermatozoa are found and ejaculate volume is low (Esteves et al. 2011). Routine semen analysis is the main pillar in male fertility investigation. In order to establish consistency in laboratory procedures, the WHO first published a manual for the examination of human semen and semen-cervical mucus interaction in 1980. The WHO criteria of 1987 and 1992 (World Health Organization 1987, 1992; Kruger et al. 1988) which classify more sperm in the normal category are also widely used in the routine semen evaluation. True reference ranges have not been established for semen parameters. The WHO manual also identified standards to exclude influences, such as the health of patient over the previous spermatogenic cycle, length of sexual abstinence, time, and temperature. The manual has been regularly updated (1980, 1987, 1992, and 1999). The addition of normal reference values in the WHO manuals has been of significant help in establishing some consistency of what constitutes a normal value. The WHO Laboratory Manual for the Examination and Processing of Human Semen serves as the basis for semen analysis in most of the recognized laboratories throughout the world. Table 5.1 shows the cutoff values of various parameters as per the previous WHO manuals.

Table 5.1 Changes in cutoff reference values in WHO guidelines

5.3 Semen Collection

Semen must be collected after a standardized period of abstinence, usually 3 days (2–4 days), and the period must be indicated in the laboratory report. The time of collection, when the semen was liquefied, must be reported, as a delay of longer than an hour may adversely affect sperm motility (World Health Organization 2010). The standardization is essential to minimize fluctuations in semen quality, especially sperm count and sperm motility, due to short/long abstinence.

5.4 Macroscopic Evaluation of Semen

Semen samples can show substantial variation in physicochemical properties including color and odor. Pathologically, seminal discoloration may be due to fresh blood, drugs (pyridium), jaundice, or contamination of semen with urine (e.g., bladder neck dysfunction). The complete semen analysis includes analysis of the semen volume, pH, liquefaction or non-liquefaction, and viscosity.

Volume

The normal volume of ejaculate after 2–7 days of sexual abstinence ranges from 2 to 6 ml. However, there are other possibilities as follows (Vasan 2011; Bornman and Aneck-Hahn 2012):

  1. 1.

    Aspermia: No ejaculate after orgasm.

  2. 2.

    Hypospermia: Less than 0.5 ml of semen. This can be due to improper collection, hypogonadism, partial retrograde ejaculation, and congenital bilateral absence of the vas deferens (CBAVD), and obstruction of lower urinary tract may yield low volume.

  3. 3.

    Hyperspermia: More than 6 ml of semen. This can be attributed to prolonged abstinence or excessive secretion from the accessory sex glands and also occurs in cases of male accessory gland infection (World Health Organization 2010).

pH

The main component of semen is a coagulated alkaline fluid that comes from the seminal vesicles. This fluid along with the sperm from the vas deferens empties through the ejaculatory duct. Prostatic fluid, the second largest component of seminal volume, generally has a relatively acidic pH of 6.5 and combines with the seminal fluid and sperm in the urethra. Prostatic fluid does not traverse the ejaculatory ducts. Normal semen pH is in the range of 7.2–8.2 and it tends to increase with time after ejaculation. Changes in pH of semen are usually due to inflammation of the prostate or seminal vesicles. A low volume sample with measured pH below 7.0 indicates obstruction of the ejaculatory ducts.

Liquefaction

Liquefaction of semen depends on coagulation of proteins found in the seminal fluid as well as the liquefaction by prostate-specific antigen, a proteolytic enzyme, secreted by the prostate. This process may take up to 60 min. If complete liquefaction does not occur after 60 min, it should be noted. Exact liquefaction time is of no diagnostic importance unless > 2 h elapse without any change. The clinical significance of abnormalities in liquefaction is still controversial (Keel 1990). Failure to liquefy is usually a sign of inadequate secretion of the proteolytic enzymes – fibrinolysin, fibrinogenase, and aminopeptidase – by the prostate (Amelar 1962). On the other hand, absence of coagulation may indicate ejaculatory duct obstruction or congenital absence of seminal vesicles.

Semen Viscosity

Viscosity measures the resistance of the seminal fluid to flow. High viscosity may interfere with determination of sperm motility, concentration, and antibody coating of spermatozoa. Normally, semen coagulates upon ejaculation and usually liquefies within 15–20 min. Semen that remains a coagulum is termed non-liquefied, whereas that which pours in thick strands instead of drops is termed hyperviscous. Importantly, liquefaction should be differentiated from viscosity, as abnormalities in viscosity can be the result of abnormal prostate function and/or the use of an unsuitable type of plastic container. Viscosity of semen is noted after liquefaction, although the clinical significance of hyperviscous semen is controversial. There is no correlation between seminal hyperviscosity and semen cultures, leukocytes, or presence of sperm antibodies; however, worse outcomes after in vitro fertilization (IVF) with seminal hyperviscosity have been observed (Munuce et al. 1999; Esfandiari et al. 2008). Sperm processing prior to intrauterine insemination (IUI) can be considered, if there is a clinical concern for hyperviscosity.

5.5 Microscopic Sperm Analysis

Sperm Concentration

Concentration of sperm in unstained preparations of fresh/washed semen sample is determined using, preferably, a phase-contrast microscope with volumetric dilution and hemocytometry. Sperm count is typically reported as concentration (millions of sperm per milliliter) as well as total sperm count (sperm concentration × ml of semen) in the ejaculate. Normozoospermia, oligozoospermia, and azoospermia are diagnosed based upon total sperm count. Azoospermia refers to the absence of sperm in the seminal plasma. Prior to the diagnosis of azoospermia, the sample should be centrifuged and the pellet examined for the presence of sperm. Oligozoospermia (also often called oligospermia) refers to seminal plasma concentration < 15 million/ml. This finding can accompany a variety of defects and has implications for the type of assisted reproductive options that can be utilized, as there are significant reductions in pregnancy rates (Smith et al. 1977).

Motility

The efficient passage of spermatozoa through cervical mucus is dependent on rapid-progressive motility, that is, spermatozoa must have a forward progression of a minimum of 25 μm/s (Björndahl 2010; Lindholmer 1974). Reduced sperm motility can be a symptom of disorders related to male accessory sex gland secretion and the sequential emptying of these glands. Rapid and slow-progressive motility is calculated by the speed at which sperm moves with flagellar movement in a given volume as a percentage (range 0–100 %) by counting 200 sperms and classified as follows:

  1. A.

    (Rapid progressive motility: > 25 μm/s at 37 °C and > 20 μm/s at 20 °C

    Note: 25 μm is approximately equal to 5 head lengths or half a tail length).

  2. B.

    Slow or sluggish progressive motility

  3. C.

    Nonprogressive motility (<5 μm/s)

  4. D.

    Immotile

A normal semen analysis must contain at least 50 % grade A and B progressively motile spermatozoa (Table 5.1). If greater than 50 % of sperms are immotile, then the sperms should be checked for viability. Persistent poor motility is a good predictor of failure in fertilization, an outcome that is actually more important when making decisions regarding a couple’s treatment options (Aitken et al. 1985).

Vitality

Supravital staining differentiates between live and dead sperm and is assessed when sperm motility is < 50 %. A large proportion of vital, but immotile, sperm may indicate structural defects in the sperm tail (World Health Organization 1999) or Kartagener’s syndrome. A high percentage of immotile, nonviable (dead) sperm may indicate epididymal pathology (World Health Organization 2010). Antisperm antibodies (ASA) may also be present, if the immotile sperms are dead (Björndahl et al. 2010a).

Morphology

The staining of a seminal smear (Papanicolaou Giemsa, Shorr, and Diff-Quik) allows the quantitative evaluation of normal and abnormal sperm morphological forms in an ejaculate. Smears can be scored for morphology using the World Health Organization (WHO) classification or by Kruger’s strict criteria classification (Menkveld et al. 1990). WHO method classifies abnormally shaped spermatozoa into specific categories based on specific head, tail, and mid-piece abnormalities, which is based on the appearance of sperm recovered from postcoital cervical mucus or from the surface of zona pellucida (>30 % normal forms). In contrast, Kruger’s strict criteria classifies sperm as normal only if the sperm shape falls within strictly defined parameters of shape, and all borderline forms are considered abnormal (>14 % normal forms).

  1. 1.

    Head defects: Large, small, tapered, pyriform, round, amorphous, vacuolated (>20 % of the head area occupied by unstained vacuolar areas) heads with small acrosomal area (<40 % of head area), double heads, or any combination of these

  2. 2.

    Neck and mid-piece defects: Bent neck; asymmetrical insertion of mid-piece into the head; thick, irregular mid-piece; abnormally thin mid-piece; or any combination of these

  3. 3.

    Tail defects: Short, multiple, hairpin, broken, bent, kinked, coiled tails, or any combination of these

  4. 4.

    Cytoplasmic droplets: Greater than one-third of the area of a normal sperm head

Morphology should be used along with other parameters, and not as an isolated parameter, when determining clinical implications. It is important to realize that, in general, pregnancy is possible with low morphology scores and that both motility and morphology have demonstrated prognostic value, as do combinations of parameters (Van Waart et al. 2001; Keegan et al. 2007). The clinical implications of poor morphology scores remain highly controversial. The initial studies using rigid criteria reported that patients undergoing in vitro fertilization (IVF) who had greater than 14 % normal forms had better fertilization rates (Coetzee et al. 1998). Later studies reported that most impairment in fertilization rates occurred with morphology scores of less than 4 % (Menkveld et al. 1990).

Morphology is a particularly challenging parameter to interpret because of the subjective nature of the classification and the presence of multiple classification systems, as well as controversy about the implications of various morphological features. There are studies correlating fertilization rates with morphology scores and other studies which show no relationship between morphology scores and IVF results (Deck and Berger 2000; Schlegel 1997). As there are a number of scoring methodologies, the clinician should explore and adopt a particular methodology and reporting for their laboratory. In spite of the controversy about overall morphology scores, absence of acrosomes or globozoospermia is highly predictive of failure of fertilization (Male Infertility Best Practice Policy Committee of the American Urological Association and Practice Committee of the American Society for Reproductive Medicine 2006).

In view of these findings, it is beneficial for the physician to have a detailed analysis of the morphological defects in addition to the percentage of normal forms. In the case of globozoospermia, treatment with intracytoplasmic sperm injection (ICSI) can be more successful compared to IUI (Baker et al. 1994). For some morphology, such as pin head or short tailed, sperms fail to have pronuclei fusion leading to failure of even ICSI (Dunson et al. 2004). Overall, there is significant difficulty with defining the relationship between morphology and pregnancy rates, especially with the management of patients with low morphology scores (Abbey et al. 1992). The current evidence suggests that, in general, sperm morphology scores should not be used in isolation to make patient management decisions.

5.6 Computer-Aided Sperm Analysis (CASA)

Advances in technology and the use of fluorescent DNA stains have facilitated development of computer-aided sperm analysis (CASA). Determination of sperm concentration and concentration of progressively motile spermatozoa has been possible due to the availability of advanced tail detection algorithms (Zinaman et al. 1996; Garrett et al. 2003). CASA can be used for routine diagnostic applications when specimen is prepared with proper care and adequate quality control procedures are in place. CASA systems with semiautomated morphology units are available and can be used to measure sperm concentration, motility, kinematics, and morphology with high precision. Studies have also shown the significance of CASA sperm concentration and kinematic parameters in the determination of in vitro and in vivo fertilization rates (Garrett et al. 2003; Liu et al. 1991; Barratt et al. 1993). Progress in digital image analysis has brought about greater objectivity and improved precision to quantitative assessment of sperm morphology (Garrett and Baker 1995).

Manual semen analysis lacks the ability to measure the kinematics of sperm motion. CASA is potentially useful because of its capacity to analyze sperm motion (sperm head and flagellar kinetics), some of which have been shown to be related to IVF outcome (Fréour et al. 2010). Important kinematic parameters are as follows:

  1. A.

    Curvilinear velocity: Curvilinear velocity (VCL) is the measure of the rate of travel of the centroid of the sperm head over a given time period.

  2. B.

    Average path velocity: Average path velocity (VAP) is the velocity along the average path of the spermatozoon.

  3. C.

    Straight-line velocity: Straight-line velocity (VSL) is the linear or progressive velocity of the cell.

  4. D.

    Linearity: Linearity of forward progression (LIN) is the ratio of VSL to VCL and is expressed as percentage.

  5. E.

    Amplitude of lateral head displacement: Amplitude of lateral head displacement (ALH) of the sperm head is calculated from the amplitude of its lateral deviation about the cell’s axis of progression or average path.

Although CASA is very accurate for determining the details of sperm kinetics, manual assessment of semen is much more accurate in discerning among debris, crystals, and immotile and dead sperm heads. Therefore, manually assessed sperm concentrations and number of immotile spermatozoa are much more reliable than corresponding data obtained by CASA, provided individual is adequately trained with appropriate internal and external quality control measures (Makler 1978; Ginsburg and Armant 1990).

5.7 Other Markers

The secretion of zinc by the prostate is androgen dependent, and a level of < 2.4 μmol/ejaculate indicates a low contribution of the fluid to the ejaculate, incomplete collection of the ejaculate, prostatic inflammation, or androgen insufficiency (Björndahl et al. 2010b). Fructose is another androgen-dependent secretion emanating mainly from the seminal vesicles, with a small contribution from the epithelial cell of the secretory epithelium in the ampulla of the vas deferens. Seminal fructose is used as a marker of the seminal vesicles and < 13.0 μmol/ejaculate is considered abnormal. This is seen in hypogonadal men after a short abstinence time, and where ejaculation or emission of fluid is impaired, such as in neuromuscular diseases, after surgery, in cases of drug use, and in obstruction in the ejaculatory ducts, or with inflammation in the vesicles or prostate that may hinder emission (Bornman and Aneck-Hahn 2012).

Infection of the male reproductive tract can directly or indirectly cause infertility (Mortimer 1994a). Pyospermia is a laboratory finding categorized as the abnormal presence of leukocytes in human ejaculate and may indicate genital tract inflammation (Anderson 1995). Polymorphonuclear (PMN) leukocytes are the primary sources of reactive oxygen species (ROS) that cause inflammation, and peroxidase staining is used to detect their presence (Wolff et al. 1992).

Presence of agglutinated clumps of moving sperm in the semen sample could hamper the passage of sperm through the cervical mucus, and zonal binding and passage (Mortimer 1994b). Such clumps are formed by the exposure of spermatozoa to systemic immune defense system, due to the release of antisperm antibodies (ASA). ASA can also cause cell death and immobilized sperm cells. Detection of ASA bound to the surface of motile sperm is carried out by the mixed agglutination reaction assay (MAR test; only for IgGs) and the immuno-bead binding assay (for IgA, IgG, and IgMs) (Jarow and Sanzone 1992).

5.8 Sperm Functional Tests

Tests of Sperm Capacitation

Capacitation is a series of biochemical and structural changes that spermatozoa go through to undergo acrosome reaction (AR) and be able to fertilize. The process takes place in the female genital tract but can be induced in vitro by incubating spermatozoa with capacitation-inducing media. It is thought to have a role in preventing the release of lytic enzymes until spermatozoa reach the oocyte (Tesarik 1989). One of the signs of capacitation is the display of hyper-activation by spermatozoa. At the present time, the clinical value of sperm capacitation testing remains to be determined.

Tests of Hemizona and Zona Pellucida Binding

The interaction between spermatozoa and the zona pellucida is a critical event leading to fertilization and reflects multiple sperm functions (i.e., completion of capacitation as manifested by the ability to bind to the zona pellucida and to undergo ligand-induced AR) (Oehninger et al. 1994; Liu and Baker 2003; Consensus workshop on advanced diagnostic andrology techniques. ESHRE (European Society of Human Reproduction and Embryology) Andrology Special Interest Group 1996). The most common sperm-zona pellucida binding tests currently utilized are the hemizona assay (or HZA) and a competitive intact-zona binding assay (Quintero et al. 2005; Fénichel et al. 1991). The HZA which uses non-fertilized oocytes is useful to determine the cause, in couples who have failed to fertilize during regular IVF. As the binding is species-specific, human zona must be used, thus limiting the utility of these assays (Fénichel et al. 1991; Cross et al. 1986). The induced AR assays appear to be equally predictive of fertilization outcome and are simpler in their methodologies. The use of a calcium ionophore to induce AR is currently the most widely used methodology (Henkel et al. 1993; Katsuki et al. 2005).

Sperm Penetration Assay

This assay is also called as sperm capacitation index or zona-free hamster oocyte penetration assay. The concept of the sperm penetration assay was introduced by Yanagamachi (1972; Yanagimachi et al. 1976). It yields information regarding the fertilizing capacity of human spermatozoa by testing capacitation, AR, sperm/oolemma fusion, sperm incorporation into the ooplasm, and decondensation of the sperm chromatin during the process. However, penetration of the zona pellucida and normal embryonic development are not tested. The spermatozoa penetration assay (SPA) utilizes the golden hamster egg, which is unusual in that removal of its zona pellucida results in loss of all species specificity to egg penetration. Thus, a positive SPA does not guarantee fertilization of intact human eggs nor their embryonic development, whereas a negative SPA has not been found to correlate with poor fertilization in human IVF (Yanagimachi et al. 1976). The acrosin assay, an indirect measure of sperm’s penetrating capability, measures acrosin, which may be responsible for penetration of the zona pellucida and also triggering the AR (Rogers and Brentwood 1982). Measurement of acrosin is thought to correlate with sperm binding to and penetration of the zona pellucida (Cross et al. 1986; Cummins et al. 1991).

Tests of Sperm DNA Damage

Mammalian fertilization involves the direct interaction of the sperm and the oocyte, fusion of the cell membranes, and union of male and female gamete genomes. Although a small percentage of spermatozoa from fertile men also possess detectable levels of DNA damage, which is repaired by oocyte cytoplasm, there is evidence to show that the spermatozoa of infertile men possess substantially more DNA damage that may adversely affect reproductive outcomes (Evenson et al. 1999; Zini et al. 2001). There appears to be a threshold of sperm DNA damage which can be repaired by oocyte cytoplasm (i.e., abnormal chromatin packaging, protamine deficiency) beyond which embryo development and pregnancy are impaired (Ahmadi and Ng 1999; Cho et al. 2003).

  1. A.

    DNA damage – direct tests

    1. (a)

      Terminal deoxynucleotidyl transferase-mediated deoxyuridine triphosphate (dUTP) nick end-labeling (TUNEL) assay

    2. (b)

      COMET assay

  2. B.

    DNA damage – indirect tests

    1. (a)

      Sperm chromatin structure assay (SCSA)

    2. (b)

      Sperm chromatin dispersion assay

    3. (c)

      Sperm fluorescence in situ hybridization analysis (FISH)

Overall, studies suggest that there is no significant relationship between sperm DNA damage and fertilization rate or pregnancy outcomes at IVF or IVF/ICSI (Bungum et al. 2007; Payne et al. 2005; Zini et al. 2005; Borini et al. 2006; Benchaib et al. 2007). However, there is evidence to suggest that sperm DNA damage is associated with poor pregnancy outcome after standard IVF (Lin et al. 2008; Frydman et al. 2008). Sperm FISH analysis may be useful in (a) infertile men with sex chromosome numerical anomalies, prior to ICSI; (b) infertile men with structural chromosome anomalies, prior to ICSI; (c) infertile men with severe oligozoospermia, prior to ICSI; and (d) couples with a history of recurrent miscarriages and trisomic pregnancies.

Assessment of Reactive Oxygen Species

Reactive oxygen species, (ROS) also referred to as free radicals, are formed as a by-product of oxygen metabolism. Contaminating leukocytes are the predominant source of ROS in these suspensions (Aitken et al. 1989a, b). They can be eradicated by enzymes (e.g., catalase or glutathione peroxidase) or by nonenzymatic antioxidants, such as albumin, glutathione, and hypotaurine, as well as by vitamins C and E. Small amounts of ROS may be necessary for the initiation of critical sperm functions, including capacitation and AR. On the other hand, a high ROS level produces a state known as oxidative stress that can lead to biochemical or physiologic abnormalities with subsequent cellular dysfunction or cell death. Significant levels of ROS can be detected in the semen of 25 % of infertile men, whereas fertile men do not have a detectable level of semen ROS (Aitken et al. 1989b, 1991; Agarwal et al. 2006). Sperm ROS can be measured by using cellular probes coupled with flow cytometry by the detection of chemiluminescence (Marchetti et al. 2002). Briefly, this is done by incubating fresh semen or sperm suspensions with a redox-sensitive, light-emitting probe (e.g., luminol) and by measuring the light emission over time with a luminometer. The clinical value of semen ROS determination in predicting IVF outcome remains unproven, but identifying oxidative stress as an underlying cause of sperm dysfunction has the advantage for suggesting possible therapies. Administration of antioxidants has been attempted in several trials with mixed results. Currently, there are no established semen ROS cutoff values that can be used to predict reproductive outcomes (Agarwal et al. 2005, 2008).

Sperm Proteomics

Sperm proteomics, an experimental technique, used extensively in several branches of medicine, may identify some of the molecular targets implicated in sperm dysfunction (Aitken and Baker 2008). Sperm proteomics allows comparison of protein structure of normal and defective spermatozoa (Aitken 2010).

5.9 Pre-2010 WHO Guidelines

Prior to 2010, semen analyses were performed mainly according to the WHO guidelines (World Health Organization 1992) to obtain volume, pH, sperm concentration, motility, and morphology. Sperm concentration was determined with the use of a Makler counting chamber. Motility was expressed as the percentage of motile spermatozoa and their mean speed, or motility quality (on a scale of 1–6, where 1 stands for immotile and 6 for very fast progressive motile, i.e., 100 μm/s). For sperm morphology evaluation, two slides were prepared of each sample after incubation of the semen samples with trypsin (10 min at room temperature); one slide was used for routine morphology evaluation by WHO criteria and the other for strict criteria evaluation. For evaluation according to WHO criteria, smears were flame-fixed and stained with methylene blue/eosin. At least 100 cells were examined per slide, with a final magnification of x1000. Each slide was evaluated independently by two technicians. There should not be any statistically significant difference (by Pearson’s correlation matrix analysis) between the results of the two observers. The slides for evaluation by strict criteria were stained according to the Papanicolaou method and evaluated (Menkveld et al. 1990). In addition to the morphology evaluation according to strict criteria, the acrosome index (AI) and teratozoospermia index (TZI) were also determined (World Health Organization 1992; Menkveld and Kruger 1996).

Teratozoospermia Index

The TZI is an indication of the number of abnormalities present per abnormal spermatozoon. According to the 1992 WHO manual (World Health Organization 1992), each abnormal spermatozoon can have one to four abnormalities, viz., a head abnormality, a neck/mid-piece abnormality, a tail abnormality, or the presence of a cytoplasmic residue. These abnormalities can occur as a single defect or in a combination of two, three, or all four abnormalities simultaneously. The classification of spermatozoa for the TZI is recorded simultaneously, on a five-key laboratory counter, with the recording of spermatozoa as normal or abnormal, in specific classes. The total number of abnormalities recorded are added together and divided by the total number of abnormal spermatozoa, i.e., 100 minus the percentage of morphologically normal spermatozoa.

Acrosome Index

Sperm acrosomal morphology was evaluated by light microscopy at ×1250 oil magnification based on acrosomal size and form as well as staining characteristics (Menkveld and Kruger 1996). Results were expressed as the AI (% normal acrosomes). For the evaluation of acrosome morphology, the same principles as for the evaluation of normal sperm morphology according to strict criteria are applicable. For an acrosome to be regarded as normal, the acrosome must have a smooth normal oval shape, with the same dimensions as for a normal spermatozoon. Acrosomes must be well-defined and comprising about 40–70 % of the normal-sized sperm head. The post-acrosomal part of the sperm head can be abnormal, but the rest of the spermatozoon must be normal; thus no neck/mid-piece and tail abnormalities and no cytoplasmic residue may be present. If the spermatozoon is classified as normal, the acrosome must always be classified as normal. The acrosome evaluation can be performed simultaneously with the routine morphology evaluation and the TZI, with the use of two laboratory counters. As with the normal sperm morphology, at least 100 spermatozoa are evaluated. The repeatability of the AI is be determined and should be within acceptable limits.

Reference Intervals

Reference intervals are the most widely used tool for the interpretation of clinical laboratory results. Reference interval development has classically relied on concepts elaborated by the International Federation of Clinical Chemistry Expert Panel on Reference Values during the 1980s. These guidelines involve obtaining and classifying samples from a healthy population of at least 120 individuals and then identifying the outermost 5 % of observations to use in defining limits for two-sided or one-sided reference intervals. Pre-2010 WHO guidelines were based on data obtained from laboratories that used different methodologies and examined different male populations, not supported by standardized methods or without the definition of fertile population. The male population studied included men without proven paternity, patients of human reproduction clinics that sought treatment, semen donors, and vasectomy candidates. Semen donors can be fertile and vasectomy candidates are very likely to be fertile, although there is no data about how long it took for their partners to get pregnant (Cooper et al. 2010). The cutoff point of 20 × 106/ml was suggested as the lower normal value for sperm concentration in an ejaculate (World Health Organization 1999). However, there are studies indicating sperm concentrations of subfertile men to be less than 13.5 × 106/ml (Guzick et al. 2001) and 31.2 × 106/ml for fertility status (Nallella et al. 2006). Therefore, caution must be exercised with interpretation of the semen analysis based upon the reference values as men may be infertile with “normal” semen parameters or alternatively can be fertile with markedly “abnormal” semen profiles. There is likely no upper limit of semen morphology, motility, or count as pregnancy rates appear to generally increase with increasing numbers as well as improved sperm morphology and motility (Garrett et al. 2003).

Sperm Morphology

The clinical implications of poor morphology scores remain highly controversial. Initial studies evaluating the utility of strict sperm morphology in predicting fertilization rates during IVF used a score of greater than 14 % for normal. However, subsequent studies reported fertilization rates were lowest for patients with morphology scores of less than 4 %. Pregnancy rates have also been reported to be suboptimal with lower scores (Coetzee et al. 1998), but some recent studies have reported no relationship of morphology to IVF results (Keegan et al. 2007). The relationship between morphology scores and pregnancy rates with (IUI) (Van Waart et al. 2001; Spiessens et al. 2003; Shibahara et al. 2004) and intercourse (Guzick et al. 2001; Gunalp et al. 2001) have been examined; however, there has been no consensus on thresholds and management implications of poor morphology scores. Certain rare morphological abnormalities, such as sperm without acrosomes (globozoospermia), are highly predictive of failure to fertilize ova, yet in most cases fertilization and pregnancy are possible even with very low morphology scores. Although most clinicians utilize strict morphology in everyday practice, most studies have not addressed the significance of isolated low morphology in patients with otherwise normal semen parameters.

5.10 Need for Revised Guidelines

Human semen is very different from other body fluids, mainly because of its heterogeneity. Heterogeneity leads to several negative effects on the quality of the semen analysis. Some of the problems with the interpretation of semen analysis arise from the fact that production of spermatozoa is known to vary in the same individual and that semen analysis technique is poorly standardized. Many conditions including the duration of ejaculatory abstinence, activity of the accessory sex glands, analytical errors, and inherent biological variability account for the discrepancies (Berman et al. 1996; Carlsen et al. 2004; Sanchez-Pozo et al. 2013; Hamada et al. 2012). Analysis on multiple ejaculates from the same individual is recommended before characterizing a man as normal or infertile due to the large within-subject variation in sperm parameters (Keel 2006). In one study, the within-subject variability of 20 healthy subjects assessed over a 10-week follow-up ranged from 10.3 to 26.8 % (Alvarez et al. 2003). Concentration showed the highest within-subject variation (26.8 %), followed by morphology (19.6 %) and progressive motility (15.2 %), whereas vitality had the lowest variation (10.3 %). For these reasons, it would not be suitable to take the results of a single semen specimen as a surrogate for a man’s ability to father a child, unless it is at extremely low levels (Jequier 2005). Hence, it is prudent that clinicians request at least two semen specimens following 2–5 days of ejaculatory abstinence to allow a better understanding of the baseline semen quality status of a given individual (Berman et al. 1996; Carlsen et al. 2004; Sanchez-Pozo et al. 2013). In view of the intra- and interindividual variations in semen quality, population-based reference values are expected to have better utility in assessments of fertility (Esteves 2014). In addition, conventional semen analysis does not test for the diverse array of biological properties of spermatozoa that are responsible to bring about pregnancy. Table 5.1 displays the changes in semen analysis reference values in different editions of the WHO manual. Following publication of WHO 2010 manual, several semen analysis parameters and their recommended ranges prescribed by new guidelines became the topic of intense discussion. In this section, each of them is considered in some detail.

Sperm Motility

Surprisingly, 2010 WHO manual abandons the distinction between slow- and rapid-progressive spermatozoa. The reasoning for this change appears to be primarily based on the observation that poorly trained technicians cannot distinguish between the two categories in a repeatable and reliable manner (Mortimer 1994c). In fact, the quality of sperm motility is a prime factor to be considered in a semen analysis (MacLeod and Gold 1951). In addition, proper training and achievement of intra- and interobserver standardization is essential to assess sperm motility. The arguments posited by the WHO have been refuted elsewhere (Björndahl 2010; Eliasson 2010). Very importantly, there are clinical data both from manual sperm motility assessments and computer-aided sperm analysis showing the distinction of rapidly progressive spermatozoa to be biologically, and hence clinically, important. This evidence ranges from the ability of spermatozoa to penetrate cervical mucus (Aitken et al. 1985; Mortimer et al. 1986) and in vivo conceptions (Comhaire et al. 1988; Barratt et al. 1992) to clinical outcome studies in donor insemination (Irvine and Aitken 1986), IUI (Bollendorf et al. 1996), and IVF (Bollendorf et al. 1996; Sifer et al. 2005). Even with regard to ICSI, the straight-line velocity of the individual spermatozoa subsequently injected into the oocyte has been shown to have a significant effect on fertilization outcome (Van den Bergh et al. 1998). In view of these evidences, it is scientifically and clinically inappropriate to abandon the differentiation of rapid- and slow-progressive spermatozoa.

Sperm Morphology

WHO 2010 manual has fully adopted the Tygerberg Strict Criteria for normal sperm morphology (Menkveld et al. 1990). These criteria are based on the typical morphology of spermatozoa that are able to migrate through cervical mucus and bind to the zona pellucida, even though in “normal” men only a small proportion of spermatozoa correspond to the typical morphology (Menkveld et al. 2011). As a consequence, an extra measure that includes the different types of abnormalities can provide additional useful information by identifying men with more severe disturbances in sperm form and related function, e.g., the multiple anomalies index (MAI) (Jouannet et al. 1988) and the teratozoospermia index (TZI) (Menkveld et al. 1998; Mortimer et al. 1990; Mortimer and Menkveld 2001).

The TZI is an indirect indication of (i) the risk of what appeared to be normal spermatozoa actually having defects that were invisible at the level of observation and (ii) just how badly affected spermiogenesis was in the man and hence how impaired his sperm fertilizing ability might be (Mortimer and Menkveld 2001). The TZI can provide extra information in cases where there are very few morphological normal forms, as presence of 4 or 6 % normal forms is considered to reflect a major difference in clinical significance. TZI would be highly pertinent when interpreting sperm morphology assessments based on counts of just 200 spermatozoa, and there will not be a statistically significant difference between 4 and 6 % normal form values at 95 % confidence interval (Björndahl et al. 2010a).

In the 2010 WHO manual, the assessment of multiple sperm defects has been relegated to “Optional Procedures,” although calculation of the TZI has been corrected to be out of four instead of three, as erroneously used in the 4th edition (World Health Organization 1999). Even if only % normal spermatozoa is reported, the actual assessment procedure should include all the characteristics/criteria needed for TZI since recording the prevalence of the four categories of morphological deviations is essential for quality control (internal and external) purposes. In terms of clinical application of the TZI, the consensus-based WHO Manual for the Standardized Investigation, Diagnosis and Management of the Infertile Male (Rowe et al. 2000) has commented that, together with the introduction of the Tygerberg Strict Criteria in the 1999 WHO laboratory manual, the TZI had been included to provide additional information to facilitate discrimination of the extent of impairment of sperm functional potential in men with very low numbers of normal spermatozoa. In addition, applicable reference values based on the four defect category TZI could be have been included in the manual.

Retention of the Use of Nomenclature Terms

The WHO 2010 manual retains the use of nomenclature terms such as oligozoospermia. Such terms simply classify the perceived quality of the semen but do not identify, or even suggest, biological cause or real fertility potential (Eliasson 1977, 2010; Eliasson et al. 1970; Bostofte et al. 1981) and hence are not very helpful. Many experts have discussed the possible reference values and such nomenclature, and probably the most useful approach is to provide three interpretation categories: normal, doubtful, and pathological or not normal (Guzick et al. 2001; Björndahl 2010; Eliasson 1977).

Multiple Methods and Nonlinear Method Presentation

WHO 2010 still includes multiple methods for performing some of the tests, with poor explanations of their relative merits or otherwise, e.g., determination of low sperm concentrations in semen, alternative stains for sperm morphology assessment (e.g., Diff-QuikTM), and the use of eosin without a counterstain for sperm vitality assessment. Some of the methods (e.g., sperm concentration) are also presented in a complex manner (World Health Organization 2010). These issues diminish the practical usefulness and will delay adoption of the WHO 2010 guidelines. Lack of clear step-by-step protocols for easy implementation and routine use, information on the limitations of the methods, etc., make it harder for a laboratory to adapt a method into its standard operating procedure.

Inconsistencies and Errors

There are several errors and inconsistencies in WHO 2010. One method particularly affected by this is the determination of sperm vitality using eosin-nigrosin staining: (1) The cutoff to perform a vitality assessment has been changed from 50 % immotile spermatozoa (World Health Organization 1992, 1999) to “less than about 40 % progressively motile spermatozoa” (World Health Organization 2010). The change is illogical since nonprogressively motile spermatozoa are clearly still “live,” and (2) the interpretation criteria for eosin staining has been changed arbitrarily so that “light pink heads are considered alive” (World Health Organization 2010). This is contrary to papers on eosin exclusion staining for mammalian sperm vitality going back 60 years. The standard criterion is that any degree of pink coloration indicates that a spermatozoon is not “live” (Mortimer 1994a) with the sole, strict, exception of the “leaky neck” staining artifact where faint pink coloration might be seen in the very posterior region of the sperm head (Björndahl et al. 2003, 2004). The revised criterion in WHO 2010 is clearly wrong and will affect the results obtained.

Unnecessary Extra Work

In WHO 2010, it is stated that both sperm vitality and sperm morphology assessments must be made in duplicate, evaluating 200 spermatozoa in each replicate “in order to achieve an acceptably low sampling error” (World Health Organization 2010). These requirements represent substantial extra work for what are unestablished improvements in accuracy and/or precision in the final results. Indeed, Menkveld has previously established the adequacy of a single assessment of sperm morphology on 200 cells from a single slide (Menkveld et al. 1990), and with a binary endpoint such as vitality, any possible improvement will be minimal. Similarly, the improved method for determining low values of sperm concentration leads to substantial extra work to improve accuracy or precision, which may not provide any increase in clinical value to be useful from a diagnostic or prognostic perspective. For each of these changes, the WHO manual should have provided justifications for the substantial extra effort and hence costs involved.

Illogical Sperm Preparation Methods

WHO 2010 still allows simple centrifugal washing of spermatozoa for “good-quality” semen samples. Unfortunately, one cannot be certain that an ejaculate is free from the attendant risks of reactive oxygen species damage (Aitken and Clarkson 1987, 1988; Mortimer 1991) without assessing both sperm morphology for spermatozoa with retained cytoplasm and verifying the absence of peroxidase-positive leukocytes. To achieve both of these between completion of semen liquefaction and the need to commence sperm preparation by 30 min post-ejaculation is clearly impossible on a routine basis (Björndahl et al. 2010a; Mortimer 2000). The density-gradient method mentioned in the WHO 2010 contains numerous errors. It requires the addition of 10 ml of a 10× medium to 90 ml of a “density-gradient medium” of silane-coated colloidal silica, although all commercially available silanized colloidal silica sperm preparation products since 1997 are already isotonic. The only colloidal silica product that is not already isotonic is Percoll (which is polyvinyl alcohol-coated silica) and it has been banned from clinical use by its manufacturer effective 1 January 1997 (Mortimer 2000). WHO 2010 perpetuates the incorrect colloid layers that have been in the WHO laboratory manual since 1992 (World Health Organization 1992), using a 72 % colloid-equivalent lower layer, which is too low in density (i.e., 1.1 g/ml). While this will provide an apparently higher yield, it only does so by allowing poorer quality spermatozoa into the pellet (Björndahl et al. 2010a; Mortimer 2000). Finally, WHO 2010 still recommends Ham’s F10 medium for all sperm preparation methods, even after 15 years of a clear recommendation that it should not be used for this purpose due to its iron content (Gomez and Aitken 1996).

The Delusion of Suddenly Changed Limits Between Fertile and Subfertile men

The part of WHO 2010 that has caught most attention in the field of reproductive medicine is the lowered reference limits calculated from results on semen provided by recent fathers and men in a general population. It appears that there is a common belief that the biology of subfertility has changed as a result of the lowering of the “normal/fertile” reference limits or ranges. There are, however, a number of problems related to the establishment of reference ranges based only on individuals without the disorder, i.e., men who are not subfertile (Björndahl 2011). Furthermore, since the data were collected during a long period of time, and external quality control had not been implemented in all contributing laboratories (Cooper et al. 2010), the validity of the suggested reference limits can be questioned. Due to the considerable overlap of results from fertile and subfertile men, a valid approach would be to identify three zones: (i) “normal results,” i.e., a low probability of subfertility and high probability of fertility; (ii) “abnormal results,” i.e., a high probability of subfertility and low probability of fertility; and (iii) “borderline results,” i.e., no clear discrimination between subfertility and fertility (Björndahl 2010; Björndahl et al. 2010a). Dividing the range of results into these three zones is well established in andrology (Mortimer 1994a; Eliasson 1977), and the material presented in WHO 2010 provides no evidence that might contradict the validity of this principle.

A further concern regarding the origin of the WHO 2010 reference values is that the data came from studies on semen samples obtained after 2–7 days of abstinence, as has been advocated in all five editions of the WHO manual. This persistently ignores the fact that MacLeod and Gold (1952) clearly demonstrated that ejaculate volume, and sperm concentration in particular, increase considerably with each day of increasing abstinence: e.g., sperm concentration more than doubled when the abstinence increased from 3 to 10 days. Similar results have been reported by others (Mortimer et al. 1982). For the purpose of standardization, and especially comparisons between groups, it is therefore of the utmost importance that the prescribed period of abstinence before a semen analysis should be from 3 to 4 days (Björndahl et al. 2010a; Menkveld 2007). The fact that abstinence periods were not so standardized in the source studies for the WHO 2010 casts further doubt on the usefulness of the derived reference values.

In the most recent 2010 manual (World Health Organization 2010), the WHO has published new criteria for human semen characteristics that are markedly lower than those previously reported. It is noteworthy that the WHO manual reports reference values identified in fertile population rather than the minimum requirements for male fertility. The reference ranges have been identified based on the assessment of 4,500 men from 14 different countries whose partners were able to conceive within 12 months (Cooper et al. 2010). Cooper et al. have published updated reference values obtained from analyses of multi-country data from laboratories that have used the WHO standard methodology for semen analysis (World Health Organization 1987, 1992, 1999). For the first time, semen analysis results from recent fathers with known time-to-pregnancy (TTP), defined as months (or cycles) from stopping contraception to achieving a pregnancy, were analyzed. Raw data obtained from five studies of seven countries in three continents were pooled then assessed (Stewart et al. 2009; Slama et al. 2002; Swan et al. 2003; Jensen et al. 2001; Haugen et al. 2006; Auger et al. 2001). Approximately 1,900 men, who had fathered a child within 1 year of trying to initiate pregnancy, provided a sample of semen each for sperm counts, motility, and volume assessments. Data on sperm morphology were extracted from four studies comprising approximately 1,800 men, whereas sperm vitality, assessed by the eosin-nigrosin method, was obtained from approximately 400 men of two countries (Stewart et al. 2009; Swan et al. 2003; Haugen et al. 2006; Auger et al. 2001). The mean ± SD male age was 31 ± 5 years (range 18–53 y) and only ten men were over 45 years old. Participating laboratories practiced internal and external quality control and used standardized methods for semen analysis according to the WHO manual for the examination of human semen current at the time of the original studies (Cooper et al. 2010). The 95 % reference intervals are commonly referenced with the lower 2.5 and 5 percentile being used as limits for two- and one-sided distributions (Table 5.2). The fifth centile was proposed as the lower reference cutoff limit for “normality” (Cooper et al. 2010).

Table 5.2 Distribution of values, lower reference limits, and their 95 % CI for semen parameters from fertile men whose partners has a time-to-pregnancy of 12 months or less

Data from three other groups have been used for comparison: (1) “unscreened” men from the general population or young volunteers participating in hormonal contraception studies, considered representatives of the general population (965 samples, 7 studies, 5 countries, 3 continents); (2) “screened” men from different origins, of unknown fertility but with semen analysis within reference values (934 samples, 4 studies, 4 countries, 3 continents, 2 WHO multinational studies); and (3) fertile men with unknown TTP, representing the group and all ranges of fecundity – normal or moderately or severely impaired (817 samples, 2 studies, 2 continents, 2 WHO multinational studies).

The assessment of progressive motility according to grades, as recommended by the previous WHO manuals, has been replaced by categorizing motile sperm as being “progressive” or “nonprogressive.” In addition, the strict criterion for morphology assessment was incorporated as the standard method. The lower limits of these distributions were lower than the values presented in previous editions except for the total sperm number per ejaculate (World Health Organization 1987, 1992, 1999, 2010).

The very low cutoff value for sperm morphology of 4 % morphologically normal spermatozoa, as proposed in the new edition of the WHO manual on semen analysis, is in agreement with recently published values and reflects the trend of a decline in reported mean values for normal sperm morphology. The reduced value for morphologically normal spermatozoa over the years may be due to several factors. The first is the introduction of strict criteria for the evaluation of sperm morphology. Other reasons may include the introduction of additional criteria for sperm morphology abnormalities and the suggested decrease in semen parameters because of increasing negative environmental influences. The newly proposed very low normal value may not provide the strong predictive value for a males’ fertility potential. However, certain morphology patterns and sperm abnormalities are now known to be of strong prognostic value. A good predictive value can be obtained by following the holistic, strict approach for sperm morphology and related parameter evaluation (Menkveld 2010).

5.11 Impact of 2010 WHO Guidelines

Several studies have evaluated consequences of revised reference limits and other parameters proposed in the 2010 WHO guidelines. Catanzariti et al. have reevaluated the results of semen analysis of 427 men using the new criteria. Almost 16 % of the patients, considered infertile according to the old criteria, were evaluated to be normal by the new classification and they would not need any treatment for infertility (Catanzariti et al. 2013). Their study also demonstrated that none of the patients that were previously considered normal changed to abnormal, according to the new classification, but some patients, about 15 %, changed from abnormal to normal by the new classification.

In a recent study by Baker et al., fertility categories were assigned as follows: BE (below WHO 2010 criteria), BTWN (above WHO 2010 criteria but below WHO 1999 criteria), and N (above WHO 1999 criteria) (Baker et al. 2015). A total of 82.3 % of initial semen tests were categorized as BE, and the predominance of this category was unchanged by publication of the WHO 2010 criteria. Men with initial semen analysis categorized as BTWN or N represented 16.2 and 1.5 % of the referral population, respectively. Subjects initially categorized as BTWN were more likely to change fertility categories, and overwhelmingly this migration was downward. Analysis of normal individual semen parameters revealed statistically worse mean concentration and motility when at least one other parameter fell below the WHO 2010 criteria (Baker et al. 2015).

Estaves et al. also mention about reclassification of data involving 982 men that had abnormal semen analysis results based on the 1990 WHO criteria. Approximately 39 % of these men would be reclassified as “normal” by the new 2010 criteria. Morphology itself accounted for over 50 % of the reclassifications (Esteves 2014).

Semen parameters below the WHO 2010 reference limits will be used to define male infertility and to recommend further evaluation and treatment. Such recommendation will not address the case of unexplained infertility presenting with at least two normal semen analysis and no identifiable causes after a thorough work-up including history, physical examination, and endocrine laboratory testing in the absence of female infertility (Hamada et al. 2012). The use of the new WHO 2010 reference values will lead to more men to be classified as “fertile.” As a result, assessment of semen analysis alone as a surrogate measure for male fertility may lead to nondiagnosis or delayed diagnosis of male infertility.

The WHO 2010 reference limit will also impact recommendation for further treatment based on the results of semen analysis. Current guidelines propose treatment to men with clinical varicoceles in the presence of abnormal semen analyses (Male Infertility Best Practice Policy Committee of the American Urological Association and Practice Committee of the American Society for Reproductive Medicine 2004; Dohle et al. 2012; de Radiologia and Projeto Diretrizes da Associacao Medica Brasileira 2013; Practice Committee of American Society for Reproductive Medicine 2008), but the application of the new WHO reference values might lead to their ineligibility for treatment if their semen parameters are above the fifth centile. This may prevent them from achieving a substantial improvement in semen parameters and a greater chance of spontaneous pregnancy (Esteves 2014).

The threshold for normal sperm in terms of sperm morphology (strict criteria; Tygerberg method) has been lowered to 4 % in the WHO 2010 criteria compared to 14 % in the previous 1999 standards. Murray et al. have shown that 15.9–19.3 % of men would be reclassified as having normal morphology of > 4 % from having been abnormal in the past, i.e., < 14 % (Murray et al. 2012). This could lead to increased recommendation of intracytoplasmic sperm injection (ICSI) instead of conventional IVF or intrauterine insemination (IUI) as the pregnancy outcome of IVF and IUI are significantly lower when semen with low proportion of normal sperm is used (Van Waart et al. 2001; Coetzee et al. 1998). However, 5 % of the subject population used to determine the reference limits themselves had less than 4 % cutoff for normal sperm morphology. Medical disease associated with male infertility may also be missed with fewer men potentially being defined as infertile by the new reference values. Kolettis and Sabanegh (2001) found that 6 % of infertile men were found to have significant medical pathology detected by the infertility work-up.

5.12 Limitations of the 2010 WHO Standards

The significance of a cutoff value defining fertile from nonfertile men without knowledge of the overall clinical history is a concern (World Health Organization 1999, 2010). The values created in the 2010 WHO study were from 4,500 fertile men, and analyses of semen from infertile men were not performed. Therefore, WHO did not define men as infertile if they were below the one-sided 95 % confidence interval of fertile men. The value of semen analysis parameters themselves has been questioned with other functional sperm abnormalities potentially evident that are independent from the current measured parameters (Barratt et al. 2011; Esteves et al. 2012).

The assignment of 5th centile as a discriminating cutoff value in male reproductive potential is a new development. More specifically, the 5th-centile values for semen parameters were generated on the basis of broader statistical norms and not on the basis of any clinical outcomes from fertile and infertile men. In other words, there is no clear evidence that application of these values effectively segregates men on the basis of their fertility, yet that is exactly how these new ranges are being applied clinically all over the world. As noted by Niederberger (2011), although 5th-centile values are commonly used as cutoff markers in statistics, the ability of this arbitrarily assigned cutoff point to provide meaningful information about a male’s fertility potential is questionable (Yerram et al. 2012).

There are pitfalls with reference limits and the proper use of such limits is essential for the interpretation of the results of semen analysis. It is critical to understand the statistical basis of reference ranges and cutoff limits and the importance of standardizing methods and practical laboratory training. Proper understanding of biological and physiological variability is also essential for the correct interpretation of semen analysis results. Understanding all the factors influencing semen analyses is of great importance for the development of the entire field of reproductive medicine (Björndahl 2011).

The reference population needs to be carefully defined for the intended clinical use of semen analysis. To determine appropriate reference intervals for use in male fertility assessment, a reference population of men with documented time-to-pregnancy of <12 months would be the most suitable. However, for epidemiological assessment, a reference population made up of unselected healthy men would be preferred. Currently, reference and decision limits derived for individual semen analysis test results are the interpretational tools of choice. In the long term, interpretation of semen analysis in combination with information from the female partner using multivariate methods will be necessary for the assessment of the likelihood of achieving a successful pregnancy in a subfertile couple (Boyd 2010).

Appropriate interpretation of the seminal analysis should be based on the dependability of the laboratory and the medical knowledge about the meaning of the seminal alterations. A recent study compared the evaluation of semen parameters from three laboratories, using the WHO recommendations for reporting sperm count, motility, and morphology. In a study by Montes et al., there was a statistical significant interlaboratory variability of the parameters studied (p < 0.001). The observed mean coefficients of variation intra-observer (CVs) were 3.6 % for sperm count, 20.3 % for motility, and 9.4 % for sperm morphology (Rivera-Montes et al. 2013). Procedures for the quality control of semen analysis methods have been introduced recently. However, there are issues relating to the methodology of Cooper et al. (1999, 2002). Even with internal and external quality controls, semen analysis is operator dependent and subjective assessment, especially so for sperm morphology (Menkveld 2010; Keel et al. 2000).

The methodology employed to determine the reference ranges in WHO 2010 manual gives rise to important concerns on careful examination. It appears unsound to assume that the 2010 reference standards represented the distribution of fertile men across the globe (Esteves et al. 2012; Vieira 2013). The group of studied men represented a limited population of individuals who lived in large cities in the Northern hemisphere, but for a small subset of men from Australia. Of note it was the absence of men from densely populated areas in Asia, the Middle East, Latin America, and Africa. This fact precludes the examination of regional and racial discrepancies that could account for semen quality variability. The selection criteria were arbitrary, as stated by Cooper et al.: “laboratories and data were identified through the known literature and personal communication with investigators and the editorial group of the fifth edition of the WHO laboratory manual” (Cooper et al. 2010). The heterogeneity of human semen further diminishes the clinical significance of the WHO reference values. Data indicate that there are subtle variations in semen parameters between men in different geographic areas and even between samples from the same individual (Alvarez et al. 2003; Jorgensen et al. 2001).

The lowered 2010 WHO thresholds have also been attributed to the decline in sperm count caused by endocrine disruptors and other environmental pollutants, such as insecticides and pesticides (Handelsman 2001; Sadeu et al. 2010; Carlsen et al. 1992). However, the observed discrepancies are more likely to be associated with the methodological factors, such as patient selection criteria, the higher laboratory quality control standards, and the strict criteria for morphology assessment (Cocuzza and Esteves 2014).

Conclusions

What seems like a relatively small change has a large potential impact. This might actually result in previously subfertile men being classified as fertile by many providers, especially in idiopathic cases where the only feature may be the semen analysis to make a decision on male factor. This will affect reporting data for research or even demographics and outcomes. This may mislead and misrepresent the definition of male infertility and underrepresent the cause and subsequent work-up of infertility in a couple. In addition, better international standardization of the technical methodology, consensus on the interpretation of sperm morphology evaluation criteria, and standardized international external quality control (EQC) schemes are of utmost importance to formulate robust guidelines that will have good predictive value for fertility.