Introduction

The Y chromosome harbors genes essential for testis development and function, such as the master gene for testis determination (SRY) and the genes residing in the azoospermia factor (AZF) regions. Since the discovery of the AZF region in 1976 (Tiepolo and Zuffardi 1976), and after the definition of three distinct AZF regions in 1996 (Vogt et al. 1996), this chromosome became the most important molecular genetic target in male infertility. First, the role of complete AZF deletions was addressed by several studies during the 1990s leading to the introduction of AZF testing into clinical practice (Krausz et al. 2014). Subsequently, rearrangements of the AZFc region became a hot topic both in relationship with impaired spermatogenesis and testicular germ cell tumors. While the above research on Y chromosome-linked copy number variations (CNV) was highly successful from a clinical point of view, the role of AZF genes in spermatogenesis remains largely unknown. This review is aimed at providing a comprehensive overview of Y chromosome-linked CNVs, genes, and Y chromosome haplogroups in relationship to spermatogenic failure.

The Y chromosome structure and its genes

The human Y chromosome differs markedly from the remaining chromosomes in terms of its size, genomic structure, content, and evolutionary trajectory (Navarro-Costa 2012). It is an acrocentric chromosome with two pseudoautosomal regions (PAR1 and PAR2) and the MSY (male-specific Y region), which represents about 95% of the entire Y chromosome length. The PARs correspond to an X–Y homology block and contain 27 genes encoding for products related to diverse biological function and have accordingly variable expression pattern (Mangs and Morris 2007). PAR1 is involved in meiotic pairing, which is an essential process for successful male meiosis (Kauppi et al. 2011).

The MSY region does not recombine with the X chromosome and contains three classes of sequences: X-transposed (with 99% identity to the X chromosome), X-degenerate (single-copy genes or pseudogene homologues of X-linked genes) and ampliconic (Skaletsky et al. 2003). During evolution, MSY genes were either driven to decay or to specialization into functions selectively advantageous for males (Navarro-Costa 2012; Bellott et al. 2014). As a consequence of progressive gene decay, the number of genes on the Y chromosome is extremely reduced in respect to the X chromosome (54 protein coding genes vs. approximately 700). According to the Y reference sequence, belonging to a single individual bearing haplogroup R1, MSY contains 156 transcription units including 78 protein-coding units (27 genes encoding proteins). It is important to note that the sequence, structure and the copy number of multicopy genes may vary in different Y chromosomes (Repping et al. 2006, see also other articles of this special issue).

Concerning specialization into male reproductive functions, the majority of genes with a predicted role in spermatogenesis are mapping into ampliconic sequences. Ampliconic sequences are characterized by sequence pairs showing nearly complete (>99.9%) identity with one or more regions on MSY and are organized in eight massive palindromes. The presence of highly homologous repeated sequences with the same orientation predisposes to two recombining mechanisms with opposite consequences on male reproduction (Fig. 1). On one side, it allows gene conversion, i.e., a non-reciprocal transfer of sequence information occurring between duplicated sequences within the chromosome. This process preserves genes important for male fertility by preventing the gradual accumulation of deleterious mutants across evolutionary time in the absence of crossing over (Rozen et al. 2003). On the other hand, the same structure predisposes to deletion/duplications of the genetic material between repeated sequences through non-allelic homologous recombination (NAHR) with potential negative consequence on spermatogenesis. Finally, it has been speculated that palindromes may regulate the transcriptional permissibility of the MSY genes via changes in chromatin structure (Lemos et al. 2010).

Fig. 1
figure 1

Schematic representation of the Y chromosome, the male-specific Y (MSY) region, and the ampliconic sequences organized into eight massive palindromes on the Yq. The presence of highly homologous repeated sequences (containing genes important for spermatogenesis) with the same orientation predispose to two recombination mechanisms with opposite consequences on male fertility. AZF azoospermia factor, NAHR non-allelic homologous recombination, P palindrome, IR2 inverted repeats

Spermatogenesis is characterized by many unique features, which are especially relevant during meiosis and post-meiotic stages (Hecht 1998). One of the peculiarities is transcriptional silencing, therefore a network of mRNA storage and translational control is crucial for the accomplishment of this process. Similarly, ubiquitination in testis is known to regulate many fundamental biochemical processes, including DNA repair, regulation of meiotic chromatin structure, and histone-to-protamine transition during spermatogenesis. Moreover, germ cells are particularly rich of microtubule networks mainly involved in meiosis. Many of the MSY genes participate in the above-mentioned specific processes.

MSY genes with a predicted role in spermatogenesis are classifiable into two categories: single copy and ampliconic multicopy genes. Genes belonging to the first category have a single-copy homologue on the X chromosome (they are mapping either to the so-called “X-degenerate” or to the “X-transposed” sequences) and these X–Y gene pairs show signs of dosage sensitivity. In fact, thanks to a study on the X and Y chromosome of eight species, it has been demonstrated that a higher proportion of X-linked genes with surviving Y-linked homologues escape X-inactivation compared to those without surviving Y-linked homologues (Bellott et al. 2014). All of these genes are ubiquitous except for TGIF2LY, which is expressed exclusively in the testis. Among these genes, those belonging to the AZFa and AZFb regions (see paragraph below) are the strongest spermatogenesis candidate genes (USP9Y, DDX3Y, EIF1AY, RPS4Y2, and KDM5D) and their deletion en block is responsible for spermatogenic failure in humans. Two ubiquitously expressed genes, USP9Y and DDX3Y (formerly DBY), map to the AZFa region.

USP9Y, a member of a family of deubiquitinating genes, encodes a protein with a ubiquitin C-terminal hydrolase activity. This protein may play an important regulatory role at the level of protein turnover by preventing degradation of proteins by the proteasome through the removal of ubiquitin from protein–ubiquitin conjugates (Ginalski et al. 2004). It has been demonstrated that in humans, the isolated absence of this gene is associated with a large spectrum of testis phenotypes (from azoospermia with hypospermatogenesis to normozoospermia) (Tyler-Smith and Krausz 2009). Based on this, USP9Y is more likely a fine tuner that improves efficiency, rather than a provider of an essential function. Moreover, the observed natural conceptions suggest that the protein is not required for the final sperm maturation process or for the acquisition of sperm-fertilizing ability (Krausz et al. 2006; Luddi et al. 2009).

DDX3Y encodes an ATP-dependent RNA helicase that is a member of the well-conserved DDX3 DEAD Box Helicase family (Mohr et al. 2002). Although DDX3Y’s expression at the RNA level is ubiquitous, there are specific transcript variants expressed only in the male germ line and its protein expression seems to be restricted to the testis (Ditton et al. 2004; Jaroszynski et al. 2011; Rauschendorf et al. 2014). DDX3Y protein was found predominantly in spermatogonia, whereas its X chromosome homologue was expressed after meiosis in spermatids. This observation suggests that the X–Y pairs have diverged functionally to fulfill different roles in the RNA metabolism of human spermatogenesis. Based on DDX3Y’s expression profile and the above-mentioned USP9Y deletion phenotype, it is highly likely that the removal of this gene is responsible for the SCOS phenotype in AZFa-deleted patients. In line with the role of DDX3Y in the early stages of spermatogenesis, a recent study in the Drosophila melanogaster on RNA helicase Belle (DDX3) shows that this protein is requested for mitotic progression and survival of germline stem cells (GSCs) and spermatogonial cells (Kotov et al. 2016). Overall, DDX3Y protein is involved in various biological processes: (1) mRNA nuclear export, (2) RNA unwinding, (3) transcriptional regulation, i.e., DDX3Y may initiate global transcriptional remodeling in pre-spermatogonial cells, (4) translation initiation, i.e., by directly activating the expression of RNA metabolism genes through the binding to small ribosomal RNA or mRNA molecules, (5) cell-cycle control, (6) apoptosis (Kotov et al. 2014; Ramathal et al. 2015; Schröder 2010). Since no isolated DDX3Y deletion or pathogenic mutations were reported in this gene so far, its exact biological role in spermatogenesis remains to be clarified.

In the AZFb region, there are three single-copy genes, EIF1AY, RPS4Y2, and KDM5D, which are involved in post-transcriptional or in epigenetic control.

EIF1AY belongs to the EIF-1A family requested for a high rate of protein biosynthesis by enhancing ribosome dissociation into subunits and stabilizing the binding of the 43S complex to the 5 end of capped RNA (Mitchell and Lorsch 2008). In particular, EIF1AY encodes a protein acting as an essential translation initiation factor during spermatogenesis (Lahn and Page 1997; Kleiman et al. 2007). In the mouse, a Y-encoded subunit of the translation initiation factor, Eif2, has been demonstrated as the mouse spermatogonial proliferation factor, which is requested also to complete the first meiotic division (Mazeyrat et al. 2001). Mazeyrat et al. have also convincingly demonstrated that this gene is dosage-sensitive, and its haploinsufficiency is responsible for the early failure of spermatogenesis. Recent studies in the mouse provided further evidence for its fundamental role, showing that two Y-chromosome genes, testis-determinant Sry, and spermatogonial proliferation factor Eif2s3y, are sufficient to obtain haploid germ cells with successful assisted reproduction (Yamauchi et al. 2014). The replacement of Sry by transgenic activation of its downstream target Sox9, and Eif2s3y, by transgenic overexpression of its X chromosome, encoded homolog Eif2s3x allowed generating a Y chromosome-less mouse with the capacity to produce haploid cells (Yamauchi et al. 2016). Unfortunately, these results are difficult to be translated to humans since the Y chromosomes of these two species are highly divergent. For instance, only nine of 17 ancestral genes in the human MSY are conserved in the mouse Y chromosome and only 2.2% of mouse MSY sequence shares ancestry with the primate MSYs. Further, the mouse Y-chromosome long arm harbors genes that are absent on the primates Y chromosomes (Soh et al. 2014)

RPS4Y2 encodes a testis-specific, essential ribosomal protein subunit involved in RNA processing that is required for mRNA binding to the ribosome (Bergen et al. 1998). Based on the above, it has been postulated that like many other MSY genes, it also plays a role in the post-transcriptional regulation of the spermatogenic program (Andrés et al. 2008; Lahn and Page 1997; Lopes et al. 2010).

KDM5D (alias SMCY) is a member of the evolutionarily conserved KDM5 family of four proteins, KDM5A/B/C and D, and it was conserved in the genomes of both sexes. This gene encodes for a histone H3 lysine 4 (H3K4) demethylase. H3K4me is an important epigenetic mark that has been implicated in diverse biological functions, such as gene activation and repression, DNA damage response, and genomic imprinting and development (Shao et al. 2014). Proteins that regulate the methylation of H3K4 are known to be involved in neurological and developmental disorders. It has been demonstrated that KDM5D forms a protein complex with the MSH5 DNA repair factor during spermatogenesis (Akimoto et al. 2008; Lee et al. 2007). This complex locates to condensed DNA during the leptotene/zygotene stage, suggesting an involvement in male germ cell chromatin remodeling and a role in chromosome condensation during meiosis (Navarro-Costa et al. 2010b and references therein). Recently, by studying the Drosophila’s oocytes, Navarro-Costa et al. (2016) reported that the perturbation of the oocyte’s epigenome in early oogenesis, through depletion of the dKDM5 histone demethylase, results in the temporal deregulation of meiotic transcription and affects female fertility. KDM5D is the Y chromosome-specific homologue of dKDM5, and it is possible that the meiotic maturation defects seen in dKDM5-depleted oocytes may be extended to the human AZFb deletion-phenotype, which is characterized by spermatocytic arrest.

For those genes that are outside the AZF regions, a role in gametogenesis has been proposed based on various model organisms (ZFY) and testis-specific expression (TGIFLY). Given that no deletion or mutations of these genes have been observed in infertile men, direct evidence for their spermatogenic function is still lacking.

ZFY is the first coding gene identified on the human Y (Page et al. 1987). It is a transcription factor containing a zinc finger domain that regulates the transcription of a number of genes. The mouse contains two genes, Zfy1 and Zfy2, and through the use of knock-out mice and RNAi-mediated disruption of Zfx/Zfy in mouse testis, it has been demonstrated that Zfy1/Zfy2 are required for multiple aspects of spermatogenesis, especially for spermatocyte’s functions (Nakasuji et al. 2017 and reference therein; Zhang et al. 2017). Zfy1 and Zfy2 act as ‘executioners’ for checkpoint during the first meiotic division in which Zfy genes first promote meiotic sex chromosome inactivation (MSCI), then monitor its progress, and finally execute cells with MSCI failure. These genes also have a major role in ensuring that the second meiotic division occurs (Royo et al. 2010; Vernet et al. 2014) and in sperm head remodeling and sperm tail development (Vernet et al. 2016). In humans, a single ZFY gene expresses two major splice variants: a full-length version, which shows transactivation ability, and a short version, which lacks a key acidic domain and has no detectable transactivation activity (Mardon et al. 1990).

TGIFLY belongs to the TGIFL family, homeobox gene family encoding for helix-turn-helix transcriptional regulators. In Drosophila, TGIF-related homeobox genes, vis (vismay) and achi (achintya) are required for spermatogenesis and in the absence of vis a spermatogenic arrest was observed (Wang and Mann 2003). Since the Y-linked homeobox gene and its X-linked homologue are specifically expressed in adult human testes, a similar spermatogenic function for the human genes is possible.

Concerning multicopy MSY genes, they are mapping to the ampliconic regions and are divided into nine gene families, eight of them on the Yq and one on the Yp (Table 1). With the exception of two genes (BPY2 and PRY), the others have either an autosomal homologue (CDY/CDYL and DAZ/DAZL) or an X-linked homologue. Three of the X–Y pairs are dosage-sensitive (VCY/VCX, HSFY, XKRY), whereas the X homologues of RBMY and TSPY genes are subjected to X inactivation and their Y-linked partners show signs of functional differentiation. The ampliconic MSY genes, which are discussed below, are involved in various biological processes such as (1) chromatin modification, (2) transcription, (3) splicing, (4) translation, and (5) ubiquitination.

Table 1 Human protein-coding Y genes mapping to Y chromosome amplicons and their homologues

As stated above, translational regulation and RNA metabolism are especially relevant in male gametogenesis due to the gradual reduction of mRNA synthesis in post-meiotic stages. DAZ and RBMY are two multicopy genes with RNA binding capacity and function as adaptors for target mRNA transport and activators of their translation (Vogt and Fernandes 2003).

DAZ is found exclusively in primates and it is homologous to an autosomal gene called “DAZL” (DAZ-like autosomal; Saxena et al. 1996; Yen et al. 1996) with a highly conserved RNA recognition motif (RRM) for binding target mRNAs and at least one characteristic sequence of 24 amino acids, which are termed as DAZ repeats. The Y-linked DAZ probably originated from the translocation and amplification of this ancestral autosomal gene. Some insights into human DAZ function came from the analysis of its autosomal homologues in other species and in vitro studies on human embryonic stem cells (hESCs) (Fu et al. 2015 and references therein). Targeted disruption of Dazl in mice leads to a complete absence of gamete production in both testis and ovary, demonstrating that Dazl is essential for the development or survival of germ cells (Ruggiu et al. 1997). In Drosophila, mutation of the BOULE gene, another homologue of DAZ, results in spermatocyte arrest at the G2/M transition and complete azoospermia (Castrillon et al. 1993; Eberhart et al. 1996). Overexpression of Dazl, BOULE, and DAZ induced both human ES and pluripotent stem (iPS) cells to differentiate into primordial germ cell-like cells (PGCLCs), and enhanced their subsequent maturation and progression through meiosis (Kee et al. 2009; Medrano et al. 2012). These studies provided evidence for an involvement of human DAZL in primordial germ cell formation, whereas DAZ and BOULE promote later stages of meiosis and development of haploid gametes (Kee et al. 2009). While the downstream targets of the DAZ family are still unknown, it has been demonstrated that DAZ and DAZL interact with other proteins involved in mRNA transport/localization including DAZAP1, DAZAP2, and PUMILIO-2 (Fu et al. 2015 and references therein; Tsui et al. 2000).

RBMY1 contains an RNA-binding motif in the N-terminus and four SRGY (serine, arginine, glycine, tyrosine) boxes in the C-terminus. This gene derived from an amplification of X-degenerate gene (RBMY1.1-6) (Chai et al. 1998; Skaletsky et al. 2003) and is involved in a series of biological functions related to RNA metabolism such as transport/storage of transcripts, splicing regulation, and signal transduction (Elliott 2004 and references therein). Immunostaining of human germ cells indicates that this protein is expressed in all stages of spermatogenesis and shows a dynamic shift in its subcellular distribution during spermatogenesis, suggesting its differential functions in the individual stages of germ cell maturation (Elliott et al. 1997, 1998). More recent in vitro experiments have provided a detailed characterization of RBMY localization and function showing a specific and dynamic interaction with some components of the exon-junction complex and with splicing factors (Dreumont et al. 2010). These data indicate that RBMY could act in germ cells as a co-regulator of a specific set of alternative splicing events. In addition to pre-mRNA splicing activators, RBM interacts with the STAR (signal transduction and RNA processing) proteins called SAM68 and T-STAR. These proteins are implicated in cellular signaling pathways, pre-mRNA processing, and cell cycle control (for review see Elliott 2004). The Y-linked gene Rbmy1a1 is highly methylated in mature sperm and resists DNA demethylation post-fertilization thanks to the action of the maternal Trim28 (Sampath Kumar et al. 2017). In a mouse model, characterized by the absence of the maternal Trim28, an ectopic activation of this gene was observed, causing male-specific peri-implantation lethality. The most likely explanation for this effect is the accumulation of abnormal splice variants ridden with skipped exons and alternative splice donor/acceptor sites, causing large deletions and out-of-frame truncations. These data further confirm the importance of RBMY in splicing. In addition, the presence of the RBMY protein in post-meiotic germ cells and also in the transcriptionally quiescent spermatozoa implies its role beyond RNA editing. It is predicted that similarly to DAZ, RBMY is also involved in translational control (Navarro-Costa et al. 2010b and references therein; Vogt et al. 2008).

Two other ampliconic genes, CDY and HSFY, are also involved in transcriptional regulation. CDY1 and CDY2 gene families originate from a polyadenylated mRNA of the CDYL1 or CDYL2 locus on chromosome 6 and 16, respectively, which has been then retrotransposed to the Y chromosome. The CDY1 gene is expressed only in testis, whereas CDY2 is expressed ubiquitously (Lahn and Page 1999). The CDY genes contain two functional motifs: a chromodomain implicated in chromatin binding, and a catalytic domain involved in acetylation reactions. Accordingly, it has been demonstrated that CDY1 and CDY2 are involved in hyperacetylation of histones during the maturation of spermatids (Lahn et al. 2002; Lahn and Page 1997, 1999). This step is fundamental for the displacement of histones by the sperm-specific DNA packaging proteins, protamines, at this final stage of spermatogenesis (Meistrich et al. 1992).

HSFY presents two copies on the Y chromosome (HSFY1-2), derived from the amplification of an X-degenerate gene (Skaletsky et al. 2003). These genes encode three different mRNA transcripts that are expressed specifically in human testis localizing in the nuclei of germ cells and the cytoplasm of Sertoli cells (Shinka et al. 2004). Tessari et al. (2004) have demonstrated that only the protein translated from transcript variant 1 contains a heat shock factor–like DNA-binding domain, hence representing the critical HSFY transcript. However, this protein does not bind to heat shock elements and no HSFY-targeted promoters have been identified so far (Shinka et al. 2004). The identification of 4 oligo/azoospermic subjects with a deletion removing the entire P4 palindrome and thus both HSFY copies suggested that this gene is not an essential factor for spermatogenesis (Kichine et al. 2012).

The rest of the ampliconic genes are also specifically expressed in the testis and are involved in: (1) membrane transport (XKRY), (2) apoptosis of spermatids and spermatozoa (PRY), (3) cytoskeletal network of microtubules formed during the post-meiotic elongation phase of the male germ cell (BPY2), (4) mRNA stability (VCY), and (5) cell proliferation (TSPY). While data are rather scarce on XKRY and VCY, more insights into the function of PRY and BPY2 have been gained from expression studies.

PRY encodes a protein homologous to protein tyrosine phosphatase, non-receptor type 13, which is a signaling molecule involved in the regulation of several cellular processes, particularly in programmed cell death (Dromard et al. 2007). Based on the expression profile of the gene in spermatozoa (i.e., higher in the defective germ cell fraction and is increased in spermatozoa of men with abnormal semen parameters), the involvement of this protein in apoptosis of spermatids/spermatozoa has been suggested (Stouffs et al. 2004).

The BPY2 (alias VCY2) gene family is generated from transposed autosomal non-protein-coding segments and presents three copies on the reference sequence (Cao et al. 2015). Its role in spermatogenesis is largely unknown, and data are derived only from expression studies and from the analysis of interacting proteins. An interaction with ubiquitin protein ligase E3A (UBE3A), a widely expressed member of the ubiquitin protein degradation system, has been demonstrated (Wong et al. 2002). Since UBE3A corresponds to a testis-expressed E3 ubiquitin protein ligase, BPY2 may modulate its target specificity. Additionally, the same authors demonstrated that BPY2 interacts with VCY2IP-1, which shows homology with microtubule-associated protein 1S (MAP1S) (Wong et al. 2004). They proposed that the interaction of VCY2 and VCY2IP-1 in the microtubule network might be regulated by UBE3A through the interaction of VCY2 and other proteins in the microtubule.

Moreover, there are three transcription unit families in the Y chromosome amplicons with testis-specific expression. The function of TTTY (testis-specific transcription Y) non-coding genes as well as the function of CSPG4LY and GOLGA2LY remain unknown (for review see Navarro-Costa 2012; Skaletsky et al. 2003).

There is growing evidence that MSY genes are also involved in other than spermatogenic function (Bellot et al. 2014; Kido and Lau 2015). For instance, dosage-sensitive X–Y gene pairs, expressed in multiple tissues, may also have an implication in male viability, since about 99% of human 45,X conceptuses are inviable, and those that survive to term are often mosaic for all or part of a second sex chromosome (Hassold et al. 1988; Bellott et al. 2014). The presence of Y-specific protein isoforms for eight X–Y gene pairs raises the possibility that even widely expressed ancestral genes on the Y chromosome may exhibit subtle functional differences and thus be responsible for sexual dimorphism in health and disease conditions (Bellott et al. 2014). Some of the X homologues of X–Y pairs (KDM5C, UTX, and NLGN4X) are responsible for X-linked intellectual disability syndromes in hemizygous males whereas heterozygous females show mild intellectual impairment (Lindgren et al. 2013; Rujirabanjerd et al. 2010; Shao et al. 2014). It can be speculated that deletion or mutations in the Y copy of these genes may also lead to similar phenotypes (see paragraph below). Moreover, recent studies report that either the loss of the Y chromosome or ectopic expression of Y chromosome genes are closely associated with various male-biased somatic cancers (Kido and Lau 2015).

In summary, although all of the above-described MSY genes are likely to play a role (either as essential factor or as a “fine tuner”) in spermatogenesis, their precise biological action remains poorly understood. In the absence of KO models of human Y genes, all of the available information on their putative spermatogenic function originates from testis mRNA and protein expression studies, KO of homologous genes, in vitro studies in hESC-derived germ cells and predicted biological function. As stated above, the search for isolated gene deletions was largely unsuccessful, and provided data only on two AZF genes (USP9Y and HSFY), demonstrating their non-essential role in spermatogenesis.

Y chromosome-linked CNVs

As described in the previous paragraph, the Y chromosome should be considered a genetically dynamic chromosome prone to significant variation owing to the high proportion of segmental duplications, which provide the structural basis for the generation of CNVs. From a clinical point of view, Y-linked CNVs can be divided into three major categories: (1) AZF deletions (complete and partial AZFa and AZFb), (2) partial AZFc deletions/duplications, (3) the TSPY array (Fig. 2).

Fig. 2
figure 2

Schematic representation of the Y chromosome, of the different regions/genes involved in spermatogenesis and of the Y-linked copy-number variations. a The TSPY1 gene is located on the short arm of the Y chromosome (Yp) arranged in a tandemly repeated array. b Azoospermia factor regions (AZFa, b, and c) are present on the long arm of the Y chromosome (Yq) with an overlap between AZFb and c regions. c Triangles represent the relative sizes and the arm locations of palindromes in AZF regions (gaps between opposed triangles are the non-duplicated Y sequences). d The reference sequence (Y haplogroup R1) shows the presence of multicopy genes and transcription units in the AZF regions. The arrows with the same motifs represent repeated homologous sequences, which may undergo NAHR. e Complete AZF deletions remove all genes in the region. Three partial AZFc deletions (gr/gr, b2/b3, b1/b3) are shown, which all remove half of the AZFc gene content. gr/gr deletion is depicted with three alternative breakpoints. An example of partial AZFc duplication (gr/gr) is shown (similarly to the gr/gr deletion, different breakpoints may give origin to different types of gr/gr duplications)

  1. 1.

    Y chromosome microdeletions, removing the entire AZF regions (complete deletions), are one of the leading causes of spermatogenic failure and the screening for AZF deletions became part of the routine diagnostic work-up of men with severe oligozoospermia/azoospermia (Krausz et al. 2014). In 1976, Tiepolo and Zuffardi have demonstrated the first association between azoospermia and microscopically detectable deletions of the long arm of the Y chromosome (Yq). They proposed the presence of an AZF on distal Yq. After 20 years, Vogt et al. (1996) divided the AZF region into proximal, middle, and distal Yq11 sub-regions (AZFa, AZFb, and AZFc, respectively). The final characterization of the AZF regions was obtained after the sequencing of the Y chromosome. According to the last model, there are five different deletions patterns on Yq and the AZFb and AZFc regions are overlapping (Repping et al. 2002). In clinical practice, the five deletion hotspots are still called AZFa, AZFb, AZFb + c, and AZFc deletions, although the official nomenclature is based on the type of palindromes involved. Partial AZFa and AZFb deletions are extremely rare (Krausz et al. 2014 and references therein).

Complete AZFa deletion (removes 792 kb) occurs after homologous recombination between identical sequence blocks within the retroviral sequences in the same orientation HERVyq1 and HERVyq2 (Blanco et al. 2000; Kamp et al. 2000; Sun et al. 2000). The complete deletion of AZFb is caused by homologous recombination between the palindromes P5/proximal P1 that removes also part of the AZFc region belonging to P1 (Repping et al. 2002). This deletion removes 6.2 Mb. The complete deletion of AZFc removes 3.5 Mb, originates from the homologous recombination between amplicons b2 and b4 in palindromes P3 and P1, respectively (Kuroda-Kawaguchi et al. 2001). Deletions of both AZFb and AZFc regions have two potential breakpoints: (1) between P4/distal P1 removing 7.0 Mb sequence with 38 gene copies, (2) between P5/distal P1 removing 7.7 Mb with 42 gene copies. The most frequent deletion type is the AZFc region deletion (~80%) followed by AZFa (0.5–4%), AZFb (1–5%), and AZFbc (1–3%) deletion (Krausz et al. 2014). Deletions that are detected as AZFabc are most likely related to abnormal karyotype such as 46,XX male or iso(Y) (Lange et al. 2009).

The above deletions have a clear-cut cause–effect relationship with severe spermatogenic impairment since complete AZF deletions were never reported in normozoospermic men. The screening for Y chromosome microdeletions not only has a diagnostic but also a prognostic and preventive value. Deletions removing the entire AZFa or AZFb regions (complete deletions) are associated with azoospermia due to Sertoli cell-only syndrome (SCOS) and spermatogenic arrest (SGA), respectively. The rare partial AZFa and partial AZFb deletions are associated with residual sperm production. The complete AZFc deletions are associated with a variable semen phenotype ranging from oligozoospermia (mainly below 2 million spermatozoa/ml) to azoospermia (ranging from SCOS to hypospermatogenesis). Accordingly, testicular sperm extraction (TESE) is not recommended in case of complete AZFa and AZFb deletions, whereas there is a 50% chance of sperm retrieval in azoospermic men carrying a complete AZFc deletion (Hopps et al. 2003; Kleiman et al. 2011; Krausz et al. 2000). In deletion carriers presenting oligozoospermia, there is a potential risk of a progressive decrease of sperm concentration over time, therefore sperm cryoconservation could be advised as a preventive treatment (McElreavey and Krausz 1999; Krausz and Degl’Innocenti 2006 and references therein). Overall, the frequency of these deletions is higher in azoospermic men (5–10%, reaching the highest value in “idiopathic azoospermia”), followed by severe oligozoospermic men (2–5%) (Lo Giacco et al. 2014).

In case spermatozoa are present in the ejaculate or in the testis (after TESE), patients can generate their own biological children either naturally or through assisted reproductive technology (ART). The large majority of deletions are de novo, but some naturally transmitted cases of partial AZFa or partial AZFb and complete AZFc deletions have also been described (Kichine et al. 2012; Kühnert et al. 2004 and references therein; Krausz et al. 2006; Luddi et al. 2009; Plotton et al. 2010). The deletion will be obligatory transmitted to the male offspring who will suffer either from oligo or azoospermia. Since the genetic background and exposure to environmental factors may modulate the phenotypic expression of AZFc deletions, the exact semen phenotype is not predictable in the son. An important issue, addressed only by a minority of studies, is whether AZF deletions might lead to other pathological conditions beside spermatogenic failure. Studies on men with Yq microdeletions (Jaruzelska et al. 2001; Siffroi et al. 2000) and on patients bearing a mosaic 46,XY/45,X karyotype with sexual ambiguity and/or Turner stigmata (Papadimas et al. 2001; Papanikolaou et al. 2003; Patsalis et al. 2002, 2005), suggested that there is an association between some Yq microdeletions and an overall Y-chromosomal instability which might result in the formation of 45,X cell lines. The limited data on children born from AZF deletion carriers show that they are apparently healthy. However, it is worth noticing that fetus with 45,X karyotype have a very high risk of spontaneous abortion and there are no data about the abortion rate in these couples. Preimplantation genetic diagnosis (PGD) has been performed by two groups with conflicting data about the risk of monosomy X in embryos (Mateu et al. 2010; Stouffs et al. 2005) with consequent need for future investigations.

In 2011, Jorgez et al. (2011) reported that 5.4% of men with AZF microdeletion and normal karyotype displayed haploinsufficiency of the SHOX gene located in the pseudoautosomal region PAR1 on the short arm of the Y chromosome. The authors proposed that the mechanism underlying Y-chromosome microdeletions might also be associated with the occurrence of PAR rearrangements and therefore, might be at higher risk for incurring PAR-related disorders such as SHOX-haploinsufficiency, which is responsible for short stature and skeletal anomalies (Mangs and Morris 2007 and references therein). This highly alarming risk for developing PAR-related disorders prompted researchers to clarify this issue in the context of a large multicenter study. Data on 224 patients with Y chromosome microdeletions and normal karyotype did not confirm the association between partial or complete microdeletions and SHOX haploinsufficiency (Chianese et al. 2013). This data has been confirmed by Castro et al. (2017) who reported PAR abnormalities exclusively in patients with terminal AZFbc deletion associated with isochromosome Yp and/or Y nullisomy. The authors of this recent study report neuropsychiatric disorders in 5/7 patients with terminal AZFbc deletion and abnormal karyotype and hypothesize that CNVs in the pseudoautosomal regions (PARs) and/or the removal of MSY genes such as NLGN4Y may play a role in the observed neuropsychiatric disorders. Although not mentioned by the authors, we can also speculate that the removal of the KDM5D gene (mapped to the AZFb region) may contribute to these pathological conditions. Moreover, KDM5D is among the seven sex chromosome genes showing sexually dimorphic expression in the developing mouse cortex and hippocampus, which may indicate a role for this gene in the development of neural circuits governing distinct behaviors between the sexes as adults (Armoskus et al. 2014). The association between neuropsychiatric disorders and terminal AZFbc deletions needs further confirmation especially in view of the lack of such neurodevelopmental disorders in XX males (Vorona et al. 2007).

  1. 2.

    The AZFc region is rich in amplicons; therefore, it is predisposed to a series of rearrangements including partial deletions or duplications and deletions followed by duplication(s). The role of these rearrangements has been the object of long-lasting debates. The clinically most relevant partial AZFc deletion is called gr/gr, which received its name after the fluorescent probes (‘green’ and ‘red’) used for its discovery by Repping et al. (2003). This deletion is associated with a highly variable phenotype ranging from azoo- to normozoospermia. However, its frequency is significantly higher in oligozoospermic men and according to five meta-analyses (Tüttelmann et al. 2007; Visser et al. 2009; Navarro-Costa et al. 2010a; Stouffs et al. 2011; Bansal et al. 2016) and to the largest multi-ethnic study (Rozen et al. 2012), the gr/gr deletion increases by 2–2.5 fold the risk for reduced sperm output. Indeed, by removing half of the gene content, this deletion represents a significant risk factor for spermatogenic impairment. However, the entity of this risk varies between populations and shows the highest values in the Mediterranean area. Indeed, the highest risk was found in the Italian population (Ferlin et al. 2005; Giachini et al. 2005, 2008). The updated OR for the Italian cohort from Florence (642 patients and 685 controls) is of 5.8 (CI 2.0–16.9, p = 0.001), whereas the combined data of the Spanish (Lo Giacco et al. 2014) and Italian cohorts show a similarly high value (OR 4.2, CI 2.0–8.8, p < 0.001); (Table 2). This suggests that the screening for gr/gr deletion should be performed according to the level of conferred risk in a given population.

    Table 2 Comparison of gr/gr deletion frequencies and ORs between the following three different study populations from the Mediterranean area: Giachini et al. (2008); the Giachini et al. study population up-dated to 2016; the Lo Giacco et al. (2014) cohort; combination of the three study populations

Although the deletion removes half of the AZFc gene content, it may have different breakpoints leading to the removal of distinct gene copies. First, it has been proposed that the variability in semen phenotype may depend on the remaining gene copies. In fact, a number of studies reported that DAZ1/DAZ2 copies (and in some papers CDY1a) are specifically or predominantly removed in patients with respect to normozoospermic controls (Fernandes et al. 2002; Ferlin et al. 2005; Giachini et al. 2005, 2008; Yang et al. 2008a, 2010; Li et al. 2013; Wang et al. 2016). Since some gr/gr deletions are followed by duplications, hence restoring gene dosage, it has been proposed that gene copy number may be another modulating factor of the semen phenotype (Noordam et al. 2011). A definitive answer to the question of whether Y chromosomal background, the type of remaining AZFc gene copies, and the dosage of genes may explain phenotypic variability comes from the largest-to-date cohort from the Western European population (Krausz et al. 2009). In this study, a similar distribution of the deletion of different DAZ and CDY1 copies and of Y chromosomal haplogroups was found between gr/gr deletion carriers with normal and abnormal sperm parameters (Krausz et al. 2009). Moreover, in contrast to the Dutch study, the restoration of gene dosage due to duplication was not significantly more frequent in normozoospermic men. Interestingly enough, in line with a recent study (Yang et al. 2015), the presence of deletion followed by duplications affected even more seriously the sperm count than isolated gr/gr deletion. Since in the multicenter study geographic differences were detected concerning the subtypes of gr/gr deletion, it is possible that this genetic anomaly is more penetrant in certain populations than in others. This deletion type also shows large differences in its prevalence from 2% (in the United States) to 15% (in Vietnam) (Rozen et al. 2012). Interestingly enough, Y background seems to play a role in the clinical manifestation of gr/gr deletion in some Asian populations (see paragraph below).

In addition to the gr/gr deletion, two other partial AZFc deletions (b2/b3 and b1/b3) were objects of a number of studies. Both deletions remove 12 gene copies and transcription units and seem not to have a significant effect on male infertility (Repping et al. 2004; Bansal et al. 2016 and references therein; Rozen et al. 2012) with the exception of the Chinese, Maroccan, and South Indian populations, where the b2/b3 deletion shows a strong association with male infertility (Wu et al. 2007; Lu et al. 2009, 2014; Eloualid et al. 2012; Vijesh et al. 2015). Concerning b1/b3, data are scarce due to its low frequency. By analyzing 20,000 Y chromosomes, Rozen and collaborators found 2.5-fold increased risk of developing severe spermatogenic failure in men carrying the b1/b3 deletion.

The questions about a potential deleterious effect of an excess AZFc gene dosage has been proposed based on the fact that the copy number of AZFc-linked genes shows limited inter-individual variation. Hence, a natural selection may act for the conservation of an “optimal” copy number, by removing exceptionally high or low copy number variants from the population (Repping et al. 2006). Studies in the Han Chinese and Chinese-Yi populations reported an association between increased AZFc gene dosage and infertility (Lin et al. 2007; Ye et al. 2013; Yang et al. 2015). Similarly, Noordam et al. (2011) suggested that a higher DAZ gene dosage could be deleterious for spermatogenesis also in a cohort of multiethnic patients attending a Dutch infertility clinic. This association was not found in the Italian and Spanish populations where the frequency of duplications in patients and normozoospermic controls was similar (Giachini et al. 2008; Lo Giacco et al. 2014). Whether this discrepancy has a molecular explanation remains to be established; it is expected that the recently described method for the distinction of different amplified gene copies may provide some clues to it (Vaszkó et al. 2017).

  1. 3.

    Apart from the long arm of the Y chromosome, the Yp, also contains a multicopy gene, TSPY1, with a potential role in spermatogenesis. TSPY gene copies are arranged in 20.4-kb of tandemly repeated units, with different copy numbers (ranging from 11 to 76) among individuals (Tyler-Smith et al. 1988). In about 65% of the Italian population, the copy number interval is restricted (21–35 copies) (Giachini et al. 2009), suggesting that a minimum TSPY1 copy number is likely to be maintained through selection (Tyler-Smith 2008). TSPY was originally described as the putative gene for the gonadoblastoma locus on the Y (GBY) chromosome with potential involvement also in other human cancers (Lau et al. 2009). Besides its oncogenic properties, expression analyses converge on a physiological involvement of the TSPY1 protein in spermatogenesis as a pro-proliferative factor expressed in gonocytes/pre-spermatogonia of embryonic testis, in spermatogonia or spermatocytes at meiotic prophase I in adult testis and in early fetal germ cells (Schöner et al. 2010; Lau et al. 2009). Since the mean copy number of TSPY is significantly different among different Y haplogroups (Mathias et al. 1994; Giachini et al. 2009), Y haplogroup-matching between cases and controls is crucial for a reliable analysis (Tyler-Smith 2008). The few studies on TSPY copy number in relation to male infertility reached different conclusions, which may derive from population stratification bias (Giachini et al. 2009; Vodicka et al. 2007; Nickkhlogh et al. 2010). In fact, only the Italian study avoided this potential confounder by analyzing cases and controls matched for their Y hgr distribution. This original study from Florence has been further enlarged and reached the same conclusion, i.e., TSPY1 copy number influences spermatogenic efficiency and it is positively correlated with sperm count (Giachini et al. 2009; Krausz et al. 2011).

Y chromosome background and spermatogenesis

Studies dealing with the Y chromosome background in relationship with male infertility can be divided into three groups: (1) search for “at-risk” Y hgrs for spermatogenic disturbances; (2) search for predisposing Y chromosome background to AZF deletion formation; (3) analysis of the role of Y hgrs in the phenotypical penetrance of partial AZFc deletions.

  1. 1.

    The human Y chromosome belongs to different paternal lineages, called Y chromosome haplogroups, which display very specific patterns of geographical clustering (Seielstad et al. 1994; Tyler-Smith 2008). A chromosomal haplogroup refers to a group of chromosomes sharing a similar combination of allelic states at slowly mutating binary markers at multiple loci (a combination is called haplotype) (Jobling and Tyler-Smith 2003). The localization of these binary markers in the MSY region (which does not recombine with the X chromosome) is responsible for the largely intact passage of their combinations from generation to generation and for the possible tight association between these markers and potential functional variants (undetected microdeletions, or sequence variants) located in this region and associated with Y-linked phenotypes. It is therefore plausible that the definition of Y chromosome haplogroups in patients vs. disease-free and/or the general population is an indirect way of exploring if Y chromosome-linked factors are involved in the etiology of a specific disease (Krausz et al. 2004 and references therein). The hypothesis of an “at-risk Y haplogroup” for spermatogenic disturbances was reported in the Danish, Han Chinese, and Latvian populations but not in others such as Italian, Kinki, and Indian populations (Krausz et al. 2004 and reference therein; Singh and Raman 2009). In Denmark, one class of Y chromosomes (haplogroup K+) was significantly over-represented in idiopathic oligo- or azoospermic men conferring a risk to develop spermatogenic impairment about nine times greater (OR 8.92, 2.8–28.5, CI 95%) than Danish men with hg 1 (the most frequent hgr in Denmark). These data suggested that hg K+ may be under negative selection in Denmark because this haplogroup is associated with decreased reproductive fitness (Krausz et al. 2001). The susceptibility of haplogroup K to spermatogenic failure was suggested in the Han Chinese and Latvian populations by reporting a significantly higher frequency of this hgr among azoo/oligozoospermic men than in controls (Lu et al. 2007, 2013; Yang et al. 2008b; Puzuka et al. 2011). Molecular analysis of the AZFc region suggests that DAZ and BPY2 duplications might underlie the susceptibility of haplogroup K and O1 to spermatogenic impairment in the Han Chinese population (Lu et al. 2013, 2014). On the other hand, the haplogroup R1a1 and O3e seem to have a protective effect on impaired spermatogenesis in Latvian and Han Chinese populations, respectively (Puzuka et al. 2011; Lu et al. 2013). Highly contradictory results have been published concerning the Japanese population since both an association (Kuroki et al. 1999; Sato et al. 2013, 2014) and lack of association (Carvalho et al. 2004) between hgr D/D2 and infertility were reported (Table 3a).

Table 3 Studies dealing with the Y chromosome background in relationship with male infertility: (a) association of Y hgrs with spermatogenic disturbances; (b) association of Y chromosome background with AZF deletion formation; (c) analysis of the role of Y hgrs in the phenotypical penetrance of partial AZFc deletions
  1. 2.

    Another line of research on Y hgrs and male infertility deals with the relationship between Y chromosome background and AZF deletion formation. It has been hypothesized that sequence differences or different orientation of the repeated blocks may predispose or protect against specific types of deletions. A logical example could be a higher homology between the two retroviral blocks in the flanking sequences of the AZFa region (this occurs after the polymorphic deletion of L1PA4 element) facilitating, in theory, their recombination. To rule out this hypothesis, the haplotype distribution in men carrying Y chromosome microdeletions from Scotland, Germany, Italy, Spain, Holland, France, Ireland, Denmark, and Israel was compared with haplotype distribution of infertile men without microdeletions (Paracchini et al. 2000; Carvalho et al. 2004) or unselected male control population (Quintana-Murci et al. 2001). These studies concluded that chromosome deletion formation is a stochastic event, independent from the Y chromosome background. However, more recent studies reached different conclusions. A study conducted in the Italian population showed a significantly higher frequency of haplogroup E (29.3 vs. 9.7%, p = 0.01) in North Italian patients carrying the AZFc microdeletion than those without the microdeletion. The authors concluded that in the North Italian population, haplogroup E probably predisposes to the occurrence of b2/b4 deletions (Arredi et al. 2007). In the South Amerindian population the Q1a3a haplogroup significantly increases the susceptibility to AZFb deletion when compared to controls or cases without deletions or to those with complete or partial-AZFc deletions (Lardone et al. 2013). Haplogroups C and DE* have been associated with increased rates of partial deletions in the Chinese population (Yang et al. 2008a, b) and haplogroups C3, E1, and G are also particularly prone to partial deletions and duplication affecting the proximal AZFc domain in Russia/Kyrgyzstan, African, and Asian populations, respectively (Balaresque et al. 2008). The above discovery of specific haplogroups with an increased propensity for the occurrence of partial AZFc deletions may explain differences in the frequencies of partial AZFc deletions between populations. Another AZFc-linked CNV with unknown functional effect on spermatogenesis (a 200-kb tandem repeat on Yq11.22, DYZ19, approximately 500 kb distal to palindrome P4) has been reported in association with specific Y hgrs (Lopes et al. 2013). Both gains and losses involving this region have been reported more frequently in oligo/azoospermic men than in the general population (Lopes et al. 2013). The type of copy number changes of this tandem repeat appears to be associated with specific Y backgrounds such as the R1 hgr with gains, whereas the I and J hgrs with losses, suggesting that the different structure of the chromosome may predispose to different rearrangements.

On the other hand, “protective” Y haplogroups (J, R, and O3) against the AZFc deletion have also been reported (Arredi et al. 2007; Balaresque et al. 2008; Yang et al. 2008a, b). The exact molecular bases to explain susceptibility or protection against the formation of deletions remain unexplored and await further research (Table 3c).

  1. 3.

    Finally, a possible role for the Y chromosome background was hypothesized in relationship with the phenotypical penetrance of a given AZFc deletion. For instance, in the Korean population, the gr/gr deletion seems to be associated with impaired spermatogenesis only in patients with YAP- lineage (Choi et al. 2012). An intriguing issue concerns those Y hgrs on which partial AZFc deletions are fixed. For instance, gr/gr deletion is fixed in haplogroups D2b1 and Q1, common in Japan and in certain areas of China, whereas b2/b3 deletion is fixed in haplogroup N* and N1 (Repping et al. 2003, 2004; Fernandes et al. 2004; de Carvalho et al. 2006; Zhang et al. 2007; Sin et al. 2010; Bansal et al. 2016 and reference therein). Since “fixed” deletions do not seem to negatively affect fertility, the presence of lineage-specific compensatory factors were proposed by several authors (Wu et al. 2007; Sin et al. 2010; Yang et al. 2008a, b, 2010; Navarro-Costa et al. 2010a) (Table 3b).

Conclusions

The pivotal role of the Y chromosome in spermatogenesis is supported by the presence of Y-linked genes specialized in male reproductive fitness. The removal of these genes en block in the AZF regions causes distinct pathological testis phenotypes. After 20 years from the first molecular definition of the AZF regions, Yq deletion screening became a routine diagnostic test with clear-cut cause–effect relationship with severely impaired spermatogenesis. However, there are still some clinical questions that prompt further investigations. For instance, there is an urgent need for more data on pregnancies’ outcome and the health status of children born from AZF deletion carriers. It is especially relevant in view of the potential risk of mosaicism due to the instability of the deleted Y chromosome. In addition, the question of whether partial AZFc deletions act as a pre-mutation to complete AZFc deletion needs to be clarified due to its obvious importance for the future descendants of gr/gr deletion carriers (Zhang et al. 2007; Lu et al. 2009). If this mechanism is going to be further confirmed, the couple should be aware that besides the obligatory transmission of a genetic risk factor for impaired sperm production (gr/gr deletion) to the male offspring, there is a higher risk for the transmission of a complete AZFc deletion, i.e., a clear-cut causative factor for spermatogenic impairment.

Apart from the above concerns related to the descendants of deletion carriers, also Y deletion carriers themselves should be followed-up over time. In fact, it is unknown whether the “fragility” of the Y chromosome is a marker for general “genomic instability” potentially affecting the general health status of the carriers. Especially subjects with multiple Y rearrangements may be at a theoretical higher risk for a more generalized chromosomal instability. Two studies report that multiple duplications of a partial AZFc-deleted structure has a more serious effect on spermatogenesis than the partial deletion-only mutation (Krausz et al. 2009; Yang et al. 2015). Two possible reasons were proposed to explain this phenomenon: (1) the excessive AZFc NAHR-substrate may increase the risk for AZFbc deletions during the division of germ cells and thus may lead to testicular mosaicism (i.e., tubules with absent spermatogenesis); (2) it may indicate a more generalized tendency to deletion load across the genome affecting also other genes involved in spermatogenesis (and perhaps in other diseases). Data to support this latter hypothesis come from array-CGH studies showing a significantly higher deletion load (genome-wide, with a burden on the sex chromosomes) in men with impaired sperm production versus normozoospermic controls (Tüttelmann et al. 2011; Krausz et al. 2012; Lopes et al. 2013).

Finally, with the advent of NGS, large-scale Y chromosome-linked gene mutation screenings are expected in the near future. These analyses have the potential to provide important insights into two unanswered basic questions. First, it will most likely allow the definition of the gene(s) responsible in each AZF region for the respective deletion phenotype and their function. Second, it has the potential to shed light on the molecular basis of the predisposing Y hgrs to impaired spermatogenesis and of the predicted lineage specific compensatory factors on the Y chromosome. It is clear that after 40 years from the first description of the AZF region (Tipeolo and Zuffardi 1976), research on the Y chromosome in relationship to spermatogenic failure continues to be one of the most exciting fields of androgenetics.