Introduction

Among sexually reproducing organisms, those belonging to the same species are capable of exchanging genetic information through interbreeding. Speciation is the process by which such exchange is impeded. Interspecies hybrid sterility has been at the center of speciation studies for many years because the production of sterile hybrids imposes a restriction to gene flow between groups of organisms effectively isolating them as distinct species. Many different aspects of the speciation process are of interest to biologist in general, as speciation contributes to the origin of biological diversity. For many years, evolutionary geneticists have studied the genetic basis of speciation, or more specifically, hybrid male sterility using Drosophila as a model.

The focus on males has been driven by the observation that in crosses between members of closely related species it is the male hybrid that turns out sterile (Haldane 1922). The first evidence of a genetic basis for hybrid male sterility is within Haldane’s rule itself as it states that the heterogametic sex is the sterile sex. This is because interspecies hybrid sterility manifests itself in chromosomally XY hybrid males and ZW hybrid females. The observation implies a major role of sex chromosomes in hybrid male sterility and a larger proportion of male sterility genes have been found on the X chromosome than autosomes (Tao et al. 2003; Masly and Presgraves 2007; Llopart 2012). Aside from the large-X effect, one would expect that in XY taxa genetic factors leading to hybrid male sterility in crosses between closely related species should somehow perturb the process of sperm development (spermatogenesis).

In Drosophila, interspecies sterile hybrid males can have different levels of testes morphological atrophies, normal testes morphology with seminal vesicles containing no sperm or even normal reproductive tracts with either non-motile or motile sperm incapable of fertilizing eggs (Lachaise et al. 1986; Zeng and Singh 1993; Haerty and Singh 2006; Gomes and Civetta 2014; Civetta and Gaudreau 2015). Cytological studies of testes cross sections from interspecies sterile hybrids have shown mostly postmeiotic problems during spermatogenesis, including asynchrony in the spermatid development and its differentiation into mature sperm cells, lack of proper spermatid individualization, accumulation of cellular debris between sperm bundles, and formation of undifferentiated interconnected spermatids (Dobzhansky 1934; Wu et al. 1992; Kulathinal and Singh 1998). At the molecular level, a series of studies have been able to identify genes of spermatogenesis, and mainly those in the late stages of the sperm developmental process (spermiogenesis), as severely misregulated in sterile hybrids (Michalak and Noor 2003, 2004; Moehring et al. 2007; Catron and Noor 2008; Sundararajan and Civetta 2011).

The descriptions of cytological postmeiotic developmental defects coupled with misregulation of postmeiotic spermiogenesis genes lend support to the sterility hypothesis, which assumes that gene misregulation causes sterility and contributes to species isolation, for two reasons: (1) the majority of genes needed during sperm development are transcribed and stored premeiotically (Oliveri and Oliveri 1965; Gould-Somero and Holland 1974; Barreau et al. 2008); (2) the phenotypic manifestation of sterility occurs after (postmeiotically) the disruption in expression of genes needed in spermiogenesis.

Here, I review the original studies in Drosophila that have identified misregulation of spermatogenesis gene expression in sterile hybrids. Alternatively, fast gene regulatory evolution can explain observed patterns of gene misregulation in the absence of sterility. Thus, the observation of misregulation of gene expression along the sperm developmental pathway in a sterile hybrid should not be taken as a condition linked or causative of sterility. While both fast-male and fast-X evolution can certainly contribute to sterility, the emphasis in this review is placed on approaches that had helped establish their role in gene misregulation and on possible future directions to be taken toward the identification of misregulated genes or gene pathways linked to HMS.

Misregulation of Gene Expression and Sterility

The observations made in the 90s that sterile male hybrids produced from crosses among closely related species of the Drosophila simulans clade (i.e., D. simulans, Drosophila mauritiana and Drosophila sechellia) show mostly postmeiotic developmental defects in spermatogenesis and the availability to perform genome-wide assays of gene expression using microarray platforms prompted a series of gene expression studies attempting to find gene–phenotype (sterility) associations. Most profiling studies focused on the D. simulans and D. mauritiana species pair. The original study identified misexpression of several spermatogenesis genes in the sterile hybrids with an actual validation via qPCR for Mst84Dc, a gene showing higher than fourfold underexpression in hybrids relative to both pure species (Michalak and Noor 2003). A subset of such genes was followed up in a study that used a backcross design to compare gene expression between fertile and sterile backcross hybrid males. They found that the sterile hybrids were more variable in gene expression than fertile males and most misregulated genes had lower gene expression in the sterile than fertile condition. The significant underexpression of these transcripts in sterile relative to fertile backcross males and the observation of a correlation among genes in their patterns of expression were used to infer that they might be likely targets of genetic factors causing sterility (Michalak and Noor 2004). In a separate study, a subset of three genes identified as misexpressed by Michalak and Noor (2003) was examined for gene expression in hybrids between Drosophila pseudoobscura and Drosophila persimilis. Most genes showed consistent misexpression in the sterile hybrids leading to propose a common regulatory pathway of sterility in different species groups (Noor 2005).

The original study by Michalak and Noor (2003) suffered from potential biases that might occur from using samples from D. simulans species clade to hybridize Drosophila melanogaster genomic arrays. Moehring et al. (2007) overcome this limitation by using a sperm transcript array developed from the species being tested. The study compared patterns of expression using RNA extractions from whole-bodies between all three species of the D. simulans clade and their interspecies sterile hybrids. The study found a large proportion of misregulation in adults compared to larvae and an enrichment of late-stage (spermiogenesis) genes being underexpressed in sterile hybrids. The idea of a postmeiotic disruption was further supported by a study showing a contrasting pattern of early-spermatogenesis transcripts (aly and comr) being underexpressed in hybrid whole-bodies but not in testes and late-spermatogenesis transcripts (dj and Mst84D) underexpressed in both whole-bodies and testes samples (Catron and Noor 2008). Thus, the connection between the phenotypic manifestation of sterility in the hybrids (postmeiotic) and tissue-specific misregulation of postmeiotic (spermiogenesis) gene expression reinforced the hypothesis of a direct link between spermatogenesis genes’ misregulation and sterility.

In an extension of the work by Catron and Noor, Sundararajan and Civetta (2011) surveyed two genes from each of the four major stages of spermatogenesis: germline proliferation, transition from mitosis into meiosis, progress from meiosis, and sperm maturation and individualization. The authors found that only a mitotic arrest gene (bag of marblesbam) and a spermatocyte arrest gene (spermatocyte arrestsa) showed significant underexpression localized to testes in sterile hybrids relative to parental species and intraspecific hybrids (i.e., hybrids between strains). These two genes had not been previously identified as misregulated in sterile hybrids when whole-body extractions of RNA were used, suggesting a tissue-specific effect and the possibility of previous false negatives (Sundararajan and Civetta 2011). Moreover, bam and sa are not postmeiotic genes. Therefore, it was possible to suspect that their underexpression in sterile hybrids could be linked to gene regulatory divergence without any effect on male fertility. Sundararajan and Civetta (2011) showed, with a slightly larger sample, an opposite pattern to the testes-specific misexpression for late spermiogenesis previously suggested (Catron and Noor 2008) and reinforced the need to use tissue-specific (e.g., testes) samples to avoid false negatives.

Two remaining limitations in most previous studies were 1) the lack of fertile hybrid controls and 2) the fact that the genome of a hybrid male between species merges genetic elements from two diverged parental species (Table 1). The hybrid nature of the genome composition in the sterile hybrids means that any kind of gene misregulation could be driven by rapid interspecies divergence in regulatory elements in the absence of sterility. The study by Michalak and Noor (2004) and follow-up studies (Michalak and Ma 2008; Ma and Michalak 2011) comparing backcross males that were either fertile or sterile elegantly addressed this issue and provided strong evidence for an association between misregulation of the Acylphosphatase (Acyp) gene and HMS.

Table 1 Summary of gene expression comparisons between fertile and sterile parental species and hybrids

Fast-Male Evolution can Drive Gene Misregulation in the Absence of Sterility

Studies that compared differences in expression between closely related species of Drosophila using different tissues have found that male reproductive tract proteins are more rapidly evolving than other tissue- and non-tissue-specific proteins (Coulthart and Singh 1988; Civetta and Singh 1995). A more recent genome-wide comparison among 12 Drosophila genomes confirmed faster coding sequence divergence and a more rapid loss of orthology among species for both testes- and accessory gland-expressed genes (Haerty et al. 2007). Male-biased expressed genes evolve more rapidly than female-biased or non-biased genes and tend to be more differentially expressed between species than other genes (Meiklejohn et al. 2003; Ranz et al. 2003; Zhang et al. 2007; Assis et al. 2012; Harrison et al. 2015). Based on these previous observations of fast turnover, both at the actual coding sequence and expression level, for genes with male expression it is possible to hypothesize that spermatogenesis genes could be misregulated in hybrids without a need to invoke a link to sterility.

To dissect the role of sterility versus fast-male divergence at regulatory factors as drivers of spermatogenesis gene expression in Drosophila hybrids, fertile backcross strains between D. simulans and D. mauritiana were created. Gene expression was measured at three genes (bam, sa, and Mst98C) over multiple fertile backcross lines and all genes were found to be significantly underexpressed in backcross fertile progeny relative to parental species (Ferguson et al. 2013). Downregulation of gene expression in fertile backcross males provides support for the fast-male hypothesis for three genes that act at very different stages during sperm development (i.e., mitosis arrest, spermatocyte arrest, and spermiogenesis, respectively). While all backcross approaches manage to remove the sterility phenotype and provide fertile controls, they suffer from the limitation that the generated hybrids are only partial genome mixtures and thus not fully comparable to interspecies hybrids where the whole genome is heterozygous (Table 1).

A second approach to dissect the relative roles of sterility and fast-male divergence takes advantage of unidirectional sterility in crosses among species. Species pairs from different Drosophila lineages produce hybrid male sterility in one direction of the cross, with the hybrid male resulting from the reciprocal cross being fertile (unidirectional sterility) (Bock 1984; Coyne and Orr 1989a). Using species pairs for which one of the species’ genome had been sequenced, sixteen D. melanogaster spermatogenesis gene orthologs were identified and tested for differences in gene expression unique to the sterile condition versus shared between hybrids (Gomes and Civetta 2014). This design allowed identifying differences in misregulation of gene expression across different evolutionary lineages of Drosophila. Some genes (e.g., matotopetli, pelota, vismay) showed evidence of sterility-specific gene misregulation, which lends support to links between sperm developmental genes and sterility, while others (e.g., always early, bam, janus B) showed similar levels of gene misregulation in both fertile and sterile hybrids (Gomes and Civetta 2014). This approach is powerful because it allows for the use of control fertile F1 hybrids, as opposed to backcross hybrids, with a genome makeup almost identical to its sterile F1 hybrid counterpart. Two limitations are the sampling of few genes rather than a genome-wide scan and the fact that the sterile and fertile hybrids differ in their sex chromosome makeup leaving open the possibility that misregulation could be driven by fast sex chromosome divergence without relationship to sterility itself (Table 1).

Fast Evolution of Sex Chromosomes, Sterility, and Gene Misregulation

The sex chromosome (X or Z) has been shown to have a larger effect on hybrid sterility than other chromosomes, a pattern that is consistent across a wide taxa spectrum (see: Coyne and Orr 1989b; Moehring et al. 2006; Good et al. 2008; Kitano et al. 2009; Garrigan et al. 2014). Different explanations have been offered for the large effect of the X chromosome in driving hybrid male sterility. One possibility is that the expression of X-linked genes themselves is disrupted in sterile hybrid males. Two independent genome-wide assays, one using sperm-specific microarrays to compare gene misregulation in hybrids between species of the simulans clade (Moehring et al. 2006) and a more recent RNA sequencing survey of male reproductive gene expression in hybrids between Drosophila pseudoobscura pseudoobscura and Drosophila pseudoobscura bogotana (Gomes and Civetta 2015), revealed no evidence of any significant representation of misregulated X-linked genes. Alternatively, if X-linked sterility factors are primarily acting as trans-regulatory elements, they might exert their effects through their divergent amino acid protein composition. Thus, another possible explanation for the large X-chromosome effect on HMS is that fast evolution of sex chromosome genes coding for trans-regulatory proteins might lead to misregulation of target genes.

Rates of evolution of genes residing on the sex chromosomes are known to be enhanced by the effect of selection upon recessive variants (Charlesworth et al. 1987). This faster evolution can not only enhance genetic divergence for X-linked genes but also create X-autosome incompatibilities in hybrid genomes. Two recent papers that have conducted comparisons of genome-wide diversity among species have revealed not only higher nucleotide divergence for sex chromosomes than autosomes but also supported the role of selection as an engine for rapid change of sex chromosome-linked genes (Garrigan et al. 2014; Dean et al. 2015).

Is it possible that fast-X evolution driven by diversification of protein coding genes might itself trigger patterns of misregulation of target genes? If so, does the effect of X-linked trans-regulatory proteins divergence between species have an impact on hybrid fertility? An overrepresentation of trans-regulatory elements on the X chromosome, capable of causing HMS, might explain cascade effects of misregulation that could further contribute to HMS. A quick survey in Flybase of D. melanogaster genes with molecular function linked to transcription shows a similar distribution of such genes across major chromosomes. This is not necessarily surprising, as a disproportional effect of X-linked trans-regulatory elements might not be necessarily the result of a numerical overrepresentation of transcription factors located in the X-chromosome. It could simply be driven by few transcription factors of large effect, or by a particular subset of transcription factors regulating specific male reproductive gene expression. A potentially disproportional effect of X-linked trans-regulatory gene divergence in driving target genes’ misregulation in hybrids was hinted in a previous transcriptomic scan, where eight out of ten autosomal genes with the largest reversal in allelic expression between hybrids favored the allele matching the X-chromosome genotype (Gomes and Civetta 2015).

In Drosophila, HMS genes are enriched on the X chromosome (Tao et al. 2003; Masly and Presgraves 2007). Two of the best-known examples of major genes contributing to HMS between species are Odysseus (OdsH) and Overdrive (Ovd) (Perez et al. 1993; Phadnis and Orr 2009). These two genes code for putative DNA-binding proteins. OdsH possesses a homeobox binding motif and the protein exerts its sterilizing role by differentially binding heterochromatin (Ting et al. 1998; Bayes and Malik 2009). Ovd contains a DNA-binding domain (MADF: myb/SANT-like domain in Adf-1) and has seven fixed non-synonymous differences between the two closely related species D. p. pseudoobscura and D. p. bogotana (Phadnis and Orr 2009). It is therefore possible that these amino acid differences might result in different abilities of the protein to bind target DNA sites and regulate gene expression. It will be desirable to test whether Ovd binds upstream of any of the proteolytic and other genes found to be specifically misregulated in D. p. bogotana x D. p. pseudoobscura male sterile hybrids (Gomes and Civetta 2015). Out of the 21 sterile hybrid-specific misregulated proteases, 17 have at least one Adf-1 transcription binding site (a putative MADF target) somewhere between −500 and +100 bp of their transcription start site (PROMO: http://alggen.lsi.upc.es/cgi-bin/promo_v3/promo/promoinit.cgi?dirDB=TF_8.3) (Messeguer et al. 2002; Farré et al. 2003). If we consider the fact that the presence of more than a single binding site increases the probability that at least one of them is bound by the transcription factor (Lang and Juan 2010), then eight proteases become potential targets. A connection between a known HMS gene and putative misregulated targets could help discover gene pathways involved in male fertility. However, empirical validation would be needed because the presence of DNA TF binding motifs does not warrant their active participation in protein binding.

Genome-Wide Misregulation and Candidate Gene Validation: Moving Forward

There has been a significant progress in the field in terms of sampling of transcripts from the primary affected tissue sites (i.e., male reproductive tract) and the use of quantitative methods that are species specific (e.g., qPCR). However, there have not been many studies using a genome-wide approach, such as RNA-Seq, that can sample entire populations of transcripts. This is potentially restricted by the need to have better annotated, and assembled, genomes beyond the traditionally most widely used species D. melanogaster. Genome expression scans using F1 or BC fertile hybrids as controls to tease apart gene expression changes associated with the male sterility condition can identify candidate genes and gene ontologies potentially linked to HMS (Civetta and Gomes 2015).

Further progress will require better phenotyping of sterility and functional validation of candidate genes. Often, hybrid males are classified as fertile or sterile based on sperm motility assays or their inability to father progeny. However, examples are known of hybrid males classified as “fertile” despite the fact that they produce very small amounts of sperm or “sterile” F1 hybrid males that are incapable of producing progeny despite having normal motile sperm (Gomes and Civetta 2014; Civetta and Gaudreau 2015). In cases when hybrid males are properly phenotyped as sterile, often the developmental stage at which sterility is triggered is unknown or only broadly characterized. Teasing apart developmental and cell-specific problems in sterile hybrids could go hand in hand with tissue, and cell-specific assays of expression of candidate genes identified from genome assays. These assays could in turn serve to prioritize what genes should be targeted for the creation of misexpression lines (e.g., overexpression), RNAi knockdowns, and CRISPR editing that could ultimately be used for candidate gene validation (Fig. 1).

Fig. 1
figure 1

A summary of progress on genome-wide misexpression assays used to identify candidate HMS genes/gene ontologies (boxed). The dashed lines connect to future aspects of research that would allow us to validate candidate genes. Shadow lettering is used to point at specific subareas that have been particularly underexplored