Introduction

Under certain conditions, selection will favor differences between the X chromosome and the autosomes in the abundances of genes that are expressed primarily in one of the sexes (“sex-biased genes”). In a randomly mating population, dominant or partially dominant mutations that are beneficial for females but deleterious for males can spread more easily if they are X-linked than if they are autosomal, because the X chromosome spends more time in females than in males, whereas autosomes spend the same amount of time in each sex (Rice 1984). Conversely, low-frequency, recessive X-linked mutations that are beneficial for males, but deleterious for females, are masked in their heterozygous female carriers but can express their beneficial effects in hemizygous males. A similar autosomal mutation only starts to have a beneficial effect on males when it appears in homozygous individuals. Recessive or partially recessive sexually antagonistic mutations that are beneficial for males, and dominant sexually antagonistic mutations that are beneficial for females, will thus spread more easily if they are X-linked; mutations with the opposite patterns will spread more easily if they are autosomal. In each case, modifiers that decrease the expression of these genes in the harmed sex are favored by selection (Rice 1984). In principle this can lead to a difference between the X chromosome and the autosomes in the abundances of sex-biased genes (reviewed by Vicoso and Charlesworth 2006).

Microarray and EST datasets comparing female and male expression have allowed the identification of male- and female-biased genes in several organisms. As predicted from the above considerations, their distribution in the genome is not random, and the X chromosome often differs from the autosomes in its content of sex-biased genes (Kaiser and Ellegren 2006; Khil et al. 2004; Lercher et al. 2003; Parisi et al. 2003; Reinke et al. 2004; Wang et al. 2001). However, the patterns are highly inconsistent between species: in Drosophila melanogaster and Caenorhabditis elegans, there seems to be a deficit of male-biased genes on the X chromosome, whereas in mammals an excess of male-biased genes is observed (Khil et al. 2004; Lercher et al. 2003; Parisi et al. 2003; Reinke et al. 2004). Since Rice’s (1984) theory predicts different results for different levels of dominance of the new mutations, most of the discussion on these patterns has relied on differences in the dominance coefficients of mutations to explain the discrepancies (Vicoso and Charlesworth 2006). Why there should be systematic differences in dominance between organisms remains unclear.

While the X chromosome differs from the autosomes in its transmission mode and ploidy state, as modeled by Rice (1984), it also has other biological properties that could affect the distribution of sex-biased genes. Meiotic X inactivation, for instance, implies that genes required in certain stages of spermatogenesis cannot be located on the X chromosomes in organisms where this mechanism is present, such as mammals (Khil et al. 2004) and, possibly, also Drosophila (Hense et al. 2007). In the mouse, it has been shown that genes required for late spermatogenesis are indeed rare on the X, whereas genes required for early spermatogenesis (before X-inactivation) are located on the X more often than expected with a random distribution (Khil et al. 2004). Meiotic X inactivation cannot, however, explain other peculiarities of the distribution of sex-biased genes, as a deficiency of male-biased genes on the X chromosome is not limited to testis-expressed genes in Drosophila (Sturgill et al. 2007).

In Rice’s (1984) model, genes become sex-biased when their expression in the harmed sex is decreased or abolished. Therefore, on average, sex-biased genes should have lower levels of expression than unbiased ones. Connallon and Knowles (2005) tested this by comparing microarray expression data for male-biased, female-biased, and unbiased genes. Surprisingly, they found that sex-biased genes are on average transcribed at higher rates than unbiased genes. Our own analysis of published microarray data (Zhang et al. 2007) for sex-biased and unbiased genes in D. melanogasters confirms that the first step in the evolution of male-biased genes appears to be an increase in expression in the testis (see below).

The X chromosome is often hyperactivated in males as a result of the evolution of dosage compensation in response to degeneration of the Y chromosome, so that the transcription rate on the male X is about twice as high as the transcription rate of autosomal genes (Gupta et al. 2006; Lin et al. 2007; Nguyen and Disteche 2006; Straub and Becker 2007). If there is an upper limit to the rate of transcription that can be achieved (see Discussion), X-linked genes are more likely to be close to this limit when they are expressed in males than are autosomal genes. Since many male-biased genes appear to arise through a large increase in expression in the testis (see Results), this increase might be harder to achieve on the haploid, hyperactivated X chromosome (Vicoso and Charlesworth 2006). This yields two predictions concerning the distribution of male-biased genes.

  1. 1.

    Hyperactive X chromosomes should accumulate fewer male-biased genes than autosomes.

  2. 2.

    This deficit should be expression-dependent, as low-expression sex-biased genes either arose from genes that were very little expressed to start with or arose by a decrease in expression levels in the harmed sex, as predicted by Rice’s (1984) model. The evolution of their sex bias should therefore not have been affected by a limit to transcription rates. Highly expressed male-biased genes, on the other hand, are likely to have arisen through a large increase in expression in the testis and, therefore, will be found more rarely on the X chromosome than on the autosomes.

These predictions are further complicated by the fact that most of what is known about dosage compensation and X chromosome hyperactivation concerns the soma (Gupta et al. 2006; Nguyen and Disteche 2006), whereas a large proportion of sex-biased genes is primarily expressed in the germline (Ellegren and Parsch 2007). In D. melanogaster, however, predictions are made particularly simple by the fact that X chromosome hyperactivation has been detected both in the germline and in the soma, although the molecular mechanims appear to be different (Gupta et al. 2006).

In Drosophila, a deficit of male-biased genes on the X chromosome (prediction 1 above) has been consistently observed (Parisi et al. 2003; Sturgill et al. 2007). Here, we tested the second prediction by checking whether the deficit of male-biased genes observed on the D. melanogaster X chromosome is stronger for highly expressed genes than for lowly expressed genes, using microarray and EST data to measure levels of expression of male-biased genes. We focused on genes expressed in the testis and the ovary, as there are only very small numbers of somatic male-biased genes on the X chromosome, making the sample size too small to perform meaningful analyses.

Methods

Microarray Data Analysis

We used the data of Zhang et al. (2007) to compare levels of gene expression in males and females in D. melanogaster, D. simulans, and D. yakuba, and to test whether male-biased expression arises primarily through an increase in male expression or through a repression of female expression. For this purpose, we selected genes that had a male/female expression ratio > 2 in D. melanogaster but lower in D. simulans and D. yakuba. The level of male and female expression was then compared between D. melanogaster and D. simulans, to detect the nature of the differences between these two species causing the sex-biased expression; i.e., Was there an increase in male expression or a decrease in female expression in D. melanogaster? Genes were considered to have acquired a male-biased expression primarily through an increase in male expression if the D. melanogaster/D. simulans ratio of male expressions was >2, whereas the female ratio was >0.5. If the D. melanogaster/D. simulans ratio of male expressions was <2, and the female ratio was <0.5, the genes were considered to have evolved primarily through a decrease in female expression. The same analysis was performed for female-biased genes, but using the female/male ratio of expressions.

To examine patterns of male/female bias in relation to overall levels of expression, we downloaded four microarray datasets (Parisi et al. 2003) that compared expression levels in D. melanogaster testes and ovaries (dataset 5a, ID GSM2464; dataset 5b, ID GSM2465; dataset 6a, ID GSM2466; and dataset 6b, ID GSM2467) from the NCBI GEO Web site, a repository of microarray datasets (http://www.ncbi.nlm.nih.gov/geo/). The genes were ordered according to the natural log-transformed ratio of the corrected ovary-to-testis signals, as this should be representative of their sex bias. Genes with scores higher than 1 or lower than –1 were considered to be sex-biased, as this corresponds to an approximately twofold enrichment in male and female expression, respectively.

Once the genes were classified into male-biased, female-biased, or unbiased genes, we organized them according to their overall expression in males (for the case of male-biased genes) and in females (for female-biased genes) and the average of males and females (for unbiased genes). We used the overall probe signal, after normalization for background signaling, as the measure of expression levels (this corresponds to P1S/B and P2S/B in the datasets).

EST Data Analysis

The UNIGENE database (http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=&db=unigene) is a collection of EST and cDNA libraries organized by sequence similarity, so that, for each gene, it returns the ESTs that have been detected in all the libraries in the dataset. The results can be filtered to return all the genes found in one particular species, tissue, and/or chromosome. We took advantage of this to select D. melanogaster autosomal and X-linked genes that are expressed in the testis but not in the ovary (these are referred to as male-biased genes) and in the ovary but not in the testis (female-biased genes). The genes were classified as low-, medium-, or high expression genes using the testis EST count (for male-biased genes) and the ovary EST count (for female-biased genes) as a proxy for expression.

Results

Microarray Data: Changes in Sex Bias in the D. melanogaster Lineage

Table 1 reports the results of the analysis described in the first part of the Methods section. By comparing D. melanogaster with D. simulans and the more distant species, D. yakuba, we can determine whether genes that have apparently evolved sex bias in D. melanogaster since its common ancestor with D. simulans do this primarily by sex-specific increases or decreases in gene expression. The results show clearly that the most common mode is for expression to have increased in the sex with higher levels of expression, in agreement with the proposal of Connallon and Knowles (2005). As mentioned in the Introduction, this raises the question whether genes with high levels of expression are constrained in their ability to evolve sex-specific expression by further increases in one of the sexes.

Table 1 Changes in gene expression in the D. melanogaster lineage

Microarray Data: Sex Bias and Expression Level in D. melanogaster

Using four datasets of Parisi et al. (2003), which compare D. melanogaster ovary and testis expression, we can test whether the distribution of male-biased, female-biased, and unbiased genes is related to their expression levels (Fig. 1).To maximize our capacity to detect any deficit of male-biased genes for all levels of expression, we divided the sample of male-biased genes into three groups of equal size (low expression, medium expression, and high expression). The boundaries of these groups were used to classify the unbiased and the female-biased genes according to their expression level, so that we could compare the numbers of male-biased genes between the X chromosome and autosomes in the different expression classes.

Fig. 1
figure 1

The percentage of male-biased, female-biased, and unbiased genes located on the X chromosome for three expression levels (low, medium, and high), using four comparisons of testis and ovary expression levels. a Dataset 5a; b dataset 5b; c dataset 6a; d dataset 6b. The p-values denote significant deficits or excesses of low-, medium-, and high-expression sex-biased genes on the X, compared with the number of unbiased genes for that class, and were obtained with 2 × 2 chi-square tests

In all four datasets, the percentage of male-biased genes located on the X chromosome is lowest for highly expressed genes and highest for lowly expressed genes, and this difference is significant in three of the four cases (using a 3 × 2 chi-square test; Table 2) and after combining the data by summing chi-square values across the datasets. This contrasts with unbiased genes (Fig. 1), which shows the opposite pattern: there is a larger proportion of highly expressed than lowly expressed unbiased genes on the X chromosome (although this difference is only significant for three of the datasets; see Table 2). No significant differences are detected for female-biased genes among the three levels of expression.

Table 2 Probability values obtained with 3 × 2 chi-square results comparing the proportion of low-, medium-, and high-expression genes located on the X chromosome for male-biased, female-biased, and unbiased genes

Furthermore, in three of the four datasets, the deficit of male-biased genes on the X chromosome (compared with the number of unbiased genes that are located on the X) is nonsignificant for low-expression genes, using 2 × 2 chi-square tests to compare the two categories (Fig. 1). In contrast, a similar comparison shows that high-expression male-biased genes are present at a significantly lower frequency on the X chromosome than are unbiased genes in three of the four datasets. In all four datasets, high-expression genes have the most highly significant deficit of male-biased genes, suggesting that the deficit of male-biased genes on the X chromosome is indeed stronger for highly expressed genes. The patterns for female-biased genes are mostly nonsignificant, apart from a significant excess of low-expression female-biased genes on the X in one of the datasets (Fig. 1a).

EST Data

Overall, the microarray results agree with the prediction that the deficit of male-biased genes on the X chromosome should be stronger for highly expressed genes and weaker for lowly expressed genes. This analysis suffers from the drawback that truly low-expression genes may not be identified as sex-biased among the generally high levels of background noise in microarray data, especially because this dataset is meant for comparative analyses (male versus female) and may not be ideal for estimating absolute expression levels. It therefore seemed useful to see if these results held when using EST data as a proxy for expression level.

EST data from several cDNA libraries can be easily queried in the NCBI Unigene database. While it is difficult to determine which genes are male-biased, female-biased, or unbiased from EST datasets, we can select them according to the tissues in which they have been detected. In this case, we selected genes that are expressed in the testis, but not in the ovary, and classified them as male-biased genes. To examine genes with female-biased functions, we chose genes detected in the ovary but not in the testis. Genes that were expressed neither in the testis nor in the ovary, or expressed in both, were classified as unbiased, and the percentage of these genes located on the X chromosome was used as a control value. (In the microarray dataset, the testis and ovary expressions are by definition similar for unbiased genes, so that using one or the other, or their average, has little effect on the results. For the EST dataset, on the other hand, we classify genes as unbiased if they have no expression in the ovary and testis, or if they are detected in both, independently of the male-to-female ratio. It is therefore unclear what the expression value for the unbiased genes should be, and we focused instead on the expression dependence of male-biased and female-biased genes). The results are shown in Fig. 2.

Fig. 2
figure 2

The percentage of male-biased and female-biased genes that are located on the X chromosome, for three different levels of expression (the testis EST count is used as a proxy for male-biased gene expression, and the ovary EST count for female-biased gene expression): low (genes that have a testis EST count of 1), medium (testis EST count of 2 to 4), and high (EST count > 4). The p-values are for the comparison among low, medium, and high expression for each class of genes, and they were obtained using 3 × 2 chi-square tests, NS, nonsignificant difference. The dotted line denotes the overall percentage of unbiased genes located on the X chromosome

We repeated the previous analysis, using the number of ESTs detected in the testis (for male-biased genes) and in the ovary (for female-biased genes) as a proxy for expression levels. Consistent with the microarray data, the female-biased gene distribution does not appear to be restricted to any class of gene expression, with an excess of female-biased genes being located on the X chromosome for all EST count classes (Fig. 2). The male-biased gene distribution (Fig. 2), on the other hand, is heavily expression dependent, with a deficit of male-biased genes being observed only for medium- and high-expression genes, and not for low-expression genes, leading to an overall deficit of male-expressed genes on the X chromosome, as has been described previously in the literature (14% of male-biased genes are located on the X, versus 16% of unbiased genes and 21% of female-biased genes).

Discussion

Since microarray data have become widely available, much work has focused on testing Rice’s (1984) predictions for the genomic distribution of sex-biased genes. As expected, the X chromosome shows peculiar patterns of accumulation of sex-biased genes, but these are highly inconsistent among different species analyzed and, sometimes, between studies of the same groups (Ellegren and Parsch 2007).

Our main results are that, in D. melanogaster, the deficit of male-biased genes on the X chromosome is strongly dependent on their expression level; furthermore, genes that have evolved male or female bias since the common ancestor with D. simulans have mainly done so by increased levels of expression in the relevant sex. These results are consistent with the idea that, if dosage compensation mechanisms lead the X to become hyperactivated in males, any increase in expression required to make a new male-biased gene could be harder to achieve than it would be for an autosomal gene, as previously suggested by Vicoso and Charlesworth (2006).

A basic assumption of this hypothesis is that there is an upper limit to the rate of transcription, and that X-linked genes in the D. melanogaster testis reach this limit. A similar argument has been made for gene duplications in yeast, where duplicates of genes that are heavily transcribed are more likely be retained in the genome (Kondrashov et al. 2002; Seoighe and Wolfe 1999), suggesting that in this group it is often more costly (or impossible) to increase the expression of the ancestral copy than to keep an extra copy of the gene. This is harder to assess in multicellular organisms, because genes whose products are required at high concentrations can be transcribed from a multitude of organs, from a single organ, or even from a few specialized cells in one organ; what we are interested in is the mean rate of transcription per chromosome. There are several relevant examples, however, such as the gene amplification of oncogenes in tumors (Schwab 1999) and the duplications of insecticide resistance genes (Emerson et al. 2008), which suggest that transcription limits also occur in multicellular organisms, since it seems to be easier to acquire high levels of gene product by duplications than by increased rates of transcription. Furthermore, in yeast, the correlation between higher expression levels and retention of duplicates is observed even at low levels of expression (Seoighe and Wolfe 1999). It is therefore plausible that transcription limitations affect expression in the Drosophila testis, where a relatively small number of cells produce large amounts of protein.

One puzzling observation from the microarray data is that unbiased genes also have an expression-dependent genomic distribution—overall, highly expressed genes are located on the X chromosome more often than low-expression genes (Fig. 1 and Table 1)—although there is considerable variation among the four datasets; the sum of the 2 × 2 χ2 values for this comparison is 66.99 (p < 0.001). This contradicts our predictions, as unbiased genes would also be expected to be affected by an existing cap on transcription, although to a smaller extent than for male-biased genes. Some other process must therefore be involved. One possibility is that the selective pressure to evolve dosage compensation of X-linked genes in response to the degeneration of the Y chromosome (Charlesworth 1978; Ohno 1967) is greater for highly expressed genes, so that on average they evolve more effective equalization of gene expression in males and females, in the absence of sexually antagonistic fitness effects. This could cause there to be more X-linked unbiased genes in the high-expression level class.

If this were the case, our model predicts that a deficit of highly expressed genes that have only recently increased their level of expression, as opposed to the ancient process of dosage compenstation, will be detected on the X chromosome. To test this possibility, we examined the dataset of Zhang et al. (2007), which compares expression levels of male and females of D. melanogaster, D. simulans, and D. yakuba (Supplementary Material). We selected genes without sex bias that have been subject to more than twofold increases in expression in the D. melanogaster branch and found that these were rarer than expected on the X chromosome (P < 0.0001; Supplementary Material), suggesting that transcriptional limitations may also be affecting unbiased genes.

Of course, processes other than the one we have proposed could lead to an expression-dependent deficit of male-biased genes on the X chromosome. For instance, mutations that increase the activity of X-linked genes with low expression levels in the testis might have smaller effects on fitness in males than mutations in highly expressed genes, consistent with the general correlation between expression level and degree of selective constraint on the protein sequence (Drummond and Wilke 2008). We examined this hypothesis by computing the expected rates of fixation on the X and autosomes of mutations with different effects on male fitness, following the approach of Charlesworth et al. (1987); see the Supplementary Material. Increasing the selection coefficient in males leads to a more pronounced accumulation of recessive mutations on the X chromosome; the effect on dominant mutations is only marginal. This suggests that if genes with high expression in the testis were associated with larger benefits for males, they would be found on the X more often than genes with low expression, opposite to what we observed. If we consider the case of mutations that are beneficial for males but deleterious for females, then increasing the deleterious effect of these mutations in females can lead to a much more pronounced accumulation of male-beneficial sexually antagonistic mutations on the autosomes. A correlation between the expression level in the testis and deleterious effects on female fitness could therefore account for the observed pattern. However, since most male-biased genes arise primarily through increases of expression in males, and the level of expression in males is unlikely to influence female fitness, it is unclear why there would be such a correlation.

It is also possible that genes that are highly expressed in the testis are under greater pressure to move from the X chromosome to the autosomes, due to their activity being impaired during late spermatogenesis as a result of meiotic X inactivation (Khil et al. 2004; Wu and Xu 2003). While this scenario is compatible with our findings, the evidence in Table 1 suggests that recently evolved male-biased gene expression largely results from increases in gene activity in males, so that it seems likely that a considerable proportion of male-biased genes in D. melanogaster is the result of changes in expression in situ, rather than movements of genes. Furthermore, the results of Betrán et al. (2002) show that the predominantly testis-specific expression of genes that have transposed to the autosomes from the X chromosome is not seen in the X chromosome ancestral genes, so that there is no necessary relation between the current expression level of a transposed male-biased gene and that of its X-linked ancestor.

Another possibility is that low-expression testis genes are not really male-biased genes but are so rarely detected in EST screens that, by chance, they were found only in the testis. However, the expression-dependent distribution of male-biased genes is not limited to the genes with the lowest expression levels, suggesting that this is not a major issue in our analysis. Furthermore, we also analyzed genes that are classified as “testis-specific” in the Unigene database, as these are consistently detected in testis libraries, and found the same pattern as for all testis-expressed genes (Supplementary Material).

Connallon and Knowles (2005) found a negative correlation between the sex ratio of expression of male-biased genes and their frequency of location on the X chromosome. They suggested that this was due to dominance effects: dominant male-biased mutations are likely to have a strong deleterious effect on heterozygous females, leading to the evolution of expression inhibitors in females and, consequently, to a strong sex bias. According to this theory, a high sex ratio reflects the fixation of more dominant mutations, which are less likely to accumulate on the X chromosome (Rice 1984), causing the observed pattern. If highly expressed male-biased genes also have highly biased expression, we could be describing the same pattern here when using the microarray data. In the case of the EST data, however, we focused on genes whose expression always seems to be inhibited in females (when testis/ovary expression levels are compared), allowing us to bypass this issue to some extent. The fact that the correlation between expression levels and frequency of location on the X chromosome remains strongly significant suggests that there truly is an effect of expression levels on the distribution of male-biased genes, even when only male-specific genes are considered.

While the data analyzed here are not sufficient to establish definitively that dosage compensation is the cause of the male-biased gene deficit observed for the D. melanogaster X chromosome, it is interesting to note that they follow the predictions of our hypothesis. It is also worth considering whether transcription limitations could provide another line of investigation for the differences among mammals, flies, and worms mentioned in the Introduction. Our hypothesis predicts a deficit of high-expression male-biased genes on the X chromosomes in all three groups, assuming that the X chromosome in males has evolved hyperactivity in response to Y chromosome degeneration, as appears to be the case (Gupta et al. 2006). Contrary to this, there is an apparent excess of male-biased genes on the mammalian X, once the effects of meiotic X inactivation are removed (Khil et al. 2004). If sex bias in mammals evolves primarily by reductions in gene activity, rather than increases in activity, in contrast to what appears to be the case in Drosophila, the observed pattern could be explained. Comparative analyses of patterns of evolution of gene expression, similar to those presented in Table 1, would shed light on this. Another possibility is that the same transcriptional limitations may be at play in mammals, but to a lesser extent than in D. melanogaster and C. elegans, if the mean rate of transcription per cell is lower in this group. There is no direct measure of overall levels of X-linked expression per cell in the testis of D. melanogaster, C. elegans, and mammals. Should this value be lower for mammals, it could provide a new line of explanation for the opposite distribution of X-linked sex-biased genes in these species. Finally, under our hypothesis mammals should also present a deficit of highly expressed female-biased genes on the X chromosome, as mammalian females inactivate one copy of the X and hyperexpress the other in order to compensate for the reduced dosage of X-linked genes (Gupta et al. 2006; Lin et al. 2007; Nguyen and Disteche 2006; Straub and Becker 2007). This, again, is open to empirical testing.