Introduction

Studies of a variety of organisms have shown that the germline mutation rate is male-biased, i.e., that more mutations are produced in males than in females (mammals [Shimmin et al. 1993a; Chang et al. 1994; Bohossian et al. 2000; Makova and Li 2002], birds [Ellegren and Fridolfsson 1997; Kahn and Quinn 1999; Bartosch-Härlid et al. 2003; Axelsson et al. 2004], fish [Ellegren and Fridolfsson 2003], and plants [Filatov and Charlesworth 2002; Whittle and Johnston 2002]). The male mutation bias was first suggested by Haldane (1935) under the assumption that if mutations are predominantly introduced by errors during cell division (DNA replication), then the germline mutation rate should correlate positively with the number of germline cell divisions, which in many animal species are in vast excess in spermatogenesis compared to oogenesis. It has subsequently been recognized that comparisons of estimates of the male mutation bias and the sex bias in germline cell division actually provide a means for testing the relative importance of faulty replication in germline mutation (Chang et al. 1994; Chang and Li 1995; Kahn and Quinn 1999; Li et al. 2002). Specifically, if the male-to-female mutation rate ratio (αm) is significantly different from the ratio of the numbers of cell division in spermatogenesis and oogenesis (c), then other mechanisms than replication errors are important for causing mutations too. Alternatively, this could indicate that the per cell generation mutation rate differs between spermatogenesis and oogenesis. Analyses of these basic aspects of germline biology require appropriate estimates of αm and c.

The extent of male-biased mutation is also important in other contexts. Molecular clock rates may vary in response to life history characteristics such as generation time and sexual selection if these characteristics affect the number of germ cell divisions. For instance, when sexual selection is intense and leads to increased sperm production, the relative excess of male mutation is expected to increase (Bartosch-Härlid et al. 2003). Moreover, the relative rate of adaptive evolution in different parts of the genome is affected by αm (Kirkpatrick and Hall 2004), as is the chromosome-specific levels of nucleotide diversity (Sachidanandam et al. 2001). Furthermore, a significant sex bias in mutation rates will have practical implications for paternity testing (Ibarguchi et al. 2004) and is also important in health risk assessment (Crow 1997).

Mutation rates prove difficult to estimate by direct counts from one generation to another, however, Miyata et al. (1987) offered an evolutionary approach for studying αm. By comparing neutral substitution rates in chromosomes that spend different amounts of time in male and female germline, the relative mutation rates of the two sexes can be indirectly inferred. For instance, since the mammalian Y chromosome is only transmitted through the male germline, the rate of Y sequence evolution is solely governed by the male mutation rate. Knowing this rate from divergence data of neutral Y chromosome sequences, the female mutation rate can be obtained from divergence data of neutral X chromosomes sequences by taking into account the fact that, during evolution, X chromosomes are in the female germline two-thirds of the time and in the male germline one-third of the time. Correspondingly, αm can be estimated in organisms with female heterogamety, such as birds, where the W chromosome evolves under the influence of female mutation rate only, while the Z chromosome is in the male germline two-thirds of the time.

A potential pitfall in αm estimation using the evolutionary approach derives from the fact that substitution rates in regions traditionally thought to be free of selective constraint are not uniform along chromosomes (Lercher et al. 2001). Such variation may be due either to mutation rate heterogeneity or to negative selection on, e.g., noncoding DNA or synonymous sites. Indeed, recent observations from different organisms point at significant substitution rate heterogeneity among genomic regions (Lercher et al. 2001; Ebersberger et al. 2002; Smith et al. 2002; Waterston et al. 2002; Hardison et al. 2003), a variation that may occur at various scales, from sequence context effects of neighboring nucleotides up to regional variation at the Mb-scale level (Ellegren et al. 2003). Potentially, the effect of such variation on αm estimation could be significant. For example, consider estimates of αm in the human-chimpanzee comparison. Typical sequence divergence for the X and Y chromosomes in this case are 0.010 and 0.019, respectively (Ebersberger et al. 2002), which gives αm = 3.45 according to the formula of Miyata et al. (1987). However, if data from a region of unusually high substitution rate on the Y chromosome (say 1.5 times the average rate) are used for αm estimation, then αm = 38.

In practice, we do not know to what extent this represents a real problem in molecular evolutionary analyses of male-biased mutation. A common approach in studies of this kind is to target one or a few regions shared (yet nonrecombining, i.e., “gametologous”) between the X and the Y chromosomes, or the Z and the W chromosomes, and estimate αm based on substitution rate estimates from these regions. Hence, it is considered to be an advantage to use gametologous genes to correct for direct sequence effects. However, the use of gametologues would not correct for regional effects on mutation rates. To empirically test whether substitution rate heterogeneity along the sex chromosomes affects αm estimates, we study here the male mutation bias in five birds using sequence data from 16 different Z- and W-linked gametologous introns.

Materials and Methods

Sequence Data

Sixteen introns from three gametologous gene pairs (CHD1Z/CHD1W, SPINZ/SPINW, and UBAP2Z/UBAP2W) on the avian Z and W sex chromosomes were sequenced in five species from the order Galliformes: chicken (Gallus gallus), red-legged partridge (Alectoris rufa), quail (Coturnix coturnix), turkey (Meleagris gallopavo), and black grouse (Tetrao tetrix) (Table 1). PCR amplification was achieved by the use of conserved Z-specific and W-specific exon primers (Supplementary Material). PCR reactions for all introns except UBAP2 introns 1 and 4 were performed in 20-μl volumes in a Perkin Elmer 9600 Thermal Cycler using 0.5 U AmpliTaq Gold (Applied Biosystems), 2.5–3 mM MgCl2, 0.08 mM dNTPs, 1 × PCR Gold Buffer, 5 pmol of each primer, and 50 ng of template DNA. UBAP2 introns 1 and 4 were amplified with Pfu DNA polymerase (Promega) according to the manufacturer’s instructions. The error rate of Pfu DNA polymerase is approximately 1.6 × 10−6 errors per base (Lundberg et al. 1991), while the error rate of Taq DNA polymerase is approximately 1.1 × 10−4 (e.g., Barnes 1992). We assume that this difference has had no effect on our study. Prior to sequencing, PCR products were purified with ExoSAP-IT (Amersham Biosciences), followed by direct sequencing in forward and reverse directions using the original PCR primers and the DYEnamic ET DyeTerminator Kit (Amersham Biosciences). PCR products from UBAP2 introns, CHD1 introns 12 and 22, and SPINZ intron 3 were ligated into pGEM-T Easy Vectors, transformed into JM109 competent cells (Promega), and thereafter sequenced using modified M13 primers OMNI (5′ ACAGGAAACAGCTATGACCATGAT) and UNI (5′ CGAC GTTGTAAAACGAGGCCAGT), as well as with internal sequencing primers to cover the full length of the introns. Pfu DNA polymerase generates blunt-end fragments during PCR amplification and these products were subsequently A-tailed, ligated into pGEM-T Easy Vector, and transformed into JM109 competent cells (Promega). Positive clones were amplified with Templiphi DNA sequencing template amplification kit (Amersham Biosciences) and sequenced in both directions using the OMNI/UNI primers. Reactions were electrophoresed on a MegaBACE 1000 sequencing instrument (Amersham Biosciences). All sequences have been deposited in GenBank (accession numbers in Supplementary Material).

Table 1 Data on intron length, sequence divergence (KZ and KW), and male-to-female mutation rate ratio (αm) based on untrimmed and trimmed data sets, respectively

Sequence Analyses

Sequences were edited in Autoassembler (Applied Biosystems) and forward and reverse sequences compared to construct consensus sequences. Regions of simple repeats, which may be difficult to align, were identified with Sputnik (Abajian 1994) and masked before further analyses. Z- and W-linked sequences from each intron were aligned separately, using Clustal W (Thompson et al. 1994) in Sequence Navigator 1.0 (Applied Biosystems). Phylogenetic reconstruction was performed with a concatenation of all Z- and all W-linked alignments using the neighbor joining method and Tamura-Nei distance method implemented in MEGA 2.1 (Kumar et al. 2001) (Fig. 1). The bootstrap analysis consisted of 1000 replicates. Individual trees constructed for each intron did not give any indication of gene conversion between gametologous sequences (data not shown). Pairwise comparisons between all five species for KZ and KW were estimated using the Tamura-Nei distance method in MEGA 2.1 (Supplementary Material).

Figure 1
figure 1

Neighbor-joining tree with branch lengths (Tamura-Nei) and bootstrap supports in boldface.

Sequence divergence (KZ and KW) for each intron was estimated using the baseml program in PAML 3.13d (Yang 1997), with the Tamura-Nei (1993) model of sequence evolution. The estimation of intronic standard errors and hypothesis testing was carried out using nonparametric bootstrapping, which involved random sampling with replacement of sites from an alignment to generate alignments of the same length as the original. The bootstrapping process was repeated 1000 times for each alignment, thereby giving 1000 sets of KZ and KW estimates. Following the method of Miyata et al. (1987), the male-to-female mutation rate ratio, αm, was estimated using KZ and KW. The expected ratio of KZ to KW, taking into account the time the chromosomes spend in male and female germlines, is (1/3 + 2/3 αm):1, which gives αm = (3 × KZ/KW − 1)/2. Standard errors for KZ, KW, and hence αm were given by the standard deviation of the 1000 bootstrap replicates. The KZ/KW ratio can be obtained by matching any divergence estimates, including pairwise divergence between two species, divergence down one lineage, or total divergence down all lineages in a tree.

Statistical Analyses

In order to study variation in substitution rates among introns we used estimates of divergence down all lineages, i.e., the total branch lengths for each intron and chromosomal class. G tests (Sokal and Rohlf 2000) were performed, comparing the observed numbers of substitutions (O(D)) with the expected (E(D)), given by the following formula:

$$ E(D)_{j} = {\sum\limits_j {O(D)_j}\over {\sum\limits_j {L_j}}}{L_j} $$
(1)

where L is the ungapped length of each intron alignment, O(D) is given by the product of L and the intronic divergence, and summation is over all introns. Variation in estimates of the male mutation bias among gametologous introns was studied with a one-way analysis of variance (ANOVA) using the logarithm of αm for each intron to ensure normality. The error variance was estimated by dividing each intron alignment in odd-and-even numbered nucleotides, thereby giving two estimates of total branch lengths for each intron (Smith and Eyre-Walker 2003). Variations in substitution rate and in the estimates of the male mutation bias with introns and lineages as independent factors were studied in the same way with two-way ANOVAs. The two shortest introns, CHD1-13 and CHD1-21, displayed excessively high error variances, so we omitted these from the analyses of variance to attain homogeneity of variances among the introns (Sokal and Rohlf 2000). Moreover, CHD1-15, CHD1-24, and CHD1-25 showed zero branch lengths and were excluded from the two-way ANOVAs.

Search for Conserved Elements

We made alignments of Z- and W-linked gametologues from chicken alone to search for conserved blocks, and performed two types of analyses. First, alignments were examined to see if the distribution of lengths of perfectly conserved blocks deviated from the distribution that would be expected by chance. To obtain the expected distribution of block lengths, each alignment was permuted 1000 times and the length distribution of conserved blocks was noted for each permutation. We compared the observed distribution of conserved blocks with the expected distributions with a Kolmogorv-Smirnov test. Second, we used Pipmaker (Schwartz et al. 2000), which is based on the local alignment method BLASTZ and outputs a list of conserved regions and their respective conservation level. Each intron alignment was search for conserved blocks with a sequence similarity of at least 90%.

In order to examine sequence conservation in intron edges, we estimated sequence divergence in nonoverlapping sequence blocks at various distances from the intron-exon boundaries as well as in the bulk middle part of the introns (excluding 30 base pairs in each end). The blocks were positioned at nucleotides 1–10, 11–20, and 21–30 from the splice sites at both the ends (running 5′→3′) and the 3′ ends (running 3′→5′), respectively. Blocks comprising nucleotides 1–20 and 1–30 were also analyzed. Sequence divergence within the end blocks, as well as the middle block of the introns was compared using a chi-square test of a 7 × 2 contingency table. The contingency table was constructed by assigning the first row as the estimated number of nucleotide changes for each block using the baseml program of PAML, and the second row as the number of bases in each block.

Relative Rate Tests

In order to investigate whether different lineages have evolved at different rates, relative rate tests were performed for all possible species pairs and chromosomal classes using concatenated alignments of all introns using trimmed data sets. Substitutions in terminal branches from the split of any two lineages were compared, and the bootstrapped branch lengths were used for evaluating the significance of the relative rate tests (Table 2 and 3). A similar approach was used to study variation in αm among lineages. In this case αm was estimated using terminal branch lengths from the concatenated alignments of Z and W sequences, respectively. In addition, substitution rates and αm were estimated using the whole phylogeny (Table 3) followed by relative rate tests.

Table 2 Relative rate tests for KZ, KW, and αm, assuming a star phylogeny
Table 3 Relative rate tests for KZ, KW, and αm using the whole phylogeny (i.e., not assuming a star phylogeny)

Results

Five galliform species of birds (chicken, red-legged partridge, quail, turkey, and black grouse) were sequenced for 16 introns of the Z- and the W-linked copy of three gametolgous genes (Table 1). A total of about 9000 bp Z-linked and 8200 bp W-linked sequence was thereby obtained for each species. We examined variation in substitution rates (KZ and KW; total branch lengths of the respective phylogenetic tree) and subsequently also estimates of the male mutation bias (αm) for each intron (untrimmed data set). KZ varied between 0.21 and 0.39, with statistically significant variation among introns (G15 = 51.4, p < 0.0001). KW varied between 0.11 and 0.21 and the variation among introns was significant also in this case (G15 = 56.1, p < 0.0001). Somewhat surprisingly, Kw estimates fell broadly within two bimodal ranges; CHD1-6, 7, 10, 11, 12, 13, 15, 17, and 21 show divergences of 0.165–0.214, while CHD1-22, 24, and 25, SPIN-3, and UBAP-1, 2, and 4 show divergences of 0.106–0.139, a pattern that is unaccounted for. Estimates of αm based on each gametologous intron ranged between 1.39 and 3.29, with a mean of 2.19. There was evidence for significant variation in logαm among the introns (ANOVA: F13, 14 = 3.56, p = 0.01).

There was no significant correlation between substitution rates of gametologous introns (ρ = 0.318, p = 0.228, Spearman rank test). Moreover, the variance in αm estimates obtained from the 16 gametologous introns (var = 0.35) was not significantly lower than the variance in a permuted data set (95% confidence limit = 0.29, single sided) where introns from Z and W were randomly selected to estimate αm (p = 0. 13)

A detectable fraction of the noncoding DNA of vertebrate genomes, including chicken (Chicken Genome Consortium 2004), consists of conserved blocks that most likely evolve under purifying selection (Ludwig 2002). The presence of sequences subject to selective constraint in our data set may potentially contribute to the heterogeneity in substitution rates seen among introns. Comparative genomics offers a means for identification of conserved blocks and here we can use data from gametologous Z-linked and W-linked sequences for this purpose. The avian Z and W sex chromosomes started to differentiate from an autosomal state 100–170 Mya (Handley et al. 2004). Given this long time of independent evolution, any remaining sign of conservation between paralogous Z-linked and W-linked noncoding DNA would suggest functional constraints. However, the incidence of perfectly conserved sequence blocks in alignments of the chicken Z- and W-copy of each intron was not higher than in a permuted data set (Kolmogorov-Smirnov test, p > 0.2).

It has recently been observed in mammalian genomes (Hare and Palumbi 2003; Chamary and Hurst 2004) and in the chicken genome (Chicken Genome Consortium 2004) that sequence conservation in intron ends may extend beyond the splice acceptor and donor sites. However, in our untrimmed data set of the respective alignments of Z- and W-linked sequences from all species, we were not able to detect lower divergence in blocks 10, 20, and 30 bp from exon-intron boundaries compared to the bulk interior parts of introns (χ2 test; p > 0.05). Although the analyses described above do not present evidence for a large effect of negative selection on intron sequences, we cannot exclude that some sequences have been evolving under such constraint. To be conservative in the analysis of mutation rate variation among introns, we therefore trimmed the data set by removing 20 bp in the 5′ and 3′ end of each intron and conserved blocks with a similarity of at least 90% according to Pipmaker analysis. The block lengths varied between 3 and 57 nucleotides; a total of 576 bp was in this way removed from the complete data set (17,270 bp). However, using the trimmed data set, the earlier conclusions basically remained unchanged. There was significant variation in KZ (G15 = 37.6, p = 0.0010) and in KW (G15 = 44.0, p = 0.0001) as well as in log αm among introns (ANOVA: F13, 14 = 2.61, p = 0.04) (Table 1).

Using the trimmed data set (similar results were obtained with the untrimmed data set), two-way ANOVA analyses showed significant intron and lineage effects for substitution rates among Z-linked (lineage: F4,86 = 11.0, p < 0.001; intron: F9,86 = 1.98, p = 0.05) as well as W-linked introns (lineage: F4,86 = 2.90, p = 0.03; intron: F9,86 = 4.26, p = 0.0001). Moreover, there were significant intron and lineage effects on estimates of the male mutation bias (lineage: F4,85 = 2.76, p = 0.03; intron: F9,85 = 2.80, p = 0.0006). The phylogenetic relationship among the species included in this study approximately follows a star-like phylogeny (Dimcheff et al. 2002), although our tree (Fig. 1) gives significant support for positive internal branches. Relative rate tests of divergence estimates were therefore conducted both between all pairs of lineages in the phylogeny (Table 2) and using the phylogeny shown in Fig. 1 (Table 3). In both cases, quail displayed the highest substitution rates among the five lineages for Z as well as for W-linked sequences using the trimmed data set for estimating divergences. This may be consistent with a generation time effect on the molecular clock, quail being the smallest species of the five galliform birds analyzed. Also in both approaches, the male mutation bias was significantly higher in chicken (≈3.0) than in black grouse (≈2.1) and red-legged partridge (≈1.7) and significantly higher in quail (≈2.5) than in red-legged partridge (Tables 2 and 3).

Discussion

Previous studies of male-biased mutation have revealed variable and sometimes inconsistent estimates of αm for the same group of organisms. In humans, for instance, estimates have ranged between 1.7 and 6 (Shimmin et al. 1993a; Bohossian et al. 2000; Makova and Li 2002). In birds, point estimates of αm of between 1.7 and 6.5 have been reported (Ellegren and Fridolfsson 1997; Kahn and Quinn 1999; Carmichael et al. 2000; Fridolfsson and Ellegren 2000; Bartosch-Harlid 2003; Axelsson et al. 2004). There are at least three possible methodological causes to such variation. One obvious explanation is that most estimates are associated with rather large confidence intervals (note that αm is obtained from the ratio of two divergence estimates) so the wide range of the point estimates is at least in part due to stochastic variation in substitution rate estimation. It follows that, for comparative purposes, meaningful estimates of αm need to be based on divergences estimated with low variance.

Second, ancestral polymorphism affects the analysis of male-biased mutation if the evolutionary distance between the taxa for which divergence is estimated is short (Makova and Li 2002). Specifically, if the preexisting levels of polymorphism differ between the chromosomal classes being analyzed and these levels are appreciable compared to levels of divergence, then αm estimates will be biased since the coalescence time (and thus divergence) for two sequences from one chromosomal class will tend to differ from that of two sequences from another class. This was noted in studies based on Y chromosome data in human and great apes (Makova and Li 2002) and W chromosome data in terminal branches of an extensive bird phylogeny (Bartosch-Härlid et al. 2003); both Y and W are typically low in genetic diversity (Sachidanandam et al. 2001; Berlin and Ellegren 2004). In the present study we consider the effect of ancestral polymorphism negligible since the pairwise distance between all taxa is of the order of 5–10%, which is high compared to diversity levels.

Third, data presented herein demonstrate that heterogeneity in substitution rates among genomic regions on the sex chromosomes can lead to significant variation in αm estimates. It has previously been noted that the male mutation bias is insufficient to explain the variation in substitution rates seen among and within individual chromosomes (McVean and Hurst 1997; Lercher et al. 2001). We extend these observations by showing that even with the analysis of gametologous sequences shared between sex chromosomes, an approach commonly used in αm estimation, significant variation in αm estimates is obtained. With the exception of large-scale analysis of genomic sequences from human-chimpanzee (Ebersberger et al. 2002; Lu and Wu 2005) and mouse-rat (Lercher et al. 2001; Malcom et al. 2003; Makova et al. 2004), most studies of male bias have been based on data from one or just a few gametologous regions shared between the sex chromosomes (e.g., Shimmin et al. 1993a; Chang et al. 1994; Agulnik et al. 1997). It is thus possible that part of the variation in αm seen among these studies reflects mutation rate heterogeneity inherent to different genomic regions rather than underlying differences in the mutational sex bias.

Given the observed heterogeneity in estimates of the male mutation bias, is there a single meaningful evolutionary-based αm for each species (lineage)? On the one hand, large-scale approaches involving sequence data from many different genomic regions should reveal a genomic average for the male mutation bias. On the other hand, however, just as sex-specific recombination rates vary over the genome (Nachman 2002), so may sex-specific mutation rates. Clearly, this is true for hot spot sites where differences in methylation levels between sexes can lead to distinct differences in their rates of mutation (Wilkin et al. 1998; Girard et al. 2001; Trappe et al. 2001). For comparative purposes, e.g., in tests of the correlation between the male mutation bias and various life history parameters, this suggests that either of two approaches should be taken. If it is possible to retrieve large numbers of sequence data (for instance, if the genome sequence is available for one of the species), it would be preferable to include data from many genomic regions rather than doing more extensive sampling from one or just a few regions. Alternatively, for less well-characterized genomes, comparative studies should be based on data from the same genomic region(s) analyzed in all species (cf. Bartosch-Härlid et al. 2003).

One important observation in this study was that gametologous introns did not seem to perform better in αm estimation than random combinations of introns from Z and W. Furthermore, the absence of a correlation between the substitution rates of gametologous introns suggests that the pattern of mutation rate heterogeneity is not conserved in these paralogous sequences. Deterministic mutation rate variation has been observed over much shorter evolutionary distances, e.g., for orthologous sequences along the human and chimpanzee lineages (Smith et al. 2002). Whatever factors govern repeatability in patterns of regional mutation rate variation, avian sex chromosome evolution may represent too long a time of divergence (>100 MYA) for such effects to remain or, at least, to be strong enough to be detectable in our analysis. Based on the assumption that gametologous noncoding sequences are subject to similar patterns of mutation, it has been argued that they are preferably used for αm estimation (Shimmin et al. 1993, b; Chang et al. 1994; Chang and Li 1995; Ellegren and Fridolfsson 1997; Kahn and Quinn 1999; Fridolfsson and Ellegren 2000). However, our analysis thus does not support this assumption, something, which may make it easier to gather larger sets of sequence data from sex chromosomes of less well-characterized genomes.