Introduction

Several mechanisms cause mutations, including damage to DNA due to mutagens and replication errors (Griffiths et al. 1996). If mutations are caused by mistakes during DNA replication, genetic material that has been replicated many times should have a higher mutation rate than that replicated a few times. In mammals, DNA replication occurs more often in the male germ line than the female germ line, so the male germ line may be a major source of mutations relative to the female germ line, especially in species with a long pre-reproductive time, such as humans (Haldane 1947). “Male-driven evolution”” is the idea that male-originating mutations drive the mutational aspect of evolution. Human genetic diseases that often arise from paternal point mutations provide evidence for male-driven evolution (Jung et al. 2003). According to Miyata and colleagues (1987), the male-to-female mutation rate (α) can be calculated by comparing the mutation rates of genes on autosomes, X chromosomes, and Y chromosomes, because the chromosomes are passed through the male germline with different frequencies.

In mice, DNA in the male germ line goes through approximately two times as many replications as DNA in the female germ line, assuming a male reproductive age of 5 months (Chang et al. 1994). If replication is the main source of new mutations, α should be approximately 2. Indeed, some comparisons of X, Y, and autosomal sequences have led to a calculation of α in rodents ranging from approximately 2 (Chang and Li 1995; Chang et al. 1994; Makova et al. 2004) to 3.5 (Malcom et al. 2003). In humans, α is expected to be approximately 6.2, if the average male reproductive age is 20 (Chang et al. 1994). Human α has been estimated from comparisons of Y, X, and autosomal sequences to be approximately 5 (Chang et al. 1996; Huang et al. 1997; Makova and Li 2002; Shimmin et al. 1993). The similarity of α estimated using germ line replications and substitution rate is appealing, but there is disagreement about whether the similarity is meaningful. The average human male reproductive age is likely to be closer to 30 than 20 (Tremblay and Vezina 2000), and there may be other effects confounding the estimation of α (Hurst and Ellegren 1998).

A recent study comparing a large, closely related region of the human X and Y chromosomes containing no genes (Bohossian et al. 2000) provoked controversy by concluding that α in humans is only approximately 1.7. Even after making some corrections for ancestral nucleotide diversity, α was still much smaller than in previous studies (Bohossian et al. 2000). Bohossian and colleagues (2000) interpret this to mean that, in humans, the high number of germ cell divisions in males may not cause as many mutations as previously thought. Possibly these long, gene-free sequences allowed a more accurate estimation of α than was possible in previous studies, or humans have a lower α than other primates (Bohossian et al. 2000). Alternatively comparisons of sequences that are closely related, in this case sequences that have diverged after a recent translocation, have a lower α as a result of factors unrelated to the male mutation rate and, therefore, are not suitable for studies of male-driven evolution (Li et al. 2002).

In another study, Makova and Li (2002) found that α calculated from other closely related primate species, in addition to humans, was lower than α from more distantly related species. They attributed the low ratio to error caused by ancestral nucleotide polymorphism. Because observed branch lengths are actually the combination of any polymorphisms retained from the ancestral lineages and new mutations occurring after two lineages split, the mutation rate for a branch is really the observed rate minus the ancestral polymorphism (Li 1977). Corrections for ancestral polymorphism have been considered by several authors (including Axelsson et al. 2004; Bartosch-Harlid et al. 2003; Bohossian et al. 2000; Makova and Li 2002), in studies of male-driven evolution. When Makova and Li (2002) corrected for ancestral nucleotide polymorphism by subtracting ancestral polymorphism from the mutation rate of autosomal sequences in closely related primates, the corrected α values were comparable to α values calculated from distantly related primates. Interestingly, Bohossian et al. (2000) also corrected their data to consider ancestral polymorphisms, but this correction did not make a great difference in their results. In addition, the standard error of Makova and Li’s results (2002) was high, as is the case with most studies of male-driven evolution. From these two studies, it is still not clear how closely related sequences are useful in studies of male-driven evolution.

If Makova and Li’s (2002) hypothesis that the low values of α seen in their study and in Bohossian et al.’s (2000) study are due to ancestral polymorphism in closely related sequences is correct, the low values of α should also be reproducible in similar studies of other mammalian lineages. Previous studies in rodents have compared X and Y genes in relatively distantly related species such as laboratory mouse and rat (Chang and Li 1995; Chang et al. 1994; Smith and Hurst 1999) (which diverged about 23 million years ago; Adkins et al. 2001), or mouse, human, and horse (Agulnik et al. 1997). To address the possible confounding effects of closely related species on male-driven evolution, we sequenced and analyzed Jarid1d (formerly Smcy; Agulnik et al. 1994a) from the Y chromosome and Jarid1c (formerly Smcx;. Agulnik et al. 1994b) from the X chromosome, in species of the rodent murine genus Mus (family Muridae, subfamily Murinae).

Sequences such as Jarid1d and Jarid1c that were once homologous gene pairs, but now reside in regions of the X and Y chromosomes that do not recombine with each other, are useful for studying male-driven evolution because their mutation rates may be influenced mainly by the chromosomes they are on. In this study, intron sequences of Jarid1c and Jarid1d were compared in a phylogenetic context to determine the neutral rates of evolution in the two sexes. In our analyses, we obtained results that are similar to Makova and Li’s (2002).

Materials and Methods

Species

Some of the most commonly studied species of the genus Mus were used (see Lundrigan et al. 2002, for collecting information). The genera Mus, Mastomys, and Hylomyscus are members of the rodent family Muridae and the subfamily Murinae, the Old-world Mice and Rats. Members of the Praomys group (Hylomyscus alleni and Mastomys hildebrandtii) were used as outgroups in this study. DNA hybridization and nuclear DNA sequence suggest that the Praomys group is sister to Mus (Catzeflis and Denys 1992; Chevret et al. 1994; Jansa and Weksler 2004) and diverged from Mus approximately 8 million years ago (mya) (Chevret et al. 1994).

Based on molecular clock estimates, the genus Mus diverged 10 to 8 mya (Chevret et al. 2005) and the subgenera Coelomys (represented by Mus pahari) and Mus (represented by M. caroli, M. cookii, M. cervicolor, M. spretus, M. spicilegus, M. macedonicus, and the subspecies of M. musculus) diverged 8 to 6.7 mya (Chevret et al. 2005). The divergence of the lineage composed of M. caroli, M. cookii, and M. cervicolor occurred approximately 1.9 to 1.6 mya (She et al. 1990) or 4 mya (Chevret et al. 2005). The divergence of lineages giving rise to M. spretus, M. spicilegus, and M. macedonicus occurred about 1.29 to 0.93 mya (She et al. 1990) or 2 mya (Chevret et al. 2005). M. spicilegus and M. macedonicus diverged from each other 0.29 to 0.17 mya (She et al. 1990). The subspecies of M. musculus (M. musculus musculus, M. m. domesticus, M. m. castaneus) diverged about 0.9 mya (Boursot et al. 1996) or 0.35 mya (She et al. 1990).

Amplification of the Target Sequences

PCR of genomic DNA and sequencing of the results was done as described (Sandstedt and Tucker 2004). All sequences have been deposited in GenBank (Accession numbers AY260478–AY260503).

Analysis

Jarid1d and Jarid1c nucleotide sequences including introns and exons were aligned using Clustal X (Thompson et al. 1997). Intron alignments were checked by realigning groups of three taxa using the MCALIGN program (Keightley and Johnson 2004), which is particularly good at aligning intron sequences (Chamary and Hurst 2004). The complete alignments were realigned by hand using the Clustal X and MCALIGN results as a guide. These sequences were combined with a subset of the sequences from four other nuclear genes from Lundrigan et al. (2002): Sry, B2m, Zp-3, and Tcp-1, in a likelihood analysis published separately (Tucker et al. 2005). The resulting tree was used as a constraint tree for the intron analyses.

Aligned intron sites containing gaps were removed from the alignment before calculating branch lengths. Nucleotides within 20 base pairs from exon/intron boundaries were removed. Intron distances along the constraint topology were determined using the Tajima-Nei model (1984). These distances were mapped to branches with PAUP*4.0b10 (Swofford 2002) using the unweighted least squares option. Using these distances, an Y/X ratio was calculated for all branches, for internal branches only, for external (terminal) branches only, and for external branches with longer branches removed (Fig. 1). Variances were calculated with the delta technique as by Makova and Li (2002), using a simplification of the Tajima-Nei variance that reduces to the large sample variance for the Jukes-Cantor model (Kimura and Ohta 1972; Nei and Kumar 2000 p. 39). This implementation assumes no covariance between the X and Y sequences. The variance of Y is V(Y) = Y (1 − Y) / [L (1 − 4Y / 3)2] and variance of X is V(X) = Y (1 − X) / [L (1 − 4X / 3)2] with L equal to the length of the sequence. The variance of Y/X is V(Y/X) = V(Y) / E(X)2 + E(Y)2 V(X) / E(X)4. The male:female mutation rate ratio, α, was calculated using the formula Y/X = 3α / (α + 2) (Miyata et al. 1987). The 95% confidence interval was estimated following Huang et al. (1997): the lower bound for the confidence interval for Y/X is Y/X = Y/X − 1.96s and the upper bound is Y/X+ = Y/X + 1.96s, where s is the standard error of Y/X. The 95% confidence interval of α was calculated by substituting Y/X and Y/X+ in the rate ratio formula.

Figure 1
figure 1

A higher substitution rate was found in Jarid1d than in Jarid1c from Mus. Intron distances calculated using the Tajima–Nei method were mapped to the constraint tree using unweighted least squares in PAUP*4.0b10. Jarid1d/Jarid1c distances (×100) are given above branches. Nodes are labeled with bold-face numbers.

Ancestral polymorphism can be corrected for by subtracting ancestral polymorphism from the mutation rate of an X chromosomal or autosomal gene. Because the Y chromosome has very low polymorphism (reviewed by Charlesworth and Charlesworth 2000; Hellborg and Ellegren 2004), this subtraction should not be done to mutation rates of Y linked genes (Makova and Li 2002). If the correction for ancestral polymorphism is not made, the X or autosomal mutation rate will be too large, causing Y/X or Y/A to be too small. When α is calculated from these uncorrected ratios, it will also be too small.

We made several types of corrections to account for the effect of ancestral nucleotide polymorphism on closely related species, following Makova and Li (2002). One correction was to calculate α using only internal branches, which diverged a longer time ago than the external branches. In Makova and Li’s study (2002), external branches represented closely related, recently diverged species. The external mouse lineages we studied may have undergone greater divergence, so we made an additional, similar correction using evolutionary distances as the measurement of “close” relationships.

Makova and Li (2002) also made a pairwise correction using pairs of closely related species. We also calculated α using this correction. Branch lengths along the shortest path between the two species in each pair were added together for Jarid1c and Jarid1d, separately. An estimate of ancestral nucleotide polymorphism was subtracted from the total branch length for each Jarid1c distance. We used the average nucleotide polymorphism (π (0.078%)) from four X chromosomal genes in Mus domesticus (Nachman 1997) as an approximation of ancestral nucleotide polymorphism for Jarid1c. The corrected Jarid1c branch lengths were used in the equations above to calculate α.

Results

Synapomorphies for Jarid1c arid Jarid1d in the same species were rare, and only one base long, so gene conversion is unlikely to affect this analysis (Slattery and O’Brien 1998). Most Jarid1d branch lengths were longer than Jarid1c branch lengths (Fig. 1). Y/X varied greatly among branches, ranging from 0 to 20.488 (Table 1). As noted by Chang et al. (1996), widely varying Y/X ratios may occur because of random fluctuations due to a small number of observed differences. Following their lead, analyses were based on groups of branches. Adding together all of the branch lengths, Y/X = 1.669 ± 0.302 and α = 2.508 (95% CI: 1.210 to 6.114). As a correction for ancestral nucleotide polymorphism, we calculated α for external and internal branches separately. Y/X for external branches = 1.642 ± 0.340 and α = 2.418 (95% CI: 0.963 to 6.678). For internal branches, Y/X = 1.726 ± 0.484 and α = 2.709 (95% CI: 0.699 to 16.436) (Table 2).

Table 1 Branch lengths and Y/X calculated for external and internal branches
Table 2 Alpha calculated for subsets of data including type of branch (external or internal) and branch length

α was also calculated for the external branches without the outgroup species, Hylomyscus alleni and Mastomys hildebrandtii. In this case Y/X = 1.642 and α = 2.418 (95% CI: 0.963 to 6.678). In addition, to examine closely related sequences by degree of relationship, α was calculated from external branches, after sequentially removing both X and Y branches when Jarid1d lengths were longer than 1.0, 1.5, 2.0, 2.5, or 3.0% (Table 2).

Five pairs of species were chosen to sample different parts of the tree topology for pairwise corrections. The species compared were as follows: M. spicilegus/M. macedonicus, M. cookii/M. cervicolor, M. domesticus/M. m. musculus, M. spretus/M. domesticus, Hylomyscus alleni/Mastomys hildebrandtii. The Jarid1c branch length for each pair was corrected by subtracting an estimate of ancestral nucleotide polymorphism. Y/X for each species pair was calculated by dividing the total Jarid1d branch length by the corrected Jarid1c branch length. The corrected Y/X ratios ranged from 0 to 5.901, and α ranged from 0 to infinity (data not shown).

Discussion

The male-to-female mutation rate in rodents was previously estimated from Y/X comparisons to be approximately 2 (Chang and Li 1995; Chang et al. 1994). Our uncorrected results are in agreement with those studies and with α predicted by the number of cell divisions in rodent germ lines (Chang et al. 1994). These results are also similar to a previous study using Jarid1d (Smcy) and Jarid1c (Smcx) in mouse, human, and horse, which found an α of 3 (Agulnik et al. 1997). That estimate may be higher than ours because it includes human and horse, both of which have longer generation times than mouse.

The results of this study support the conclusion of Makova and Li (2002) and Li et al. (2002) that analyses of closely related species produce low estimates of α. In this study, α calculated from external branch lengths was only slightly lower than α calculated from all branch lengths. α calculated from internal branch lengths was higher than α calculated from all branch lengths. We noted that some of the branch lengths in our study were long, relative to branch lengths in Makova and Li’s (2002) study of primates, which were all less than 2.5%. We addressed this by (1) removing the outgroup taxa, which had long branches, from the external branch length calculation, and (2) removing other long external branches (Table 2). When distantly related external branches were removed from the analysis, the estimate of α dropped; if outgroup taxa were removed, α was 1.524 (95% CI: 0.614 to 3.404), and if both outgroups and the most distantly related Mus species were removed, α was 0.937 (95% CI: 0.361 to 1.885). This suggests that examining only closely related species causes estimates of the extent of male-driven evolution to be too low relative to expected values. However, the 95% CI of these estimates overlap. In Makova and Li’s (2002) study, the 95% CI of α calculated from external and internal branches also overlaps.

Correcting for ancestral nucleotide polymorphism (π) by subtracting an estimate of π from the Jarid1c branch lengths between pairs of species, resulted in different estimates of α, depending on which species pairs were used. Makova and Li (2002) used estimated ancestral nucleotide diversity from polymorphism data. Polymorphism data were not available for Jarid1c, so the average π for four X chromosomal genes in a random sample of 10 Mus domesticus, 0.078% (Nachman 1997), was used. This estimate of π may not be accurate for this purpose since it is an average calculated from four X chromosomal genes (range: 0–0.160) (Nachman 1997). Additionally, ancestral Mus populations may have had different levels of polymorphism from extant species. Estimations of α for species pairs varied widely depending on which species were compared, probably due to random factors related to small samples. The corrected values also varied widely, and neither support nor refute the utility of correcting for ancestral nucleotide polymorphism in this way.

Despite evidence from plants (Filatov and Charlesworth 2002; Whittle and Johnston 2002), birds (Carmichael et al. 2000; Ellegren and Fridolfsson 1997; Garcia-Moreno and Mindell 2000; Kahn and Quinn 1999), and fish (Ellegren and Fridolfsson 2003; Zhang 2004) as well as from mammals, the precise portion of mutations that is due to male gametogenesis is not known. Different autosomes evolve at different neutral rates, which would not be expected if male-driven evolution were the only force at work (Lercher et al. 2001). Li et al. (2002) point out that autosome-specific mutation rates can be caused by replication-dependent (Birdsell 2002; Bohr et al. 1987; Wolfe et al. 1989) and replication-independent (Birdsell 2002; Kumar and Subramanian 2002; Petes 2001) mechanisms. If recombination in mammals is mutagenic (Filatov 2004; Filatov and Gerrard 2003; Hellmann et al. 2003; Lercher and Hurst 2002; Perry and Ashworth 1999; but see also Yi et al. 2004), it will lower estimates of male-driven evolution made using Y chromosomal sequences, because the Y does not recombine (Li et al. 2002). It is not clear which mutational mechanisms are responsible for which portion of the observed mutation rate. In addition, there is new evidence that some mitotically active germ cells exist within the adult female mouse ovary, so the female germ line may continue to replicate after birth like the male germ line does (Johnson et al. 2004). If this surprising result is confirmed, more mutations may come from the female germ line than previously supposed.

The meaning of neutral rate differences of Y and X chromosomal genes is not completely clear. Possibly a high rate of evolution of Y genes relative to X genes means that the X chromosome evolves slowly, for reasons unrelated to which sex it is in (McVean and Hurst 1997). The conclusion that the X chromosome has a low mutation rate seems to depend on the amount of ancestral polymorphism assumed (Ebersberger et al. 2002). Since male-driven evolution is also seen in birds, in which males are the homogametic sex (Carmichael et al. 2000; Ellegren and Fridolfsson 1997; Garcia-Moreno and Mindell 2000; Kahn and Quinn 1999), the difference in the mutation rates of the X and Y is probably not due to a slow mutation rate on the X as a consequence of homogamy. However, this issue remains controversial (Malcom et al. 2003; Smith and Hurst 1999). A low X chromosomal mutation rate was not responsible for the results of Makova and Li (2002), since they found evidence for male-driven evolution by comparing ancestrally homologous sequences from the Y and an autosome, without using X chromosomal sequences. A low mutation rate on the X chromosome may remain a factor in X/Y comparisons such as those reported here.

This study adds to the several studies supporting a male-to-female mutation rate in rodents of approximately 2 when calculated using X/Y comparisons (Chang and Li 1995; Chang et al. 1994) and shows that this result is dependent on the analysis of relatively distantly related species, as predicted by Makova and Li (2002).