Introduction

In mammals, the vomeronasal organ (VNO) of the olfactory system is a chemosensory organ specialized in the detection of pheromones, chemical signals that induce innate reproductive and social behaviors between the members of the same species (Halpern 1987; Wysocki and Lepri 1991; Keverne 1999). It is widely believed that pheromone perception is initiated by the interaction of chemical molecules with G protein-coupled receptors (GPCRs) in the VNO, and these candidate pheromone receptors are encoded by two distinct and complex superfamilies named V1R and V2R (Dulac and Axel 1995; Herrada and Dulac 1997; Matsunami and Buck 1997; Ryba and Tirindelli 1997). Expression of V1R genes is restricted to Gαi2-expressing sensory neurons of the apical zone of the VNO epithelium, whereas expression of V2R genes is exclusive to Gα0-expressing neurons of basal zone of the VNO epithelium (Berghard and Buck 1996; Jia and Halpern 1996). As in many other GPCR genes, V1Rs are characterized by an intronless coding region and a short extracellular N-terminus, while V2Rs have a large N-terminal domain (Dulac and Axel 1995; Herrada and Dulac 1997; Matsunami and Buck 1997; Ryba and Tirindelli 1997).

There are dramatic differences in the V1R repertoire among different mammalian lineages. The mouse genome contains about 300 V1R sequences classified into 12 gene families (Saito et al. 1998; Del Punta et al. 2000; Pantages and Dulac 2000; Rodriguez et al. 2002). About 53% of these are believed to be pseudogenes (Rodriguez et al 2002). By contrast, the vast majority of V1R sequences in humans (∼200) are believed to be pseudogenes, with only 5 known intact ORFs (Lane et al 2002; Zhang and Webb, 2003). The difference in V1R repertoire is consistent with morphological differences among lineages. The functional VNO was absent in adult catarrhine monkeys and humans while other nonprimate mammals, including rodents, possess a functional VNO (Keverne, 1999; Zhang and Webb, 2003). Not surprisingly, this observation led to the suggestion that such differences in V1R repertoires might reflect lineage-specific and species-specific differences in pheromone-related social and sexual behaviors (Lane et al. 2002; Rodriquez et al. 2002).

Lane et al. (2002) analyzed three major mouse V1R clusters (located on chromosome 6D, 6.56/57, and chromosome 13, respectively) and suggested that they arose via an expansion event associated with the divergence of mice and rats. Most recently, Lane et al. (2004) further analyzed two subfamilies in rats. Interestingly, both reports speculated that V1R expansion might contribute to differences in pheromone recognition, and thus played certain role in the process of rodent speciation. However, their analyses of V1R from either the mouse or the two subfamilies alone provide only a limited evolutionary perspective of this important family. We expected that a comparative analysis of this gene family from a closely related species would be a powerful way to further illustrate the molecular evolution of this family, especially the role of adaptive evolution of these genes in rodents. Therefore, we used the recent releases of the rat genome sequence (Rat Genome Sequencing Consortium 2004) to identify the V1R genes of rat and conducted an analysis of this gene family evolution in mice and rats.

Methods

Data Mining and Sequence Quality Control

Previously identified V1R genes from mouse and human were used as queries to screen the rat genome sequences of the June 2003 assembly (http://genome.ucsc.edu) by using the program BLASTN or TBLASTN (Altschul et al. 1997).

Initially, we included all previously reported mouse genomic sequences in our mouse dataset. However, when we compared the mouse V1R genes previously identified from the Celera database by Rodriguez et al. (2002) with the latest assembly of draft mouse genome (October 2003 assembly; http://genome.ucsc.edu), we found that some sequences from the new assembly contained frame shifts as a result of the insertion of one or two nucleotides, leading to stop codon interruption. For example, the new sequence of V1RC19 is different from the old one in having an additional nucleotide (G) at position 872. Furthermore, some genes previously regarded as distinct actually represent only one genomic position in the new assembly. For instance, V1RB11 is the same as the V1RB4 in the later assembly. The most plausible explanation is that these discrepancies are artifacts of the data handling procedures that have been resolved in the later assemblies. In fact, the V1RB11 gene is no longer present in the update Celera database (Del Punta et al. 2002). We excluded 13 such problematic sequences from this study (V1RB5, V1RB6, V1RB11, V1RC1, V1RC4, V1RC13, V1RC19, V1RC22, V1RH1, V1RI8, V1RD1, V1RD2, V1RD18), although some real functional genes might have also been removed under this criterion.

Here we adopted the classification and nomenclature proposed by Rodriguez et al (2002). Each V1R gene is denoted by one additional letter followed by one number. The letter represents the subfamily to which the gene belongs and the number designates the order of discovery of the genes in that subfamily. For instance, the notation rV1RA1 indicates that the sequence is from the rat (r) and the first member identified as belonging to the A subfamily. The classification of subfamily was based on the phylogenetic relationships and sequence homology among sequences using the criteria proposed by Rodriguez et al. (2002).

Evolutionary Analyses

Deduced V1R amino acid sequences were initially aligned by CLUSTAL W (Thompson et al. 1994) and then the alignment was refined by manual adjustment. The nucleotide sequences were aligned according to the aligned amino acid sequence and used in subsequent tree reconstruction. The alignments are available from the authors on request. The mouse and human T2R genes, members of the bitter taste receptor gene family, were used as outgroups. The phylogenetic relationships of the V1R gene sequences were reconstructed by using the neighbor-joining method (Saitou and Nei 1987) under a variety of models (uncorrected P-distance, K2P [Kimura 1980], JC69 [Jukes-Cantor 1969], and HKY [Hasegawa et al. 1985]) with program MEGA (Kumar et al. 2001). In addition, a more sophisticated clustering procedure, BioNJ, under various models of evolution with parameters, including proportions of invariable sites (I) and gamma distribution (γ), were also used by Phyml program (Guindon and Gascuel 2003) to construct the phylogenetic trees. Since all these analyses gave similar results, only NJ trees under JC69 are presented here. The reliability of the trees obtained was evaluated by the bootstrap method (Felsenstein 1985) with 1000 pseudo-replications. Sawyer’s (1989) method was used to examine gene conversion among paralogous genes. We used the two-cluster test of Takezaki et al. (1995) to examine molecular clock hypothesis. The linearized-tree method of Takezaki et al. (1995) was employed to estimate the relative divergence times of each V1R subfamily.

To examine the pattern of nucleotide substitution, the numbers of synonymous substitutions per synonymous site (dS) and the number of nonsynonymous substitutions per nonsynonymous site (dN) were estimated by maximum likelihood method using the codeml program of the PAML package (Yang 1997). Note that those V1R genes involved in gene conversions were excluded from this analysis, because such sequences can increase the rate of false positives when testing the positive selection (Anisimova et al. 2001). Two kinds of codon-substitution models were employed in this paper. First, we used “site-specific” models to analyze each of the subfamilies or clusters whose origins were potentially associated with the divergence of the mouse and rat lineages. Here we employed the recommended models (M0, M3, M7, and M8) (Anisimova et al. 2001, 2002). M0 is the “one-ratio” model, having one ω (dN/dS) parameter for all sites in the data. M3 is the “discrete” model, having a discrete number of site categories, each with an independent ω parameter. In this case we used M3 with k = 3 site categories. M7 (k = 10) is the “beta” model, as it uses a beta distribution to model variable selection pressure among sites. M8 (k = 11) is the “beta & ω” model, an extension of M7 that allows for an extra ω parameter that can be >1. Likelihood ratio tests (LRTs) were conducted to examine whether selection pressure varied among sites and a portion of sites was under positive selection.

Second, we employed “branch-site” models of Yang and Nielsen (2002) to test for positive selection at sites along prespecified sets of branches. We conducted this type of analysis on subfamily A because this family contained large samples of mouse and rat sequences, allowing tests for episodes of positive Darwinian selection following both gene duplication events and speciation events within the same clade.

The branch-site models assume that selection pressure (ω) varies among sites and at a subset of these sites, selection pressure changes in a specified set of branches called the foreground (FG) branches. All other branches are referred to as background (BG) branches. Yang and Nielsen (2002) implemented two versions, called Model A and Model B. We choose to employ Model B, given Model B imposes less restrictions on the data than Model A (Yang and Nielsen 2002).

LRTs were used to test for (i) divergent selection pressure at a fraction of sites in specific branches of the gene tree and (ii) a ω parameter greater than 1 at such sites in those branches of the gene tree. The first LRT compares a null model that permits variation in ω among sites but not branches (M3; k = 2) with a model that permits variable ω among sites and between FG and BG branches (Model B). Note that LRT only tests for divergent selection pressure in the FG branch. With regard to identifying positively selected sites, the error rate can be excessively large when a fraction of sites is evolving under strict neutrality, e.g., ω = 1 (Hayden et al. 2001; Anisimova et al. 2002; Zhang et al. 2004), which is not distinguished by this LRT. To avoid false positives in such a case, we implemented a second LRT that compares a rigid null model where the FG branches were constrained to neutral evolution (Model B; ω = 1) with a model where positive selection is permitted in the FG branches (Model B; ω ≥ 1) (Bielawski et al. 2004). Only in cases where the second LRT was significant, did we use an empirical Bayesian approach to predict which sites were most likely to have evolved under positive selection (Nielsen and Yang 1998).

Results

Identification of Rat and Mouse V1R Functional Genes

The lack of introns in V1R genes allowed us to use a BLAST-based approach to identify a large number of presumably functional V1R genes in the rat genome sequence. For the convenience of comparison with previously identified mouse sequences, we searched the rat draft genome sequences for new V1R genes using the same stringent criteria as those by Rodriguez et al. (2002). Only those ORFs longer than 810 bp or those containing seven intact transmembrane domains were considered as putative V1R genes. A total of 102 putative rat V1R functional genes were ultimately identified, including seven genes that had been reported previously (Dulac and Axel 1995) (Table 1). We also searched the latest draft mouse genome sequence and identified 62 putative V1R genes that were missed in previous reports (Saito et al. 1998; Del Punta et al. 2000; Pantages and Dulac 2000; Rodriguez et al. 2002) (Table 1). Thus, the final set of mouse putative functional genes used in this study was comprised of 62 newly identified genes and 125 previously reported genes. Although our findings, together with the 13 problematic sequences listed in the methods section indicate a real potential for both false-negative and false-positive annotation errors, the number of each V1R subfamily in mouse and rat (Table 1) suggests that the newly discovered rat V1R genes may make up a large proportion of the members in most subfamilies.

Table 1 Numbers of genes of each V1R subfamily in mouse and rat

The mouse V1R repertoire was comprised of 12 subfamilies. The rat V1R repertoire consisted of 11 subfamilies, 10 of which were shared with the mouse. It lacked the H and I subfamilies found in mouse and possessed one unique subfamily (M), which was comprised of only one gene sequence. The new M gene sequence (rV1M1) shows all features typical of the V1R family, including the fourteen amino acid residues conserved among 95% of mouse V1Rs and a potential N-linked glycosylation site in extracellular loop II conserved in all mouse V1Rs (Rodriguez and Mombaerts 2002). In addition, we found that a pseudogene sequence in mouse genome has 83% similarity with rV1RM1 (data not shown). Bootstrap analysis also indicated strong support for a sister relationship between the M sequence and the clade comprised of the A and B subfamilies (Fig. 1a). Taken together, these data strongly support that the M sequence is a new component of V1R family.

Figure 1
figure 1

Neighbor-joining tree of 289 putatively functional V1R genes from the rat and mouse. After the removal of gaps, a total of 552 nucleotide sites is used in reconstructing the tree. Genetic distance was based on the Jukes–Cantor model. A Subtree of the A, B, and M subfamilies. The V1R genes and branches in the A subfamily were classified into six subsets. Each of them was separately specified as foreground (FG) in the branch-site model of Yang and Nielsen (2002). Subset a was comprised of the basal lineages of the A subfamily, exclusive of clusters 1 and 2. Subset b was comprised of the branches that immediately postdate the duplication event that gave rise to clusters 1 and 2 of the A subfamily. Subset c was comprised of the branches that immediately postdate the mouse and rat speciation event resolved within cluster 1. Subset d included all branches of cluster 2. Subset e consisted of all branches of the rat clade of cluster 1. Subset f consisted of all branches of the mouse clade of cluster 1. Percentage bootstrap values (≥50) are shown on interior branches. B Topology of whole V1R gene families. The T2R genes are used as the outgroup. The bootstrap values of each branch separating the various subfamilies are indicated. A color figure is available as online Supplementary Material.

In mouse, the H and I subfamilies are distributed in a cluster on chromosome 13, which is homologous to rat chromosome 17. Surprisingly, no orthologs of the mouse H and I subfamilies were identified in our initial search of the rat genome sequences. However, when mouse V1RI genes were used as queries, a sequence segment with very high similarity was identified. The ORF of this segment contained a premature stop codon. Thus, it appears that the only rat sequence segment that is potentially orthologous to the I subfamily of mouse is likely to be a pseudogene. In the mouse genome, the H and I subfamily clusters are flanked by the histone 1 gene (NM_178198) and the Abt1 gene (NM_013924). Coincidently, the rat sequence segment identified above is flanked by the same two genes. Thus, it appears that at least one sequence ancestral to the I subfamilies existed before the divergence of the mouse and rat lineages. When mouse H subfamily genes were used as queries, a sequence of high similarity was identified on rat chromosome 15 rather than 17. Hence, the origin of H might predate the divergence of mouse and rat as well.

Phylogenetic Relationships of Functional V1R Family from Mice and Rats

Figure 1b shows a phylogenetic tree inferred from 289 putatively functional V1R genes of the rat and mouse. As a representative family, the subtree of families A, B, and M is presented in Fig. 1a. Other subtrees of each family are separately provided online as supplement 1. Phylogenetic analysis showed that the newly identified rat genes represent 12 evolutionary lineages with strong supports. Moreover, these data reveals that mouse and rat sequences do not form two separate clusters in each subfamily. They appear intermingled with each other. This result suggests that many gene duplication events predated the separation of mice and rats. By contrast, some V1R genes formed species-specific clades. For example, within the A subfamily (Fig. 1a) mouse V1R genes, V1RA1, V1RA11, V1RA2, V1RA3, V1RA4, V1RA5, and V1RA6 were grouped together, and the rat V1R genes rV1RA3, rV1RA4, rV1RA5, rV1RA6, rV1RA7, and rV1RA8 formed another separate clade. These species-specific genes may be the result of gene conversions and/or postspeciation duplications. Gene conversions among paralogous genes were detected by Sawyer’s (1989) test for both the rat and the mouse. Among the 289 mouse and rat V1R genes, only 25 gene conversion events were detected, suggesting that gene conversions may not have played a major role in the evolution of V1Rs; being infrequent and acting locally. Thus, most species-specific genes may be formed by postspeciation duplications and might have species-specific functions or perhaps could relate to reinforcement of speciation, as has been suggested by Lane et al. (2002).

Relative Times of Expansion of Different GeneFamilies and Clusters

Since almost all V1R-like sequences found in humans seem to be pseudogenes (Pantages and Dulac 2000; Rodriguez et al. 2000; Giorgi et al. 2000; Lane et al. 2002), it is interesting to estimate the relative divergence times of the functional genes found in rodents. The two-cluster test of Takezaki et al. (1995) revealed that the molecular clock hypothesis could not be rejected in our all cases. A linearized tree for these data is shown in Fig. 2. It is clear from the topology that this family emerged between 80 million and 100 million years ago, during the proposed radiation of placental mammals (80–110 MYA; Springer et al. 2003), and many subfamilies originated after the split of rodents and primates (about 75 MYA; Springer et al. 2003), indicating that they are rodent-specific genes. Figure 2 also reveals a wide rage of divergence dates for the subfamilies, yet most clusters within different subfamilies underwent an expansion during a relatively short period of time.

Figure 2
figure 2

Linearized tree for the V1R superfamily. The divergent times and nucleotide distances between different subfamilies are marked on the scale below the tree. The detailed divergent times of each cluster and subfamily are given in parentheses. The sizes of black triangles correspond to the numbers of genes within the clade.

It can be very useful to estimate dates for such events, however, estimating molecular dates for rodents is complicated by the need to employ calibration points from other lineages where there is significant rate heterogeneity (Adkins et al. 2001, 2003). As rodent V1R genes are evolving in a clock-like manner, it is possible to calibrate a local time scale of their evolution. Although in this case the relative ages obtained may be correct, the specific estimates of dates could be biased. We inferred such a local time scale by assuming the mean nucleotide substitution rate of 5 × 10−9per year per site for rodent evolution (O’Huigin and Li 1992), which is extremely similar to the substitution rate estimated from the entire genomes (Rat Genome Sequencing Consortium 2004). Based on this calibration, many subfamilies underwent an expansion between 12 and 20 MYA (Fig. 2), very close to the mouse–rat split time. This estimation was consistent with that of Lane et al. (2002), which was obtained by analyzing mouse V1R gene-block duplications.

Adaptive Diversification After Gene Duplications in the V1R Family

Why did so many genes from different subfamilies undergo such a substantial expansion over a short time period? To help answer this question, we attempted to resolve the mode and tempo of selection pressure acting on the V1R gene family. First we focused on the clusters which underwent an expansion between 12 and 20 MYA (marked in Fig. 2) and separately estimated the distribution of selection pressures (ω) among sites in each cluster using the “site-specific” models. Estimates of the parameter values of the ω distribution under M3 (data not shown) and M8 (Table 2) indicate that a fraction of sites is under positive selection in each of the subfamilies tested. LRTs of M0 and M3 were significant and confirmed that selection pressure varied among sites (Table 2). Moreover, the LRTs of M7 and M8 were significant in all cases, indicating that some variation in selection pressure was due to the evolution of a subset of sites by positive Darwinian selection (Table 2).

Table 2 LRT of positive selection for the subfamilies and clusters expanded during the mouse–rat split

To more precisely resolve the mode of adaptive evolution, we performed a more detailed analysis of the genes in the A subfamily using “branch-site” models. As indicated in Figure 2a, six nonintersecting subsets (af) of the A subfamily were specified as the foreground (FG) branches and the complement was specified as the background (BG) in each analysis. Parameter estimates under Model B indicated that a fraction of sites (1–9%) is evolving by positive selection in four of the six subsets of branches (Table 3). LRTs were performed to test the improvement obtained by allowing heterogeneous selection pressure between the BG and FG branches (Model B). Note that in this case the null model only allowed variable selection pressure among sites (M2; k = 2). Results (Table 3) indicated that a significant improvement was achieved by allowing heterogeneous selection pressures in branch subsets a, d, and e. Subset a consisted of the basal lineages of the A subfamily, subset d consisted of all branches in cluster 2, and subset e consisted of all branches of the rat clade of cluster 1 (Fig. 1a).

Table 3 Parameter estimates and likelihood scores for the A subfamily genes

The LRT above is a test of heterogeneous selection pressure between the FG and BG branches. We also wanted to test whether such heterogeneity was caused by the positive selection in the FG branches. Hence, we conducted an LRT that compared a null hypothesis having sites in the FG branches constrained to a rigid neutral model (Model B with ω2 = 1) to an alternative hypothesis that permitted those branches to have values consistent with positive selection (Model B with ω2 ≥ 1). Table S1 contains the parameter estimates and likelihood scores under the null and alternative hypotheses. In each case (subsets a, d, and e), the result of this LRT was significant (Table S2), indicating that the estimates of ω 2 in those cases were significantly larger than 1. Thus, the LRTs are consistent with positive selective pressure driving functional divergence V1R genes.

Since the parameter estimates were consistent with positive selection, and significant LRTs were obtained, it was desirable to infer which sites are under positive selection in each subset. In this case we wanted to know if the target of positive selection had changed over evolutionary time in the A subfamily. We examined the posterior probabilities for site classes under Model B. At the P > 50% level, 10 sites were identified as potential targets of positive selection in subset a, and 11 sites in subset d (Table 4). None of the sites identified in subset d matched those in subset a, suggesting a dramatic shift in the target of positive selection. When 3D structures of this protein are available in the future, we can investigate whether these putative positively selected sites are located in the ligand binding regions.

Table 4 Predicted positive selection sites for subsets of the V1R phylogeny inferred under Model B and two different methods of computing codon frequencies

Sampling errors are known to be high for ML parameter estimation under complex codon models (Anisimova et al. 2001, 2002). Our above results were obtained under the F3×4 model of codon frequencies, which uses the nucleotide frequencies at the three positions of the codon to compute the equilibrium codon frequencies (Goldman and Yang 1994). Because the reliability of Bayesian site identification depends on the reliability of the ML parameter estimates, it is important to assess the robustness of the results under different model assumptions. Therefore, we also estimated the parameters of Model B under a different model (F61) that uses empirical estimates of the 61 codon frequencies. Results of parameter estimation and LRTs were in good agreement under both models (data not shown), with Bayesian site identification being identical for subsets d, e, and f and only a few differences observed for subset a (Table 4).

Discussion

In this paper, we took advantage of the draft rat genome sequence to further study V1R gene family evolution in rodents. Using computational methods we identified 102 and 62 potentially functional V1R genes from the rat and mouse genome, respectively. Together with previously identified rodent V1R genes (Saito et al. 1998; Del Punta et al. 2000; Pantages and Dulac 2000; Rodriguez et al. 2002), we conducted a comparative analysis in rodents. A new subfamily of V1R genes (M) that appears to be unique to the rat lineage and two subfamilies apparently unique to mouse (H and I) were identified. In addition, most of the V1R genes within different rodent subfamilies exhibit species-specific gene clusters, suggesting the presence of “species (lineage)-specific” vomeronasal pheromone receptors. Groups of these receptors may have evolved separately in different species to deal with the specific types of chemicals they encounter. Interestingly, species-specific clusters are also observed in the rodent T2R family, where it is thought that they could reflect the adaptive evolution of species-specific bitter taste receptors (Shi et al. 2003).

Relative divergence dates for rodent V1R subfamilies were variable, but no date was earlier than the divergence date of primates and rodents. Hence, V1R diversity in rodents reflects the interplay of evolutionary forces originating within the rodent lineage. Interestingly, if we assume the rate estimated by O’hUigin and Li (1992), a large number of clusters within subfamilies occur close to the divergence time of mouse and rat, with additional expansions having occurred recently within both mouse and rat (as recently as 9 MYA).

Lane et al. (2002) noted that expansion of three V1R loci sampled from the mouse genome appeared to coincide with the divergence of mouse and rat. A remarkable finding of this study was that expansion of most of the V1R gene clusters within different rodent subfamilies seemed to occur very close together in time. This suggested a gene family–wide process of expansion. To determine if positive Darwinian selection could have played a role during the evolution of these clusters, we measured selection pressure in these clusters using the d N /dS ratio (Yang and Bielawski 2000). We detected a small fraction of sites evolving under positive selection in each cluster. Although evolution by positive selection is not uncommon in sensory receptor gene families (Ngai et al. 1993; Hughes and Hughes 1993; Shi et al., 2003), our findings, and those of Lane et al. (2002), are unusual in that they raise the possibility that positive selection might have been related to a species level divergence event.

We conducted a further analysis of selection pressure within the A subfamily because the mouse–rat divergence, as well as a gene duplication event, was clearly resolved within this subfamily. Using codon models that allowed selection pressure to vary both among sites and branches, we discovered an episodic mode of positive selection. There was a strong signal for positive selection in some parts of the A subfamily gene tree, and none in other parts. Moreover, the sites targeted by positive selection differed among the different parts of the tree. Surprisingly, no evidence for positive selection was found in the branches associated with the divergence of mouse and rat or in the branches associated with the divergence of clusters 1 and 2 within the A subfamily. Rather, the general pattern was one of adaptive evolution during the expansion of clusters (mouse cluster 1, rat cluster 1, and cluster 2) that followed the divergence of mouse and rat.

The association of the V1R family with detection of pheromones affecting social and sexual behaviors is intriguing, as genes that mediate sexual reproduction are often more divergent than nonproductive genes (e.g., reviewed by Swanson and Vacquier 2002). Although the composition and variability of pheromonal mixtures are largely unknown among mammals, they have been widely used to detect the social and reproductive status of potential mates (Halpern 1987; Wysocki and Lepri 1991). A mixture of pheromones evolving by a rapid neutral process could create strong selection pressure for a recognition system such as V1R to quickly adapt to pheromonal changes. An analogy can be made with the rapid neutral evolution in abalone VERL that creates strong selection pressure for rapid adaptive change in lysin (Metz et al. 1998). In the case of abalone, neutral mutations accumulate differently in VERL in different populations, and lysin in different populations must adapt in different ways in order to maintain an efficient VERL–lysin interaction (Metz et al. 1998). In the case of rodents, if pheromone mixtures evolved rapidly and accumulated neutral changes in isolated populations, then recognition systems such as V1R could be subject to different selective pressures in different populations. This could lead to lineage-specific incompatibility of pheromone recognition systems. If a large and complex V1R repertoire is required to discriminate between pheromone blends, we might then expect positive selection pressure to have acted across many members of the V1R family. Interestingly, many members of the V1R family were subject to adaptive evolution, as revealed by our result

Genes that mediate reproduction can be the direct target of sexual selection (e.g., reviewed by Swanson and Vacquier 2002), in which case the above model would not be correct. For instance, if pheromone-mediated mate selection is important in rodents, the characteristics of the pheromonal mixtures might themselves be under intense sexual selection. In this case adaptive co-evolution of the associated phermonal recognition system, such as V1R, would be driven by the strength of sexual selection on the pheromonal mixture. If sexual selection played a prominent role in the mouse–rat speciation process, we might expect to see a strong signal for adaptive evolution in reproductive genes along the branches most closely associated with the speciation event. In this study, we were unable to detect such a signal in the A subfamily of V1R.

We have argued for the possibility that evolution of pheromone blends in isolated populations of rodents could have generated selective pressure for adaptive co-evolution of the V1R repertoire. Given enough time, such an evolutionary process could lead to reproductive isolation through divergence in pheromone-based mate selection systems. Pheromone and receptor co-evolution has been suggested to play a role in the speciation process of some insects (e.g., Roelofs and Rooney 2003). However, we observed that adaptive evolution only occurred in V1R clusters that expanded following the mouse–rat speciation event, and not along those branches associated with the speciation event itself. Based on these findings, we suggest that co-evolution of pheromone blends and a V1R-based recognition system might have played an important role in reinforcing barriers to fertilization among species of rodents, rather than being directly involved in the speciation event. Such issues cannot be resolved for rodents without further studies along two different fronts. First, studies are needed to determine the importance of pheromone blends in distinguishing between one’s own species and a closely related species of rodent. Second, the molecular evolution of the V1R family needs to be investigated within additional and more closely related species of rodents.