Introduction

The major histocompatibility complex (MHC) is a multigene family that contains genes for processing and presentation of antigens to cells of the immune system (Klein 1986). The two primary classes of MHC genes differ in structure and function, with class I genes presenting endogenous antigens and class II genes presenting exogenous antigens (Klein 1986). High levels of genetic variation (compared to presumably neutral genes) are observed at functional MHC genes, and this has been attributed to a number of forces, including pathogen-mediated selection, kin selection, mate choice, and maternal/fetal interactions (Apanius et al. 1997; Edwards and Hedrick 1998; Hedrick 1994).

MHC genes have become the classic example for balancing selection (Garrigan and Hedrick 2003; Hedrick 1994) and analysis of variation in MHC genes responsible for the presentation of antigens to the immune system has revealed many patterns that are attributed to the effects of long-term balancing selection. This includes the retention of ancestral polymorphisms (Klein et al. 1998) and an increase in nonsynonymous substitutions, a classic signal of positive selection (Hughes and Nei 1989; Hughes et al. 1990). The retention of ancestral polymorphisms is due to balancing selection countering the effects of drift (lineage sorting) over evolutionary time periods (Edwards et al. 1997; Klein et al. 1998), whereas the increase in amino acid-altering substitutions is attributed to direct selection for allelic diversity. While the long-term evolutionary effects of balancing selection on MHC variation are not in doubt, attributing contemporary selective forces to MHC genes has been more difficult (Aguilar and Garza 2006; Landry and Bernatchez 2001).

Recombination and gene conversion have also been shown to be important factors in the generation of allelic diversity at MHC loci. They affect the phylogenetic patterns of MHC genes at different evolutionary scales in different vertebrate groups. Interlocus gene conversion generates distinct phylogenetic patterns in the MHC genes of birds (Edwards et al. 1995) and has also been shown to influence genetic variation at class II genes in sticklebacks (Reusch et al. 2004). These forces may act at very localized physical scales, even within a single exon (Bergstrom et al. 1998), and generate functionally important genetic variation much more quickly than mutation alone.

Early work on the molecular evolution of MHC class IIB genes in salmonids revealed somewhat different patterns of variation than observed in other vertebrate taxa (Miller and Withler 1996). This included genus- and species-specific allelic lineages (Miller and Withler 1996; Shum et al. 2001) and a low amount of allelic variation, with low divergence observed among alleles (Kim et al. 1999; Miller and Withler 1996; Miller et al. 1997). These observations were attributed to possible reductions in effective population size (recent or historical) for the species of Oncorhynchus examined (Miller and Withler 1996) or a completely different mode of evolution at the salmonid MHC (Shum et al. 2001). However, population-level studies of MHC allelic variation in salmonids have also found the classical pattern of increased nonsynonymous substitutions (Aguilar and Garza 2006; Dorschner et al. 2000; Kim et al. 1999; Miller et al. 1997, 2001), indicating that positive selection still does operate on MHC genes in these fish. Interestingly, transspecific evolution of alleles has been observed for MHC class I genes, both between genera (Oncorhynchus and Salmo [Shum et al. 2001]) and among some species of Oncorhynchus (Garrigan and Hedrick 2001).

Here, we use patterns of allelic variation in the second exon of the MHC class IIB gene of salmonids to test a number of hypotheses regarding the evolution of this gene. We use data from 11 salmonid species to evaluate the extent of transspecific evolution among three closely related salmonid genera (Oncorhynchus, Salmo, and Salvelinus), as well as among species of Oncorhynchus. We also investigate the role that gene conversion/recombination plays in the observed phylogenetic patterns. Finally, we employ a codon-based model of selection to identify sites that have a pattern indicative of recent natural selection.

Methods

Sequences of exon 2 from the MHC class II β chain were obtained from 11 salmonid species. The dataset included newly generated data, as well as previously published sequences (Table 1). The identification and isolation of unique alleles were done via PCR of the exon 2 fragment with the primers B1AF and B1AR (Miller et al. 1997), followed by single-strand conformational polymorphism (SSCP) analysis. Four microliters of diluted DNA was used as a template in a 15-μl PCR. The reaction contained 1× PCR buffer (Applied Biosystems, Inc.), 0.5 units of Taq DNA polymerase, 1.5 mM MgCl2, 0.67 mM of each primer, and 100 nM of each dNTP. The cycling conditions consisted of an initial denaturation of 2 min at 95°C, followed by 30 cycles of 95°C for 30 s, 56°C for 30 s, and 72°C for 30 s. The reaction was followed by a 5-min extension at 72°C. PCR products were then diluted (3:5) in SSCP loading buffer (95% formamide, 3.2 mM EDTA, 0.025% bromophenol blue, 0.025% xylene cyanol). This mixture was then denatured at 100°C for 3 min and immediately cooled in an ice bath. Three microliters of the PCR:dye mixture was then loaded on a nondenaturing 6% polyacrylamide gel (0.5 TBE and 5% glycerol [v/v]). Products were run for 6–8 h at room temperature (20 W). Gels were stained with 1× SYBR Gold (Molecular Probes Inc.) and visualized on a BioRad FX molecular imager. Unique SSCP bands were then excised from the gel and placed in 50 μl deionized H2O overnight. Two microliters of this solution was then used in a PCR (same conditions except 30-μl volume) and products were precipitated with an equal volume of 20% polyethylene glycol (PEG). Precipitates were spun at 3250 rpm for 30min and the supernatant was removed. The DNA was washed once with 75 μl of 80% ethanol, dried in a vacuum centrifuge, and then resuspended in 30 μl of H2O. These products were directly sequenced with the forward and reverse primers using the ABI BigDye (v3.1) chemistry on an ABI 377 automated DNA sequencer (Applied Biosystems, Inc.). Sequences were imported into Sequencher (Genecodes Corp.) and aligned manually.

Table 1. Salmonid species, collection locations, number of individuals sampled (N), number of alleles (K), reference, and GenBank accession numbers for the fish in this study

Phylogenetic Analysis

The hierarchical likelihood ratio test (hLRT) implemented in ModelTest (version 3.4; Posada and Crandall 1998) was used to assess the appropriate model of sequence evolution for the aligned salmonid class IIB sequences. Models were assessed at two phylogenetic levels: the entire dataset and within each genus (Oncorhynchus, Salmo, and Salvelinus). PhyML (Guindon and Gascuel 2003) was used to construct a maximum likelihood (ML) tree using the appropriate model and parameters. Node support was evaluated with 1000 bootstrap replicates. A neighbor-joining (NJ) tree was also constructed using PAUP*4 (Swofford 2003) and node support was evaluated with 1000 bootstrap replicates.

Estimation of Recombination

Tests for recombination or gene conversion were performed with the method of Sawyer (1989) and the program GENCONV (v1.81; Swayer 1989). The global test for recombinant events was used with 10,000 permutations of the data to assess significance. Zero mismatches were allowed and p-values were corrected for multiple comparisons. The minimum number of recombinant events was also evaluated with the four-gamete method of Hudson and Kaplan (1985) as implemented in DNAsp (v4; Rozas et al. 2003).

Estimation of Selection

Per site rates of nonsynonymous (d N) and synonymous (d S) substitutions were estimated with the modified Gojobori and Nei method in MEGA3 (Kumar et al. 2004) with a Jukes-Cantor correction. Both rates were estimated for the entire available exon 2 sequence, and separately for codons thought to be involved in antigen-binding and non-antigen-binding codons (based on the human molecule [Brown et al. 1993]). Standard errors were estimated with 500 bootstrap replicates.

The method of Yang (1997), PAML, was used to identify codons potentially subject to diversifying selection. This analysis was performed separately for all species. Each dataset was evaluated under two different models of codon evolution (M7-β; M8-β and ω) and models were compared with a likelihood ratio test (LRT). The M7 and M8 models were used due to their robustness in the face of recombination (Anisimova et al. 2003). Multiple Markov chain searches were performed for each analysis with different initial values of ω (0.5, 1.0, and 2.0) to ensure convergence.

Results

The hLRT indicated that the best model of sequence evolution for the entire dataset was the F81 + I + Γ model (I = 0.37, Γ = 0.60). The ML and distance analyses revealed monophyletic groupings of MHC class IIB exon 2 alleles for all three salmonid genera, though bootstrap support was low (Fig. 1). The grouping of the Oncorhynchus alleles was supported by only 63% of bootstrap replicates for the ML analysis and 66% of replicates for the NJ analysis. The alleles from Salvelinus and Salmo clustered with <50% bootstrap support.

Fig. 1.
figure 1

Maximum likelihood (ML) phylogeny of salmonid MHC IIB alleles generated using the F81+Γ+I model of evolution (see text for details). Numbers next to nodes indicates support >50% from 1000 bootstrap replicates (ML/NJ).

Model evaluation for the Oncorhynchus sequences indicated that the JC +I + Γ (I = 0.55, Γ = 0.55) model was most appropriate. The ML and NJ trees had similar overall topologies. Both trees had elevated bootstrap support for monophyletic groupings of alleles from O. gorbuscha, O. keta, and O. nerka (Fig. 2), though support for the monophyletic grouping of O. nerka alleles was relatively low. In contrast, alleles from O. clarki and O. mykiss were scattered throughout the tree, and there was not elevated bootstrap support for clusters of alleles from either of these species (Fig. 2). Alleles from O. tshawytscha and O. kisutch formed a monophyletic group with elevated bootstrap support (Fig. 2). Closer examination of the ML and NJ subtrees that contain the O. tshawytscha and O. kisutch alleles revealed that they have qualitatively different topologies (not shown). The ML tree found that the alleles from the two species do not group together and are paraphyletic, whereas the NJ tree possesses monophyletic clusters of alleles from both species. Elevated bootstrap support was not observed for any of the internal nodes in either tree (not shown). A Shimodaira-Hasegawa (SH; 1999) test was used to evaluate if the ML topology was more likely than the NJ topology to represent the true relationship of O. kisutch and O. tshawytscha alleles. The SH test was performed using the Jukes-Cantor model of evolution with gamma shape parameter and proportion of invariant sites empirically estimated in PAUP*. To test significance, 10,000 bootstrap replicates were performed in PAUP*. The SH test indicated that the ML subtree was not significantly more likely than the NJ subtree (Δ –lnL = 20.64, p = 0.135) to represent the true relationship.

Fig. 2.
figure 2

Maximum likelihood (ML) phylogeny of Oncorhynchus MHC IIB alleles generated using the JC+Γ+I model of evolution (see text for details). Numbers next to nodes indicates bootstrap support >50% from 1000 replicates (ML/NJ). Only elevated bootstrap support values are shown for branches leading to species-specific groups. Unlabeled branch tips are MHC IIB alleles from O. mykiss. Filled diamonds indicate alleles that are shared between O. clarki and O. mykiss, while open diamonds indicate alleles isolated from O. clarki.

Phylogenetic analysis of alleles from S. salar and S. trutta using the HKY + I + Γ (I = 0.63, Γ = 0.78) model of sequence evolution did not find species-specific groupings of alleles (Fig. 3A). The ML and NJ trees for alleles from the two Salvelinus species, S. malma and S. namaycush (model of sequence evolution: TrN + I + Γ [I = 0.51, Γ = 0.58]), revealed a pattern similar to that observed in Salmo; alleles clustered irrespective of species (Fig. 3B).

Fig. 3.
figure 3

Maximum likelihood phylogeny of (A) Salmo and (B) Salvelinus MHC IIB alleles generated using the HKY+Γ+I (Salmo) and TrN+Γ+I (Salvelinus) models of evolution (see text for details). Elevated bootstrap support was not found for any internal nodes in both analyses. Filled circles represent alleles from S. trutta and unlabeled tips from S. salar (A) and filled diamonds represent alleles from S. malma and unlabeled tips from S. namaycush (B).

The GENECONV analysis revealed statistical significance for gene conversion events in all species except for O. gorbuscha, O. tshawytscha, and S. malma (Table 2). The Hudson four-gamete test revealed recombinant events in all species except O. gorbuscha (Table 2). Within five Oncorhynchus species, relatively large tracts (10 codons or 30 nucleotides) with no variable sites were found (Fig. 4). Such tracts were not found in other salmonid species.

Table 2. Results of the analysis for recombination/gene conversion for each of the 11 species of salmonids: statistical significance for global internal (GENECONV-I) and global outer (GENECONV-O) recombination events, as well as the minimum number of recombinant events (Rm) based on the Hudson four-gamete test
Fig. 4.
figure 4

Sequence alignment of representative MHC class IIB genes analyzed for codon-specific positive selection from seven species of salmonid. Codons are numbered based on the HLA-DRB*0101 allele, and take into account the amino acid insertion observed in teleosts. A + indicates sites involved in antigen binding in the human model. Positions in boldface have a p > 0.99 and underlined positions have a p > 0.05 of being under positive selection (see Results). Shaded areas indicate monomorphic tracts (>10 amino acid sites).

Nonsynonymous substitutions exceeded synonymous substitutions in all species (Table 3), although the differences were not significant in two cases. When the comparisons were performed separately for only those codons involved in antigen binding (ABS codons, based on the human molecule) and for those not involved in antigen binding, nonsynonymous substitutions still exceeded or equaled synonymous substitutions in all species for both classes of codon, although the differences were not significant in four and seven species for ABS and non-ABS codons, respectively. All members of the genus Oncorhynchus, except O. gorbuscha and O. nerka, possessed similar levels of within-species synonymous divergence (Table 2). Mean within-genus synonymous divergence was 0.024 (SE  = 0.012) for Oncorhynchus, 0.034 (0.012) for Salmo, and 0.047 (0.016) for Salvelinus.

Table 3. Non synonymous (d N) and synonymous (d S) substitutions (per site) based on the modified Nei and Gojobori method with a Jukes-Cantor correction for multiple comparisons; standard errors are given in parentheses

The site-specific analysis for selection revealed that the β + ω (M8) model had a higher likelihood than the β model (M7) for all but one species, O. gorbuscha (Table 3). All other analyses were significant at the p < 0.01 level, except for the O. keta analysis, which was significant at the p < 0.05 level. This analysis identified a number of putatively selected codon sites in each of the salmonid species analyzed and seven sites with such a signal in the majority of species analyzed (Fig. 4). Four of the sites that showed evidence of positive selection in at least half of the species corresponded to ABS codons in the human molecule (β37, β78, β81, and β85; Fig. 4). Three sites that showed evidence of positive selection (β24, six species; β53, five species; and β86, nine species) in a majority of the species examined did not correspond to ABS codons in humans.

Table 4. Results of the PAML analysis on salmonid class II β PBR sequences

Discussion

This analysis of sequences from the second exon of the salmonid class IIB gene revealed two major hallmarks of MHC evolution: transspecific evolution of alleles and increased nonsynonymous/synonymous (d N:d S) substitution. Transspecific evolution was observed between species in all three genera examined: Oncorhynchus, Salmo, and Salvelinus. Recombinant alleles were found in most species, with the notable exception of O. gorbuscha. The pattern of increased d N:d S was found for all species and the site-specific method for identifying positive selection found a number of codons under selection in most salmonid species. Some of these sites under selection coincided with mammalian antigen binding sites, whereas others did not.

A previous study (Shum et al. 2001) found a lack of retained ancestral polymorphism among salmonid genera, though only two genera were examined (Oncorhynchus and Salmo). Here a lack of ancestral polymorphism was observed among three salmonid genera, including the more closely related Oncorhynchus and Salvelinus (Crespi and Fulton 2004; McKay et al. 1996; Oakley and Phillips 1999). However, there was a lack of elevated bootstrap support for internal branches that cluster genus-specific alleles. This may be due to the short length of the gene segment examined (Cummings et al. 1995), which is an inherent problem in examining the second exon of MHC class II genes, as it is only 90 amino acids long. This apparent lack of retained ancestral polymorphism among closely related genera is contrary to what has been observed in other vertebrate groups (Fan et al. 1989; Figueroa et al. 1988; Yuhki and OBrien 1997). Retention of MHC class IIB allelic lineages has also been observed between genera in other teleost groups, including the Cyprinidae (Graser et al. 1996; Ottova et al. 2005) and Cichlidae (Figueroa et al. 2000; Ono et al. 1993). The retention of ancestral polymorphisms within the Cyprinidae was even observed between the two subfamilies Cyprininae and Leuciscinae, whose divergence has been dated to approximately 27.7 million years ago (MYA) (Ottova et al. 2005). In the Cichlidae, ancestral polymorphism was observed between Tilapines and Haplochromines, groups that diverged approximately 8.4 MYA (Figueroa et al. 2000). Divergence times among Oncorhynchus, Salmo, and Salvelinus are within this range—OncorhynchusSalmo divergence, ∼15–20 MYA (McKay et al. 1996)—yet retention of ancestral polymorphism between genera was not observed. This suggests that evolutionary pressures on salmonid MHC genes may differ from those in other vertebrate groups, which could explain this disparity in phylogenetic patterns (Shum et al. 2001).

Observed phylogenetic patterns within the genera Oncorhynchus, Salmo, and Salvelinus were consistent with transspecific evolution. In contrast, high bootstrap support was found for species-specific groupings of alleles from O. gorbuscha, O. keta, and O. nerka. Such clustering has been described previously for a smaller dataset and past reductions in effective population size were proposed to account for this pattern (Miller and Withler 1996). Substantial bootstrap support was also found for the branch leading to all O. kisutch and O. tshawytscha alleles, whereas alleles from O. clarki and O. mykiss (closely related sister species) were spread throughout the Oncorhynchus tree. This may indicate that the selective pressure to retain ancestral polymorphisms is greater in the O. clarki-O. mykiss lineage than in other Oncorhynchus lineages. In addition, there were a number of alleles that were shared between O. clarki and O. mykiss, which could be due to long-term balancing selection or hybridization (Allendorf and Leary 1988). Natural hybridization between these two species where they co-occur is quite common (Baumsteiger et al. 2005; Bettles et al. 2005; Young et al. 2001), and the latter scenario cannot be ruled out without the ascertainment of “pure” O. clarki or O. mykiss individuals.

The data were inconclusive regarding the existence of transspecific evolution of allelic lineages between the species O. kisutch and O. tshawytscha, as the two rooted phylogenies had different topologies. Paraphyly of alleles from the two species was observed in the ML tree, whereas the NJ tree had species-specific allelic lineages. In addition, the SH test indicated that the ML tree was not significantly more likely to accurately represent the data than the NJ tree. Unfortunately, the two hypotheses, transspecific evolution and species-specific lineages, cannot be sufficiently distinguished with the current dataset. Even though O. kisutch and O. tshawytscha are sister species, it is unlikely that recent hybridization contributes to this pattern, as natural hybrids between the two species have not been documented and no alleles were shared between the two species.

There does appear to be a phylogenetic component to the retention (or lack thereof) of ancestral polymorphism in Oncorhynchus. The species with monophyletic allelic lineages (O. gorbuscha, O. keta, and O nerka) are all closely related (Crespi and Fulton 2004; McKay et al. 1996; Oakley and Phillips 1999) and form a monophyletic lineage within the genus. Similarly, O. kisutch and O. tshawytscha are sister taxa and possess little or no transspecific evolution of MHC class IIB alleles. It is possible that differences in life history contribute to the observed pattern of MHC evolution in these Oncorhynchus species, as they are predominantly anadromous. The species that retain more ancestral polymorphisms (O. clarki and O. mykiss) are closely related to one another, form the earliest-branching lineage in the genus, and contain populations with both anadromous and resident life history forms (contemporarily and historically).

The transspecific evolution observed in the genera Salmo and Salvelinus indicates that species within these genera have not experienced the same historical demographic and/or selective pressures that have influenced genetic variation in most Oncorhynchus species. Transspecific evolution at MHC class IIB genes has been reported previously in a phylogenetic analysis of S. salar and S. trutta alleles (Stet et al. 2002). We have also shown that retained ancestral polymorphisms occur between two species of Salvelinus. Interestingly both species of Salmo and Salvelinus malma possess both anadromous and resident forms, much like O. clarkii and O. mykiss. While it remains unclear what attributes (demographic effects, life history variation, selective sweeps) account for the lower MHC class II B variation and species-specific allelic lineages in O. gorbuscha, O. keta, and O. nerka, the phylogenetic attributes of MHC diversity for this lineage contrast greatly with those of other salmonids.

While limitations appear to be present on the extent of transspecies evolution observed in the class IIB gene of teleost fish, the retention of ancestral polymorphism may occur over longer time periods for class I genes. Retention of ancestral polymorphism has been observed between Oncorhynchus and Salmo for class I MHC genes (Shum et al. 2001). Long-lived allelic lineages have also been observed in class I genes from Chinook salmon (Garrigan and Hedrick 2001). A similar pattern with regard to the evolution of MHC genes was found in Lake Tana barbels (Kruiswijk et al. 2005). These differences in evolutionary patterns between class I and class II genes of teleost fish may have been facilitated by the lack of physical linkage of the two gene clusters. Whereas they are found on the same chromosome in mammals, they are found on two different chromosomes in teleost fish (Bingulac-Popovic et al. 1997). However, without substantial data on variation of class I MHC genes, or other unlinked genetic markers, for a large sample of Oncorhynchus species or a greater understanding of the class II-facilitated immune response in salmonids, we cannot distinguish between the competing hypotheses of neutral demographic versus selective/functional forces differentially affecting class II MHC diversity in salmonids.

Recombination and gene conversion appear to be important mechanisms in the generation of allelic variation of MHC genes in vertebrates (Parham and Ohta 1996; Bergstrom et al. 1998). Evidence of extensive recombination and gene conversion was found in some of the salmonid species studied here (O. clarki, O. mykiss, Salmo trutta, Salvelinus namaycush) but not in others (O. gorbuscha and O. keta). However, no within-species variability was found in the 5’ portion of exon 2 in O. gorbuscha (first 41 amino acids) and O. keta (first 31 amino acids). There are also smaller invariant amino acid tracts in O. kisutch, O. nerka, and O. tshawytscha. Recurrent gene conversion could lead to a homogenization of alleles and generate such homogeneous tracts. Intra- and interlocus gene conversion has been described in other teleost species (Reusch and Langefors 2005; Reusch et al. 2004). In birds, gene conversion is thought to be a major factor in the generation of observed phylogenetic patterns (Edwards et al. 1995; Wittzell et al. 1999), and in humans highly localized gene conversion has been shown to contribute substantially to MHC β gene diversity (Bergstrom et al. 1998). While gene conversion is a ubiquitous evolutionary force in the MHC, the hypothesis of reductions in effective population size or selective/functional differences as an explanation of species-specific allelic lineages and/or monomorphic amino acid tracts within the second exon cannot be ruled out.

The distribution of codons found to exhibit patterns consistent with natural selection differed substantially among the 11 salmonid species surveyed. Some codon sites putatively under selection correspond with antigen-binding codons in the human molecule. However, some of these salmonid sites did not correspond to antigen-binding sites in the mammalian molecule. This is not surprising given the large evolutionary divergence between fish and mammals, and a similar result was reported in a study of cyprinid fish (Ottova et al. 2005). This discord between the codon sites under selection in the fish MHC and those known to be antigen-binding sites in mammals may reflect structural/functional differences in the MHC molecules of the two groups. However, extensive study of the structure and antigen binding properties of teleost MHC molecules is necessary to evaluate this hypothesis. However, these differences do indicate that caution should be used when inferring ABS codons in nonmammalian taxa using the human molecule as a model, as some ABS codons almost certainly differ between these two major taxonomic groups.