The laboratory rat (Rattus norvegicus) is a widely used experimental animal that provides important and well-established models for many human diseases, particularly infectious and autoimmune diseases (Günther 1999). Besides its central role in the control of immune responses against pathogens, the major histocompatibility complex (MHC) is also involved in the susceptibility of an individual to develop an autoimmune disease (Thorsby 1997). While many autoimmune diseases have been linked to MHC class II genes, e.g. the HLA-DQB gene in type I diabetes, associations are also found for class I genes, the most prominent being HLA-B*2705 and ankylosing spondylitis (Hülsmeyer et al. 2004). The genetic basis of associations between the MHC and autoimmune diseases can be well demonstrated by using MHC congenic and intra-MHC recombinant congenic mouse or rat strains. Thus, it could be shown that in MOG-induced experimental autoimmune encephalomyelitis (EAE)—an animal model of multiple sclerosis —the rat MHC determines the degree of susceptibility and the clinical course of the disease in a haplotype-specific manner (Weissert et al. 1998). Recently, the rat MHC, the RT1 complex, has been completely sequenced (Hurt et al. 2004). This sequence is based on a single RT1 haplotype, namely the RT1n haplotype of the BN rat strain. To study the extent of genomic diversity in the RT1-CE class I region, which corresponds genomically to the HLA-B region in humans and the H2-D/L/Q region in mice, we aligned the RT1-CE region class I genes of the RT1n haplotype with class I sequences of other RT1 haplotypes.

Rat MHC class I sequences were extracted from the DDBJ/GenBank/EMBL database and aligned using the Clustal X software (Thompson et al. 1997). Gene trees were constructed with the neighbor-joining algorithm (Saitou and Nei 1987) as implemented in PAUP, version 4.0b10 (Swofford 2002), and MEGA, version 2.1 (Kumar et al. 2001). Support of branching was assessed by bootstrapping based on 500 replications. MEGA was also used for the analysis of synonymous and nonsynonymous nucleotide substitutions.

The genomic sequence of the rat MHC (Hurt et al. 2004) revealed a complete set of class I gene sequences from a single haplotype, namely RT1n. This prompted us to compare these class I genes with previously known rat class I sequences derived from other RT1 haplotypes to analyze the extent of haplotypic diversity. Initially, a gene tree analysis was carried out showing that the RT1n class I sequences group roughly according to their genomic position, i.e., genes located in the RT1-A and RT1-CE region, in the RT1-N region, and in the RT1-M region cluster together (Fig. 1a). We were particularly interested in the RT1-CE region that contains 16 class I genes in the RT1n haplotype (Hurt et al. 2004), which are all of the nonclassical type. The RT1-CE region is located between the Bat1 and Pou5f1 genes (Günther and Walter 2001), a genomic interval that exhibits considerable diversity in the mouse (Kumanovics et al. 2002) and the rhesus macaque (Daza-Vamenta et al. 2004). Therefore, we included class I sequences of other rat haplotypes in the phylogenetic analysis that are most likely derived from the RT1-CE region and do not represent class I genes of the RT1-N or RT1-M type. As expected, these additional class I sequences cluster with RT1-A and RT1-CE, and not with RT1-N or RT1-M region genes (Fig. 1a, b). Furthermore, some of these sequences cluster significantly with certain RT1-CE genes of the RT1n haplotype, suggesting allelism of these loci. As can be seen in Fig. 1b, the cc1 cDNA clone derived from the RT1c haplotype (Leong et al. 1999) clusters with the RT1-CE1n gene and, therefore, they most likely represent alleles. Likewise, RT1-E2g (Lau et al. 2003) and RT1-CE5n as well as RT1–46l (Lambracht-Washington and Fischer Lindahl 2002) and RT1-CE9n appear to be allelic (Fig. 1b). A significant clustering of RT1-CE16n with RT1-U1f and RT1-U2c could also be observed, evolving the question whether RT1-U1 and RT1-U2 are indeed different loci or represent alleles of the same locus, namely RT1-CE16. With respect to the sometimes confusing nomenclature of RT1 class I genes, it should be noted that a definitive nomenclature is not yet available and we follow our proposal published recently (Hurt et al. 2004), which is based on a systematic designation according to the physical mapping of class I genes.

Fig. 1
figure 1

Gene tree analysis of rat class I exon 2 to exon 8 sequences. Only bootstrap values exceeding 95% are shown and are indicated by an asterisk. a The database accession numbers of the class I genes can be found in a comprehensive review by Günther and Walter (2001) except for BX511170 (RT1-A1n to RT1-A3n, RT1-CE1n to RT1-CE16n, all RT1-N and RT1-M region class I loci), AF457139 (RT1-L1l), AY397759 (RT1-L2l), AY445668 (RT1-L3l), and AF387339 (RT1-46l). b shows an enlarged view of the RT1-A and RT1-CE region sequences. Loci of other RT1 haplotypes that do not cluster with the RT1n haplotype loci are indicated by a box

In contrast to the mouse, where the evolution of the H2-D/L/Q region class I genes could be well deduced from the genomic sequence (Kumanovics et al. 2002), an evolutionary reconstruction was not possible for the rat RT1-CE region (Hurt et al. 2004), even by inclusion of certain repetitive sequences. However, inspection of the neighbor-joining tree shown in Fig. 1b provided some clues to the evolution of at least some of the RT1-CE region class I genes. The RT1-CE7n and RT1-CE11n genes and the RT1-CE1n, RT1-CE12n, and RT1-CE14n genes cluster significantly (Fig. 1b), suggesting that they arose by recent duplications from respective ancestral genes. A clustering (bootstrap value 86%) of the RT1-A3n and RT1-CE10n genes was also found (Fig. 1b) and confirms the evolutionary relatedness of the centromeric and telomeric class I regions RT1-A and RT1-CE (Lambracht-Washington et al. 2000; Hurt et al. 2004; Walter and Günther 2000).

Our finding that RT1-CE9n and RT1-46l are most likely alleles appears quite instructive. A cosmid sequence derived from the LEW rat (RT1l haplotype) has previously been reported by Lambracht-Washington and Fischer Lindahl (2002) showing that the RT1-46l maps adjacent to the Bat1 gene at a distance of about 12 kb. Thus, we conclude that the RT1l haplotype lacks the RT1-CE1 to RT1-CE8 loci of the RT1n haplotype (Fig. 2). An alternative explanation might be that these loci were translocated to a different genomic region. In contrast, the loci RT1-L1l, RT1-L2l, and RT1-L3l which form a small subfamily of class I genes in the RT1l haplotype (Lambracht-Washington et al. 2004) do not cluster with any of the RT1n-derived RT1-CE loci (Fig. 1b), indicating that the RT1-L subfamily either does not occur or got lost in the RT1n haplotype. Similarly, for the genes RT1-3.6av1, RT1-cc22c, RT1-cc23c, RT1-9.5f, RT1-9.6f, RT1-119l, RT1-EC2r21, RT1-Eu, RT1-Ku, and RT1-cc9c, no corresponding allele of the RT1n haplotype could be detected among the RT1-CE genes (Fig. 1b). It is not clear at the moment whether RT1-C113l and RT1-Clw2l represent alleles of RT1-CE13n and RT1-CE16n, respectively, as bootstrap support for the clustering was low (48% and 56%, respectively). These data indicate that rat MHC haplotypes show considerable diversity with respect to absence and presence of certain class I loci in the class I region extending between Bat1 and Pou5f1. Such genomic plasticity has also been reported for the RT1-A region that contains the classical class I genes of the rat (Walter and Günther 2000).

Fig. 2
figure 2

Map of the Bat1-Pou5f1 genomic interval that contains the RT1-CE region class I genes. The maps of the RT1n, RT1l, and RT1u haplotypes are based on data reported by Hurt et al. (2004); Lambracht-Washington and Fischer Lindahl (2002) and Walter and Günther (1997), respectively. Functional and nonfunctional class I genes (Hurt et al. 2004) are indicated by filled and hatched boxes, respectively. The position of the box above or below the line represents the orientation of the gene. Putative alleles of the RT1-CE loci (according to the analysis shown in Fig. 1) are indicated. In the RT1u haplotype no class I gene could be detected at 20-kb distance of the Bat1 gene (Walter and Günther 1997). The broken lines and the question marks indicate that the RT1l and RT1u haplotypes are still continuing, but the genomic interval has not been characterized in detail by genomic sequencing or detailed physical mapping

To study the level of allelic diversity, we compared the rates of synonymous (dS) and nonsynonymous (dN) substitutions in codons of the peptide-binding region (PBR) and non-PBR codons of exons 2 and 3 (Table 1). In contrast to the class Ia locus RT1-A1, which was included as a known example for positive selection, evidence for positive selection could neither be found for the expressed loci RT1-CE1, RT1-CE5 and RT1-CE16, nor for RT1-CE8 and RT1-CE9, which was expected as the latter two represent pseudogenes. These data further indicate that diversity of the RT1-CE region-derived class I genes is focused on presence and absence of certain loci. Interestingly, a similar mode of generating class I gene diversity in the corresponding genomic interval has recently been published for the mouse (Kumanovics et al. 2002) and the rhesus macaque (Daza-Vamenta et al. 2004), adding further impact on the “birth and death” model of MHC evolution (Klein et al. 1998). This diversity has to be taken into account when these species are used as animal models for human MHC-associated diseases, as most of these class I loci are expressed, or at least appear expressible from inspection of their genomic sequences.

Table 1 Mean number of synonymous [(dS)±SE] and nonsynonymous [(dN)±SE] substitutions per 100 sites in codons of the peptide-binding region (PBR) and remaining codons (non-PBR) of exon 2 and exon 3

In summary, we have analyzed the genetic diversity of rat class I genes that map between the Bat1 and Pou5f1 genomic interval and have obtained evidence for extensive polymorphism, which is mainly manifested in presence and absence of loci, and not in allelic substitutions.