Introduction

The major histocompatibility complex (MHC) comprises a group of highly polymorphic genes with a central role in the immune system whose major function is the binding and presentation of foreign antigens to T lymphocytes. The MHC of many species, including birds, is comprised of large multi-gene families as a result of widespread gene duplication (Edwards et al. 1999; Zelano and Edwards 2002). Knowledge of gene organisation within the MHC of most non-model species with duplicated genes is unknown. As a result, many evolutionary studies of MHC variation have relied on the use of diversity measures pooled across several genes (Kurtz et al. 2004; Hansson and Richardson 2005; Richardson et al. 2005; Westerdahl et al. 2005). Studying MHC evolution across pooled loci may miss the more complex processes occurring at individual loci.

In contrast to most avian MHC regions, the domestic chicken B locus, containing the BF/BL region corresponding to the classical MHC, is particularly well characterised (Guillemot et al. 1988; Kaufman et al. 1999a). It is very compact and simple with only two classical class I α-chain (BF) and two class II β-chain (BLB) genes, located within a 44-kb DNA segment (Kaufman et al. 1995, 1999b; Jacob et al. 2000; Shaw et al. 2007). No incidents of recombination have been recorded between the class I and class II B genes in experimental matings (Hala et al. 1979, 1988; Koch et al. 1983; Skjødt et al. 1985), therefore loci appear to be linked in stable B haplotypes. Importantly, one gene from each class is expressed at a much higher level than the second, consequently such loci have been termed major and minor, respectively (Kaufman et al. 1995, 1999a; Jacob et al. 2000; Wallny et al. 2006; Shaw et al. 2007).

As in other vertebrates, levels of polymorphism for the classical class I and class II B genes are high in the domestic chicken (Briles et al. 1982; Miller et al. 2004). It is widely hypothesised that such high levels of MHC variation are maintained by parasite-driven selection pressures. In the domestic chicken, direct associations between the B haplotype and resistance to infectious diseases including Marek’s disease, Rous sarcoma virus and lymphoid leukosis virus (Collins et al. 1977; Schierman et al. 1977; Bacon et al. 1981; Briles et al. 1983) suggest that MHC polymorphism has arisen as a result of genetic benefits associated with pathogen resistance. As different domestic chicken MHC alleles confer resistance to different pathogens (Kaufman et al. 1995; Wallny et al. 2006), it is beneficial to maintain high levels of individual MHC diversity and, therefore, resistance to a wide array of pathogens.

Given the extensive knowledge of the structure and function of the domestic chicken MHC, other fowl populations are ideal models in which to examine the role of selection in avian MHC genes. Furthermore, the ‘minimal essential’ (Kaufman et al. 1999b) nature of the fowl MHC region allows the characterisation of functional diversity at the locus-specific level. Here, we examine variation in the BF and BLB genes of a population of red jungle fowl (Gallus gallus gallus), the wild ancestor from which chickens were domesticated (Nishibori et al. 2005; Liu et al. 2006). A previous study suggests that red jungle fowl exhibit significant diversity at the MHC although observations are restricted to non-classical loci (Shiina et al. 2006). In the present study, we use a captive population of red jungle fowl originating from a stock caught in Thailand (Schütz and Jensen 2001). Jungle fowl in this population appear to be genetically pure, with behaviour and morphology consistent with that of wild red jungle fowl (Schütz and Jensen 2001).

Studies aiming to investigate the processes responsible for maintaining MHC diversity within populations require reliable methods of screening variation across individuals. Several methods have been developed to screen individual genetic variation directly in the domestic chicken, including single strand conformation polymorphism (Goto et al. 2002), automated sequencing (Livant and Ewald 2005) and by utilising adjacent microsatellites variation (Fulton et al. 2006). All these methods have different advantages, for example analysis of functional diversity can be inferred from sequence variation and microsatellites genotyping is quick and easy to perform. For our planned studies on red jungle fowl, we require an inexpensive method suitable for screening large populations and which enables the full amino acid sequence of all alleles to be characterised. Consequently, we used the alternative technique of reference strand-mediated conformation analysis (RSCA), a conformation technique in which fluorescently labelled reference strands are used to form heteroduplexes with allelic variants (Argüello et al. 1998). This is an automated capillary based system containing internal size standards, which enables accurate and reproducible genotyping to be performed. We combined RSCA with sequencing to verify and further characterise all alleles. The method has been used successfully to identify MHC alleles in several species (Ramon et al. 1998; Kennedy et al. 2002; Addie et al. 2004; Angles et al. 2005). We expand these methods to include locus-specific techniques based on knowledge of the genomic structure of the MHC of domestic chickens. This locus-specific screening method will also allow future efficient and accurate ecological and population based studies. Using this screening protocol, we aim to: (1) characterise jungle fowl MHC alleles at all four classical MHC loci and (2) examine evidence for selection across the four loci. Addressing aims (1) and (2) also allows us to discuss patterns of MHC evolution in the population by comparing jungle fowl MHC sequence alleles with published MHC sequence alleles from domestic chicken lines.

Materials and methods

Samples

Blood samples were taken from all 84 individuals (44 females and 40 males) present (in 2006) within a population of red jungle fowl at the Swedish University of Agricultural Sciences, Skara. The population has been maintained for five generations at the level of 40–150 birds. The Skara population was founded from a captive population in Frösö zoo. Eggs were collected at random and the resultant 40 birds founded the Skara population. The zoo population comprised of around 25 individuals, of which ten were males, and was, in turn, founded from a single wild population in Thailand. Eggs for successive generations are incubated and, upon hatching birds, are vaccinated against Mareks disease; there has been no outbreak of other disease within the population.

Blood samples (approximately 100 μl) were taken by brachial venipuncture, diluted in 800 μl of 100% ethanol in a screw-cap microfuge tube and stored at 4°C until use. Genomic DNA was extracted using a standard salt extraction method (Sunnocks and Hales 1996). Extractions were normalised to concentrations of 10 ng/μl and stored at −20°C.

Gene amplification

Alleles from both BF1 (minor) and BF2 (major) loci were amplified using the non-specific primers C71 (5′-CGAGCTCCATACCCTGCGGTAC-3′) and C75 (5′-CTCCTGCCCAGCTCAGCCTTC-3′) developed from domestic chicken lines (Shaw et al. 2007). Amplicons comprise 767 bp fragments encompassing exons 2 and 3. All polymerase chain reactions (PCRs) were performed with 40 ng of extracted DNA in 30 μl reactions containing 0.125 μM of each primer and 15 μl of 2× Extensor Hi-Fidelity PCR mastermix (ABgene) incorporating 1.25 units DNA polymerase, 2.25 mM MgCl2 and 350 μl of each dNTP. A standard hotstart PCR program was used for all amplifications and comprised an initial denaturation step of 1 min at 96°C, 30 cycles at 96°C for 1 min, 60°C for 30 s and 72°C for 1 min, followed by a final extension step at 72°C for 10 min.

A 277 bp exon 2 fragment of BLB1 (minor) and BLB2 (major) loci was amplified using the non-specific primers OL284BL (5′-GTGCCCGCAGCGTTCTTC-3′) and RV280BL (5′-TCCTCT-GCACCGTGAAGG-3′; Goto et al. 2002). Amplification was for BF fragments except for an increase in primer concentration to 0.139 μM and an increase in the number of amplification cycles to 35.

Genotyping MHC loci

RSCA was performed following the method described by Argüello et al. (1998). Fluorescent-labelled reference strands (FLRs) were amplified using reverse primers labelled with 5′-FAM and cloned domestic chicken alleles as template. Eight FLRs were trialled for each of the BF and BLB genes to determine a subset that could best resolve all alleles in the population. All clones were provided by JK (Jacob et al. 2000; Shaw et al. 2007). A final set of four FLRs derived from domestic chicken clones was used for each of the BF and BLB genes. The BF FLRs comprised one BF1 allele clone; IB4m (Genbank accession number AM279337), and three BF2 allele clones; IB15M (AM282695), IB2M (AM282698) and IB4M (AM282693). The FLRs for BLB genes were derived from two BLB1 clones; IIB2m (AJ248573) and IIB4m (AJ248575), and two BLB2 gene clones; IIB2M (AJ248572) and IIB4M (AJ248574). No further alleles were found by using additional FLRs. Amplification of the BF and BLB FLR fragments was as the standard protocol above, but a tenfold excess of reverse primer was used in the reaction mix (final concentration 1.25 μM).

All FLR PCR products were diluted 1:10 with sterilised water prior to hybridisation. Two microlitres of each diluted FLR fragment was mixed with 2 μl of unlabelled amplified sample fragment in 96-well plates. Plates were denatured at 95°C for 10 min, cooled to 55°C at 1°C/s, re-annealed at 55°C for 15 min and cooled for 15 min at 4°C. Hybridised samples were diluted with 8 μl water, and 2 μl of this mix was then transferred to 384-well plates containing 0.2 μl internal size standard (ABI ROX 2500 for BF fragments and ROX 500 for BLB fragments) and 4.8 μl water before running on an automated sequencer (ABI 3100). Heteroduplexes were separated using 50-cm capillaries, an injection voltage of 15 kV, 15-s injection time, a run voltage of 15 kV, run temperature of 30°C and 4% non-denaturing polymer (Applied Biosystems). Only samples that hybridised with the labelled FLR strand are detected by the laser.

FLR homoduplex and heteroduplex peaks were identified using genemarker software (SoftGenetics) and their motility score taken in reference to the internal ROX size standard. Peaks with the same motility across different FLRs are designated identical putative alleles.

Cloning and sequencing RSCA alleles

Non-specific MHC alleles

Individuals were chosen for sequencing on the basis of their RSCA genotype. To enable the characterisation of every RSCA allele we cloned and sequenced at least two individuals carrying each RSCA allele. For example, samples 876 and 950 both contain the BF alleles ‘FI’ and the BLB alleles ‘ef’. In total, eight individuals were cloned and sequenced across both BF and BLB genes (Tables 1 and 2).

Table 1 Comparison between the results of class I BF RSCA and sequence alleles
Table 2 Comparison between the results of class II BLB RSCA and sequence alleles

Amplicons were cloned using a Zero blunt TOPO PCR kit (Invitrogen). A minimum of 15 recombinant colonies were purified for each sample using standard techniques (Wizard Plus Minipreps, Promega). Cycle sequencing was performed according to the manufacturer’s protocol (Applied Biosystems) in both directions for BF and in one direction only for BLB plasmids (as no differences between forward and reverse were recorded in any of the former) on an ABI 3730.

Locus-specific amplifications

A subset of samples was amplified using locus-specific primers enabling alleles to be designated to a specific BF or BLB locus using the following protocol. The primers C477 (5′-GTTACGCCCCGCTTCCCGGTCACAACTAC-3′) and C69 (5′-GCGGTGCCACTGAGTGCCACCAGGG-3′) amplify a fragment of approximately 3.6 kB containing the BF1 locus alone by binding within adjacent genes (Shaw et al. 2007). Reaction mixes were identical to those for non-specific amplifications. PCR conditions comprise an initial denaturation step of 2 min at 96°C followed by 30 cycles of 96°C for 30 s, 66°C for 30 s and 68°C for 5 min and terminated by 68°C for 10 min. Nested PCRs were, then, performed on BF1 amplicons using the non-specific BF primers C71 and C75. The resulting PCR products, which only comprise the BF1 locus, were cloned as above. A reduced number of eight clones were sequenced from each individual.

The BLB1 locus was amplified by the primers C275 (5′-GGTTCCAGGTGCAAGGCGATGGTCTCTGTGCATACCT-3′) and C243 (5′-CCATGCCTTGCAGAAATCGCAGTG-3′), which amplify a 3.4-kB fragment spanning between the Blec and Tapasin genes (Jacob et al. 2000). Reaction mixes were identical to those non-specific amplifications. Cycling conditions comprise an initial denaturation step of 2 min at 96°C, 11 cycles at 96°C for 1 min, 60°C for 30 s and 68°C for 2 min, followed by 21 cycles of 96°C for 1 min, 60°C for 30 s and 68°C for 2 min with a 20-s time increase per cycle. Reactions were terminated by a final extension step at 68°C for 10 min. Exon 2 of the BLB locus was then amplified from these BLB1 amplicons using nested PCRs with non-specific BLB primers. The final PCR products, therefore, comprised the BLB1 locus alone. A total of eight clones were sequenced from each individual. BLB locus designations were verified by additionally amplifying BLB2 products using the locus-specific primers C245 and C277 (Jacob et al. 2000), with reaction conditions and protocols as for BLB1 products.

DNA from cloned alleles was typed using RSCA. This enabled the assignment of RSCA alleles to specific loci in haplotypes where major and minor loci appear to be inherited in complete linkage (for example the BLB RSCA loci ‘a’ and ‘b’).

Sequence analyses

Sequences were edited and aligned using the program BioEdit version 7 (Hall 1999) and nucleotide diversity within gene regions was calculated in DnaSP version 4 (Rozas et al. 2003). Relative numbers of synonymous (d S) and non-synonymous (d N) substitutions were estimated in mega (Kumar et al. 2001), using the modified Nei and Gojobori distance model (1986) and applying a Jukes–Cantor correction for multiple substitutions at a site. The significance of deviation from neutral expectations was calculated by comparing the difference between d N and d S using t-tests with an infinite number of degrees of freedom. Analyses were conducted on all four loci and their putative protein binding regions (PBR) and non-PBR analysed separately (Kaufman et al. 1992; Brown et al. 1993; Tong et al. 2006).

In finite populations, alleles are gained by new mutation and lost by drift. Expected allele frequencies can be predicted using an infinite allele model and deviations from expectations can reveal processes of selection. Allelic frequencies in the jungle fowl population were examined for evidence of deviation from neutral expectations using Ewens–Watterson tests (Ewens 1972; Watterson 1978). The test statistic F A is the observed frequency of homozygotes. Significance is tested against expected values under a model of neutrality for a given number of alleles (k) and population size (2n; Ewens 1972). Distributions of expected F values are available from http://allele5.biol.berkeley.edu/homozygosity/homozygosity.html.

Tajima’s D statistic was calculated for all loci using DnaSP (Rozas et al. 2003). The test is based on the difference observed between the number of variable (segregating) sites across a group of sequences and the average number of nucleotide differences between sequence pairs within the group (Tajima 1989). These two values are assumed to correlate under the neutral model of evolution, and significant deviations from this expectation can be attributed to selection.

Amino acid sequences of all jungle fowl alleles were obtained using BioEdit. Sequence comparisons were made between the translated jungle fowl MHC sequences and domestic chicken sequences collected from Genbank (accession numbers given in Figs. 4 and 5). Only sequences of known MHC haplotype were used for comparisons. Bootstrapped (2,000 replications) phylogenetic trees were constructed from jungle fowl and domestic chicken sequences for both BF and BLB loci using mega. The BF amino acid tree was rooted using a sequence from Japanese quail (Genbank accession number AB005533), while the BLB tree was rooted by a cattle sequence (U77067).

Allele nomenclature

Historically, a variety of nomenclatures have been used for alleles of the chicken B region. We designated jungle fowl sequence alleles following the present standard (Miller et al. 2004); names indicate the specific locus (e.g. BF1), the initials JF (referring to jungle fowl) and arbitrary characters to differentiate alleles. RSCA alleles are designated with letters and sequence alleles with corresponding numbers, such that the BF RSCA allele ‘A’ corresponds to the nucleotide sequence allele BF2*JF1. Nucleotide sequences for all jungle fowl alleles have been deposited in Genbank under the accession numbers AM41960-AM419168 (BF sequences) and AM489767-AM489776 (BLB sequences).

Results

Jungle fowl MHC characterisation

RSCA alleles

A total of nine BF and ten BLB RSCA alleles are present in the 84 birds of the red jungle fowl population in 2006. All RSCA alleles can be distinguished by motility measures using the selected FLRs (Fig. 1). Each bird carries between two and four BF alleles and between one and four BLB alleles. BF and BLB alleles combine to give 12 MHC genotypes in the population (Fig. 2). Genotypes are found at unequal frequencies across the population, for example the genotype ABCD_abc is present in over 25% of all birds (Fig. 2).

Fig. 1
figure 1

Separation of class I BF (a) and class II BLB (b) alleles by heteroduplex motility for each set of four FLRs

Fig. 2
figure 2

Proportion of jungle fowl in the population with RSCA class I BF (a), class II BLB (b) and overall MHC genotypes (c) and MHC haplotypes (d). BF genotypes are shown in capital and BLB genotypes in lower case font. Minor alleles are shown in grey in figures a and b. The BLB ‘c’ allele is found in both BLB1 and BLB2 loci and is shaded to illustrate this

Examination of allelic distribution within genotypes allows the designation of six haplotypes; AB_ab, CD_c, E_dg, FI_ef, BG_hi and DH_j, where upper case letters represent BF and lower case BLB alleles.

Cloned sequence alleles

Nine BF and ten BLB nucleotide sequences were identified from cloned sequences (Tables 1 and 2). Sequences were verified by replication between individuals when possible. Exceptions to this are the sequences BF2*JF7, BF2*JF8, BLB2*JF9, BLB1*JF8 and BLB1*JF10. Although these rarer sequences were not verified in different individuals, they were replicated within individuals (samples 864 and 900). Several recombinant sequences were observed in jungle fowl BF sequence alleles. As most recombinant variants result from mismatched priming from partial products during amplification (Zorn and Krieg 1991), and none of the jungle fowl BF recombinant sequences could be replicated within samples, we discounted recombinant sequences as artefacts of amplification. Within samples, the numbers of RSCA and sequence alleles are the same.

Locus-specific amplification reveals there to be three BF1 alleles: RSCA alleles B, D and I corresponding to sequences 2, 4 and 6 (Table 1). BLB1-specific PCR amplified five unique BLB1 alleles: RSCA alleles b, e, g, i and j corresponding to sequences 1, 5, 4, 8 and 10 (Table 2). BLB2-specific primers amplified four unique BLB2 alleles: RSCA alleles a, d, f and h. Interestingly, both BLB1- and BLB2-specific primers amplify the sequence BLB*JF3, corresponding to RSCA allele c. This allele is reliably amplified by BLB1 primers in samples with either a ‘cj’ or ‘c’ genotype but is amplified by BLB2 primers in samples with the genotypes ‘abc’ and ‘cef’ (Table 2). BLB2 primers do not amplify any allele in samples with ‘cj’ or ‘c’ genotypes.

Red jungle fowl MHC sequence diversity and evidence of selection

There are 87 polymorphic nucleotide sites across the 767 bp jungle fowl BF sequences and 77 across the 277 bp BLB sequences. Nucleotide diversity (π) is approximately three times greater in BF2 sequences than in those from the BF1 locus and is over six times greater in codons involved with protein binding (Table 3). In contrast, there is little difference between BLB1 and BLB2 sequences (Table 3).

Table 3 Rates of exonic nucleotide diversity and non-synonymous (d N) and synonymous (d S) substitutions at jungle fowl and domestic fowl MHC loci

Translations of all jungle fowl BF and BLB exonic nucleotide sequences produce unique amino acid sequences (Fig. 3). None of the translated sequences contain stop codons and are, therefore, not likely to constitute pseudogenes. There are many variable amino acids across BF (27% of residues) and BLB (44% of residues) translated sequences (Fig. 3). Amino acid translation of BF1 sequences reveals that all three alleles can be identified by a locus typical motif between codons 71 and 76 (Livant et al. 2004).

Fig. 3
figure 3

Jungle fowl BF (α1 and α2 domains) and BLB (β2 domain) amino acid alignments. Numbers above the sequence relate to residue position. Positions of both α-helix (+++) and β-strands (= = =) are indicated for BF alleles. The putative protein binding regions (P) are indicated below the alignments (Kaufman et al. 1992; Hunt and Fulton 1998; Brown et al. 1993). The BF1*12 and BF2*12 domestic chicken reference sequences relate to Genbank accession numbers Z54314 and Z54326, respectively, while the BLB1*12 and BLB2*12 domestic chicken reference sequences relate to sequences AJ248577 and M29763

Calculations of synonymous (d S: those that do not result in protein change) and non-synonymous (d N: those resulting in protein change) substitutions for each of the four jungle fowl MHC loci are shown in Table 3. Overall ratios of d N to d S reflect an excess of non-synonymous mutations in these MHC genes, and are of a similar magnitude to those observed in domestic chickens. The significant excess of non-synonymous substitutions at sequences from the BF2 locus (d N/d S = 3.04, p < 0.001) is greater still across alleles of the putative protein binding region (d N/d S = 5.32, p < 0.001). There is no significant deviation from neutrality (d N/d S = 1) at the BF1 locus, even when examining the codons involved in protein binding. Although selection is observed within the PBR regions of both BLB1 (d N/d S = 2.05, p = 0.046) and BLB2 (d N/d S = 2.12, p = 0.039) loci, no deviation from neutrality is observed when the entire 277 bp region is analysed.

Ewens–Watterson tests indicate that allele frequency distributions at none of the jungle fowl MHC loci deviate significantly from those expected under neutrality. However, in all cases, the allele frequencies are skewed towards a more even distribution as expected in the presence of diversifying selection. When BF1 and BF2 loci were pooled, Ewens–Watterson tests approach significance. The same observation can be made for pooled BLB1 and BLB2 loci (Table 4). Our results show that pooling loci has the potential to lead to false significant results and that the Ewens–Watterson test should not be performed on alleles combined across loci. Similar misleading results can be produced when performing the test at one locus in samples that are pooled across genetically diverged subpopulations (Worley et al. 2006).

Table 4 Tests of allele frequency deviations from neutrality by the Ewens–Watterson statistic, F A, for jungle fowl MHC loci

Tajima’s D statistic is expected to be positive in sequences under diversifying selection. Although positive values of D were found for all loci except BLB1 (Table 5); no locus showed significant deviation from the neutral expectation of zero.

Table 5 Tajima’s D statistic for all jungle fowl MHC loci

Phylogenetic analyses of jungle fowl and domestic chicken MHC sequences

Five jungle fowl BF alleles, three from the BF2 and two from the BF1 locus, are identical to domestic chicken sequences present on the Genbank database (data not shown). Two of the identical domestic chicken sequences are not characterised to B haplotypes but the third allele, matching jungle fowl BF1*JF2, occurs in both the domestic chicken B4 and B21 haplotypes. None of the BLB jungle fowl alleles, which are more variable than the BF alleles, are shared with domestic chicken sequences published on the public database.

Jungle fowl alleles do not cluster together in the amino acid trees of BF and BLB sequences but are scattered throughout domestic fowl sequences (Figs. 4 and 5). A BF tree shows a node with 64% bootstrap support that largely separates BF1 alleles from BF2 alleles (Fig. 4). The smaller, largely BF1 clade encompasses all jungle fowl BF1 alleles and the majority of domestic chicken BF1 sequences from Genbank. There are some exceptions to the phylogenetic division of loci, namely BF2*06, BF1*02 and BF1*06. In contrast to the BF1 gene, alleles of the BF2, BLB1 and BLB2 loci do not form monophyletic clades.

Fig. 4
figure 4

Amino acid tree of jungle fowl BF exons 2 and 3 (in bold type) and domestic chicken sequences with locus and B haplotype designation from the Genbank database bootstrapped with 2,000 replications in mega. Domestic chicken sequences are shown with their B haplotype designation. The majority of BF1 genes form a separate clade with 64% bootstrap support (indicated by an arrow). The tree is rooted with a Japanese quail (Coturnix japonica) sequence, accession number AB005529. Accession numbers of other included sequences are as follows; AF013491, AF013492, AF013494, AF013495, AF013496, L28959, Z54314, Z54315, Z54316, Z54318, Z54319, Z54320, Z54322, Z54323, Z54324, Z54325, Z54330, Z54359, Z54360, X12780

Fig. 5
figure 5

Amino acid tree of jungle fowl BLB exon 2 alleles (in bold type) together with domestic chicken sequences with known locus designation from the Genbank database. The tree is bootstrapped with 2,000 replications in mega. Domestic sequences are shown with their B haplotype designation. The tree is rooted by a homologous cattle (Bos taurus) sequence (U77067). Accession numbers of other included sequences are as follows; AJ248573, AJ248574, AJ248575, AJ248576, AJ248577, AJ248578, AJ248579, AJ248580, AJ248581, AJ248582, AJ248583, AJ248584, AJ248585, AJ248586, AJ248572, M26306, M26307, U91528, U91529, U91532, U91536, U91537

Discussion

Diversity and selection within jungle fowl MHC genes

A total of nine BF and ten BLB alleles are present in the jungle fowl population, of which four BF and all ten BLB sequences have not been identified previously in domestic chickens. A similar rate of new alleles was reported in a study of the MHC class I variation within Caipira native chickens where nine novel BF sequences were characterised from a total of 23 observed sequences (Lima-Rosa et al. 2004). The remaining five jungle fowl BF sequences are shared with those derived from commercial broilers, leghorns and Brazilian Caipira domestic breeds. Lineage sharing between taxa (Figueroa et al. 1988) is widespread in the MHC regions of many species and is a key indicator of selection acting to enhance variation by maintaining alleles in a population longer than expected under neutrality (Takahata 1990). Allelic lineage sharing between jungle fowl and domestic chickens could also result from the relatively recent divergence between the two subspecies following domestication. The diversity of MHC genes in several domesticated species such as cattle, sheep and pigs is thought to be too high to represent founding by a small number of individuals that have remained genetically isolated since the original domestication event (Vilà et al. 2005). If introgression between jungle fowl and domestic chickens (either during domestication or more recently) explained all MHC allele sharing we might expect the presence of shared alleles in both class I and II sequences. The presence of shared alleles in only class I loci is difficult to explain in terms of introgression alone, suggesting selection is important in the jungle fowl MHC.

Relative frequencies of synonymous and non-synonymous mutations can reveal the effect of underlying selection. Neutrally evolving loci are expected to accumulate synonymous and non-synonymous mutations at similar rates, whereas purifying selection produces an excess of synonymous mutations. In contrast, jungle fowl BF and BLB loci show evidence of selection for diversity. This is indicated by observations that alleles differ by many nucleotides (Takahata and Nei 1990) and ratios of dN/dS substitution are above one (2.05 and 1.37, respectively). Estimates of both nucleotide substitution and nucleotide diversity are higher in both BLB loci than those in BF loci. However, both BLB loci have similar ratios of non-synonymous to synonymous mutations, and departure from neutral expectations (dN/dS = 1) is not observed across the entire length of either locus (Table 3). Significant deviation from neutrality is observed across combined BF loci, although frequencies of both non-synonymous and synonymous substitution are elevated in the BF2 locus compared to the BF1 locus. All four loci show an increase in ratios of dN/dS at codons of the PBR (Brown et al. 1993) compared to non-PBR codons, reflecting the functional importance of these regions. Indeed, the significant excess of non-synonymous substitutions across both BF loci can be attributed solely to selection acting on the BF2 locus (dN/dS = 3.04, P < 0.001). Codons involved in MHC class II peptide binding also showed significant deviation from neutrality, although at a lesser degree to that present within the BF1 locus. There was little difference between our results presented in Table 3 using the PBR regions designated by Brown et al. (1993) and those when PBR regions identified by Tong et al. (2006) were used. Our results demonstrate that care should be taken when using multilocus estimates of selection at the MHC.

In neutrally evolving equilibrium populations, the expectation of Tajima’s D is near zero, as variation based on segregating sites and the average number of pairwise differences is assumed to be the same. Positive Tajima’s D statistics, such as those observed in jungle fowl BF2 and BLB2 loci, are indicative of diversifying selection while a negative value, as in BLB1, is indicative of purifying selection. However, caution needs to be taken with the use of Tajima’s D in captive populations, as population bottlenecks can also lead to an increase in the value of D without input from diversifying selection.

Examination of allele frequency distributions can reveal current rather than historical evidence of selection. Allele frequency distributions of all four loci are more even than expected under neutrality, an indicator of diversifying selection, but Ewens–Watterson tests indicate there to be no significant deviation from neutral expectation (Table 3). The Ewens–Watterson test is a conservative test of selection which is not powerful when allele number is low, as is the case here. This may explain the non-significance of this trend. More even allele frequency distribution than expected at all four loci could be a result of infrequent recombination across the BF/BL region. Selection resulting in a change in allele frequency at one locus will also affect the allele frequencies of all loci in linkage disequilibrium. Alternatively, our result could indicate that although historical selection has acted on the MHC, as indicated by the dN/dS tests and positive Tajima’s D values, there are no selection pressures currently acting on the population. Accounting for the possible presence of null non-amplifying alleles in individuals with fewer than eight (four BF and four BLB) MHC loci (Table 3) does not appear to change the significance of Ewens–Watterson tests at any locus.

There does appear to be variation in the levels of diversity and evidence of selection present between the four BF and BLB loci in the jungle fowl population. Differential selection between genes from different MHC classes could have important implications for disease resistance, as genes of the class I and II regions have different roles in the vertebrate immune response (for instance, the recognition of intracellular and extracellular pathogens, respectively). While the jungle fowl population exhibits greater nucleotide diversity and mutation rates in BLB than BF loci, greater deviation from neutral expectations is found at the BF2 locus. The evolutionary importance of the BF2 locus with respect to disease resistance is supported by evidence that levels of gene expression are tenfold higher than those of the BF1 locus (Wallny et al. 2006; Shaw et al. 2007). Furthermore, resistance to one important poultry disease, Rous sarcoma virus, is associated with specific BF2 alleles and not those from BF1 (Wallny et al. 2006). In conclusion, differing levels of diversity, expression and selection support different functional roles for loci of the jungle fowl BF/BL region, with proteins from BF2 loci likely to have a more active immunological role in the jungle fowl population.

Allelic composition of jungle fowl MHC

Although every one of the six jungle fowl MHC haplotypes contains an allele from the BF2 locus, we were not able to amplify an allele at every other locus in all haplotypes. No BF1 allele could be amplified from the haplotype E_dg and there is no amplified BLB2 allele in the haplotype DH_j. The inability to amplify alleles from all BF–BLB loci has also been observed in previous studies of the domestic chicken MHC (Kaufman et al. 1999a; Jacob et al. 2000; Lima-Rosa et al. 2004; Shaw et al. 2007). In some cases, such as the BF1 genes in the domestic B14 and B15 haplotypes, the missing locus is a pseudogene (Wallny et al. 2006; Shaw et al. 2007). In other cases, such as certain BLB alleles, intact functional gene loci were not amplified for reasons that were never clarified (Jacob et al. 2000). A more detailed analysis is required to determine causes of non-amplification for the jungle fowl alleles.

The jungle fowl BLB ‘c’ RSCA allele (sequence allele BLB*JF3) is extremely common in this population, occurring in over 60% of the individuals (Fig. 2). The allele is amplified by both BLB1 and BLB2 specific primers depending on the genotype of an individual. The simplest explanation for this finding is that there has been an inversion around the tapasin gene with the recombinational breakpoints in the conserved portions of the BLB genes, resulting in two haplotypes: one with BLB*JF3 in the BLB2 locus next to the RING3 gene and another with BLB*JF3 in the BLB1 locus next to the Blec gene. However, the reason why the apparent location of the ‘c’ allele depends on the overall BF–BLB haplotype remains a mystery. Further investigation is, therefore, required.

Comparative evolution of jungle fowl and domestic chicken MHC

Population bottlenecks during domestication may be expected to lead to a loss of genetic diversity. Comparisons of red jungle fowl and domestic chicken nucleotide diversity are difficult given that only one population of the former is currently available, and characterised domestic sequences originate from several different strains. Acknowledging these problems, we found no difference between the nucleotide diversities of jungle fowl and domestic chicken BF2, BLB1 and BLB2 loci (Table 3). However, nucleotide diversity is lower in jungle fowl BF1 sequences than in domestic chicken lines. Correspondingly, only 3% of jungle fowl BF1 nucleotide sites are polymorphic compared to 14% in one sample of domestic chickens (Wallny et al. 2006). Lower BLB diversity in red jungle fowl compared to domestic chickens is unlikely to result from a reduced number of alleles in this population, as there are also few domestic chicken alleles available for comparative analysis.

The occurrence of red jungle fowl B alleles in haplotypes mirrors that observed in domestic chickens. Identical BF1 alleles are found in multiple haplotypes in both subspecies, for example the jungle fowl allele ‘B’ (BF1*JF2) is present in both the AB_c and BG_hi haplotypes, while the domestic chicken BF1 alleles in B4 and B21 haplotypes are identical (Kaufman et al. 1999a; Wallny et al. 2006; Shaw et al. 2007). Alleles of BLB loci are never observed in multiple haplotypes in both jungle fowl and domestic chickens.

The significance of dN/dS ratios appear to differ between jungle fowl and domestic chickens. Although there are indications of selection in all loci tested (as dN/dS > 1), significant deviation from neutral expectations is present in both BF1 and BF2 loci of domestic fowl but in the BF2 locus alone in the jungle fowl population. Diversifying selection also appears stronger in domestic fowl BLB loci than in red jungle fowl BLB (Table 3). There are currently no data available to contrast rates of selection between class I and II genes from other avian species, although evidence for selection has been identified in both gene regions (for example, Edwards et al. 1995; Richardson and Westerdahl 2003; Jarvi et al. 2004; Hansson and Richardson 2005; Westerdahl et al. 2005).

Amino acid trees of both BF and BLB reveal that jungle fowl alleles are spread widely throughout those of domestic fowl (Figs. 4 and 5). Phylogeny reconstructions of MHC alleles can not, therefore, allow the designation of ancestral and derived sequences. Figure 4 shows some support for the phylogenetic separation of BF1 and BF2 alleles as proposed by Livant et al. (2004). Although there are domestic fowl alleles which do not follow this division, most notably those from the B2 haplotype (Fig. 4), all jungle fowl alleles can be separated using this method. Many of the amino acid differences that separate the BF1 clade are located within codons involved in protein binding (Livant et al. 2004; Wallny et al. 2006). This supports the hypothesis of differential functional roles for loci with different levels of expression (Kaufman et al. 1995, 1999a; Jacob et al. 2000; Livant et al. 2004; Wallny et al. 2006; Shaw et al. 2007).

Red jungle fowl BLB alleles are less widely spread than BF alleles throughout an amino acid tree of domestic chicken sequences (Fig. 5). Although initial sequences from the B21 haplotype appeared to reveal fixed differences between the BLB1 and BLB2 loci (Juul-Madsen et al. 2000), an amino acid tree of BLB sequences shows there is no phylogenetic differentiation between the two BLB loci of either jungle fowl or domestic fowl (Fig. 5). However, smaller bootstrap-supported clades do generally contain small numbers of alleles from within a single locus. For example, the clade of five sequences containing the jungle fowl allele BLB1*JF10 contains four sequences from the BLB1 locus and one from the BLB2 locus.

Domestic chicken B haplotypes exhibit differential resistance and susceptibility to a range of diseases. For example, B21 homozygote and heterozygote domestic chickens are resistant to Marek’s disease (Bacon 1987) while B2 homozygotes, although more susceptible to Marek’s, are resistant to Rous sarcoma (Bacon et al. 1981). B4 domestic chickens are susceptible to both these diseases but are resistant to Salmonella (Joiner et al. 2005). Genes mapping close to the BL–BLB genes are also candidates for disease resistance (Shiina et al. 2007). Resolving disease resistance beyond the level of B haplotype, thereby directly determining the exonic variants that confer resistance or susceptibility to specific diseases requires population wide association studies to be performed. Our RSCA based method does allow the genotyping of MHC variation at the population level to be achieved relatively inexpensively and with good resolution. Direct associations with particular gene loci can be made and it is possible to immediately analyse nucleotide differences by sequencing RSCA alleles. Work of this type is, therefore, of significant importance in the field of pathogen resistance screening, such as avian flu, within populations. Disease association studies at the level of single nucleotides have contributed widely to the understanding of human diseases (for example Single et al. 2007; Bakker et al. 2006) but are impossible for MHC class I and II genes due to the extreme density of variable sites within genes.

We have used our typing method to analyse the BF/BL region of a jungle fowl population for the first time. Our method also has useful applications in wild bird populations. The ability to accurately and easily screen wild populations at individual MHC gene loci is required before the ecological and evolutionary processes acting to maintaining variation at the MHC can be further examined. Such reliable MHC screening could have implications for many fields of research, including host-pathogen interactions, sexual selection and mate choice.