Introduction

The major histocompatibility complex (MHC) is well-known for the large number of genes and diversity of alleles found in most vertebrates (Klein and Figueroa 1986; Klein et al. 1993). These genes are divided into three classes, of which classes I and II genes exhibit the most polymorphism (Trowsdale 1996). Class I genes encode molecules that recognise intercellular antigens and class II genes encode molecules that recognise extracellular antigens, such as those of invading parasites and other pathogens. Evidence for the MHC’s role in immune function is based, in part, on the patterns of variation concentrated in the peptide-binding region (PBR) (Hughes and Nei 1989). Variation within the PBR suggests that there has been evolutionary pressure for organisms to combat a wide range of immunological challenges.

The human MHC is better known as the human leukocyte antigens (HLA) and the HLA-DRB genes consist of four functional loci that encode molecules used in immune response (-DRB1, 3, 4, and 5). Three hundred and eighty-nine sequences assigned to the HLA-DRB1 locus can be divided into lineages based on nucleotide and amino acid sequence motifs (i.e. HLA-DRB1*01, -DRB1*02, etc.). Five pseudogene loci have also been identified (-DRB2, 6, 7, 8 and 9) and a total of 470 HLA-DRB sequences have been described (Robinson et al. 2003; Marsh et al. 2005).

The MHC appears to be polymorphic in all primate species studied thus far. Molecular studies have shown that rhesus macaque (Macaca mulatta) Mamu-DRB loci are also highly variable. To date, 134 Mamu-DRB sequences have been identified using molecular genetic techniques (Doxiadis et al. 2001; Robinson et al. 2003). Mamu-DRB nucleotide sequences have been given locus and lineage designations based on their similarity to human sequences. Like the human ortholog, the Mamu-DRB1 locus also consists of several lineages: Mamu-DRB1*1/10, 1*03, 1*04 and 1*07. Homologs of other HLA-DRB loci have also been identified in rhesus macaques. These include the -DRB3, 4, 5 and 6 loci. Interestingly, not all loci and lineages are shared between the two species. For example, the HLA-DRB1*02 and -DRB1*09 lineages and the HLA-DRB2, 7, 8 and 9 loci are not present in rhesus macaques. Likewise, rhesus macaques have numerous loci that are not orthologous to humans and, hence, are identified by workshop designations (e.g., DRB*W201).

Humans and rhesus macaques also differ in the degree of variation observed at their respective Mhc-DRB loci and among lineages. While both species exhibit variation within the DRB1*03 lineage, variation is also high in the rhesus macaque -DRB6 and -DRB*W loci. As in humans, Mamu-DRB6 loci are pseudogenes and research demonstrates that some -DRB1, -DRB3 and -DRB*W genes are also not transcribed (De Groot et al. 2004). The presence of a large number of non-functional sequences may provide the raw material for creating new functional genes through gene conversion and recombination (Beck and Trowsdale 2000).

While humans and rhesus macaques have been the primary focus of primate MHC research to date, little is known about the MHC of other Old World monkeys (OWM). Only DNA sequences in a few macaque species (M. arctoides: Kriener et al. 2000; M. fascicularis: Kriener et al. 2000; Leuchte et al. 2004; M. mulatta: Robinson et al. 2003; M. nemestrina: Zhu et al. 1991; M. silenus: Blankenburg et al. 2003), and only one sequence from a baboon (Papio hamadryas: Riess et al. 1990) and three sequences from one drill (Mandrillus leucophaeus: Gyllensten et al. 1991), have been published in the ImMunoGeneTics Information System MHC-NHP database. In fact, only three macaque species have been examined thoroughly. Therefore, little is known about the variability of loci, lineages, or even haplotypes between OWM species. In this paper, we describe Mhc-DRB diversity in another OWM, the mandrill (Mandrillus sphinx). We are currently studying the mandrill to better understand the functions and complexities of class II MHC genes in primates and to evaluate whether the patterns of Mhc-DRB diversity described for rhesus macaques are unique among OWMs.

Materials and methods

Source of DNA

Peripheral blood samples were collected from 11 wild-born mandrills housed at the Centre International de Recherches Medicales, Franceville (CIRMF) in Gabon. Samples, obtained during routine health examinations, were treated with 1×SSC (0.15 M NaCl, 0.015 M sodium citrate). Genomic DNA was extracted in Unité de Génétique des Ecosystèmes Tropicaux (UGENET) as described by Wickings and Dixson (1992). Samples were stored at −60°C until they were transported to the Department of Biological Anthropology, Cambridge University, (DEFRA-CITES: AHZ/2035A/2002/4-250403/01-03 and AHZ/2035C/2004/6-262315/01-03). Tissue samples were also collected from two captive stillborn mandrills at the Chester Zoo (United Kingdom). Samples were stored at −20°C at the Department of Biological Anthropology, Cambridge University. Genomic DNA was extracted using QIAamp DNA Mini Kit (Qiagen, UK).

Polymerase chain reaction (PCR)

Amplification of MHC sequences was performed as described by Knapp et al. (1997), with slight modification to the cycling conditions. Thirty to 40 ng of genomic DNA, in 25 μl of 1×PCR buffer [pH 8.5, MgCl2 1.5 mM (Invitrogen, Paisley, UK)], 2.5 mM of deoxyribonucleotide triphosphates (dNTPs) (Invitrogen), 25 pmol of each of the oligonucleotide primers 5′MDRB and 3′MDRB+GC (Table 1) and 1 unit of Taq polymerase (Invitrogen) was used for amplification by PCR. Optimal cycling conditions consisted of 30 cycles of 90 s denaturation at 94°C, a 90-s annealing at 54°C and a 90-s extension at 74°C. A single cycle of 10 min at 74°C was carried out, followed by a holding temperature of 4°C. After PCR, 5 μl of the product was electrophoresed on a 1.5% agarose gel to confirm successful amplification of DNA samples.

Table 1 Nucleotide primer sequence list

Cloning and sequencing

Initially, cloning and sequencing, as well as PCR, denaturing gradient gel electrophoresis (DGGE) and direct sequencing, were used to identify Mhc-DRB sequences in two wild-born mandrills. Clones were sequenced (Macrogen, Korea) in both directions using M13 forward (−20) and M13 reverse (−20) primers (Table 1). A total of 322 and 262 clones were screened from multiple cloning experiments for mandrills 10 and 18, respectively. As DGGE and sequencing was quicker and less costly than cloning and sequencing, and proved to reliably identify all sequences, it was the chosen primary method for the present study.

Denaturing gradient gel electrophoresis

The technique of Mhc-DRB allele separation using DGGE established by Knapp et al. (1997) was used with slight modifications of parallel conditions specifically required for identification of mandrill Mhc-DRB. A perpendicular DGGE was used to determine optimal conditions for parallel DGGE separation of mandrill DRB sequences. The GC-clamped PCR product was electrophoresed in a 12.6% acrylamide (37.5:1 acrylamide to bisacrylamide) gel with an increasing gradient from 0 to 80% denaturant (100% denaturant=7 M urea and 40% formamide) for 3.5 h at 300 constant volts. Perpendicular gels were electrophoresed in a BIORAD D-GENE (BioRad, Hemel Hempstead, UK) apparatus in 1×TAE buffer at 60°C. After electrophoresis, the gel was silver-stained (Bassam et al. 1991) and photographed.

According to perpendicular gel results, 7–20 μl GC-clamped PCR products were electrophoresed on a 41–50% parallel denaturing gradient gel for 3.75 h at 300 constant volts to separate mandrill -DRB sequences. As described for the perpendicular DGGE, the 12.6% acrylamide gel was electrophoresed in a BIORAD D-GENE apparatus in 1×TAE buffer at 60°C. After electrophoresis, gels were stained using Sybr-Gold (BioGene, Cambridge, UK) for 10 min and visualised using UV illumination. Gel plugs of individual bands were excised using wide-bore pipette tips and eluted in 50 μl distilled water at room temperature overnight, as described by Knapp et al. (1997).

DRB sequencing

To generate PCR product for sequencing from DGGE, 15 μl of the eluted DNA was re-amplified in a 25-μl reaction using 10×PCR buffer (Invitrogen), 2.0 mM MgCl2 (Invitrogen), 2.5 mM of dNTPs, 12.5 pmol of each of the hemi-nested primers 5′MDRBr- and 3′DRB1B-M13 (Table 1) and 0.5 unit of Taq polymerase. Cycling conditions consisted of 30 cycles of 20 s denaturation at 94°C, 30 s annealing at 60°C and 36 s extension at 72°C, a single cycle of 10 min at 72°C and a final holding temperature of 4°C.

In preparation for sequencing, 5 μl of the PCR product was electrophoresed on a 1.5% agarose gel to confirm successful amplification of DNA samples. Samples with visible primer-dimer on agarose gels were treated with 3 μl ExoSapIt (Amersham Biosciences, UK) and incubated at 37°C activation for 15 min, 80°C deactivation for 15 min and a final temperature of 4°C. All sequencing was performed by Macrogen (Korea). DNA was sequenced in both directions using M13 forward (−20) and M13 reverse-pUC (−40) primers (Table 1).

Masp-DRB nomenclature

Official locus and sequence names were requested and designated by the ImMunoGeneTics Information System MHC-NHP group (Robinson et al. 2003). New sequences have been deposited in GenBank (Accession numbers: DQ103715-47).

Sequence analysis and phylogenetic trees

Mandrill DRB sequences were compared by alignment to previously published OWM sequences, including Mamu-, Mafa- (Macaca fascicularis), Maar- (Macaca arctoides), Male- (Mandrillus leucophaeus) and Paha-DRB (Papio hamadryas). Phylogenetic trees were constructed using the Jukes and Kantor neighbour-joining method and the computer program MEGA version 2.1 (Kumar et al. 1993). The analyses were based on 252 nucleotides. Bootstrap analyses using 500 replications were performed to determine the repeatability of sequence-clustering patterns. A second analysis was conducted for mandrill and other OWM Mhc-DRB6 sequences.

Statistical analyses

Masp-DRB sequences were analysed with MEGA version 2.1 software (Kumar et al. 1993). Variable amino acid positions, presumed to represent the PBR (Brown et al. 1993), were assigned as for rhesus macaques (Slierendregt et al. 1992). Rates of non-synonymous (d N) and synonymous (d S) substitutions within the PBR and non-PBR were calculated according to Nei and Gojobori (1986), with the Jukes and Cantor (1969) correction. Significance levels were determined using a Z test (p<0.05).

Results

Cloning and sequencing

Multiple exon 2 Masp-DRB sequences were amplified using 5′MDRB and 3′MDRB primers. Cloning and sequencing identified 11 different sequences, 252 nucleotides in length, in two individuals (Table 2). No sequences were identical to those previously published for other primates.

Table 2 Masp-DRB typing results for two unrelated mandrills. The sequence found using only one method is indicated in bold print

Separation of Masp-DRB sequences using DGGE

The optimal denaturing gradient for mandrills (41–50%) was determined using perpendicular DGGE for one mandrill used in the cloning and sequencing experiments. As with cloning and sequencing, multiple Masp-DRB sequences were amplified using the 5′MDRB and 3′MDRB (this time with a GC-clamp) primers. A direct comparison of results from cloning and sequencing and DGGE and direct sequencing revealed the same 11 sequences, with one additional sequence obtained using DGGE (Table 2). Authenticity of this sequence was confirmed by multiple DGGE experiments.

While a 41–50% parallel denaturing gradient could be used to separate most mandrill sequences (Fig. 1), this gradient was not optimal for some sequences in other study individuals and second gradient (40–60%) was required for four individuals (Fig. 2a,b; Table 3).

Fig. 1
figure 1

Parallel DGGE parameters: 40–60% gradient, 4.5 h (+15 min longer due to low amperage), 300 V and 60°C

Fig. 2
figure 2

Using DNA for three identical subjects, a shows a 3.75-h DGGE using a gradient of 41–50% and b shows a 4.5-h (+15 min longer due to low amperage) DGGE using a gradient of 40–60%

Table 3 Masp-DRB sequences for 13 mandrills

Using DGGE and direct sequencing, 22 additional sequences were identified in the remaining 11 individuals. Due to the use of hemi-nested primers for gel plug amplification, the sequences reported are 237 nucleotides in length. Although sequences derived from cloning had no ambiguities and were longer in length, the DGGE method identified the complete sequence repertoire for both subjects tested. As some bands migrated to identical places on the DGGE, assumptions of sequence identity based on migration patterns were not possible for mandrills.

Identification of mandrill DRB alleles

A total of 33 different Masp-DRB sequences were identified using cloning and sequencing and/or DGGE and direct sequencing (Table 4). Each individual had a unique DGGE banding pattern (Fig. 1). Predicted amino acid sequences and nucleotide sequences are presented in Figs. 3 and 4, respectively. Up to seven different sequences were identified for a single individual (Table 3).

Table 4 Summary of Masp-DRB sequences and typing techniques used to identify each gene
Fig. 3
figure 3

Predicted amino acid sequences of mandrill DRB (exon 2). Numbering of amino acids is shown above the consensus of Masp-DRB sequences; bold typeface indicates inferred PBR sites. Identity to consensus is shown by dashes (). Deletions are indicated by dots (.) and a stop codon is indicated by an exclamation point (!). An asterisk (*) denotes an un-sequenced region

Fig. 4
figure 4figure 4

Nucleotide sequence of mandrill DRB (exon 2). Codon numbering is shown above the consensus of Masp-DRB sequences. Identity to consensus is shown by dashes (–). A deletion is indicated by dots (.) and asterisks (*) denote unsequenced nucleotides

Phylogenetic analysis of the Masp-DRB sequences was performed to identify the locus and lineage of each new sequence. Trees included the new sequences and published OWM Mhc-DRB sequences (Fig. 5). Mandrill representatives of -DRB1, -DRB3, -DRB5, -DRB6 and -DRB*W loci and lineages were recognised. Five mandrill sequences clustered in the -DRB1*03 lineage and four sequences clustered both in the -DRB5 and -DRB*W25 loci. Two mandrill sequences clustered with other -DRB3 and -DRB*W301 sequences. Other single sequences clustered with other -DRB*W sequences. Sequences assigned to the -DRB1*04 lineage clustered with an assortment of other OWM sequences (i.e. -DRB1*04, -DRB4*01, -DRB*W101). Two sequences were assigned novel designations.

Fig. 5
figure 5

Phylogenetic tree of 33 Masp-DRB sequences and other selected OWM sequences. All non-mandrill sequences are located in the IMGT MHC-NHP database (Robinson et al. 2003). The tree configuration was derived from nucleotide sequences using the neighbor-joining method of the MEGA program. Numbers on the branches refer to confidence probability values

Six Masp-DRB6 sequences were also identified and a separate phylogenetic tree was created to compare these with other pseudogene sequences (Fig. 6). One Masp-DRB6 sequence clustered with the rhesus macaque Mamu-DRB6*0101, three with Mamu-DRB6*0110, and two clustered with the cynomolgus macaque Mafa-DRB6*0101.

Fig. 6
figure 6

Phylogenetic tree of the Masp-DRB6 sequences and other selected OWM-DRB6 sequences. All non-mandrill sequences are located in the IMGT MHC-NHP database (Robinson et al. 2003)

Patterns of nucleotide substitution

Most nucleotide substitutions were confined to residues representing the PBR. Based on work by Brown et al. (1993), 17 positions were identified as most variable (i.e. four or more amino acid changes) and assumed to make direct contact with peptides. The variable residues were at positions 11, 13, 23, 26, 28, 30 and 37 of the β sheet and 57, 61, 67, 70, 71, 73, 74, 77, 78 and 86 of the α helix (Fig. 7). These positions are directly comparable to those described for rhesus macaques by Slierendregt et al. (1992).

Fig. 7
figure 7

Amino acid variability plot for 33 Masp-DRB alleles. The horizontal axis shows the amino acid position, and the vertical axis shows the number of different amino acids at a given position. The most variable sites in the rhesus macaque MHC (Slierendregt et al. 1992) are indicated by asterisks (*)

Estimated rates of non-synonymous and synonymous substitutions within the PBR and non-PBR were also examined. The PBR of non-DRB6 sequences (Table 5) contained significantly more non-synonymous changes than synonymous changes (p<0.0001). However, the region outside the PBR contained fewer non-synonymous than synonymous substitutions, and this difference was not significant (p=0.109). Pseudogenes were examined separately, and no significant differences between frequency of non-synonymous and synonymous substitution were found (Table 6).

Table 5 Estimated rates of non‐synonymous (d N) and synonymous (d S) substitutions for protein binding (PBR) and non-protein binding (Non-PBR) regions for DRB exon 2 of mandrill sequences (excluding -DRB6)
Table 6 Estimated rates of non-synonymous (d N) and synonymous (d S) substitutions for PBR and non-PBR regions for DRB exon 2 of mandrill -DRB6 sequences only

Discussion

Using DGGE and direct sequencing, as well as cloning and sequencing, a total of 33 Masp-DRB sequences were identified in 13 mandrills. A simple comparison of the first 30 nucleotides of exon 2 demonstrates more than 91% shared sequence similarity between mandrills and other OWMs. Two sequences were given novel DRB*W designations.

Through analysis of the peptide-binding region, significantly higher rates of non-synonymous than synonymous substitutions within the Masp-DRB were found. Hughes and Nei (1989) argue that substitutions favouring change in amino acid configuration within the PBR indicate evidence for positive selection of Mhc diversity. These findings have important implications for our understanding of Mhc-DRB evolution in primates, but new research by De Groot et al. (2004) suggests that not all sequences may be expressed. De Groot and co-workers demonstrated that despite the assumption of their functionality, certain rhesus macaque sequences (i.e. -DRB1*0309, -DRB3*0403 and -W307) were not transcribed, rendering them non-functional.

Klein’s (1987) trans-species inheritance hypothesis argues that many primate Mhc-DRB sequences share sequence similarities due to common ancestry. Thus, a comprehensive understanding of the primate MHC can only be gained by studying related species in depth. Based upon extensive work on chimpanzees (Pan troglodytes) and rhesus macaques, it is increasingly obvious that there are major differences in frequency and variability of MHC genes of primates. According to Bontrop et al. (1999), the -DRB3, -DRB4, -DRB5 and -DRB6 loci, as well as the -DRB1*03, -DRB1*04, -DRB1*07 and -DRB1*10 lineages, are shared by humans, common chimpanzees and rhesus macaques. Some OWM and ape Mhc-DRB regions contain loci with no human equivalent (e.g. -DRB*W loci). Thus, some loci, or even lineages, may have been lost, while others will arise through duplication after divergence from a common ancestor. Many of these loci might be unique to some species. In the present study, two sequences are unique to mandrills (-DRB*W5101 and -DRB*W5201).

Knapp et al. (1997) identified 34 different sequences from 12 unrelated rhesus macaques. Similarly, 33 mandrill DRB sequences were identified from only 11 unrelated and two related individuals. The captive-born individuals shared five sequences with four of the wild-born individuals. Shared sequences between the two groups, as well as among the unrelated wild-born mandrills, indicate that some sequences may have high frequencies in mandrills. Analyses of the Masp-DRB sequences revealed distinct patterns of amino acid sequences (i.e. motifs) shared between humans, apes or monkeys. While shared motifs can be identified in many sequences, only those found at the amino end of the beta-1 domain are used to differentiate between loci and lineages (Gyllensten et al. 1991; O’hUigin 1995). For example, the amino acid motif of EYST occurs at positions 9–12 on -DRB1*03 alleles in humans, apes, Old and New World monkeys (Gyllensten et al. 1991; Klein and O’hUigin 1994; Bontrop et al. 1999). The presence of this motif in many different primates suggests descent either from a single ancestral allele or through convergent evolution in the case of New World monkeys (Antunes et al. 1998). In the mandrill, five sequences share the -DRB1*03 EYST motif, encoded by the same nucleotide sequence (GAG TAC TCT ACA), with the human and rhesus macaque. As no more than two -DRB1*03 sequences are found in any single individual, it is possible that these five sequences are allelic. However, this concept should be interpreted with caution, as reports have shown that macaques could have two -DRB1*03 sequences on a single haplotype (Doxiadis et al. 2003; Penedo et al. 2005). Further study of related individuals should provide important information about locus identity.

All other mandrill sequences belong to other -DRB1 lineages and -DRB3, 5 and 6 and -DRB*W loci. Although the -DRB1*01/10 and -DRB1*07 lineages are found in both humans and rhesus macaques, mandrills seem to lack these sequences. A study on cynomolgus macaques also reported the presence of the -DRB1*07 lineage (Leuchte et al. 2004). The mandrills in the present study also lack -DRB4 sequences. Despite the large number of sequences identified in the present study, representatives from missing lineages may be uncovered through further study of additional individuals.

While official sequence names are given on the basis of nucleotides found in the first part of exon 2, phylogenetic trees highlight difficulties in the nomenclature process. Three examples are apparent in Fig. 5. First, Masp-DRB*W5101 clusters with Mamu-DRB*W2101. Despite their similarities (Fig. 8), distinct nucleotides found in Mamu-DRB*W21 alleles are not present in the Masp-DRB*W5101 sequence. Second, the Masp-DRB1*0405 sequence did not cluster with other Mhc-DRB1*04 sequences, but instead grouped with a Mamu-DRB4 sequence. Examination of the first 20 amino acids of Masp-DRB1*0405 demonstrates the similarities with other -DRB1*04 sequences, but other regions of the sequence share motif similarities with Mamu-DRB4*0101. These motifs may have arisen by positive selection or by other processes such as gene conversion or recombination. Lastly, the Masp-DRB1*0404 clusters with Mhc-DRB*W25 sequences and most closely with the Masp-DRB*W2504 sequence. In Fig. 8, these two sequences are compared with two similar Mamu sequences and their similarity to Mhc-DRB1*04 sequences are apparent. Interestingly, phylogenetic trees constructed with only non-PBR residues do not improve resolution of the Masp-DRB1*04 allelic lineage (data not shown). As mentioned previously, sequences are given workshop designations based on their similarity to HLA sequences. The observations made in the present study suggest that an MHC nomenclature system based on the HLA (a relatively recent primate species) may complicate naming of MHC sequences in the non-human species, and lineage identity does not imply similarity across an entire sequence.

Fig. 8
figure 8

Selected Masp-DRB amino acid sequences and other OW locus/lineage representatives and phylogenetic clustered sequences. Bold typeface indicates variable residues for each species. Boxes enclose areas of similarities between sequences

The mandrill sequences with deletions and premature stop codons are also interesting. Two mandrill sequences contain nucleotide deletions. Masp-DRB6*0402 has a one nucleotide deletion at position 75 and Masp-DRB*W2601 has a three nucleotide deletions at positions 77–78. Another sequence, Masp-DRB6*0404, does not contain a deletion but a premature stop codon at position 48. Although deletions and premature stop codons are characteristic of pseudogenes, the nucleotide sequence of Masp-DRB*W2601 does not suggest it should be grouped with other pseudogenes. According to Paz-Artal et al. (1996), three-nucleotide deletions in -DRB6 alleles typically occur at position 71 in rhesus macaques (Mamu-DRB6*0112) and at positions 77–78 in orangutans (Pongo pygmaeus: Popy-DRB6*0201 and *0202). Two-nucleotide deletions were reported for three chimpanzee -DRB6 alleles (Patr-DRB6*0106, *0107 and *0108) at position 77. Unlike the deletions reported by Paz-Artal et al. (1996), Masp-DRB6*0402 has a one-nucleotide deletion at position 75. Although the nucleotide sequences of the two sequences are not similar, Masp-DRB*W2601 possesses a three-nucleotide deletion at positions 77–78 like the orangutan pseudogenes. These findings suggest the possibility of trans-species inheritance of pseudogenes from the common ancestor of orangutans and mandrills.

According to Hughes and Nei (1989), non-synonymous substitutions, or changes that alter encoded amino acids, within the PBR should be more frequent than synonymous (non-altering) changes in functioning Mhc-DRB alleles. Concomitantly, a greater rate of non-synonymous substitution would be indicative of positive selection. Our analyses of nucleotide substitution in the mandrill sequences, based on interlocus comparisons, support Hughes and Nei’s model (Table 5). As expected, the Mhc-DRB PBR contained significantly more non-synonymous substitutions than synonymous substitutions (p<0.0001). Therefore, as in most species, positive selection may be a major force in maintaining sequence diversity in mandrill DRB genes. Future studies on DRB inheritance in mandrills will offer important further information about allele assignment that can be used for intralocus substitution analyses. In addition, studies of these loci using mRNA should clarify which sequences remain functional in the mandrill -DRB repertoire.

Considerably less is known about the MHC of OWM other than rhesus macaques. Representatives of the -DRB1, -DRB4, -DRB5, -DRB6 and -DRB*W loci have been described for some, but not all, OWM. Taken together, Mhc-DRB sequences of OWMs suggest trans-species inheritance of only some alleles. The present study of mandrill Mhc-DRB suggests that, like the rhesus macaque, other OWMs may have high levels of allelic variation with duplication, gene conversion and recombination, providing opportunities for new alleles to arise. Further study will shed light on structural similarities with other primates and clarify haplotype diversity in the mandrill.