Introduction

Invasive species frequently experience bottlenecks and increased genetic drift after they colonize a new location, as a consequence of low numbers of founding individuals (e.g. Blackburn et al. 2015; White and Perkins 2012). An introduced population is composed of a sub-sample of genotypes, originating from one or several populations in the area of origin, and some loss of genetic diversity is expected (reviewed by Dlugosch et al. 2015). Usually, reduced genetic diversity co-varies negatively with fitness traits such as disease resistance. Recent studies suggest that the success of invasive species depends both on their ability to respond to natural selection and their broad physiological or ecological plasticity. Studying the adaptive potential of genetic variants in invasive species is centrally important for recognizing the impacts of microevolutionary changes on population growth and spread of bottlenecked species. It generates a more complete understanding of the role of genetic variation for the process of invasion (reviewed by Dlugosch et al. 2015).

Wild European rabbits (Oryctolagus cuniculus cuniculus) were imported and released at many locations in Australia and although most failed to establish some stable populations did so in the mid-nineteenth century (Peacock and Abbott 2013). The primary source population at Barwon Park (Victoria, Australia) comprised only 13 individuals but rapidly increased and spread over most of the continent due to their high reproductive rate and the destruction of their predators (Williams et al. 1995). Rabbits quickly became a pest in Australia causing devastating ecological and economic damage (Cooke et al. 2013; Cooke 2012). Two different viral biocontrol agents were subsequently used to substantially reduce rabbit numbers in Australia. The Myxoma virus was spread in the 1950s followed some 40 years later (after resistance had accumulated), by rabbit haemorrhagic disease virus (RHDV) (Cooke 2014; Kerr 2012; Mutze et al. 2015; Saunders et al. 2010). After the initial spread of RHDV in 1995, RHD epidemics have been reported to reoccur on a yearly basis (Wells et al. 2015).

RHDV is a small single-stranded (+) RNA virus that has rapidly evolved in Australian rabbits and positive selection has been identified on some positions of its capsid protein (Eden et al. 2015; Kovaliski et al. 2013). Mortality rates, at least initially, were often around 95%, both in domestic and wild rabbits (Cooke and Berman 2000; Mutze et al. 1998). However, experimental challenges of domestic and wild rabbits with RHDV ten years after the initial spread confirmed that wild rabbits were evolving genetic resistance. This was especially the case in populations that experienced stronger selection pressure through RHDV as compared to populations that experienced weaker selection for RHDV resistance due to cross-protection through the non-pathogenic calicivirus, RCV-A1 (Elsworth et al. 2012).

Surprisingly, allozyme and microsatellite analyses have shown that despite strong population bottlenecks caused by founder effects or high mortality rates during epizootics, wild rabbit populations both in Europe and Australia do not seem to exhibit reduced neutral genetic variation (Queney et al. 2000; Zenger et al. 2003). Both the high reproductive rate of surviving rabbits and their rapid expansion in Australia may have significantly buffered the effects of the bottlenecks and subsequent genetic drift. Adaptive immune genes, in contrast to neutral genetic variation, are not only affected by the stochastic processes of genetic drift but underlie selective pressures as well (Piertney and Oliver 2006; Sommer 2005a). Among the best studied genetic adaptive markers in natural populations are the immune genes of the major histocompatibility complex (MHC). Classical MHC genes code for antigen-presenting cell-surface glycoproteins that have a central role in the adaptive immune system as they bind and present antigens to T cells which then initiate further immune responses. MHC class II genes are expressed only on specific antigen presenting cell types such as macrophages or B cells. They specialize in binding antigens from extracellular pathogens. In contrast, MHC class I molecules present intracellularly derived peptides coming from both own proteins and non-self peptides such as viral proteins (Klein 1986). In most vertebrate species, the MHC is highly polymorphic with both a high number of gene duplications and alleles per locus. This extraordinarily high degree of variability is thought to be maintained by balancing selection through parasites and pathogens and/or sexual selection. Different selection mechanisms have been suggested in nature with the main ones being ‘heterozygote advantage’ and ‘frequency-dependent selection’ (reviews by Sommer 2005b; Spurgin and Richardson 2010), both of which are not mutually exclusive. The ‘heterozygote advantage’ assumes that heterozygous individuals have a more variable repertoire of MHC molecules that allows them to recognize and present a wider array of different antigens, which in turn leads to better general immune protection (Doherty and Zinkernagel 1975; Hughes and Nei 1988). ‘Frequency-dependent selection’ assumes that specific MHC alleles may confer a better protection from common pathogens at low frequencies, but when these protective MHC alleles become more frequent in the host population, only non-recognized pathogens will survive and subsequently increase in numbers. Then another low-frequency MHC allele will provide protection against this new (adapted) pathogen. Over time, this would also result in high MHC variability within the host population (Takahata and Nei 1990).

Although pathogens and parasites in general are thought to be drivers of selection for MHC diversity, both selective sweeps by infectious diseases and bottlenecks can substantially reduce MHC variability and may lead to either fixation of selectively favored alleles or to random depletion of genetic diversity due to genetic drift. The latter has, for example, been observed in the Tasmanian devil (Cheng et al. 2012). There are several examples showing that small, isolated populations that have suffered severe bottlenecks have a limited MHC variability, which, in turn, may affect susceptibility to pathogens and parasites (Radwan et al. 2010), although low MHC variability does not always have such an effect (Babik et al. 2005; Ellegren et al. 1993; Weber et al. 2004). A few studies on bottlenecked populations (as evidenced by their low variability at neutral markers) found a relatively high variability at MHC genes, highlighting the fact that (depending on pathogen pressure and population size) selection may have a stronger impact on variability than genetic drift (Aguilar et al. 2004; Hambuch and Lacey 2002; van Oosterhout et al. 2006) but see (Ejsmond and Radwan 2011; Sutton et al. 2011).

Despite considerable interest in rabbit genetics (e.g. Rabbit Genome Project of the Broad Institute, Carneiro et al. 2014) no study has investigated MHC class I variability and only limited data are available on MHC class II variability in wild rabbit populations. Surridge and colleagues (2008) identified 14 rabbit MHC class II DQA alleles in single individuals from six different domestic rabbit breeds and in twelve wild rabbits from England (N = 8) and Portugal (N = 4), but the number and distance of sampling locations were not specified. Magalhães et al. (2015) studied the MHC class II DQA locus in three wild rabbit populations from France, Spain and Portugal, respectively, and found 9–13 different alleles per population (in N = 10 individuals/population). Another study, on the MHC class II DRB locus (Oppelt et al. 2010), reported extremely low allelic variability: only three alleles and six genotypes were detected in 113 European wild-type rabbits in an enclosed population. This population had experienced two bottlenecks due to myxomatosis in one year and high winter mortality in another. Because viruses mainly elicit a response of MHC class I receptors (Klein 1986; but mind exceptions, see Lin et al. 2008), high viral exposure and prevalence is expected to direct selection pressure more towards MHC class I than class II receptors. Therefore, information on MHC class I variability is essential to learn more about the adaptive potential to (biocontrol) viruses of invasive rabbits in Australia.

The aim of this study was to assess the MHC class I variability in rabbits in Australia that had experienced strong historic bottlenecks during colonization and biocontrol epizootics, and which still is under strong RHD-driven selection. For that purpose we developed a next-generation amplicon sequencing approach for the Illumina MiSeq platform. We investigated whether the MHC repertoire is associated with surviving RHDV, and if so, whether ‘heterozygote advantage’ or ‘frequency dependent selection’ is the underlying selective mechanism. Understanding the degree to which balancing selection can oppose genetic drift by maintaining high MHC diversity despite past bottlenecks is a central question in evolutionary genetics and invasion ecology. Deciphering the coevolutionary arms race between MHC class I diversity in association with RHD resistance in invasive rabbits will be of applied importance for the success of currently planned new RHDV variant biocontrol releases in Australia aimed at protecting primary production and the endemic biodiversity.

Materials and methods

Study site and sample collection

Rabbit samples were collected from a study site on Turretfield Research Centre (34°33′S, 138°50′E), 50 km north of Adelaide, South Australia. This rabbit population has been under surveillance since 1996, and during every RHD outbreak season (usually around August–September) extensive searches for dead rabbits are conducted. Details of the study site are described in Peacock and Sinclair (2009). From all fresh rabbit carcasses, liver and/or spleen tissue samples were taken and frozen for later analyses. In cases in which no liver or spleen samples were available (due to scavenging or an advanced state of decay), bone marrow was collected. When no new dead rabbits had been found for approximately three weeks, the outbreak was assumed to have ended and the search was terminated.

For comparison, we also collected a tissue sample from visually healthy rabbits in 2010 during capture-mark-recapture field trips conducted by NRM Biosecurity South Australia at the same study site. Rabbits were live-trapped over one week every other month using wire treadle cage traps (23 cm × 23 cm × 60 cm) baited with carrots. After a blood/ear tissue sample had been taken the rabbits were released into the same burrow from which they had been caught (for a detailed description of the rabbit trapping procedure and the data sampling methods see Peacock and Sinclair (2009). Animal ethics approval for the work with live rabbits and tissue export permits for analysis in Germany had been granted (permits: CEAE 09-06, PIRSA AEC 09/03). No ethics approval is required in Australia for the collection of tissue from wild animals that have died of natural causes (here RHD).

Confirmation of RHDV-infection status

To verify that a rabbit had died of RHD, the presence of viral RNA in the carcasses was confirmed by a diagnostic PCR as reported earlier (Schwensow et al. 2014). Briefly, total RNA was extracted from all carcasses using QIAzol lysis reagent (Qiagen, Hilden, Germany) according to the manufacturer’s protocol. First-strand RHDV cDNA libraries were prepared using up to 5 µg of total RNA, Oligo-dT18 primers and the Revert AidTM H Minus First Strand cDNA Synthesis Kit (Fermentas) as per the manufacturer’s protocol and used as template in the diagnostic PCR. Forty rabbits with confirmed RHD-infection constituted the test group of ‘RHDV succumbed rabbits’ used in the present study.

Furthermore, the blood samples from all captured individuals were tested for presence of antibodies against RHDV using a competition ELISA and three isotype ELISAs (for details see Capucci et al. 1991, 1997; Cooke et al. 2000). To identify rabbits that were true survivors of an RHDV infection, i.e. individuals that became infected but survived RHDV after having lost their juvenile protection (conferred through both maternal antibodies and/or innate juvenile immunity), we used information from the capture-mark-recapture study and the serological tests. We assigned individuals to the group ‘RHD-survivors’ if they met one of the following criteria: (1) Rabbits that carried maternal antibodies when first caught as young kittens but were seronegative (and older than 12 weeks, i.e. >800 g) when caught in the next trapping session (8 weeks later) yet carried antibodies against RHDV when caught in a third trapping session (after an RHD outbreak); (2) Rabbits caught for the first time when >800 g and carrying no antibodies but caught again later in life having seroconverted; (3) Rabbits being antibody seropositive and having necrotic ears, indicating survival from severe RHD infection; (4) Rabbits >1000 g that showed IgM (indicating recent infection) or high IgA titres (which do not occur when infection occurs in young rabbits); and (5) Seropositive rabbits caught for the first time at an age of >4–5 months, but born well after a previous RHD outbreak and with an age of at least 12 weeks at the onset of the next outbreak. Following these criteria we identified 42 RHDV-survivors which constituted the second test group used in this study.

Microsatellite genotyping

Genomic rabbit DNA was extracted using the Invisorb Spin Tissue Mini Kit (Invitek) following the manufacturer’s protocol. For samples with decayed tissue material we used standard phenol–chloroform extractions (Sambrook et al. 1989) to ensure sufficient DNA quantity and quality. Microsatellite genotyping of the rabbit DNA samples was carried out at nine loci: Sat2, Sat4, Sat5, Sat8, Sat12, Sat13 (Mougel et al. 1997), Sol44 (Surridge et al. 1997), OCELAMB (van Haeringen et al. 1996), and D7UTR5 (Korstanje et al. 2003). In our microsatellite protocol we followed the maximum likelihood method described by Miller et al. (2002). We generally extracted the DNA in two independent replicates, and analysed them in parallel. If the results differed, we repeated the whole process. If the replicates were identical in the repetition, the genotype was accepted. If not, the sample was excluded. We also ran negative controls in all steps. Additionally, we generally added the same control sample (the sample had also been sequenced at all loci) to all runs. Thus, slight variations between runs could be detected and were controlled for. Based on a prior assessment of allele size distribution, primers for loci were combined in three multiplex mixes (I: Sat4, Sat12, Sat16; II: OCELAMB, Sat5, Sat8; III: Sat2, Sat13, Sol44, D7UTR5). One primer of each pair was 5′-labelled with a fluorescent dye (either HEX [OCELAMB, Sat5, Sat13, D7UTR5] or 6-FAM). For mixes I and III, PCRs were conducted in 10 µl volumes containing 5 µl 2× MP-mix (Qiagen, Hilden, Germany), 1 µl primer mix (0.6 pmol each), 1 µl DNA (50–100 ng), and 3 µl H2O. For mix-II, primer concentration was doubled for OCELAMB and the 3 µl H2O were substituted by 1 µl H2O and 2 µl 5× Q-solution (Qiagen). Cycling conditions for all mixes were: 95 °C 15 min, 30× (95 °C 30 s, 54 °C 90 s, 72 °C 60 s), 60 °C 30 min. Amplification products were separated on a Pop-7 matrix (Life Technologies, Darmstadt, Germany) using an ABI A3130xl automated sequencer and sized with a Rox-labelled Genescan ®-500 length marker. Fragment length analysis was carried out using Genescan ®Analysis software v.2.0 (all Life Technologies). We used standardized multilocus heterozygosity (MLH) as a measure of neutral heterozygosity. We calculated MLH as the proportion of heterozygous typed loci divided by the mean heterozygosity of typed loci (Coltman et al. 1999).

MHC genotyping design and Illumina library construction

MHC class I exon 3 amplification was performed using the primers OryExon3af 5′-ATGTWYGGCTGCGAGGTC-3′ and OryExon3ar 5′-CCTTCCCCATCTCCAGGTAT-3′ which target a 200 bp fragment excluding primer binding sites. PCR was performed in 25 µl volumes using 2.5 µl 10× buffer including 18 mM MgCl2 (1x final buffer concentration) 0.8 mM dNTP, 10 pmol of each primer, 1 µl BSA, 0.5 units FastStart High Fidelity taq (Sigma Aldrich) and approximately 25–100 ng DNA template. Illumina (amplicon) libraries were generated in two PCR steps. The first step amplified the MHC fragment in 25 cycles using both forward and reverse specific primers containing a barcode (4–8 bp) and the 3′ final 33 bp of one of the Illumina sequencing adapters at the 5′ end. The second PCR consisted of ten cycles with primers binding to the 33 bp of the barcode adapter and adding a sequence complementary to the P5 and P7 Illumina flowcell oligos.

To achieve optimal cluster density and recognition for amplicon libraries on the Illumina platform we used the adapters to increase the diversity of the library (within the first nucleotides). First, in-line barcodes were designed to yield a balance among the 4 nucleotides in the first 4 cycles (i.e. each nucleotide would roughly have a 25% frequency), which optimizes cluster recognition. Second, in order to increase the diversity throughout each cycle, barcodes were designed with different lengths (4–8 bp) and 39% of the amplicons were sequenced in the opposite direction by attaching the reverse (instead of the forward) MHC primer to the P5 Illumina sequencing adaptor (and mixing them equimolarly with the other batch). We designed 10 reverse and 8 forward primers associated to adapter P7 and 10 forward and 8 reverse primers combined to adapter P5, giving a total of 164 primer combinations (i.e. 10 × 10 plus 8 × 8 combinations). Each individual rabbit library was independently generated twice (using the same DNA extraction) with a different adapter combination, and in 15 individuals we sequenced the amplicons from both directions (i.e. with the forward and reverse MHC primer attached to the P5 sequencing adapter).

MHC genotyping bioinformatics analysis

Demultiplexing of samples based on the in-line barcodes was performed using Flexbar (Dodt et al. 2012). Flexbar allows for dual in-line barcoding and barcodes of different lengths. Before merging the corresponding forward and reverse reads, we assessed the read quality via FastQC quality plots (Andrews 2010). We trimmed the 3′ end of reads at the first position where the phred quality score was below 30 for the lower first quartile (positions 176 for R1 and 100 for R2, leaving an overlap of 70 bases), merged them using FastqJoin (Aronesty 2011) and removed combined reads that had more than 5% of their bases below a 30 phred quality score. Virtually no reads showed unexpected sizes and the merged high quality reads finally analyzed had a length of 200 bp. For the downstream analysis we followed a published pipeline (Sommer et al. 2013) with two modifications to accommodate the Illumina platform (Suppl. Material S1). Reads were carefully checked for artefacts and MHC genotyping of each sample was based only on alleles that had been identified in both replicates (refer to Suppl. Material S1 for further details).

MHC class I characteristics of Australian rabbits and testing for associations with surviving/dying of RHD

We tested for recombination using a) GARD (Kosakovsky Pond et al. 2006) implemented in the HyPhy package (Pond et al. 2005) on the Datamonkey webserver (Delport et al. 2010) and b) the methods implemented in RDP4 (Martin et al. 2015). We used CODEML of the PAMLX package (Xu and Yang 2013; Yang 2007) to test for sites under positive selection in our MHC sequences. Using a codon-based approach we ran four models: 1) Model M1a (neutral) which assumes two site classes, with 0 < ω0 < 1 and ω1 > 1 respectively: 2) Model M2a (selection) which adds a third site class to M1a, with ω2 > 1; 3) Model M7 (β) with beta distribution approximating d N/d S variation, and 4) Model M8 (β and ω) with a proportion of sites evolving with d N/d S > 1. M1a versus M2a and M7 versus M8 were used to form two pairs of likelihood ratio tests (LRTs) for which the twice log likelihood difference (2∆In L) was compared to the x 2 distribution to determine the better-fitting models. If the selection models (M2a and M8) were significantly better than the neutral models (M1a and M7), we used Bayesian statistics (Bayes empirical Bayes, BEB) integrated in CODEML to identify codon sites under selection (Yang et al. 2005). Additionally we tested for evidence of positive selection by using the methods SLAC (Single-Likelihood Ancestor Counting), FEL (fixed-effect likelihood), REL (Random-Effect Likelihood), FUBAR (Fast Unconstrained Bayesian AppRoximation, Murrell et al. 2013) and MEME (mixed effects model of evolution, Murrell et al. 2012) all of which are implemented in the DATAMONKEY Web server (Delport et al. 2010; Kosakovsky Pond and Frost 2005). For these we used the implemented automatic model selection tool to identify the best substitution model for our dataset and used otherwise the default settings. These methods are widely used (e.g. Lemos de Matos et al. 2014; Pinheiro et al. 2014). To visualize alternative evolutionary paths among non-recombining allele sequences we constructed a median-joining (MJ) network using the software package Network v.5.0.0.0 (Bandelt et al. 1999). The implemented algorithm searches for all equally most-parsimonious connections among haplotypes which can either be found directly or by sequential addition of so-called median vectors. These connect sampled sequences and can be envisioned as sequences either not sampled or not existing anymore in the population (Bandelt et al. 1999). The exact tests of sample differentiation were performed using Arlequin 3.5.2.2. (Excoffier and Lischer 2010). We used DnaSP v5 (Librado and Rozas 2009) to calculate the nucleotide diversity (π) and the average number of nucleotide differences (k). We used BLAST (Altschul et al. 1990) and the European Nucleotide Archive (ENA, http://www.ebi.ac.uk/ena) we aligned a consensus of our sequences to the reference rabbit genome (OryCun2.0). For each BLAST hit we retrieved the annotation and genomic position and compared the reference sequence with each of our alleles.

We then grouped the MHC alleles into MHC supertypes based on functional similarity as suggested (Doytchinova and Flower 2005), a method which is widely used in vaccine design in human medical studies, as well as in evolutionary genetic analyses in wild species (e.g. Lillie et al. 2015; Schwensow et al. 2007; Sepil et al. 2013). For this, each variable amino acid position was transcribed into five physicochemical descriptor variables: z1 (hydrophobicity), z2 (steric bulk), z3 (polarity), z4 and z5 (electronic effects) (Sandberg et al. 1998) and hierarchical clustering with Euclidean distance was applied using SPSS 2.0.0.1 (IBM, New York, USA).

We constructed generalized linear models (glms) with a binomial dependent variable (‘survival’) and logit link-function in the R software environment for statistical and graphical computing (R Development Core Team 2013) and the packages MuMIn (Barton 2015) and AICcmodavg (Mazerolle 2016) to test for associations of rabbit MHC genotypes with RHDV survival (N = 82). In all models, we included MLH as a measure for neutral genetic variability. As the group of rabbits that had died of RHD comprised individuals collected in different years we tested whether the sampling year (as a categorical variable) is likely to affect the number of MHC alleles/supertypes or the frequency of a MHC supertype. As we did not find any evidence that the sampling year affected these parameters we pooled all rabbits that had died from RHD to form our ‘susceptible’ group. Full model selection and multimodel inference (Burnham and Anderson 2002) was conducted in order to determine a minimum adequate model set and variable influence.

To investigate whether ‘heterozogosity advantage’ is the more likely selection mechanism in rabbit MHC – RHDV co-evolutionary processes in Australia we used the individual number of MHC alleles (on the amino acid level) and the number of MHC supertypes present in an individual as an approximation of MHC heterozygosity. We tested for an effect of MHC diversity on survival by analyzing the mean number of amino acid differences between all MHC alleles of an individual.

To test for evidence for ‘frequency-dependent selection’ we tested the importance of specific MHC supertypes for surviving or dying (0 = dead, 1 = alive) from RHD. Using multi-model inference we identified any influential MHC supertypes as a first step (Table 2) and subsequently fitted a model that included the most influential MHC supertype(s). We also tested whether the number of rare alleles (present in <10% of the samples) or the number of very common alleles (present in >50% of the samples) influenced survival or death.

Results

MHC class I diversity pattern, selection and supertypes

We obtained 11,816,930 reads of which 8,033,088 passed all our quality filters. Each individual was genotyped with an ultra-deep coverage (mean = 48,982 ± 24,384 SD reads per replicate). We identified 21 nucleotide sequences without stop codons, indels or reading-frame shifts. A BLAST search identified for all sequences hits with genes (or sequences) within the rabbit MHC class I region on chromosome 12, however, a locus assignment could not be performed with certainty. Despite their multi-locus origin we called them MHC-class I alleles for simplicity. In our study population, we detected 19 MHC alleles with different amino acid compositions which we named MHC class I Orcu*01 to Orcu-*19 (deposited in GenBank, accession numbers KU848243–KU848263) based on the nomenclature suggested by (Klein et al. 1993). A consensus of our sequences mapped to six regions on chromosome 12 of the rabbit genome (OryCun 2.0). One of the BLAST hits was mapped to a transcript with five suggested locations. The other five hits had clear genome associated coordinates. Four of them were classified as “class I histocompatibility antigen alpha chain precursor MHC class I antigen” in their Ensembl family descriptor. The remaining one was described as “E3 ubiquitin ligase TRIM39”. The comparison to the rabbit cDNA/vertebrate cDNA ENA database showed that all of our sequences represented analogies to MHC class I genes. Three of our alleles, Orcu*10, Orcu*11 and Orcu*12, aligned to 100% to the RLA annotated genes ENSOCUG00000011016, EU427428.1 and ENSOCUG00000010972.

Three nucleotide sequences (Orcu*01a, Orcu*01b and Orcu*01c) translated into the same amino acid sequence (Suppl. Figure 1). We identified 65 segregating nucleotide sites. The pairwise nucleotide distance between the MHC alleles ranged from 1 to 37 (mean = 19 ± 8.35 SD). The nucleotide diversity was π = 0.097 and the average number of nucleotide difference was k = 19.462. The pairwise amino acid allele distance ranged from 0 to 21 (mean = 10.69 ± 4.28 SD). No evidence for recombination between the identified MHC alleles was detected. The MJ-network had a high degree of resolution with few homoplasies present caused only by median vectors (Fig. 1). In each individual rabbit we identified between three and eight different nucleotide sequences (mean = 5.63 ± 1.27 SD), and between three and seven different sequences on the amino acid level (mean = 5.51 ± 1.20 SD) meaning that at least four different MHC loci were present. The allele frequencies at the amino acid level ranged from 0.01 to 1.00 (Fig. 2).

Fig. 1
figure 1

Median-joining network of 21 rabbit (Oryctolagus cuniculus) MHC allele sequences (200 bp). Grey dots represent MHC alleles. Length of lines between alleles corresponds to number of nucleotide differences between neighboring alleles in the network. For mutational steps >1, the numbers are given in grey at the lines. Black dots represent median-joining vectors introduced by the network constructing algorithm to meet the parsimony criteria. Orcu, Oryctolagus cuniculus

Fig. 2
figure 2

MHC supertypes (ST). The identified MHC class I alleles were grouped into six different ST based on their physiochemical characteristics. The frequency of each amino acid allele in the dataset (N = 82 individuals) is given in brackets. Orcu*01 was present in all individuals and encoded by three different nucleotide alleles. MHC supertype 6 encompassed only allele Orcu*18 which strongly differed from the remaining alleles

Using the CODEML software package, the two models that allowed for positive selection (M2a, M8) fitted our data significantly better than the ones that did not. The M2a and M8 identified four and five sites under selection, respectively (Table 1), and the identified positions were consistent between the models. There was also a high overlap with positively selected sites identified by the models MEME, SLAC and FUBAR (see Suppl. Figure 1 for details). A comparison with the corresponding human sequence (Suppl. Figure 1) showed that all of the putative positively selected sites corresponded to antigen binding sites in humans.

Table 1 Evidence for positive selection on MHC class I alleles in European rabbits introduced to Australia

The MHC amino acid alleles were grouped into six distinctive groups (Euclidean distance ≥ 10, Fig. 2) according to physicochemical similarities affecting antigen-binding, hereafter referred to as ‘MHC supertypes’ (ST). ST 6 encompassed a single allele only (Orcu*18), which differed strongly from and appeared to be basal to all other alleles (Figs. 1, 2). Each individual rabbit possessed MHC alleles belonging to 3–5 different ST.

Association of neutral diversity and MHC constitution with surviving or dying from RHD

On average, we found 3.667 (±SD 0.866, range 2–5) alleles per locus. No difference between the mean observed heterozygosity (Hobs = 0.441 ± SD 0.204) and the mean expected heterozygosity (Hexp = 0.504 ± SD 0.219) was detected. No locus showed evidence for deviance from Hardy–Weinberg equilibrium (all p > 0.05). Rabbits that ‘succumbed to RHDV’ and ‘RHDV survivors’ did not differ in their MLH (Exact test of sample differentiation, exact p value = 1, SE = 0.000) nor their MHC allele frequencies (Exact test of sample differentiation, exact p value = 0.847, SE = 0.013). All MHC ST were equally present in both test groups (ST1: dying: 88%, surviving 88%, ST2 and 3: dying: 98%, surviving: 98%, ST5: dying: 28%, surviving 29%. ST6: dying: 13%, surviving: 12%), except for ST 4 (dying: 68%, surviving: 43%).

Due to inter-correlations among the measures of MHC variability, we fitted seven a priori models to explain a potential association of the MHC class I allele composition and surviving or dying from RHD. Each of the models included MLH as measure for neutral variability and a single variable for the MHC constitution. To test whether the presence of a MHC ST influenced survival or death we used multimodel inference and found that the presence of MHC ST4 was retained in the top six models (Table 2). Therefore, only the presence of MHC ST4 was included in our seven a priori models (Table 3). Of those, model 1 which included MLH and ‘presence of ST4’ fitted our data best and obtained the lowest AICc. The presence of ST4 was associated with dying from RHD (Table 3) with an odds ratio = 2.77.

Table 2 Model selection summaries for rabbit survival (or death) after RHDV infection explained by neutral diversity (microsat MLH) and the presence of specific MHC supertypes (ST)
Table 3 The a priori model set for testing the influence of non-independent genetic parameters on survival/death after RHDV infection

Discussion

Invasive species experience genetic bottlenecks during the invasion process and their ability to retain or increase genetic variability may be crucial for their capability to successfully overcome evolutionary challenges such as maladaptation to new environments including infectious diseases. Successful invaders often seem to have this capacity and do not always show significantly decreased genetic variability compared with original native populations (Lawson Handley et al. 2011), and this is also true for the European rabbit (Queney et al. 2000; Zenger et al. 2003). While neutral genetic variation may reflect the general adaptive genetic potential of a species, investigating the variability of coding genes that are under positive selection can reveal information on the processes that maintain or increase genetic variability. The immune genes within the MHC are under pathogen driven selection. Experimental studies suggested that genetic polymorphisms within the MHC, a gene region representing only 0.1% of the genome, are major host factors influencing pathogen adaptation and virulence evolution (Kubinak et al. 2013). Thus the MHC provides excellent markers to study the mechanisms of balancing selection in an invasive, bottlenecked rabbit population as well as the evolution of increased resistance to RHDV that has been observed in experimental settings (Elsworth et al. 2012).

The neutral diversity seems lower in our study population (9 microsatellite loci, average 3.7 alleles per locus, Hobs = 0.441 ± SD 0.204, Hexp = 0.504 ± SD 0.219) than previously reported from five other Australian rabbit populations. Zenger et al. (2003) found on average 5.05 alleles per locus (at seven microsatellite loci), an observed heterozygosity Hobs = 0.66 and an expected heterozygosity Hexp = 0.67. Similar values were reported from Spain and France while the values obtained from England were slightly lower and resembled ours (Queney et al. 2000; Surridge et al. 1999, summarized and compared by Zenger et al. 2003). Although it has to be noted that different sets of microsatellites had been used between the studies the slightly reduced neutral variability in the Turretfield population may be indicative of past population bottlenecks.

We identified a relatively high MHC class I diversity with 21 different MHC class I sequences with a high nucleotide diversity and genetic distances among alleles in 82 rabbits from a single Australian rabbit population. The clear tree-like pattern of the network is indicative of MHC alleles that have been separated over evolutionarily prolonged periods; a rapid allele evolution would have caused the network to display more of a star-like structure. The fact that we found three alleles identical to the reference genome (generated from a female inbred Thorbecke rabbit) suggests that least the alleles Orcu*10, Orcu*11 and Orcu*12 predate the split between the Australian rabbits and the Thorbecke breed. It is a common feature of the MHC that alleles are maintained across evolutionary lineages by selection processes (trans-species polymorphism, Klein et al. 1998). Our MHC class I sequences originated from several loci. MHC class I-Orcu*18 was markedly different from all other alleles, and the remaining alleles clustered into two major groups, one comprising supertypes ST1 and ST2 and the other with supertypes ST3, ST4 and ST5.

To our knowledge no information on MHC class I diversity is available from other rabbit populations. Only very few studies investigated MHC class I variability in other wild vertebrate species. In 66 individuals of common frogs from seven different populations, 129 alleles were found (Teacher et al. 2009). In 108 cheetahs from across Namibia a total of nine expressed sequences (from >1 locus) were identified (Castro-Prieto et al. 2011). In 664 house sparrows from several populations, 60 sequences (from >1 locus) were identified (Claire et al. 2009). In our Australian rabbit study (N = 82), we identified 21 different sequences. Comparing the number of sequences per genotyped individual of these studies, rabbits have a relatively high MHC class I variability. A few studies reported on MHC class II variability in rabbits. In French, Spanish and Portuguese wild rabbits, 10–13 MHC class II DQA alleles were identified in 10 individuals per population (Magalhães et al. 2015). Another study found 12 DQA alleles in 12 wild rabbits (England: N = 8, Portugal: N = 4) (Surridge et al. 2008). In a fenced rabbit population, four DRB sequences from different loci were identified in 113 individuals (Oppelt et al. 2010). However, allelic variability is known to differ both between MHC gene classes and between the different loci within a class (for numbers in humans see Reche and Reinherz 2003). A comparison of allelic diversity of different MHC loci thus has to be made cautiously.

Such high functional diversity is remarkable considering that the population experienced at least three strong bottlenecks (i.e. during initial introduction and through two viral biocontrol programs). Several empirical studies suggested drift as the predominant cause of reduced MHC variation, especially in small and bottlenecked populations (summarized by Alcaide 2010). Studies using both a modelling approach and a meta-analysis predicted that the combination of selection with strong genetic drift would not only be inadequate to maintain genetic variability in a bottlenecked population but may even lead to a stronger decline in diversity in MHC genes than in selectively neutral loci (Ejsmond and Radwan 2011; Sutton et al. 2011). Thus, the question emerges as to why MHC diversity in the Australian rabbit population is relatively high?

One explanation could be that different alleles ‘survived’ or evolved in individual rabbit populations after the bottlenecks and that, on the metapopulation level, gene flow between populations maintained or restored the MHC variability. However, our study population is relatively isolated from neighboring rabbit populations and strong gene flow into our study population is not indicated by neutral marker (Schwensow et al. unpublished SNP data). Another possible answer is that evolutionary mechanisms other than mutation and subsequent selection restore functional diversity. Over long evolutionary periods (>10,000 years) gene conversion may increase MHC variability, as was demonstrated for some island birds (Spurgin et al. 2011). However, the establishment of rabbits in Australia occurred less than 200 years ago, and it has only been a few decades since the introduction of the Myxoma virus and RHDV. Moreover, we did not find evidence for recombination in our study. Our sequencing results, however, suggest gene copy number variation (CNV). The rabbits studied here had between three and eight different MHC class I alleles suggesting the presence of at least four different gene copies. This figure is comparable to the three paralogs annotated in the rabbit genome project (Alföldi et al. 2009). CNV in the MHC region has previously been detected in different domestic rabbit breeds, and a functional role in disease resistance and susceptibility has been suggested (Fontanesi et al. 2012). The adaptive role of gene duplication is becoming more recognized as it was found that selective processes work immediately on recently duplicated paralogs (Kondrashov et al. 2002; Kondrashov 2012). CNV was suggested to have replaced MHC allelic polymorphism in Rhesus macaques (Otting et al. 2005) and in the RT1-CE region in the rat (Roos and Walter 2004). Although we cannot rule out that we simply failed to amplify all loci in all individuals, we believe this to be unlikely as we checked the region targeted by our primers on the rabbit genome assembly and found that they perfectly matched the three alleles shared between Australian rabbits and the Thorbecke breed, eliminating technical errors due to primer mismatches. In our case, variability reduction by genetic drift after the bottlenecks would be counteracted by the presence of haplotypes derived from several gene copies such that high functional MHC variability still exists. Whether differences in the MHC class I expression levels influence RHDV resistance or susceptibility cannot be answered yet, however, it is at least possible that gene duplication or CNV may either maintain or even generate adaptive MHC variability in natural populations of wild rabbits. Although we could not test the expression of the sequenced MHC class I alleles, the detection of positively selected sites strongly suggests their functionality. The two models that allowed for selection or an increased rate of non-synonymous substitutions within the sequences fitted our data significantly better than the neutral models and we identified five positively selected amino acid positions that corresponded to human antigen binding sites (Bjorkman and Parham 1990; Bjorkman et al. 1987).

Only few studies have reported evidence for virus-driven selection on MHC class I in the wild. In the common frog (Rana temporaria) virus-affected populations had more similar allele frequencies than non-affected populations which was interpreted as being the result of virus-driven selection (Teacher et al. 2009). MHC class I variation in cheetahs (Actinonyx jubatus) mirrors the different selection pressures imposed by viruses in free-ranging cheetahs across Namibian rangeland also indicating virus-driven selection on MHC class I (Castro-Prieto et al. 2012). Investigating associations between MHC allele repertoires of hosts and actions of their pathogens, such as viral loads/replication or the outcome of infections, may unravel contemporary selection on MHC genes. As pointed out earlier, associations between specific pathogens and particular MHC alleles suggest frequency-dependent selection while associations with MHC heterozygosity or diversity would suggest heterozygote advantage (Spurgin and Richardson 2010). We did not find evidence for the latter as a higher number of MHC class I alleles did not influence survival from RHD. Although the high number of MHC alleles in our study population did not allow for testing allele-specific effects, grouping the alleles into presumably functionally similar MHC supertypes suggested that ST4, which was present in 55% of our rabbits, is disadvantageous with respect to surviving RHDV. This would be in line with the frequency-dependent selection hypothesis which predicts negative association of pathogens with common MHC alleles (Takahata and Nei 1990). The importance of negative frequency-dependent selection in newly arising host-pathogen interactions, such as the sudden introduction of RHDV, has also been supported by several recent studies (summarized by Burdon et al. 2013).

ST4 was built by only two alleles, MHC class I-Orcu*06 and Orcu*17. Interestingly, these two alleles differed from all other alleles at four amino acid positions. Two of these positions (positions 40 and 44, Suppl. Figure 1) are antigen binding sites in the corresponding human molecule (Bjorkman and Parham 1990; Bjorkman et al. 1987) suggesting a role in antigen binding in rabbits as well. Reasons for the relatively high frequency of ST4 in the Turretfield population (55% in total, 68% in dying rabbits, 43% in surviving rabbits) despite a negative association with surviving RHD may be that a change in the RHD virus has occurred only recently. Frequent RHDV mutations can be expected due to the very high evolutionary rate of this RNA virus (Eden et al. 2015; Kovaliski et al. 2013). A high evolutionary rate of the pathogen favors Red Queen dynamics (i.e. frequency-dependent selection) and simulations showed that when pathogens evolve quickly, this mechanism is capable of explaining both positive selection and long coalescence times of MHC alleles (Ejsmond and Radwan 2015). Alternatively, susceptible MHC class I alleles may also be maintained in a host population when they exert antagonistic effects on different pathogens, as observed for different malaria parasites in house sparrows (Loiseau et al. 2008). For example, if ST4 were protective against myxomatosis, which regularly causes disease outbreaks and mortality in our study population, maintenance of alleles Orcu*06 and Orcu*17 at relatively high prevalence would be advantageous.

Conclusion

Understanding the evolutionary mechanisms by which invasive species adapt and overcome effects of strong population bottlenecks has a central role in both invasion biology and conservation biology. Our data show that rabbits have maintained relatively high MHC diversity despite apparent bottlenecks and strong selection through viral biocontrol agents. We find evidence for positive selection in the antigen-binding sites suggesting that pathogens are evolutionarily important drivers for this diversity. An association of one MHC supertype with death after RHDV infection suggests that this virus contributes to the selection on the rabbit MHC. Gene duplication and frequency-dependent selection are possible (and likely) mechanisms that maintained MHC class I variability despite presumably high levels of genetic drift. Besides the MHC, other genes are likely to contribute to the increased genetic resistance of wild rabbits in Australia against RHD, and a genome-wide scan for further resistance loci would be an important next step to gain a better understanding of the co-evolutionary processes between RHDV and the rabbit host. This is not only important for controlling increasing rabbit numbers in Australia, it is also important for rabbit conservation in Europe.