Introduction

Studying genetic variation among individuals is a core goal of evolutionary biology and crucial to conservation genetics. Population bottlenecks and subsequent inbreeding can result in loss of genetic diversity both at the population (allelic) and individual (observed heterozygosity) levels. Loss of heterozygosity can lead to inbreeding depression via losses of heterozygote advantage and increased probability of homozygosity for deleterious recessive alleles (Allendorf and Luikart 2007). Left unmanaged, inbreeding depression has the potential to undermine population restoration goals through negative impacts on both individual and population viability (Keller and Waller 2002). Measuring (and thus mitigating) losses of heterozygosity in wild populations is frequently based on individual-level heterozygosity (multilocus heterozygosity, MLH) evaluated using neutral genetic markers such as microsatellites (Slate et al. 2004). Many studies of threatened species still rely on neutral loci, such as microsatellites, at least for initial estimates of heterozygosity (Kirk and Freeland 2011). Despite their common usage, the suitability of microsatellites to inform patterns at functional loci is debated (Väli et al. 2008; Chapman et al. 2009; Ljungqvist et al. 2010; Szulkin et al. 2010), so a growing number of studies also include diversity at functional loci when evaluating changes in heterozygosity.

Immune genes are ideal for studying functionally relevant genetic diversity as they represent the most rapidly-evolving parts of the genome, due to diverse and constant selection pressure from co-evolving pathogens (Hedrick 1998; Piertney and Oliver 2006; Deakin 2012). Studies of putatively functional genetic diversity usually examine genes of the major histocompatibility complex (MHC), which encode cell-surface proteins that bind and present foreign peptides for the initiation of cell-mediated immunity (Piertney and Oliver 2006; Spurgin and Richardson 2010). Although MHC is well-studied in the molecular ecology literature, there is a growing awareness that non-MHC parts of the immune system can provide useful insight into processes affecting adaptive diversity of non-model organisms (Jepson et al. 1997; Acevedo-Whitehouse and Cunningham 2006; Vinkler and Albrecht 2009; Bollmer et al. 2011). Furthermore, calculating locus-specific MHC genotypes for non-model species, especially birds, can be technically challenging. The MHC is highly duplicated in many species, particularly passerine birds (Westerdahl 2007; Bollmer et al. 2010), often necessitating cloning or next generation sequencing (Babik et al. 2009), the latter with accompanying specialised analyses (e.g. Galan et al. 2010; Meglécz et al. 2011). These technical challenges can limit our ability to determine MHC locus-specific heterozygosity for individual animals.

Recently, a family of innate-immunity genes, the toll-like receptors (TLRs), have been used to address questions of ecological relevance in wild populations (Turner et al. 2011; Grueber et al. 2012, 2013). TLRs are a family of genes encoding proteins that recognise a wide diversity of pathogens through binding of conserved pathogen-associated molecular patterns (Uematsu and Akira 2008). Through intra-cellular signalling, TLRs initiate innate and adaptive aspects of the immune response (Barreiro et al. 2009). TLRs offer several advantages over MHC for assaying immune-gene heterozygosity, one of which is that gene duplications are either rare or well-characterised (Temperley et al. 2008; Grueber et al. 2012). Individual TLR genotypes can therefore be obtained using routine Sanger sequencing of PCR amplicons (current study; Alcaide and Edwards 2011; Grueber et al. 2012).

Previous analyses of avian TLR sequence diversity found evidence that a number of codons in most TLR genes experience pervasive or episodic positive selection (Grueber et al. 2014). Where previous studies have used a phylogenetic approach, with only 1–2 samples per species, the current study focuses on within-species tests of selection, which require population-level data. Here we determine levels of TLR heterozygosity of individuals from 10 threatened New Zealand bird species (across four avian orders; “Appendix” section). We first report detailed TLR diversity statistics within these populations of New Zealand birds, and test for evidence of balancing selection on these key immunity genes. Because microsatellites are still widely used for evaluating patterns of genetic diversity in threatened populations, this study also evaluates whether microsatellite data can predict levels of TLR heterozygosity of individuals within populations. Additionally, should a relationship between microsatellites and other types of heterozygosity exist, it may be population-specific due to variation in demographic processes (Grueber et al. 2008b; Alho et al. 2009), thus our models enable us to examine the consistency of relationships between neutral and functional heterozygosity across species.

Methods

Study species and DNA samples

Background data for the study species included here, as well as details of the sites from which our samples were sourced, are provided in “Appendix” section and Fig. 1. All samples used herein had been collected for previous studies and made available for this study either as extracted DNA or as tissue samples [whole blood in ethanol or lysis buffer (Seutin et al. 1991), or feathers] (see “Appendix” section for the sources of samples used here). DNA was purified from tissue samples using a modified Chelex (Bio-Rad) extraction (Walsh et al. 1991; Casquet et al. 2012). Feather sample extractions were followed by an additional LiCl2 and ethanol precipitation using GenElute linearised polyacrylamide (Sigma) as a DNA carrier. Extracted DNA was stored at 4 °C (short-term, <1 year) or −20 °C (long-term).

Fig. 1
figure 1

Important sites for the conservation of the species studied herein (“Appendix” section). Major cities of New Zealand are labelled with open diamonds, localities of interest with filled circles. Insets show the Northland (A), Fiordland (B) and Stewart Island (C) regions; the location of the Southern Alps (dashed area on main map) is an approximation

TLR sequencing

In birds, there are 10 known TLR genes, which bind a variety of pathogen-associated molecular patterns (Cormican et al. 2009). We used previously published primers (Alcaide and Edwards 2011; Grueber et al. 2012; Grueber and Jamieson 2013) to amplify a minimum of five TLR loci per species; all loci were amplified in at least five species (see “Results” section). For each TLR gene, we amplified an average of 850 bp targeting the extracellular leucine-rich repeat region (LRR) of each gene (Alcaide and Edwards 2011; Grueber et al. 2014). These regions were targeted because the LRR domain is associated with pathogen recognition (Werling et al. 2009), and is expected to show greatest sensitivity to pathogen-driven selection (Areal et al. 2011; Grueber et al. 2014). Amplification, clean-up and sequencing followed published protocols (Grueber and Jamieson 2013). Sanger sequencing service was provided by Macrogen Inc. (Korea) and the Genetic Analysis Service at the University of Otago (New Zealand). Sequence editing and haplotype reconstruction protocols are provided in Supplementary Methods; all SNPs (synonymous and nonsynonymous) were used in the construction of haplotypes. Haplotypes were assigned 2-digit codes and used as genotype data for calculating heterozygosity.

TLR diversity and inference of selection

Balancing selection can result in an excess of heterozygotes, so we tested our TLR data for deviation from Hardy–Weinberg equilibrium using the exact test implemented in Arlequin v3.5.1.3 (Excoffier and Lischer 2010), with 1 × 106 steps in the Markov chain following 1 × 105 burn-in iterations. Inferred haplotypes were also used to calculate polymorphism statistics using DNAsp (Librado and Rozas 2009), including number of inferred haplotypes (h), mean nucleotide differences between haplotypes (k), nucleotide diversity (π), and numbers of non-synonymous and synonymous single-nucleotide polymorphisms (SNPs).

We used Arlequin to evaluate summary statistics of selection for those genes that showed at least five haplotypes within a species (following Alcaide and Edwards 2011). The Ewens–Watterson test (Ewens 1972; Watterson 1978) examines whether the frequency of alleles in the sample are more uniform than predicted under neutrality, interpreted as a form of balancing selection (Spurgin and Richardson 2010). Tajima’s D (Tajima 1989) can be considered analogous to the Ewens–Watterson test, but explicitly accounts for mutational events and therefore the level of divergence among alleles, rather than simply their frequencies (Garrigan and Hedrick 2003). For both tests, P values were evaluated by 1,000 permutations in Arlequin. Both of these metrics can be influenced by demographic population processes, meaning comparisons between species/populations are problematic (Nei 1987). We therefore restrict our inference to comparisons among loci within species (as they should be similarly affected by demographic processes); four species had ≥2 loci with h ≥ 5: kakariki, rock wren, robin and kokako.

TLRs that bind different types of pathogen-derived ligands may evolve in different ways; we thus compared levels of diversity (SNPs, h, π and k) between TLRs proposed to bind viral ligands (TLR3, TLR7, TLR21) and those proposed to bind proteinaceous ligands (TLR1LA, TLR1LB, TLR2B, TLR4, TLR5, TLR15) (Areal et al. 2011; Keestra et al. 2013). These comparisons were performed using generalised linear mixed modelling with lme4 v1.0-5 (Bates and Maechler 2009) in R v3.0.1 (R Core Team 2013). Our fixed factor was a binary predictor of “viral”/“non-viral” (1/0) and we included diversity data from each TLR locus as the response variable, fitting “species” as a random factor to account for mean differences in diversity between species. Each model used an error structure appropriate for the response variable: count data (h and number of SNPs) were fitted with a Poisson error distribution; data derived from count-based measures (k) were log transformed and fitted with a Gaussian error distribution; proportion-based data (π) were logit transformed and fitted with a Gaussian error distribution. For number of SNPs, we examined total SNPs, as well as non-synonymous and synonymous SNPs separately. Fitted values were obtained from our models using functions available in the R package arm v1.6-10 (Gelman et al. 2009); inference was based on the effect size of the slope and its associated 95 % confidence interval (1.96 × SE of the slope).

Microsatellite data

Microsatellite data for the individuals included here were sourced from published studies, our own on-going research, and we specifically obtained microsatellite genotypes from kiwi and kokako. Full microsatellite genotyping protocols for kiwi and kokako are detailed in Supplementary Methods. Data from 8 to 25 microsatellite loci per species were used; heterozygosity summary statistics and references for all microsatellite data are provided in “Results” section. We note that different microsatellite loci were typed in each species; our microsatellite datasets are therefore not directly comparable between species, but can be used to evaluate relationships within species (see below).

Individual MLH was calculated from the microsatellite data using the R-package Rhh (Alho et al. 2010); these calculations were performed by analysing the microsatellite dataset for each species separately. We used the MLH metric internal relatedness (IR), as this measure is intended as a DNA-based measure of an individual’s inbreeding coefficient (Amos et al. 2001). Note that IR is a measure of homozygosity and therefore is expected to be negatively correlated with other measures of heterozygosity. For comparison, we also evaluated an alternative microsatellite MLH metric, standardised heterozygosity (SH) (Coltman et al. 1999). IR and SH were highly correlated in our dataset (r = −0.929; N = 216 individuals), and our main results from both were qualitatively similar. We therefore present only the results using IR, as this measure is commonly used in studies that measure MLH with a view to informing patterns of inbreeding (i.e. studies operating in a conservation context similar to ours).

Relationship between microsatellite and TLR heterozygosities

We examined the ability of microsatellite MLH to predict TLR heterozygosity of individuals of each species using a GLMM implemented by the MCMCglmm function in the R-package MCMCglmm (Hadfield 2010). In the GLMM, the response variable was the proportion of TLR loci genotyped as heterozygous for each individual, specified in the model as the per-individual counts of heterozygous and homozygous TLR genotypes. Given that our response variable is a proportion, the model was estimated with a logit link function (specified in MCMCglmm as family = “Multinomial2”). Microsatellite IR was our fixed predictor variable. To assess whether the ability of microsatellite IR to predict TLR heterozygosity was species-specific, we also fitted a random effect with 10 levels, specifying the species of each individual, with a random slope for microsatellite IR. This model allows us to estimate two processes: (1) the between-species variance associated with the slope of TLR heterozygosity on microsatellite IR, and its associated error; (2) species-specific slopes of TLR heterozygosity on microsatellite IR, and associated errors, estimated with partial pooling. The estimation of error on these values is a specific advantage of using the MCMCglmm package. Detailed MCMC specifications, convergence diagnostics and sensitivity analyses are provided in Supplementary Methods.

Results

TLR diversity in 10 bottlenecked New Zealand species

Across all study species and genes, we obtained population-level data (total 1,225 sequences) for a total of 18,168 codons of TLR sequence (Table 1). The length of the sequenced region for each gene/species combination varied, as a range of PCR primers were used, although sequences from the same gene showed a high degree of overlap between species (compare starting positions referenced against the chicken genome, Table 1). We sequenced a mean of 20.1 individuals per species per gene (range 17–24 individuals), a mean of 6.2 genes per species (range 5 [kiwi, kakapo, hihi, rock wren] to 8 [robin, saddleback] genes) and each gene was sequenced in a mean of 6.9 species (range 5 [TLR7, TLR15] to 10 [TLR1LA] species) (Table 1).

Table 1 TLR sequencing results and diversity measures

All sequenced TLR regions were polymorphic in kiwi (N = 5 loci), kakariki (N = 5 loci) and robin (N = 8 loci); all species were polymorphic at one or more TLR loci (Table 1). TLR1LA showed the highest rate of polymorphism, being variable in 9 of 10 (90 %) species sequenced, followed by TLR4 and TLR15 (both variable in 5 of 6 [83 %] species sequenced) (Table 1). TLR21 and TLR2B showed the lowest rates of polymorphism (for both loci, 40 % of sequenced species were variable) (Table 1). We detected 219 SNPs in total over all loci and species; there were slightly more non-synonymous sequence variants (112) than synonymous variants (109) (Table 1). We observed a mean of 5.34 SNPs per polymorphic alignment, although there was considerable variation in this statistic (SD = 4.89; N = 41 species–gene alignments).

None of our alignments showed frame-shift mutation, indicating no evidence of potential pseudogenisation that has been previously observed in some passerine TLR5 alignments (Alcaide and Edwards 2011; Bainová et al. 2014). Only one alignment (hihi TLR21) appeared to show indel variation within species; the chromatogram data for two individuals appeared to show a heterozygote 3-bp indel polymorphism in two independent PCRs each. Cloning or next-generation sequencing data would be required to confirm this indel variant; TLR21 data for these two individual hihi were excluded from subsequent analyses. TLR7 has previously been observed to be duplicated in passerine birds (Cormican et al. 2009; Alcaide and Edwards 2011; Grueber et al. 2012; Hartmann et al. 2014); herein TLR7 sequence data were obtained for two passerines: saddleback and rock wren. The TLR7 sequencing chromatograms for all saddleback samples (N = 20) appeared to show heterozygosity at five nucleotide sites. This complete heterozygosity is suggestive of a duplication of TLR7, if the five “variable” sites comprise differences between the two copies of the gene, rather than heterozygosity per se; cloning would be required to confirm this hypothesis.

New Zealand rock wren is of particular interest for TLR7, as the species belongs to the ancient family Acanthisittidae, which is phylogenetically distinct from all other passerines (Ericson et al. 2002). It is therefore informative to determine whether rock wren, as a representative of this group, show a pattern of TLR7 duplication similar to other passerines, or a single copy of TLR7 as found in non-passerine bird species. In the sequencing data, rock wren showed repeatable variation among individuals in the relative chromatogram peak heights at 12 apparently heterozygous nucleotide sites at TLR7 (N = 21). These observations could be explained by a gene-duplication wherein either copy is heterozygous at these sites. In addition, two individuals exhibited one apparent tri-nucleotide position in the Sanger sequencing chromatograms of two independent PCRs per individual. These two observations support coamplification of a possible gene-duplication of TLR7 in rock wren, suggesting the duplication occurred early in the divergence of the Passeriformes, although cloning or next-generation sequencing data of the amplified products would be required to confirm this. Due to the possibility of coamplification of orthologous sequences, TLR7 data for saddleback and rock wren were excluded from subsequent analyses.

Selection on TLR sequences

Within species, three loci showed statistically significant heterozygote excess (kiwi TLR2B, takahe TLR15 and hihi TLR1LA) and one showed significant heterozygote deficit (robin TLR21) at α = 0.05 (Supplementary Table S2), although only hihi TLR1LA was significant after accounting for multiple testing (sequential Bonferroni correction, 41 tests; Holm 1979). Furthermore, slight heterozygote excess was observed at approximately half (49 %) of our 41 gene/species samples, suggesting no systematic pattern of deviation from Hardy–Weinberg equilibrium at toll-like receptor loci among the 10 species studied here (Supplementary Table S2). Therefore, with the possible exception of hihi TLR1LA, there was no strong evidence for balancing selection at TLR loci within these species.

Most Tajima’s D and Ewens–Watterson test statistics showed no deviations from neutrality for the four species that had ≥ 2 loci with ≥ 2 haplotypes (Supplementary Table S3). These results are consistent with genetic drift as the predominant determinant of the observed haplotype frequencies in these four species. One exception was TLR5 for kokako, for which the Ewens–Watterson test statistic indicated that allele frequencies were more uniform than predicted under neutrality (Supplementary Table S3). This result suggests that a form of balancing selection is potentially operating on TLR5 in kokako, an observation that is unlikely to result from demographically induced biases (such as changes in population size or incomplete isolation), as TLR15 showed no such deviation in kokako and all loci are expected to be similarly impacted by demographic processes (Supplementary Table S3).

Comparing diversity of the three TLRs with viral ligands (TLR3, TLR7 and TLR21) to the other TLRs, which have proteinaceous ligands, we observed lower diversity among the virus-sensing loci in terms of h, π, k and number of SNPs (Supplementary Fig. S1). The 95 % confidence intervals for the regression slopes excluded zero for all diversity measures, with viral-sensing TLRs showing, on average, 2.2 fewer haplotypes, k decreased by 0.557, π decreased by 0.00048 and 2.7 fewer SNPs (1.1 fewer synonymous; 1.6 fewer non-synonymous) than TLRs that bind proteinaceous ligands, as inferred from the modelling results (Supplementary Table S3). Together, these results highlight potential differences in long-term (i.e. over evolutionary timescales) selection pressures experienced by different TLR loci.

Association of microsatellite and TLR diversities

Across all 10 study species, we recorded microsatellite genotype diversity data for a total of 146 polymorphic loci, representing a mean of 14.6 loci per species (range 8 [kokako] to 25 [kakapo] loci) (Table 2). We used internal relatedness (IR) as our metric of individual MLH, noting that IR is a measure of homozygosity, and is expected to decrease with increasing levels of genome-wide heterozygosity (e.g. IR is expected to correlate positively with inbreeding coefficient).

Table 2 Summary statistics for microsatellite data

On average across all species, MLH showed no relationship with TLR heterozygosity; the very weak negative effect was statistically non-significant: the 95 % credible interval included zero (Fig. 2A; full model results in Supplementary Table S5). The model suggested that there were some species-specific effects (Fig. 2B): species-level slope variance was 0.023, translating to a between-species standard deviation in slopes of 0.153 (Table 3). Comparing this value to the magnitude of the overall slope (−0.013; Table 3), suggests that there may be some variation among species, however as seen in Fig. 2B, all species display similar null relationships. Note that these values are all estimated with poor precision (Table 3).

Fig. 2
figure 2

Relationship between IR, a microsatellite-based measure of individual multilocus heterozygosity (MLH), and TLR heterozygosity. A The data from each individual (N = 216); the size of each point indicates the number of TLR loci genotyped for that individual (range 1–8). The fitted line on A is the main effect indicated as “overall” in B. B Forest plot comparing the species-level slopes on the logit scale (random slopes model fitted using MCMCglmm; see “Methods” section). Points are the posterior mean; error bars are the 95 % credible interval. Species are indicated on the y-axis of B, grouped by avian order (“Pa” Passeriformes, “Ps” Psittaciformes, “Gr” Gruiformes, “Ap” Apterygiformes)

Table 3 Linear mixed model estimates for TLR heterozygosity

We did observe some variation in the precision of slope estimates (i.e. the slope errors; Fig. 2B). It is possible that variation in the number of microsatellite loci used for each species may contribute to this variation, as well as the diversity of those loci (see Table 2). This may occur if, for example, greater numbers of microsatellite loci, or loci with greater diversity, facilitate more-precise estimates of individual MLH than using less-informative microsatellite data. This hypothesis would predict a negative relationship between the amount of microsatellite data and the magnitude of species-level slope errors (width of the 95 % CI for the relationship between microsatellite MLH and TLR heterozygosity of each species). We found no such relationship with the number of loci used, mean gene diversity (expected heterozygosity) of loci, nor mean number of alleles at microsatellite loci (Supplementary Fig. S2). Therefore we had no evidence that the variation in slope errors was correlated with the amount or quality of microsatellite data used.

Discussion

Here we report population-level toll-like receptor diversity statistics for 10 threatened New Zealand bird species. Many loci were variable: SNPs were observed in all sequenced loci for three species, and all species were polymorphic at one or more TLR loci (Table 1).

A recent phylogenetic study of the evolution of TLRs across bird species found a high degree of episodic positive selection, consistent with a pathogen-mediated model of evolution (Grueber et al. 2014). At the within-population level, the current study provided little evidence for balancing selection at TLR loci within these 10 populations. Although hihi appeared to have an excess of heterozygotes at TLR1LA, there was no systematic pattern of heterozygote excess (relative to Hardy–Weinberg expectations) across species or genes, indicating no evidence of balancing selection. We note, however, that comparing the frequency of observed heterozygotes to Hardy–Weinberg expectations only permits detection of selection over a single generation; very strong selection would be required to drive a statistically significant result over this short timeframe (Spurgin and Richardson 2010; Hedrick 2012). Comparing haplotype frequencies to neutral expectations for four species also failed to produce evidence of balancing selection, with the possible exception of TLR5 for kokako. This latter result is surprising, given a recent report of multiple independent TLR5 pseudogenisation events in passerine evolution (Bainová et al. 2014). It is unclear why these differences among species occur (Bainová et al. 2014), although further investigation into specific pathogen pressures experienced by these species, with specific comparison to kokako, would be valuable. Overall, a general lack of evidence for balancing selection in these small populations supports findings from a pedigree-based study of a population of Stewart Island robin that genetic drift is a key determinant of population TLR diversity following a bottleneck (Grueber et al. 2013). These results are also similar to findings from studies of MHC, which have found that genetic drift can be a strong determinant of diversity after a severe population bottleneck (e.g. Miller and Lambert 2004).

We observed no relationship between microsatellite IR (a measure of homozygosity) and TLR heterozygosity across individuals from 10 threatened New Zealand bird species. These results suggest that microsatellite MLH is not a good indicator of inter-individual variation in heterozygosity at genic regions, such as TLR loci. These findings support claims that microsatellite MLH, estimated with a relatively small number of markers, is often not a reliable predictor of individual-level genome-wide heterozygosity (Balloux et al. 2004; Miller et al. 2014). Importantly, our results highlight the value of studying functional genomic regions, such as TLR sequences, due to the additional information these loci can provide about changes in genetic diversity in conservation contexts.

All of the species we examined are threatened and have therefore undergone population bottlenecking to varying degrees (“Appendix” section). A recent report of population-level TLR diversity in two widespread species, house finch Carpodacus mexicanus and lesser kestrel Falco naumanni, indicated much higher levels of TLR diversity than we observed in the 10 threatened species studied here (Alcaide and Edwards 2011; Table 1). To examine the association between microsatellite and TLR heterozygosity in the current study, we focused on within population relationships (i.e. utilising individual-level statistics). It was not possible to evaluate the relationship between mean microsatellite and TLR diversity across populations (i.e. utilising population-level statistics, such as population size), for two important reasons. First, technical differences in microsatellite characterisation and genotyping protocols between species would likely result in non-homology between loci across species, as well as cross-species variation in ascertainment bias. For example, microsatellite data for some species comprised mainly loci characterised for the species themselves (e.g. takahe; Grueber et al. 2008a), while others primarily used loci characterised in related species (e.g. robin; Boessenkool et al. 2007). Thus, population-level statistics based on microsatellite data cannot be directly compared across species. Second, we were unable to amplify the same TLR loci in all species (although there was a high degree of overlap, Table 1). TLRs bind a diversity of pathogen-associated molecular patterns, so amplifying different loci across species may drive differences in mean levels of diversity. For example, we observed reduced TLR diversity in viral-sensing TLRs, compared to other TLRs (Supplementary Fig. S1), similar to findings in mammals (Areal et al. 2011). Furthermore, even when considering the same TLR locus, differences in evolutionary, ecological and life-history traits of each species may drive differences in selection on TLR diversity, complicating comparisons across species. We do not have data regarding the particular parasite or disease burdens experienced by any of the individuals included herein, which might have enabled us to partially control for these effects.

Overall, we have observed that, within populations of conservation concern, microsatellite MLH contains no signal of inter-individual variation in heterozygosity of TLRs. Thus, because of their ease of genotyping (relative to MHC immunity genes), TLRs represent a valuable addition to the conservation genetic toolkit for the study of functional genetic variation in non-model species. The population-level TLR diversity data presented here, for 10 bird species of conservation concern, will provide a valuable comparison for similar studies in common species, as well as a starting point for studies of the relationship between TLR diversity and fitness in these and related taxa.