Introduction

In the face of dramatic rates of loss of biodiversity over the past few decades (Barnosky et al. 2011), conservation genetics has become a major branch of modern biology (Ouborg et al. 2010). New trends have recently emerged, moving the field from the assessment of random genetic drift, dispersal, population structure and genetic diversity using selectively neutral markers towards the assessment of adaptively important genetic variation underlying ecologically important phenotypic traits (Kohn et al. 2006; Ouborg et al. 2010; Piertney and Webster 2010). Assessing the consequences of population fragmentation and declining population size on the fate of adaptively important genetic variation in natural populations requires characterization of candidate genes known to be under selection or important to fitness in non-model species (Piertney and Webster 2010). Two distinct approaches can be used to identify and characterize genes of interest. The first of these is a ‘top-down’ approach where genes associated with a specific trait are identified, or genomic scans are used to identify loci under selection (e.g. Beaumont and Balding 2004; Storz 2005; Excoffier et al. 2009), with subsequent examination of sequence variation at these loci (i.e. the traits of interest or genome scans come first and then specific candidate genes are identified). The second is a ‘bottom-up’ approach where genetic/genomic information from related species is used to develop primers for specific genes already of interest (Piertney and Webster 2010).

Loci involved in immunity are an obvious choice for conservation geneticists wishing to examine genetic variation underlying adaptively important traits related to fitness. Divergent positive selection may be expected to lead to the presence of genetic variation in populations with different selective regimes imposed upon them by differing parasite/pathogen loads in space or time, or balancing selection may be expected to maintain genetic polymorphisms in cases where there may be negative frequency dependent selection or overdominance (Slade and McCallum 1992; Hedrick 1999). The major histocompatibility complex (MHC) is a case in point. MHC loci are involved in the adaptive immune response in vertebrates and the locus complex is one of the most variable coding regions known (Hedrick 1999). Maintenance of this variation is usually attributed to negative frequency-dependent selection, overdominance and diversifying selection (see introduction in Sutton et al. 2011) and numerous studies have examined MHC variation in a conservation and/or population genetics based context (e.g. Jarvi et al. 2004; Aguilar et al. 2004; Aguilar and Garza 2006; Radwan et al. 2010; Sutton et al. 2011). However, MHC variation represents only one aspect of immunity, and a wider approach to studying immunogenetics in natural populations has been advocated (Acevedo-Whitehouse and Cunningham 2006). Further, being part of the adaptive immune response, MHC loci are only found in vertebrates, which themselves represent only a small fraction of existing biological diversity (May 1988).

In addition to the adaptive immune response, both vertebrates and invertebrates possess a general immune response mediated by the innate immune system (Gillespie et al. 1997). Since invertebrates lack the adaptive immune response shown by vertebrates it was previously assumed they would lack the ability to generate a specific response; however, recent studies show that the innate immune response can be highly specific in invertebrates (Kurtz and Franz 2003; Little et al. 2003; Sadd and Schmid-Hempel 2006). Specificity of the innate immune response could be generated by high genetic diversity of parasite and pathogen recognition receptors and/or immune effectors, synergistic interactions amongst immune system loci, and/or dosage effects (Schulenberg et al. 2007).

Partly due to the reasons outlined above (a wider approach to considering immunity and investigation of immunity in invertebrate systems) investigations of patterns of natural selection acting on innate immune loci have lately been of interest both in vertebrates (Ferrer-Admetlla et al. 2008; Tschirren et al. 2011) and invertebrates (see literature cited below). In humans, evidence of balancing selection at some innate immunity loci has emerged (Ferrer-Admetlla et al. 2008) although other authors have found purifying selection to be the dominant force (Mukherjee et al. 2009). In invertebrates, estimates of selection on innate immune system loci have revealed different modes which vary between loci and study species. Purifying selection has been observed, for example, in a putative gram-negative binding protein (GNBP) in Daphnia (Little et al. 2004) and in peptidoglycan recognizing proteins (PGRPs) in mosquitoes (Little and Cobbe 2005) as well as in anti-microbial peptides (AMPs) in Drosophila (Jiggins and Kim 2005). A signature of a partial selective sweep (when a beneficial mutation nearly reaches fixation in a population particular genetic signatures can be detected; e.g. see Macpherson et al. 2008) has been found at Eater (a pattern recognition receptor invoved in phagocytosis (Kocks et al. 2005)) in Drosophila, but there was no evidence for positive selection at pathogen-interacting repeat regions of the gene (Juneja and Lazzaro 2010). Clark and Wang (1997) found evidence for departure from neutrality but low heterogeneity in two AMP genes in Drosophila. In contrast to GNBP, there is evidence for positive directional selection for alpha-2-macroglobulin (A2M) in Daphnia (Little et al. 2004). In Drosophila rapid evolution and positive selection has been found in some scavenger receptor loci, but not others (Lazzaro 2005). Rapid evolution of RNAi genes has also been documented in Drosophila (Obbard et al. 2006). Schlenke and Begun (2003) examined immunity and non-immunity related genes in Drosophila and concluded that there was an important role for directional selection in the evolution of immune genes, although there was no strong evidence for the maintenance of protein diversity. When selection of immune genes across the Drosophila genus was examined some genes showed evidence for adaptive evolution across the group, while positive selection in others was lineage specific (Morales-Hojas et al. 2009). In a recent study of Hymenoptera, direct evidence for positive selection was only documented for a single ant gene (a PGRP; Viljakainen et al. 2009). This list of studies is not intended to be exhaustive, but illustrates the contrasting patterns observed so far across loci and taxa in studies largely focused on model organisms.

Recently a ‘PAMP/PRR paradigm’ for making population genetic predictions regarding the nature of selection at innate immune system loci has been proposed by Little et al. (2004): PAMPs are pathogen-associated molecular patterns and PRRs are pattern recognition receptors produced by a host that recognise the pathogen’s PAMPs. If PAMPs are associated with functionally-constrained regions of a pathogen (i.e. conserved regions essential for the survival of the pathogen, e.g. components of the cell wall consisting of polysaccharides), evolution of PAMPs will be limited, hence purifying selection rather than directional selection may be expected at PRR loci (in other words an evolutionary ‘arms race’ will not begin). Conversely, when immune interactions between a host and a parasite involve protein–protein interactions an evolutionary ‘arms race’ may be more likely (Little et al. 2004). Thus the expectation is that when a recognition locus interacts with an essential bacterial cell wall component (like peptidoglycan) purifying selection would be expected, whereas if loci interact with bacterial proteins directional selection would be predicted (Little et al. 2004).

Bumblebees offer an ideal model system in which to bring together these issues. As social insects, they live in environments ideal for the rapid spread of infectious disease i.e. aggregations of highly related individuals living together in close contact with mutual feeding and common stores of food (see Klaudiny et al. 2005; Baer et al. 2005). In social insects, genetic diversity is thought to be particularly important in reducing susceptibility to parasite and pathogen infection (e.g. Sherman et al. 1988; Hughes and Boomsma 2006). Indeed, this is one explanation for the existence of polyandry in some species of this group (Brown and Schmid-Hempel 2003). Pathways underlying the innate immune response have also been investigated and several genes have been shown to be up-regulated following immune system challenge in honeybees (Evans et al. 2006). Work at the molecular level in non-model social insects is also greatly aided by the complete genome sequencing of the honeybee in 2006 (Honeybee Genome sequencing Consortium 2006) thus other bees can be considered ‘genome-enabled taxa’ (Kohn et al. 2006). The evolutionary ecology of immunity in Bombus terrestris has also been well-studied (Mallon et al. 2003; Baer and Schmid-Hempel 2006; Moret and Schmid-Hempel 2009). Many bumblebee species are undergoing declines in range and abundance while other species remain widespread and abundant (Goulson et al. 2005) and the conservation genetics of several species has been examined using neutral markers (Darvill et al. 2006; Ellis et al. 2006; Charman et al. 2010; Lozier et al. 2011).

The motivation of the current study was thus to examine the evolutionary genetics of innate immune system loci in bumblebees but to apply this within a conservation-based context of this ecologically and economically important group. We aimed to screen a range of candidate loci in an analysis of patterns of selection and examine whether the observed pattern of selection meets the expectations of the ‘PAMP/PRR paradigm’ (Little et al. 2004). An additional goal was then to measure intronic variation to find polymorphisms that whilst neutral may be linked to regions under selection and allow genetic variation to be assayed using an exon-priming intron crossing (EPIC) approach (Palumbi 1995). In this way, functional variation across a wider set of populations could be assayed in the future in natural populations of non-model species in a ‘wildlife immunogenetics’ context (Acevedo-Whitehouse and Cunningham 2006).

Experimental procedures

Study species

Samples of four ubiquitously distributed species (Bombus lapidarius (Linnaeus), Bombus hortorum (Linnaeus), Bombus pascuorum (Scopoli) and Bombus pratorum (Linnaeus)) and two declining species (Bombus humilis (Illiger) and Bombus monticola (Smith); Edwards and Telfer 2002; Edwards and Broad 2005) were collected in south-west England in the spring and summer of 2010 (Fig. 1). B. pascuorum and B. humilis represent a common and declining species pair within the same subgenus (Thoracobombus), respectively, as do B. pratorum and B. monticola (Pyrobombus). Permission for sampling declining species was obtained from Natural England as well as the managers of each sampling locality. Samples of B. humilis were taken from a stretch of the north coast of Cornwall from Park Head (Ordnance survey grid reference [OSGR] SW8470; latitude 50.49742, longitude −5.04602) to Boscastle (OSGR: SX0991; latitude 50.68668, longitude −4.70425; Fig. 1). B. humilis is sparsely linearly distributed across this area, being confined to suitable coastal habitats in this region (Saunders 2008). Samples of B. monticola were taken from a single 10 km square area within Dartmoor National Park (Fig. 1) and are thus likely to represent a single panmictic population. All other species are ubiquitous across the UK, thus samples taken from within south-west England were assumed to originate from a single panmictic population (a justifiable assumption since there is no differentiation across European continental samples of B. terrestris (Estoup et al. 1996) and no differentiation between widely separated samples (125 km) of B. pascuorum in the UK (Ellis et al. 2006)). Eight males per species were used for genetic analyses (see below; Supplementary Table 1). A worker sample was also taken for future work (all workers of declining species were sampled non-lethally (Holehouse et al. 2003)). Males are particularly useful for investigating genetic sequence variation in this context since they are haploid and thus eliminate the need for haplotype phase inference that is required for diploid data.

Fig. 1
figure 1

Map indicating general sampling area (see Supplementary Table 1): 1 Plymouth, 2 Wembury 3 Dartmoor, 4 Kit Hill, 5 Boscastle, 6 Tintagel, 7 Pentire point, 8 Trevose Head, 9 Park Head

Selection of candidate loci

A peptidoglycan recognition protein (PGRP-S, You et al. 2010) was selected as an innate immune gene likely to be under purifying selection as would be expected under PAMP/PRR predictions (peptidoglycan is a fundamental component of bacterial cell walls (Perkins 1963 [peptidoglycan is also known as mucopeptide])). However, there is evidence for positive selection in PGRP-SA in ants (Viljakainen et al. 2009) although this is not the case in honey bees (for PGRP-S2 and PGRP-SA loci). In B. ignitus Smith, PGRP-S has been shown to be induced after injection with B. thuringiensis and it is thus considered likely that PGRP-S functions in anti-Bacillus activity (You et al. 2010). In the honey bee PGRP-LC and PGRP-S2 are up-regulated following immune challenge (Evans et al. 2006). In honey bees there are only four PGRP genes compared to 13 in Drosophila and 7 in Anopheles (Evans et al. 2006). PGRP-S1, 2 and 3 are candidates for involvement in the Toll pathway, whilst PGRP-LC is involved in the Imd pathway (Evans et al. 2006). Since peptidoglycan is a major component of bacterial cell walls (Perkins 1963), the null expectation was for a lack of directional selection at PGRP.

For a candidate locus more likely to be under positive directional selection, we sequenced a putative alpha-2-macroglobulin gene region. Thioester-containing proteins (TEPs, including alpha-2-macroglobulin) exist in multi-gene families in Anopheles and Drosophila and are known to have a role in the immune response (Blandin and Levashina 2004). TEP genes have also been shown to be under positive selection in both Anopheles (Obbard et al. 2008) and Drosophila (Jiggins and Kim 2006). Alpha-2-macroglobulin (A2M) is also known to be under positive selection in Daphnia (Little et al. 2004). In Daphnia this gene consists of a “bait” fragment and a “thiol ester” fragment and it is the bait fragment that has been shown to be under positive selection (Little et al. 2004). In Bombus spp. it was possible to generate primers for a region with similarity to the A2M bait region in Daphnia (see below).

Scavenger receptors (Sr) are known to have a role in innate immunity binding a wide range of pathogens (reviewed in Gough and Gordon 2000). Expectations regarding Sr polymorphism and selection are unclear: they may recognise conserved bacterial regions as with PRRs and thus be expected to be under purifying selection, yet initial estimates of wild populations of Drosophila suggest they may show a more complex pattern than other loci (Lazzaro 2005). In fact, Lazzaro (2005) found Drosophila SrC-I and SrC-III to exhibit rapid evolution and positive selection, with SrCII more conserved than SrCI. Consequently, a Bombus scavenger receptor was also sequenced in this study as a possible gene region where positive selection might be expected to occur.

DNA extraction and PCR development

For all loci and species, DNA was extracted using a modified ammonium acetate protocol (original reference Nicholls et al. 2000, modified protocol available on request). In all cases, DNA was amplified using the QIAGEN Taq PCR core kit (West Sussex, UK).

Primers from You et al. (2010) were used to amplify and sequence PGRP-S in the study species listed above (‘You forward’ ATGACAAAGCTAATAGCGGTG; ‘You1076 reverse’ CGAGATTCCTAAGGGTTGACTTTGG; ‘You1076 forward’ CCAAAGTCAACCCTTAGGAATCTCG; ‘You reverse’ TCAGATCGACGACCAGTGAGGC). In order to ‘bridge’ internal gaps in our resulting sequences due to the use of a single internal primer (“You1076”forward and reverse primers are the reverse complement of one another) additional sequencing primers were developed (forward GCTGACGATTCGAGGACATGT; reverse CTTGACAAATCGCTCGAGTCG). PCR was performed in 20 μl reaction volumes. Each reaction contained 1 unit of Taq polymerase, 2 μl 10X reaction buffer, 0.5 μM each primer, 0.2 mM each dNTP, 2 μl template DNA (extracts unquantified) and 13.4 μl molecular grade water. PCR conditions were: 94 °C for 3 min (mins) followed by 15 cycles (touchdown) of 94 °C 30 s (s), 63–55 °C 30 s, 72 °C 2 min then 25 cycles of 94 °C 30 s, 53 °C 30 s, 72 °C 2 min followed by a final extension of 72 °C for 10 min.

The development of primers for the putative alpha-macroglobulin gene region was technically challenging. Briefly, an alignment of existing A. mellifera (honeybee) and Nasonia vitripennis (a parasitoid wasp) mRNA sequences with similarity to alpha-2-macroglobulin was made (XM392454.3 and XM001604143.1, respectively). An alignment of A. mellifera and Daphnia magna mRNA sequences was also made (D. magna, AY540091.1). Examination of the latter alignment allowed identification of a region in A. mellifera with similarity to the bait region of D. magna. The alignment between A. mellifera and N. vitripennis was then used to develop conserved primers for amplification of this region in Bombus. A single primer pair produced only weak PCR products, hence a nested PCR design was developed using two sets of conserved primers for the region of interest. Successful PCR using these primers then allowed initial putative Bombus alpha-2-macroglobulin to be sequenced. Bombus-specific A2M primers were then developed from the resulting sequences. However, PCR amplification was weak when these primers were used alone. Consequently a nested-PCR was developed using one set of conserved primers from the Apis and Nasonia alignment followed by the Bombus specific primers. PCR products were then sequenced in full by outsourced primer-walking (Macrogen Europe, Amsterdam). Further Bombus-specific primers were then developed from the primer walking results and these were used in a semi-nested PCR to develop the PCR products for sequencing across species. All primer sequences and further details are available on request from the authors. The final PCR methodology involved initial PCR primers (Fwd: GACCTATGCATTCGCAAAACTAG; Rev: TCGCTGAAATCCATCGGGC) followed by a second PCR using the same forward primer, but with an internal reverse primer (CTAGTTTTGCGAATGCATAGGTC). Cycle conditions for the initial PCR were 94 °C for 3 min followed by 35 cycles of 94 °C 30 s, 52 °C 30 s, 72 °C 2 min followed by a final extension of 72 °C for 10 min. Products were amplified using 1 unit of Taq polymerase, 2 μl 10X reaction buffer, 0.2 μM each primer, 0.2 mM each dNTP, 2 mM MgCl2, 2 μl template DNA (extract concentration unquantified) made up to a volume of 20 μl with molecular grade water. For the second PCR a 1 μl of a 1:10 dilution of the first PCR was amplified in a 20 μl reaction as described above except cycling conditions were altered to 94 °C for 3 min followed by four cycles of 94 °C 30 s, 54 °C 30 s, 72 °C 2 min; then 35 cycles of 94 °C 30 s, 53 °C 30 s, 72 °C 2 min and a final extension of 72 °C for 10 min.

PCR optimization of the scavenger receptor region was also challenging. An initial scavenger receptor-like Bombus mRNA was identified from existing EST data (Sadd et al. 2010). This mRNA sequence was aligned with existing Apis mellifera mRNA with sequence similarity to scavenger receptor C (XM394726). Primers were then designed to amplify a ~2,300 bp region and this was sequenced by outsourced primer-walking (Macrogen Europe, Amsterdam). From these sequence data it was possible to generate primers to amplify the region across Bombus species. The sequencing approach involved sequencing 5′UTR, 5′ and 3′ regions in separate PCRs. For the 3′end primer sequences were: forward CCAATTAATGCGTGTCCAGG; reverse GTTGTTACGAGGGTTTCACTC. PCR reactions contained 1 unit of Taq polymerase, 2 μl 10X buffer, 0.5 μM each primer, 0.2 mM each dNTP, 2 μl template DNA (extract concentrations unquantified) and 13.4 μl water using a QIAGEN Taq PCR core kit (West Sussex, UK). Cycle conditions were 94 °C for 3 min followed by thirty-five cycles of: 94 °C 30 s, 54 °C 30 s, 72 °C 12 min and a final extension of 72 °C for 10 min. The 5′ end was amplified using two separate protocols. One primer pair (Fwd: TATTCACCGTACTCATAACTCG; Rev: CTTCGATCGTTCTTCGCGTG) was used to amplify the extreme 5′ region in a PCR containing 1 unit of Taq polymerase, 2 μl 10X buffer, 0.5 μM each primer, 0.25 mM each dNTP, 2 mM MgCl2, 2 μl template DNA (extracts unquantified) and 12.9 μl molecular grade water using a Qiagen core kit (West Sussex, UK). Cycling conditions were 94 °C for 3 min followed by thirty-five cycles of: 94 °C 30 s, 48 °C 45 s, 72 °C 1 min 30 s and a final extension of 72 °C for 10 min. The second 5′ region (internal to this) was amplified in a semi-nested PCR design. The first PCR used primers with the sequences GCACAACGTTATCCTCGGCACA (Fwd); CTTCGATCGTTCTTCGCGTG (Rev). PCR products were amplified in 20 μl reactions containing 1 unit Taq polymerase, 0.2 μM each primer, 2 μl 10X buffer, 0.2 mM each dNTP, 2 mM MgCl2, 2 μl template DNA (extracts unquantified) and 14.2 μl water using a Qiagen core kit (West Sussex, UK). Cycle conditions were: 94 °C for 3 min followed by 35 cycles of: 94 °C 30 s, 54 °C 30, 72 °C 2 min and a final extension of 72 °C for 10 min. The internal PCR used the same forward primer, but with an internal reverse (AGCTGTGCTAAGAAAGGACGCG). The PCR mix was as above except 1 μl of a 1:10 dilution of the first PCR was used and the reaction volume was adjusted to 20 μl with 15.2 μl of water. Cycle conditions were the same, except the annealing temperature was raised to 59 °C.

Sequencing

Sequencing of initial or short PCR products during development was done ‘in-house’ using a standard cycle sequencing reaction (96 °C for 1 min, then 30 cycles of 96 °C for 10 s, 50 °C 5 s, 60 °C for 4 min) and a 3130 Genetic Analyser (Applied Biosystems [Life Technologies], Warrington UK). Sequencing reactions contained one-eighth final concentration Big-Dye terminator v3.1 ready reaction mix (Applied Biosystems [Life Technologies], Warrington, UK), 3.5 μl sequencing buffer, 8.5 μl water and 2 μl PCR product. All other sequencing and primer-walking was outsourced to Macrogen Europe, Amsterdam. Prior to sequencing, PCR products were either run out by agarose gel electrophoresis and bands excised and cleaned using a Qiaquick gel extraction kit (Qiagen, Crawley, West Sussex UK) or were cleaned using an ExoI and shrimp alkaline phosphatase protocol (available on request).

Statistical analyses

All sequence alignments were made in BioEdit 7.0.5.3 (Hall 1999) using Clustal W (Thompson et al. 1994) and then imported into DNAsp (Rozas et al. 2003) for further analysis. Coding regions and their reading frames were inferred either by alignment of Bombus sequence data with Bombus ignitus mRNA sequences (for PGRP), by alignment with existing Apis mellifera mRNA (for alpha-macroglobulin) or by alignment with B. terrestris EST data and then with A. mellifera mRNA to confirm reading frame and orientation (i.e. confirmation of plus strand; for scavenger receptor, Sr). Intronic sequences were then manually spliced out and coding regions translated to confirm the absence of stop codons (confirming the inferred reading frame was correct). Further analyses were made using these inferred coding regions only in DNAsp (Rozas et al. 2003). Basic measures of sequence polymorphism such as nucleotide diversity (π), total number of substitutions and number of substitutions in non-coding and coding regions were estimated. Other intra-specific polymorphisms were also recorded from sequence alignments such as the presence of microsatellites and insertion/deletion events (indels) in intronic regions.

Overall, a lack of intra-specific polymorphism was observed at all loci (see Results) and very few inter-specific differences were also encountered. Consequently, statistical estimates of selection are limited by a lack of power. However, we stress that this does not render our results as of little interest, since observing such little variation in itself addresses our original hypothesis. Hence, we provide only estimation of the ratio of the average number of non-synonyomus substitutions per non-synonymus site (K a) to the average number of synonymous substitutions per synonymous site (K s) (Li et al. 1985) and, for completion, McDonald-Krietman (MK) tables showing within species variation and the number of fixed interspecific differences. Under neutrality, the ratio of replacement to synonymous substitutions observed between species to those observed within species should not differ. Jukes and Cantor-adjusted K a and K s as well as K a/K s ratios (Jukes and Cantor 1969) were calculated in DNAsp as were the number of polymorphisms for the MK tables (Rozas et al. 2003). Given the lack of polymorphism observed (see Results), calculation of the G-statistics and their significance was not applicable. We made comparisons between species pairs of rare and common species within the same sub-genus, and rather than test all possible species pairs, arbitrarily selected two common species pairs for tests at each locus.

Other tests of selection also exist such as Tajima’s D (Tajima, 1989), Hudson, Kreitman and Aguade (HKA tests, Hudson et al. 1987) and maximum likelihood models implemented in PAML (Yang 2007). In this study, these tests were not appropriate based on the results of the lack of polymorphism encountered (for further explanation see “Results” section).

Results

All sequences obtained have been deposited in Genbank with the following accession numbers: PGRP (JQ736575-736617); scavenger receptor (Sr; JQ736464-493); alpha-2-macroglobulin (A2M; JQ736494-736527). Total sequence lengths and the amount of coding sequence obtained at each locus are shown in Table 1a, b. The Sr locus failed to adequately amplify for B. hortorum despite concerted primer modification effort. Alignment of Sr sequence with B. terrestris EST and A. mellifera mRNA revealed a genomic region at the 5′ end which is apparently translated into mRNA (was present in mRNA of the B. terrestris EST), but contained numerous stop codons. This was assumed to be a 5′ untranslated region (UTR). Following this region, the Bombus sequences obtained here, the B. terrestris EST and A. mellifera mRNA aligned well and no stop codons were present. This 5′ UTR could not be sequenced for B. lapidarius at the Sr locus due to poor quality PCR amplification (Sr was amplified and sequenced in three overlapping fragments, see Experimental Procedures). For loci where development of Bombus primers was technically challenging, BLAST searches revealed high sequence similarity to related loci in closely related taxa. The coding region of the putative A2M region showed very high similarity to predicted B. terrestris alpha-1-macroglobulin (e-value 0, coverage 100 %, max. identity 98 %) and to A. mellifera TEP-7 mRNA (e-value 1 × 10−162, coverage 99 %, max. identity 83 %). Coding regions of the putative Sr showed similarity to A. mellifera SrC mRNA (e-value 2 × 10−02, coverage 57 %, max. identity 77 %). The complete PGRP genomic DNA ‘BLAST’ search unsurprisingly showed high similarity to the B. ignitus PGRP (e-value 0, coverage 100 %, max. identity 81 %).

Table 1 Summary data for all loci and species sequenced at each locus

Intron length variations due to indels were observed for PGRP in B. hortorum, B. pascuorum, B. pratorum and B.lapidarius. Two indels occurred in B. pratorum and B. lapidarius. Overall, these indels varied from a 13 bp section in B. lapidarius to a 2 bp deletion for one of the B. pratorum indels. Location of the indels was not consistent across different species. Three small indels in intronic regions of Sr were also observed in B. pascuorum (3–7 bp).

The total number of SNPs observed in non-coding regions varied from five in B. monticola (three loci sequenced) to thirty-nine in B. pascuorum (three loci sequenced). There were twenty-two SNPs in B. lapidarius (three loci), thirty-two in B. pratorum (three loci) and thirty-five B. humilis (three loci), and fifteen in B. hortorum (two loci only) (for sample sizes see Table 1a, b). There was a complete lack of non-synonymous substitutions observed within species across loci (Table 1a, b). Synonymous substitutions within coding regions were also infrequent (Table 1a, b).

Tests of selection

So few polymorphisms were observed that sophisticated tests of positive selection would be superfluous. All inter-specific comparisons of Ka/Ks ratios were below 1 (Table 2), but the number of within-species and fixed inter-specific polymorphisms was low, limiting statistical power (Montoya-Burgos 2011). Due to the complete lack of non-synonymous variations within species, the G-statistic on MK tests was not applicable in all cases; however, the number of fixed inter-specific differences between species is shown in comparison with the number of intra-specific polymorphisms for completion (Table 3). The number of fixed inter-specific non-synonymous differences was low in all comparisons made.

Table 2 Ka/Ks ratios observed for all loci and species pairs tested
Table 3 Number of synonymous (syn) and non-synonymous (non-syn) differences within species and fixed differences between species

Discussion

Overall we detected extremely low levels of polymorphism in functional regions. The lack of diversity precluded robust tests of selection. However, if the regions sequenced were under positive selection, inter-specific comparisons would be expected to reveal an excess of non-synonymous substitutions. Studies using similar sample sizes per species to those used here have revealed positive selection in innate immune loci in the past (Little et al. 2004; Little and Cobbe 2005; Lazzaro 2005). Thus the lack of inter-specific polymorphism per se can be regarded as evidence of a lack of signal of positive directional selection at any locus sequenced. A lack of positive selection meets our expectation for PGRP, for which nearly the entire coding sequence was obtained. This locus is a pattern recognition receptor, recognizing peptidoglycan, a fundamental component of the bacterial cell wall (Perkins 1963) and showing anti-bacterial activity against Bacillus in Bombus ignitus (You et al. 2010). PGRPs are well conserved components of innate immunity (Kang et al. 1998) and under the PAMP/PRR prediction of Little et al. (2004), a pattern of purifying selection would be expected. In accordance with these expectations we here report that in all six species studied, no non-synonymous substitutions were observed within species and all Ka/Ks ratios were much less than one.

For A2M, primers were designed for a specific region with similarity to the ‘bait region’ in Daphnia (this region was tentatively suggested to be under positive selection in Daphnia by Little et al. 2004). The 552 bp coding region sequenced here showed very little intra-specific variation and inter-specific comparisons revealed very few fixed differences. For the scavenger receptor (Sr), the initial prediction was more ambiguous. Here, between 2,600 and 2,900 bp of a putative Sr locus was sequenced across five bumblebee species. Although this contained 503 bp of coding sequence, 203 bp were identified as part of a 5′ UTR leaving only 298 bp of coding sequence. Within this coding region we again found low levels of both intra- and inter-specific polymorphisms. Whether or not there would be a different signal at other regions within the A2M and Sr is unknown. Interestingly, in Drosophila, the elevated levels of non-synonymous divergence are not just confined to the regions of SrC directly involved in interactions with bacteria, but across other domains as well (see data for Sr-CI, Lazzaro 2005). However, where purifying selection is the predominant force acting on a locus, codon-specific patterns of positive selection have been observed (Tschirren et al. 2011). Thus we cannot conclude that positive selection does not affect other regions of alpha-macroglobulin and Sr that were not sequenced here, although to reiterate we specifically targeted our sequencing efforts for alpha-macroglobulin to a region with similarity to that apparently under positive selection in other invertebrates. The samples obtained for this study originated from a single geographic area and thus may be experiencing similar selective pressures from a common suite of bumblebee parasites which may explain these patterns. However, given the lack of polymorphism across species it seems highly unlikely that divergent patterns of non-synonymous variation would have been encountered for the coding regions sequenced had other localities been included. Thus we can tentatively conclude that neither of these loci conforms to Little et al.’s PAMP/PRR paradigm (2004).

Viljakainen et al. (2009) previously made an assessment of evolution and selection in honey bees and ants. In common with this study, their work also included PGRP loci. Interestingly, in their study a PGRP locus in ants was the only locus that showed direct evidence of positive selection. This is clearly in contrast to the results obtained here for bumblebee species as well as confounding expectations based on the pattern-recognising nature of this locus. Overall, Viljakainen et al. (2009) found evidence of elevated rates of evolution based on rates of non-synonymous versus synonymous substitutions at immune system loci compared with non-immune system loci. However, all ratios were much below one and whether ratios were elevated (in contrast to non-immune system loci) due to relaxation of purifying selection versus positive selection could not be ascertained in all cases. The only other studies of selection on innate immune system loci in Hymenoptera of which the we aware have been on defensin, an antimicrobial peptide. Positive selection has been found in defensins in ants (Viljakainen and Pamilo 2008) and in Nasonia (Gao and Zhu 2010). Since the focus of this study was on receptor molecules we did not include effector molecules (such as anti-microbial peptides) in our analyses.

It is interesting to ask whether or not there are mechanisms other than polymorphism at the nucleotide level that can generate an effective or specific innate immune response in insects. This question was recently reviewed by Schulenberg et al. (2007) who suggested that in addition to nucleotide polymorphisms, genetic diversity of the immune response can be generated by alternative splicing. The locus Dscam, which is likely to be involved with pathogen resistance, provides an astounding example of the extreme diversification that can result from alternative splicing, with the potential for more than 30,000 different isoforms in Drosophila (Gravely et al. 2004; Watson et al. 2005) and Anopheles (Dong et al. 2006). In Anopheles, challenge with different pathogens elicits the production of specific splice forms (Dong et al. 2006). Schulenberg et al. (2007) also highlight that synergistic interactions amongst immune loci or dosage effects may also be important in generating a specific immune response. In social insects there are numerous additional behavioural responses that can be used to combat infection (reviewed in Cremer et al. 2007), although the bulk of research in behavioural traits in bees focuses on the honey bee. One interesting aspect of colony-level defence is increased genetic variation due to polyandry (Baer and Schmid-Hempel 1999; Hughes and Boomsma 2006; Viljakainen et al. 2009), however, the only bumblebee species known to be polyandrous is B. hypnorum and other species studied to date are all monoandrous (Estoup et al. 1995; Schmid-Hempel and Schmid-Hempel 2000). Interestingly, B. hypnorum is currently expanding its range northwards (invading the southern UK in 2001; Goulson and Williams 2001; it has since spread rapidly through the country), although the underlying reasons for its invasion success are not known.

With regard to finding functional loci to be used within a conservation genetics context there are other candidates outside of the immune system. One of the most promising within the Hymenoptera is the sex-determining locus. Sex is determined in the Hymenoptera by single-locus complementary sex determination (sl-CSD; Cook 1993). Individuals hemizygous or diploid but homozygous at the sex-determining locus will develop into males. Diploid males arising from a matched mating with respect to sl-CSD alleles are not viable and represent a significant cost at the colony level (Whitehorn et al. 2009). It would be interesting to examine whether there is evidence for balancing selection at the molecular level for the sl-CSD in bumblebees and to examine variation at this locus in populations of differing demographic histories. The underlying mechanism has been assessed in in honey bees where homozygosity at a complementary sex determining locus (CSD) influences the determination of sex from the sister-locus feminizer (Beye 2004; Hasselmann et al. 2008). Patterns of selection across the CSD locus have also been examined in the honey bee (Hasselmann and Beye 2004). However, the CSD has arisen by a lineage-specific gene duplication of feminizer in Apis (Hasselmann et al. 2008; Hasselmann et al. 2010). Consequently, genomic information from the honey bee cannot be used to locate the CSD locus in the bumblebee. The metabolic locus phosphoglucose isomerase (Pgi) has also been proposed as an adaptive marker for invertebrate conservation genetics (Wheat 2010), but in bumblebees this locus appears to be very well conserved and shows very little genetic variation (Ellis et al. in review).

Regarding neutral variation, up to thirty-nine SNPs were found across loci in B. pascuorum. Whilst a selection of these SNPs could make candidates to be included in a genotyping panel, e.g. using an EPIC approach (Palumbi 1995), they are clearly of limited use for investigation of variation linked to regions under divergent or positive selection given the startling lack of variation and no evidence for positive selection at any of the coding regions which we were able to sequence here. Of the species studied here other neutral estimates (microsatellite data) of diversity are available for B. humilis (but from different populations; Ellis 2005). Neutral markers have been used to investigate foraging range and nest density of B. lapidarius, B. pascuourm, and B. pratorum (as well as B. terrestris; Darvill et al. 2004; Knight et al. 2005; Knight et al. 2009), but these studies do not routinely report genetic diversity estimates (although Darvill et al. 2004 report F IS and allele number for B. terrestris and B. pascuorum and Ellis et al. 2006 report F ST for B. pascuorum in southern England).

There are some limitations to the research presented here. Firstly, as noted, only partial coding sequences were obtained for A2M and Sr; and secondly, our study samples consist of only a single population of each species from one geographic area. It was not possible to address these issues within the time and financial constraints of our study, the aim of which was to make a preliminary investigation of coding and non-coding variation at a number of candidate loci underlying innate immunity in bumblebees in a conservation context. Two alternative approaches could have been taken: (i) employing 5′ and 3′ RACE (Frohman et al. 1988; Loh et al. 1989; Schaefer 1995) to sequence cDNA would have allowed a more thorough investigation of coding variation at one locus in more detail; (ii) the use of next-generation sequencing approaches would have enabled detection of genomic regions showing patterns of departure from neutrality, which could have been investigated further in the future. Recent advances in high-throughput genotyping such as RAD-sequencing (Miller et al. 2007) and restriction site tiling analysis (RSTA, Pespeni et al. 2010) using data available from transcriptome (Sadd et al. 2010) and genome sequencing efforts (Honey Bee Genome Sequencing Consortium, www.beebase.org) are likely to be particularly useful for detection of candidate loci to be selected from genomic scans (e.g. by highlighting loci showing outlying F ST values; Beaumont and Balding 2004; Storz 2005; Excoffier et al. 2009). These candidate loci can then be examined in more detail to establish their function and investigate patterns of population genetic structure further. This approach is likely to yield interesting insights for the future and may overcome some of the difficulties with ‘bottom-up’ approaches (Piertney and Webster 2009) such as those encountered here.