Introduction

California is the leading producer of US grown avocados (Persea americana Miller [Lauraceae]), accounting for about 95% of domestic production. Annual crop value typically exceeds $320 million, and around 6,000 growers farm approximately 60,000 acres stretching south from San Luis Obispo, to San Diego at the US–Mexico border (http://www.avocado.org/california-avocado-history). In recent years, the California avocado industry has been impacted by the introduction of a series of exotic foliar feeding arthropod pests that are specific to avocados. Following on from the establishment of red-banded whitefly Tetraleurodes perseae Nakahara (Hemiptera: Aleyrodidae) in 1982 (Rose and Wooley 1984) and persea mite Oligonychus perseae Tuttle, Barker & Abatiello (Acari: Tetranychidae) in 1990 (Bender 1993), California avocado growers were dealt a major economic blow with the establishment of avocado thrips Scirtothrips perseae Nakahara (Thysanoptera: Thripidae). After its discovery in 1996, S. perseae spread very rapidly (Hoddle and Morse 2003), and annual losses attributable to this pest have been estimated at $4.45–8.51 million per year (Hoddle et al. 2003). Thus, the discovery in 2004, that California had acquired yet another exotic arthropod pest of avocados, avocado lacebug (ALB) Pseudacysta perseae (Heideman) (Hemiptera: Tingidae), was disquieting.

Pseudacysta perseae was originally described as Acysta perseae from specimens collected from avocados in Florida, USA (Heidemann 1908) and was for a long time thought to have a limited distribution, primarily in peninsular Florida, and was regarded as no more than an occasional minor pest (Mead and Peña 1991). However, during the 1990s, this situation changed and damaging population outbreaks of ALB became commonplace in Florida and throughout the Caribbean as this pest began to invade new islands (Mead and Peña 1991; Peña 2003). At the outset of the present study, the geographic range of ALB was known to extend to the US states of Florida, Georgia, Louisiana and Texas; east to Bermuda; west to the gulf coast of Mexico; and south, through the Caribbean islands of Cuba, Puerto Rico, the Dominican Republic, and into Venezuela (Mead and Peña 1991; Almaguel et al. 1999; Sandoval Cabrera and Cermeli 2005).

In all areas where ALB is known to occur it is considered a pest of avocados that can be of economic importance (Peña 2003; Hoddle et al. 2005). Adults and nymphs of ALB form dense feeding colonies on the underside of leaves causing large necrotic patches, and in severe cases this feeding damage causes significant defoliation, which results in reduced fruit yields (Peña et al. 1998). Experimental evidence from Florida indicates that avocado varieties vary in their susceptibility to feeding damage (Peña et al. 1998), but in the Dominican Republic, severe damage to ‘Hass’ avocado (P. americana var. Hass) caused by ALB outbreaks has been observed (Hoddle unpublished). The ‘Hass’ cultivar accounts for around 95% of avocados grown in California (http://www.avocado.org/the-hass-avocado-a-california-native/). Thus, in September 2004, the detection of ALB attacking backyard avocado trees in the Chula Vista and National City areas south of the city of San Diego (Hoddle et al. 2005; Humeres et al. 2009a) placed this pest in very close proximity to major ‘Hass’ avocado production areas. San Diego County produces around 60% of all California avocados (http://www.avocado.org/california-avocado-history) and it was feared that outbreaks similar to those witnessed in the Caribbean, could occur in California, and would cost the California avocado industry millions of dollars in pesticide applications and reduced crop yields. In addition, ALB is also known to feed on ornamental camphor Cinnamomum camphora (L.) and red bay Persea borbonia (L.) (Mead and Peña 1991) and therefore, potentially poses a significant pest threat to other native and exotic plant species in the Lauraceae.

California avocado production was once renowned for its almost exclusive reliance on biological control for suppression of injurious pests (McMurtry 1992). However, the importance of natural enemies for control of avocado pests has been slowly eroding, with the regular arrival of exotic pests driving a subsequent increase in pesticide use for their control (Hoddle 2004). While pesticides can result in immediate and rapid decline in a pest population, they also increase the likelihood of eliminating non-target species, such as beneficial natural enemies and pollinators (Mills and Daane 2005). In California avocados, the removal of natural enemies because of insecticide applications may result in secondary flare ups, particularly of mite and lepidopteran species that are normally well controlled (Hoddle 2004). Increased application frequencies of insecticides may also promote the development of pesticide resistance in already established pests. This is of particular concern in avocado thrips because Californian populations have developed varying levels of resistance to certain pesticides (Humeres and Morse 2006). An alternative strategy for the management of new invasive pests would be to revert to classical biological control; suppressing the introduced pest population’s growth by the deliberate introduction and establishment of natural enemies from the pest’s home range, in the invaded range. Several natural enemies of ALB have been identified from Florida (Peña et al. 1998, 2009; Gagné et al. 2008; Henry et al. 2009) and Cuba (de la Torre et al. 1999), but their potential for use in classical biological control programs has not been tested. A fundamental step in the initial assessment of potential biological control agents is identifying the geographic area from where the target pest arrived. In theory, natural enemies from the actual donor region could be expected to be best adapted to the ALB genotype that invaded California, in comparison with candidate natural enemies from other potential donor regions. Thus, identifying the donor region(s) of California’s ALB population would help focus a search for candidate biological control agents to that particular area.

To answer this question pertaining to the geographic origin of California’s ALB population, systematic surveys to collect ALB were conducted, by the authors (MSH and PAP) or with cooperators, throughout avocado growing regions in the southeastern states of the US, the Caribbean, Mexico, Central America, and north eastern South America. Mitochondrial and nuclear genetic markers were used to ascertain if these collections constituted a single species (P. perseae), and to investigate population genetic variation and structure in an attempt to pin-point the geographic origin of the ALB population that invaded California, from within this natural distribution.

Materials and methods

Specimen collection

Between June 2005 and January 2008, specimens of ALB were collected from avocado trees (Hass and non-Hass) in the USA, Mexico, Guatemala, the Caribbean islands, and French Guyana (for complete collecting details see Table 1). Specimens were collected into 95% ethanol and returned to the University of California Riverside (UCR), USA. Where possible, these specimens were maintained below 4°C during transit. On arrival at UCR, specimens were stored at −20°C. These specimens were supplemented in 2010 with additional collections from the Dominican Republic (Table 1).

Table 1 Collection records for specimens of avocado lace bug, Pseudacysta perseae used in this study

DNA extraction

Initially, isolation of genomic DNA from individual ALB was achieved using a ‘salting-out’ technique similar to that used by ourselves for thrips specimens (Rugman-Jones et al. 2007). The method for ALB differed from that for thrips in that specimens were ground up, and that 5 μL of Proteinase K (Qiagen, Valencia, CA) was added to the extraction at the first step. However, the majority of extractions were performed using the Gentra® Puregene® Cell Kit (Qiagen). Extractions using this kit followed the manufacturer’s supplementary protocol for extraction from a single Drosophila melanogaster (http://www.qiagen.com/literature/render.aspx?id=103451), with the exception that at the initial cell lysis step, we added 0.6 μL of Proteinase K and then incubated the samples for 3 h at 55°C prior to treating the lysates with Rnase A for 1 h. In addition, the resulting DNA was eluted in 30 μL of the kit’s “DNA Hydration Solution”. Extracted DNA was stored at −20°C.

Generation and analysis of mitochondrial DNA sequences

The polymerase chain reaction (PCR) was used to amplify a section of the cytochrome C oxidase subunit I (COI) of mitochondrial DNA (mtDNA). Initial amplifications were performed using the primer pair LCO1490 (5′-GGTCAACAAATCATAAAGATATTGG-3′) and HCO2198 (5′-TAAACTTCAGGGTGACCAAAAAATCA-3′) (Folmer et al. 1994). PCR was performed in 25 μL reactions containing 1× Thermopol Reaction Buffer (New England BioLabs, Ipswich, MA), 200 μM each dNTP, 0.2 μM of each primer, 2 mM MgCl2, 1.2 μL BSA (NEB), 1 U Taq polymerase (NEB) and 2 μL DNA template (concentration not determined). PCR was performed on a Mastercycler® 5331 or Mastercycler® ep gradient S thermocycler (Eppendorf North America Inc., New York, NY) with an initial denaturation step of 5 min at 95°C; followed by 35 cycles of 95°C for 30 s, 43.5°C for 60 s, and 72°C for 90 s; and, a final extension step of 10 min at 72°C. Successful amplification was confirmed by visualizing PCR products after electrophoresis on 1% agarose gels stained with ethidium bromide. PCR products were purified using the Wizard® PCR Preps DNA Purification System (Promega, Madison, WI) and sequenced in both directions at the Institute for Integrative Genome Biology, University of California, Riverside. Sequences were aligned manually using BioEdit version 7.0.9.0 (Hall 1999) and a set of internal ALB-specific primers were designed using Primer3 (v.0.4.0) (Rozen and Skaletsky 2000); ALB-COI-f (5′-GGGCCATTTATTGGAAATGA-3′) and ALB-COI-r (5′-GATAGGATCCCCTCCTCCTG-3′). The internal primers provided more consistent amplification and were combined in 25 μL reactions containing 1× Thermopol Reaction Buffer (NEB), 200 μM each dNTP, 0.2 μM of each primer, 0.5 mM MgCl2, 1 U Taq polymerase (NEB) and 1 μL DNA template (concentration not determined). The thermocycler profile remained the same except the annealing temperature was increased to 59°C. All PCR products were purified and sequenced. Primers were removed and each sequence was translated (http://www.ebi.ac.uk/Tools/emboss/transeq/index.html) to confirm the absence of nuclear pseudogenes (Song et al. 2008). Sequences were deposited in GenBank® (Benson et al. 2008).

Sequences were again aligned manually using BioEdit and trimmed to match those produced by the ALB-specific primers. There were no gaps in the COI fragment, making alignment straightforward. The resulting data matrix, totaling 429 sequences, each 515 bp long, was examined for evidence of gross population structuring and the existence of cryptic species (Hebert et al. 2003, 2004), across the geographic range from which ALB specimens were collected. A haplotype network was constructed using the statistical parsimony method of Templeton et al. (1992) implemented in TCS version 1.21 (Clement et al. 2000) with the default 95% connection limit. Genetic diversity was simply measured as the average number of pairwise differences (κ), within and between regions, calculated using DnaSP Ver. 5.10.00 (Librado and Rozas 2009). Regions were defined by grouping sample locations by state (for the USA and Mexico), country (Guatemala and French Guyana) or larger geo-political area (i.e., the Caribbean islands) (Table 1). One exception to this was the inclusion of a single specimen collected at high altitude (>5,800 ft) in Michoacán, Mexico, which was included with 3 specimens collected close by in Guanajuato, Mexico. For simplicity, this region is referred to in our analyses as Guanajuato.

While variation in the COI haplotypes detected in ALB was well within the limits expected for a single species (Hebert et al. 2004), the geographic distribution of mitochondrial haplotypes was highly structured (see Results). Therefore, as an additional qualitative measure of species identity, we looked for DNA sequence uniformity in the conserved D2 domain of 28S (28S-D2) nuclear ribosomal DNA (rDNA). This was done for at least one specimen displaying each COI haplotype detected at each collecting location. The forward (5′-CGTGTTGCTTGATAGTGCAGC-3′) and reverse (5′-TTGGTCCGTGTTTCAAGACGGG-3′) primers of Campbell et al. (1993, 2000) were combined to amplify a 687 bp stretch of 28S-D2, in 25 μL reactions containing 1× Thermopol Reaction Buffer (NEB), 200 μM each dNTP, 0.2 μM each primer, 1 U Taq polymerase (NEB) and 2 μL DNA template (concentration not determined). The thermocycler was programmed for an initial denaturing step of 3 min at 94°C; followed by 32 cycles of 94°C for 45 s, 55°C for 30 s, and 72°C for 90 s; and, a final extension step of 5 min at 72°C. PCR products were purified and sequenced as detailed above, and sequences were deposited in GenBank® (Benson et al. 2008).

Screening populations for infection by maternally inherited symbionts

The population genetics of mtDNA can be greatly influenced by the co-transmission of maternally inherited symbiotic bacteria that manipulate sexual reproduction of their hosts (Hurst and Jiggins 2005). We screened sampled ALB populations for infection with the three most commonly reported endosymbiotic reproductive manipulators, Wolbachia, Cardinium, and Rickettsia (Stouthamer et al. 1999; Hunter et al. 2003; Perlman et al. 2006). Initially, the presence of each bacterium was checked in a total of 24 specimens, drawn from populations across the sampled range, using bacterial genus-specific PCR protocols. The presence of Wolbachia was checked using the wsp-81f and wsp-691r primers of Braig et al. (1998) which amplify approximately 600 bp of the wsp gene. A Wolbachia-infected Trichogramma pretiosum was used as a positive control. The presence of Cardinium was assessed using the CLOf and CLOr primers of Weeks et al. (Weeks 2003) which amplify approximately 450 bp of 16S rDNA. A Cardinium-infected unisexual form of Aspidiotus nerii was used as a positive control. The presence of Rickettsia was checked using the Rb-F and Rb-R primers of Gottlieb et al. (2006) which amplify approximately 900 bp of 16S rDNA. No positive control was available for the Rickettsia PCR. All PCRs included a negative control (sterile water).

Since the majority of specimens tested, across populations and COI haplotypes, were infected with Wolbachia (see “Results”), the particular Wolbachia strain involved was determined for each of the four most populous haplotypes (Hap-A, Hap-D, Hap-E and Hap-G; see “Results”) by sequencing the five genes of the Wolbachia multi-locus sequence typing (MLST) system of Baldo et al. (2006). These haplotypes covered specimens from the vast majority of the sampled range. The wsp gene (see above), an alternative method for typing Wolbachia strains (van Meer et al. 1999), was also sequenced for these specimens plus additional specimens drawn from other areas/haplotypes.

Isolation, amplification and sizing of microsatellite loci

ALB microsatellite loci were isolated by Genetic Identification Services (Chatsworth, CA). Four cDNA libraries were produced, each enriched for a different microsatellite repeat motif; CA-, GA-, ATG- and TAGA-. This resulted in the isolation of 24 potential loci, of which 16 were found to be uninformative (monomorphic) or difficult to amplify (data not shown). The remaining 8 loci are characterized in Table 2. Each locus was amplified individually via PCR, in 20 μL reactions containing 1× GeneAmp® PCR Buffer II (Applied Biosystems, Carlsbad, CA), 200 μM each dNTP, 0.2 μM each respective primer (Table 2), 2 mM MgCl2, 0.5 U AmpliTaq Gold® DNA Polymerase (Applied Biosystems), and 1 μL DNA template (concentration not determined). PCR was performed on a Mastercycler® ep gradient S thermocycler programmed for: 10 min at 95°C; followed by 36 cycles of 94°C for 40 s, 55°C for 40 s, and 72°C for 30 s; and, a final extension step of 10 min at 72°C. The resulting amplicons were sized on an ABI PRISM 3100® Genetic Analyser using genemapper ® software version 3.7 and the internal-lane size standard GeneScan®-500 LIZ™ (Applied Biosystems).

Table 2 Characterization of eight microsatellite loci isolated from Pseudacysta perseae including PCR primer sequences

Microsatellite analyses

Individuals from 11 potential source locations were analyzed alongside the invasive California population (Table 1). For the microsatellite analyses, each potential source population contained only individuals collected from a single geographic location (i.e. no mixing of individuals from neighboring areas). Potential source populations were designated as those in which at least one resident individual shared a COI haplotype with the California population (see “Results”), and then only included in our analyses if at least 10 individuals were available for genotyping. Intrapopulation genetic diversity was assessed in terms of a range of standard parameters for each locus and across all loci. Allele frequencies and estimates of expected (H E) and observed heterozygosity (H O) were obtained using POPGENE version 1.32 (Yeh and Boyle 1997). F IS was calculated in FSTAT 2.9.3.2 (Goudet 1995, 2001) and deviations from Hardy–Weinberg equilibrium (HWE) and linkage disequilibrium (LD) between pairs of loci was examined using a Markov chain method (Guo and Thompson 1992) provided in GENEPOP 4.0.10 (Raymond and Rousset 1995; Rousset 2008). Significance was estimated using 1000 dememorization steps, 100 batches, and 1,000 iterations per batch, and P values were adjusted using sequential Bonferroni corrections to control for multiple comparisons. Evidence for the presence of null alleles was sought using MICRO-CHECKER version 2.2.3 (van Oosterhout et al. 2004).

Interpopulation genetic variation was investigated using two standard measures. Multilocus estimates of F ST were calculated for all population pairs using FSTAT. Significance of the F ST estimates was evaluated with 1,000 random permutations of genotypes among samples, and P values were adjusted using sequential Bonferroni corrections. This method does not assume HWE within populations. Population differentiation was also examined by calculating Nei’s unbiased genetic distances (Nei 1978) and constructing a dendrogram of relationships using the UPGMA method (unweighted pair group method with arithmetic mean), both in POPGENE. The resulting dendrogram was redrawn in FigTree v1.3.1 (Rambaut 2009). The distribution of microsatellite variation within and among populations was investigated using AMOVA (Excoffier et al. 1992) based on number of different alleles, implemented in ARLEQUIN v.3.11 (Excoffier et al. 2005). Significance was evaluated using 10 000 random permutations.

The origin of the invasive California population of ALB was investigated using the software program Structure version 2.3.3 (Pritchard et al. 2000). The Structure analysis examined the entire data set without considering geographic population information and used a Bayesian, Monte Carlo Markov Chain (MCMC) clustering approach to group individuals with similar multilocus genotypes into the most likely number of clusters (K) in the data set, estimating the log-likelihood of data belonging to a cluster, and calculating the probability of each individual belonging to each of the K clusters. Geographic population information was overlaid onto these clusters. If the invasive population clusters clearly with one of the potential source populations, this may reveal the origin of the invasive population. We applied an admixture ancestry model with independent loci (Pritchard et al. 2000), and explored likelihoods for models where the potential number of clusters for K ranged from 2 to 9. Ten independent runs were performed for each potential K and each run consisted of a burn-in period of 20,000 MCMC iterations and a data collection period of 50,000 MCMC iterations. At the end of these runs, the most likely value of K was determined by calculating ΔK (Evanno et al. 2005). The “FullSearch” algorithm in the program CLUMPP (Jakobsson and Rosenberg 2007) was used to control for label switching and genuine multimodality across the 10 replicate runs at the most likely value of K. Calculation of ΔK and creation of CLUMPP input files was done online using Structure Harvester v0.56.4 (Earl 2010).

Results

Among the 16 regions (Table 1) and 429 individuals sequenced, we found 16 polymorphic sites and 9 distinct COI haplotypes (Fig. 1; Table 3; GenBank® accessions JF838244–838277). Synonymous substitutions accounted for 13 of the 16 polymorphic sites. A single, and unique, non-synonymous substitution was found in each of the three haplotypes represented by only a single specimen (Hap-C, Hap-F, and Hap-H). The distribution of haplotypes across the 16 regions was highly structured, with only 5 of 16 regions displaying more than one haplotype (Fig. 1, diagonal element Table 3). Hap-A (Fig. 1) was the largest with around 62% of all individuals analyzed and 100% of the California individuals. Hap-A also included all individuals from Texas (USA), Guerrero, Chiapas and Tabasco, along with the majority of individuals from Nayarit, Jalisco and Michoacán (all Mexico). Of the remaining haplotypes, 7 differed from Hap-A by no more than 2 nucleotides (0.39%). One of these, Hap-G, contained all individuals from Florida (USA), Yucatan (Mexico), French Guyana (South America), and the Caribbean islands. Indeed, Hap-G was the second most common haplotype in our sample but was not detected outside of these four regions (Fig. 1). Similarly, the third most common haplotype, Hap-E, was only found in the Vera Cruz population. The most divergent haplotype, Hap-I, differed from its nearest neighbor by 8 nucleotides (1.55%) and contained only the 4 individuals collected at high altitude (around 1,800 m) close to the border between the Mexican states of Guanajuato and Michoacán. Sequences of 28S-D2 (GenBank® accessions JF838221–838243) were identical across mitochondrial haplotypes suggesting the absence of cryptic species in the samples analyzed.

Fig. 1
figure 1

Network of Pseudacysta perseae mitochondrial haplotypes constructed from 429 sequences of the COI gene. Open circles represent hypothetical haplotypes not detected in our sample and lines connecting adjacent haplotypes represent a single nucleotide substitution. Connecting lines in bold represent non-synonymous substitutions

Table 3 Variation in the DNA sequence of a 515 bp stretch of the mitochondrial COI gene of Pseudacysta perseae

Of 24 specimens tested for infection with endosymbionts, 23 tested positive for Wolbachia, but none tested positive for either Cardinium or Rickettsia. Multi-locus sequence typing of the Wolbachia of a single individual from each of the four most populous mtDNA haplotypes (Hap-A, Hap-D, Hap-E and Hap-G; Fig. 1) found no nucleotide differences across five genes and 2,375 bp, indicating that the Wolbachia strain infecting different haplotypes (and consequently different populations) was identical (GenBank® accessions JN609257–609276). The sequence of the wsp gene (a further 555 bp) of these four specimens plus a further 9 drawn from other areas/haplotypes was also identical (GenBank® accessions JN859608–859620).

Across twelve populations, a total of 74 microsatellite alleles were observed in the 8 loci, with the total number of alleles at a single locus ranging from 3 (Pper-C4 and Pper-C9) to 27 (Pper-D9), and a mean (±SE) per locus of 9.25 ± 2.85 (Table 4). Of these alleles, 32 were unique to a single population, with the number of private alleles ranging from 0 (Pper-C4) to 16 (Pper-D9), and a mean per locus of 4.00 ± 1.79. Significant deviation from HWE was detected in 8 of 96 locus by population comparisons (Table 5). In 6 of the 8 cases, positive F IS estimates indicated deviation was the result of an excess of homozygotes. In turn, results from MICRO-CHECKER revealed that the presence of a null allele may have contributed to four of these departures (Table 5). MICRO-CHECKER also revealed the potential presence of null alleles in one further locus by population combination (Pper-B107 by CA; Table 5). Across the twelve populations, only three locus pairs (from 199 tested) displayed significant LD, but in each case the significant pair was limited to a different population (Na-LV, Pper-B106 & Pper-D9; Na-SB, Pper-B107 & Pper-D9; Mi, Pper-B107 & Pper-B106).

Table 4 Distribution of microsatellite alleles across populations of Pseudacysta perseae from the USA and Mexico Private alleles (i.e. alleles present in only one population) are in bold
Table 5 Conformity of microsatellite loci to Hardy–Weinburg equilibrium

All pairwise multilocus estimates of F ST were very high, ranging from 0.1061 to 0.8356 suggesting a high degree of population differentiation (Table 6). The high estimates of F ST were most likely the result of the high numbers of private alleles detected in our data; across all loci a mean (±SE) of 2.67 ± 0.76 (Table 5). Nei’s unbiased genetic distance between populations was also high, ranging from 0.0959 to 1.4665 Table 6). For both measures, the two most similar populations were Na-LV and Na-X, and the population that was most similar to the introduced California population was Na-LV (Table 6). The UPGMA dendrogram (Fig. 2) also placed the California population nearest to the Na-LV and Na-X populations in a group also containing Na-SB and Tabasco. The high levels of population differentiation were also detected in our AMOVA with differences among populations accounting for approximately half (51.85%) of total microsatellite variation (Table 7).

Table 6 Pairwise F ST (above diagonal) and Nei’s unbiased genetic distances (below diagonal)
Fig. 2
figure 2

Dendrogram of relationships between 12 populations of Pseudacysta perseae constructed using a UPGMA method and based on Nei’s unbiased genetic distance. CA California (USA); TX Texas (USA); VC-1 La Tinaja, Vera Cruz; VC-2 Jaltipan, Vera Cruz; Na-LV Las Vivosas, Nayarit; Na-SB = San Blas, Nayarit; Na-X Xalisco, Nayarit; Gu Guerrero; Ta Tabasco; Ch Chiapas; Ja Jalisco; Mi Michoacan (the latter are all Mexico)

Table 7 Analysis of molecular variance (AMOVA) for 12 populations of Pseudacysta perseae

Bayesian clustering of multilocus genotypes, using ΔK (Evanno et al. 2005) to determine the true number of significant clusters (range tested K = 1–10, data not shown) revealed two major genetic clusters (Fig. 3). All individuals from California were assigned to a single cluster along with all individuals from the three Nayarit populations, with all individuals having membership coefficients close to 1.0 (mean ± SE; CA = 0.9959 ± 0.0001, Na-LV = 0.9937 ± 0.0003, Na-SB = 0.9927 ± 0.0007, Na-X = 0.9934 ± 0.0004). The majority of individuals from the Texas, Vera Cruz, Guererro, Chiapas, and Jalisco populations were assigned to the second cluster with similar membership coefficients. Only individuals from the Tabasco and Michoacán populations appeared as an admixture of the two clusters (Fig. 3).

Fig. 3
figure 3

Bayesian clustering of the multilocus genotypes of 239 Psuedacysta perseae individuals. Each individual is denoted by a narrow vertical bar and its proportional membership in each of K = 2 clusters is represented by different colors. K was determined according to Evanno et al. (2005). Geographic population IDs indicated on the x-axis: CA California (USA); TX Texas (USA); VC-1 La Tinaja, Vera Cruz; VC-2 Jaltipan, Vera Cruz; Na-LV Las Vivosas, Nayarit; Na-SB San Blas, Nayarit; Na-X Xalisco, Nayarit; Gu Guerrero; Ta Tabasco; Ch Chiapas; Ja Jalisco; Mi Michoacan (the latter are all Mexico)

Discussion

The field of molecular population genetics offers a suite of markers and analysis methods that can be used to explore the invasion routes and most probable sources of introduced species from within large occupied ranges that may include both native and invaded regions. In this study, we used a combination of relatively conserved (mtDNA) and highly polymorphic (microsatellite) markers to characterize the genetic diversity of native and introduced populations of ALB, in an attempt to determine the most likely origin of the invasive population that entered California. Both types of marker revealed high levels of genetic structure in populations of ALB sampled from different geographic regions. The invasive California population was found to be most similar to populations from Las Vivosas and Xalisco, populations that are within close proximity to each other in the coastal hills of the state of Nayarit, along the tourist-popular Mexican Riviera.

Mitochondrial diversity

Across its known range, COI sequences of ALB showed little genetic differentiation. Nine haplotypes were found among 429 sequenced individuals, but differences between these haplotypes were minor (Fig. 1). Indeed, only one haplotype differed from the others by more than 3 nucleotides. However, while nucleotide diversity was low, the geographic distribution of the nine haplotypes was highly structured, with a distinct east/west split. Sampled individuals from the California and Texas populations consisted entirely of a single haplotype; Hap-A (Fig. 1). Outside of the USA this haplotype was also the only one detected in our samples from the Mexican states of Guerrero, Chiapas and Tabasco, and was also the most prevalent haplotype in three of the four remaining states sampled along the Pacific Coast of Mexico (i.e. the states of Nayarit, Jalisco and Michoacán which are part of the tourist destination area known as the Mexican Riviera). Hap-A was also found in Vera Cruz, but here, a second haplotype which was unique to Vera Cruz (Hap-E), was slightly more common.

In contrast to the distribution of several haplotypes across the western part of the known range of ALB, a different but ubiquitous haplotype (Hap-G) was found among individuals sampled from Florida, Yucatan, French Guyana and throughout the Caribbean Islands. This “eastern” haplotype differed from the most common “western” haplotype by only one nucleotide, but Hap-G was not found in the western part of the range occupied by ALB. There are several potential explanations for this apparent geographic disparity. First, populations from the west may simply be geographically isolated from those in the east, such that admixture is absent or rare. However, the relatively close proximity of populations from the Yucatan peninsula with those from Chiapas and Tabasco in Mexico would argue against this hypothesis. A second explanation results from the number of individuals and number of locations sampled from the different regions. We only sampled a single population from Florida (Table 1) and it is possible that other haplotypes may have been detected had more samples been taken. However, this sample number issue is not a concern for the Caribbean region where we sampled multiple islands, or the Yucatan peninsula which was sampled at three different locations, and the results were consistent with those observed for Florida. A third explanation for the apparent lack of admixture, is that a degree of reproductive isolation may exist between populations from the eastern and western range. Given the very small differences in mtDNA sequences and identical nature of their 28S-D2 sequences, it seems unlikely that any isolation results from the existence of cryptic species. However, there is evidence from a wide range of arthropods that such isolation can arise as a result of infection with microbes such as Wolbachia. Wolbachia are maternally inherited endosymbiotic bacteria that employ a variety of mechanisms to manipulate the reproduction of their hosts, in order to enhance their own transmission (Stouthamer et al. 1999). One such mechanism, cytoplasmic incompatibility (CI), occurs if Wolbachia modify the sperm of infected males in such a way that it prevents it from fertilizing an egg unless that is also infected with the same strain of Wolbachia. As a result, the sperm of males infected with a particular CI-Wolbachia strain cannot fertilize the eggs of females that are not infected with the same strain. Consequently, such females suffer reduced fitness relative to infected females (which can mate with any male), and a particular CI-Wolbachia strain spreads to fixation in a population (Werren and O’Neill 1997). Thus, if females from the western and eastern range of ALB harbor different strains of CI-Wolbachia, then any female immigrant (and her associated mtDNA haplotype) not carrying the correct CI-Wolbachia strain (e.g. a female from another population) would be quickly eliminated. Wolbachia infection is very common in ALB but our multi-locus sequence typing shows that only a single Wolbachia strain is involved throughout the range of ALB. The identical nature of the Wolbachia was also evidenced by sequences of the wsp gene. Therefore, the present distribution of ALB mitochondrial haplotypes is unlikely to have resulted from mitochondrial sweeps caused by CI resulting from infection with Wolbachia. Furthermore, we found no evidence that any of our sampled populations were infected with Cardinium (Hunter et al. 2003) or Rickettsia; two other bacteria associated with manipulating host reproduction. At present the most plausible hypothesis to explain our findings is that the Eastern populations were founded by an ALB population with a single mitochondrial haplotype, which possibly reflects a population bottle neck during initial colonization.

The fact that the four most common mitochondrial haplotypes were all infected with the same Wolbachia strain is an interesting observation. Generally it is expected that cytoplasmically inherited microbes like Wolbachia will spread together with the mitochondrial type that was originally infected with the bacterium. Here we have a situation where the same Wolbachia strain is present in many different mitochondrial haplotypes. It is unlikely that the one Wolbachia type infected a single mitochondrial haplotype and subsequently over time the mitochondrial haplotypes accumulated mutations while the Wolbachia remained identical in the 2,375 bp that were sequenced for the MLST genes (and the 555 bp of the wsp gene). The alternative explanation is that this particular Wolbachia type also commonly transmits horizontally. How this horizontal transmission takes place is unknown. Several possible routes have been hypothesized. These can be divided into two mechanisms: introduction through parasitoids and parasites; and, introduction through sharing food sources. Several reports in the literature suggest horizontal transmission between insect species can take place through plants and interestingly, like ALB, the insects in these studies feed on plant cells using piercing and sucking mouthparts (Noda et al. 2001, Mitsuhashi et al. 2002 and Sintupachee et al. 2006).

The paucity of haplotypes in the eastern range of ALB is surprising. ALB was originally described in 1908, and for many years its distribution was thought to be restricted primarily to peninsular Florida. If Florida is truly part of the native range of ALB we might expect to see greater haplotype diversity; perhaps similar to that seen in populations along the Pacific coast of Mexico. We demonstrated that the presence of only a single haplotype throughout the Eastern range is unlikely to be the result of loss of variation following a symbiont-driven mitochondrial sweep. However, we may not need to invoke a role for a symbiont in such a mitochondrial sweep. There is a small amount of evidence that certain mitochondrial mutations may be adaptive and therefore have the potential to shape the phylogeography of populations (reviewed by Ballard and Melvin 2010). ALB has only become a serious pest in the past two decades and interestingly, the most damaging outbreaks of ALB appear to be restricted to the eastern part of its range (Mead and Peña 1991; Almaguel et al. 1999; Sandoval Cabrera and Cermeli 2005), and seemingly therefore, to a single haplotype (as shown in this study). Perhaps this haplotype (Hap-G) is particularly successful in the humid environment of Florida and the Caribbean, and its spread to fixation triggered the damaging ALB outbreaks that began in the 1990s.

The host plant, avocado, may have also played a role in selecting particular haplotypes. Historically three races of avocado have been recognized based on differences in morphological and taste characteristics (West Indian, Guatemalan, and Mexican). A recent genetic study provided strong support for these classical races, revealing substantial levels of differentiation between them (Chen et al. 2009). Commercial avocado production in the eastern and western parts of the range of ALB is based on varieties that originate from hybrid breeding of different races. The Hass variety accounts for the vast majority of production in California and Mexico and is a hybrid of Mexican and Guatemalan races (Chen et al. 2009). In contrast, most avocados grown commercially in Florida are hybrids of West Indian origin or hybrids between West Indian and Guatemalan races (Crane et al. 2007). Thus, the geographic distribution of ALB mtDNA haplotypes may be heavily influenced by adaptation of ALB populations to particular avocado varieties. Chen et al. (2009) also revealed previously undetected differentiation in wild avocados from central Mexico, with those growing at high altitude being genetically distinct from low altitude populations. Thus, host-plant adaptation may also account for the relatively distant mtDNA haplotype detected in the four ALB individuals collected around 1,800 m (Hap-I, Fig. 1).

Regardless of the reason for the east/west divide in the distribution of ALB mtDNA haplotypes, we used the fact that Hap-A was the only haplotype present in the invasive California population as justification for excluding regions from our microsatellite analyses (i.e., only populations where Hap-A was detected were analyzed). Thus microsatellite variation was not explored in populations from the eastern range of ALB. Similarly, although Hap-A was by far the most prevalent haplotype in surrounding states, it was completely absent in our samples from the Mexican state of Colima and from Guatemala. Therefore, these regions were also omitted from the microsatellite analyses.

Microsatellite variation

Examination of microsatellite allele frequency data at eight loci revealed highly significant differentiation among the 12 populations studied. Pairwise estimates of F ST and Nei’s unbiased genetic distances, even between closely neighboring populations, were very high, and AMOVA revealed that approximately half of the variation in our microsatellite data lay between populations. This was not surprising given the presence of many private alleles (43% of the total number of alleles), and the disjunct distribution of allele sizes (e.g., Pper-B120 and Pper-D9) and frequencies across all loci (Table 4). Together, these results suggest that admixture between ALB populations is rare, and that genetic drift and the emergence of unique alleles have played significant roles in producing highly genetically distinct populations. Similar to the mtDNA haplotypes, one explanation for this could be local adaptation of the ALB populations to what may be substantially genetically different avocado varieties (see above, Chen et al. 2009) or differing regional climates (i.e., Caribbean climates vs. high altitude drier conditions around Guanajuato, Mexico). These two factors, avocado variety and climate are not mutually exclusive as cultivars have been selected to perform well under prevailing climatic conditions. Unfortunately, in the present study we did not collect varietal information for host plants.

The presence of so many private alleles in our dataset may also indicate a weakness in this study. Measures of genetic diversity are sensitive to sample size and to account for the number of alleles per locus and accurately estimate heterozygosity, empirical tests (coincidentally using eight microsatellite loci) have suggested that the number of individuals sampled in a population should be >20, and ideally >30 (Pruett and Winker 2008). In contrast to the ease with which ALB were found and collected throughout the Caribbean, ALB individuals were not abundant in any of the Mexican populations sampled (Hoddle et al. 2005). As a result, the numbers of individuals available for genotyping from these Mexican populations were relatively small (n = 10–23; Table 4). Thus, there is some doubt as to whether we were able to sample the true variation in each population. Of 32 private alleles, 21 occurred in their associated population at a frequency of 0.1 or less, and it seems likely that sampling a greater number of individuals may have detected at least some of these private alleles in neighboring populations.

Allele sizes at all eight microsatellite loci were not evenly separated as would be expected under a stepwise mutation model (SMM) (Slatkin 1995). This is well exemplified by considering the locus Pper-B120 (Table 4), which in the Texas population constitutes a fixed allele (308 bp), which is unique to that population, and which is 21 bp larger than the four remaining, equally spaced alleles (281–287 bp). Even larger size gaps appear between the alleles of Pper-D9 and it is quite certain that these disparate alleles must have arisen by some mutational process other than SMM. Alternatively, mis-priming and amplification of a different locus may explain this result. However, if this were the case, then when scoring allele sizes we might expect to find individuals for which there appeared to be more than two alleles. This was not the case, so we did not investigate these gaps further, and used measures of genetic differentiation that do not assume SMM (e.g., F ST).

Native range and origin of the California and other invasive populations

For much of its history, ALB was thought to be restricted primarily to peninsular Florida and therefore, Florida was assumed to be part of the native range of this pest. However, our results cast doubt on this assumption. We would typically expect to see the greatest amounts of intra-population genetic variation in areas that lie within the native range of an organism (Dlugosch and Parker 2008). Only a single mtDNA haplotype was present in our Florida sample. In contrast, Jalisco and Nayarit (cumulative total across three populations) each with three mtDNA haplotypes, and Vera Cruz (cumulative total across two populations) with two, were the most diverse areas studied (Fig. 1). These levels of diversity were also reflected in nuclear loci with Jalisco and Nayarit both having 32 alleles across the 8 microsatellite loci, and Vera Cruz having 30 (Table 4). Furthermore, across the 8 loci there were no fixed alleles in Jalisco or Nayarit and only one in Vera Cruz. Although our small sample sizes for some locations make inferences somewhat questionable, it would appear that these three regions (and most likely the neighboring states of Guerrero and Michoacán) form at least part of the native range of ALB. This suggestion makes sense given that this area covers part of the native range of avocado. Whether the native range of ALB truly extends into Yucatan and Florida remains unclear since these populations were not subject to microsatellite genotyping.

In contrast to these relatively diverse populations, the two US populations examined were genetically poor. California and Texas individuals all possessed a single mtDNA variant, and across the eight microsatellite loci, total alleles numbered 17 and 13 respectively. Furthermore, three loci were fixed in California and five were fixed in Texas. Despite potential problems with the numbers of individuals sampled from some of the study populations, and the resultant high estimates of F ST and Nei’s genetic distances, the California population was most similar to the population from Las Vivosas, Nayarit (Table 6; Fig. 2). Bayesian clustering of multilocus genotypes also grouped California individuals with those from the three Nayarit populations (Fig. 3). Membership coefficients for this cluster for individuals from these areas was close to 1. This can be interpreted as evidence that the invasive California population is derived from Nayarit, Mexico. In contrast, individuals from Texas clustered strongly with those from Vera Cruz, Guerrero, Chiapas and Jalisco suggesting that the Texas population may be derived from one of those populations, but that none of them are likely to have seeded the invasion of California. Only individuals from Michoacán and Tabasco showed intermediate membership coefficients, a result that in both cases may indicate admixture between a native and introduced population. Given the geographic distribution of the clusters, under this scenario it would appear more likely that individuals had been transplanted from Nayarit to Michoacán and Tabasco rather than vice versa.

It is difficult to conclude much about the origin of the Caribbean populations. However, because they share a mtDNA haplotype, they are likely to have originated from Florida and/or the Yucatan Peninsula (or the same source as those populations if they are not native). Microsatellite data may reveal directional trends in the spread of ALB through the Caribbean islands. However, this was not a goal of the current project and these analyses were not undertaken. Microsatellite data may also shed light on the population from Colima, MX. The prevalent mtDNA haplotype in Colima, Hap-D, was detected in low numbers in neighboring states. Furthermore the haplotypes from those neighboring states were absent from Colima (Fig. 1). Interestingly, Hap-D was also the only haplotype observed in the sample from Guatemala. While the numbers of individuals sequenced from Colima (n = 12) and Guatemala (n = 7) are low (and therefore the similarity between the two may be superficial), the shared haplotype may be evidence that the Guatemala population is in fact an invasive one originating from a distinct Colima population, or vice versa.

Although the probable area of origin of the California population of ALB has been identified as Nayarit, Mexico, what is unresolved is how ALB was translocated from this area to San Diego, California. Two observations during sampling efforts along the Pacific Coast of Mexico allowed the development of a possible invasion scenario. This area, known as the Mexican Riviera, is extremely popular with American tourists who visit and camp in motorized recreation vehicles, and cruise ship traffic along this coast is very heavy. Also common in this area were roadside plant stands selling potted ornamental and fruit plants, including avocado seedlings, some of which were infested with ALB. Therefore, we tentatively hypothesize that ALB was accidentally moved on an infested avocado seedling that was purchased in Nayarit, imported illegally into Southern California, and planted in a backyard urban garden in San Diego.

Following the detection of ALB in California in 2004, populations did not develop into a significant new pest problem and detections in commercial avocado orchards have not been detected (Humeres et al. 2009a), possibly because this pest is vulnerable to pesticides used to control avocado thrips, S. perseae (Byrne et al. 2010). However, ALB has not developed into a major pest of urban backyard avocados either, where pesticide use would be expected to be negligible. External factors that could hinder the spread of ALB include unfavorable climate and host plants. Southern California is not as humid as the coastal areas of Mexico where the source population likely originated, and winters are significantly colder too. Climate may also affect the ability of ALB to utilize a particular avocado variety. Although, ALB outbreaks have been observed on Hass avocado in the Dominican Republic (Hoddle unpublished), field studies in California have shown that Hass (the predominant cultivar in California) does not promote high population growth when compared to the highly preferred Bacon cultivar (Humeres et al. 2009a, b), a variety that is relatively uncommon in California. In the Dominican Republic, a favorable climate may help mitigate any adverse host plant effects. Alternatively, if the founding population of ALB in San Diego was small (as seems likely), genetic bottle-necking, drift, and subsequent inbreeding may have resulted in either a population that is not particularly pestiferous, or one that is undergoing a pronounced lag period as it adapts to its new environment. We did not look for statistical evidence of a bottleneck, since available methods (e.g., BOTTLENECK; Cornuet and Luikart 1996) typically require more than 10 polymorphic loci and large sample sizes to reliably detect such events (Luikart et al. 1998). However, reduced allelic diversity in the invading population may be interpreted as evidence that such an event has occurred.

While initial fears that ALB would become a serious pest problem in California avocado orchards have not yet materialized, this study has identified Nayarit, Mexico, as the most probable donor region for the population that invaded California. Thus, if deemed necessary, any effort to identify natural enemies of ALB for use in a classical biological control program against ALB in California should currently focus on Nayarit. However, this study has also revealed a distinct east/west split in ALB genotypes, and highlighted a potentially more pestiferous genotype ‘lurking’ in the eastern part of the studied range (i.e. Florida, Yucatan, and the Caribbean). As such, the identification of natural enemies from this region also, and continued genetic monitoring of ALB in California, would be pertinent so that if ALB does at some point become a serious pest, we are better placed to identify and manage the threat quickly.