“Candidatus Liberibacter solanacearum” (Lso) (syn. “Candidatus Liberibacter psyllaurous”) is now commonly accepted as the likely causal agent of Zebra Chip (also known as Zebra Complex) disorder of potatoes (Liefting et al. 2008; Abad et al. 2009; Crosslin and Bester 2009; Lin et al. 2009; Munyaneza et al. 2009; Secor et al. 2009; Rehman et al. 2010). It is also associated with Psyllid Yellows disease of tomato and capsicum in New Zealand (Liefting et al. 2009), and tomato and potato in North America (Hansen et al. 2008; Brown et al. 2010). The bacterium has been definitively shown to be vectored by the tomato/potato psyllid (TPP, Bactericera cockerelli) (Munyaneza et al. 2008). Similarly Lso has been detected in the carrot psyllid Trioza apicalis and psyllid-affected carrots in Finland (Munyaneza et al. 2010a, b).

Shortly after the discovery of Lso in potatoes in New Zealand (Liefting et al. 2008), it was discovered in potatoes from Texas (Bech 2008). The sequence of the 16s rRNA gene of Lso from the Texas potatoes showed two SNPs difference to that from New Zealand, representing a 99.8% similarity. Further analysis indicated that in samples from potatoes in North America, the 16s variability consistently matched variation noted in the 16s-ISR-23s region (Wen et al. 2009).

Sequence variation in genes on the three related liberibacter species associated with huanglongbing (HLB) of citrus has indicated opportunity for taxonomic and epidemiological studies. In “Candidatus Liberibacter africanus” (Laf), somewhat greater differences suggested a subspecies with host plant separation (Garnier et al. 2000). SNP differences were used to indicate variation in samples of Laf from Kenya and South Africa (Magomere et al. 2009). Analysis of genetic variability in various accessions of “Candidatus Liberibacter asiaticus” (Las) indicated a geographically-based range of variants, although no differences in phenotype were noted (Adkar-Purushothama et al. 2009; Bastianel et al. 2005; Ding et al. 2009; Furuya et al. 2010), however some degree of plant host specificity is suggested (Khairulmazmi et al. 2009). Tandem repeat analysis is another method used to differentiate strains across geographical zones (Chen et al. 2010).

In the nearly two years since the discovery of Lso, sequences from many different samples have been deposited in GenBank. Most sequences cover the 16s gene, with a few covering the 50s rplJ and rplL genes and 16s/23s intergenic spacer region. The objective of the present study was to determine whether there was any geographic pattern in the known variability of the 16s gene and the other two gene regions of Lso, as already noted in Las and Laf.

Sequences for Lso were obtained from GenBank (NCBI), with the exception of carrot 16s/23s intergenic spacer region described below. Metadata such as geographic origin and biological source material were also noted. Available sequences for this species were downloaded and aligned within gene regions using ClustalX. The SNPs were identified visually, described (den Dunnen and Antonarakis 2000) and submitted in the prescribed manner to the dbSNP database (NCBI).

Carrot plant DNA had been analyzed by PCR in a previously published study which reported the detection of Lso for the first time in carrot plants (Munyaneza et al. 2010a ). In this study, the same carrot DNA samples were tested by PCR using primer pairs Lp Frag 4-(1611F)/ LP Frag 4-480R (5′- GGTTGATGGGGTCATTTGAG -3′ and 5′- CACGGTACTGGTTCACTATCGGTC -3′) which amplify sequence from the 16S-23S ribosomal RNA intergenic spacer and 23S ribosomal RNA gene of “Ca. Liberibacter solanacearum” (Hansen et al. 2008). Amplification was performed in 25 μl reactions containing Green Go Taq Polymerase Buffer (Promega, Inc., Madison, WI), 10 pmol of each primer, 1 μl of DNA extracts, and 1 U of Go Taq Polymerase (Promega, Inc., Madison, WI). The PCR conditions were: initial cycle at 94°C for 3 min, followed by 39 cycles of 94°C for 20 s, 60°C for 20 s, 72°C for 1 min, plus an additional cycle of 5 min at 72°C. The amplified DNA was electrophoresed in 1.5% agarose gels (1X TAE) containing ethidium bromide. Selected amplicons were excised and purified using GenElute Minus EtBr Spin columns (Sigma-Aldrich, Inc., St Louis. MO), ligated into plasmid pCR2.1-TOPO according to manufacturer’s instructions (Invitrogen, Carlsbad, CA) and transformed into Top 10 cells (Invitrogen, Carlsbad, CA). Plasmid DNA was purified using QIAprep Spin Miniprep Kit (Qiagen Sciences, Maryland). Clones were sequenced in both directions (MCLAB, San Francisco, CA). The sequence was submitted to GenBank.

After aligning, the sequences for each gene region showed three distinct patterns of SNPs in each of the three gene regions. The SNPs are consistent within each gene region, and few sequences showed any sequencing errors. In the case of carrots, single DNA samples tested with all primer sets showed the SNPs across all three gene regions to be inherited together. In the solanaceous crops the 16s and 16s/23s region are inherited together. The SNP patterns were labeled “a” and “b” for the solanaceous material, and “c” for the carrot material (Tables 1 and 2).

Table 1 GenBank accession numbers showing geographic source and haplotype designation for each gene region
Table 2 Haplotypes and SNP differences with dbSNP ss# references. The reference sequence for the 16s and 23s genes is EU812559.1, and for 50s genes is EU834131.1 (highlighted row is in the intergenic spacer region between rplJ and rplL genes). Nucleotide numbers count from the beginning of the reference sequence

The appearance of SNPs across the three gene regions indicates that they represent haplotypes. There is no evidence at all of combinations of SNPs within a gene region across the haplotypes. Our data agrees with, and extends the earlier published SNP analysis, with our haplotypes “a” and “b” corresponding to Clades 1 and 2 (Wen et al. 2009).

The geographic ranges of the “a” and “b” haplotypes in North and Central America appear to separate into two geographic regions (Fig. 1). The “a” type has so far been found from Honduras and Guatemala in the south, north through western Mexico to Arizona and California, with some samples found in Texas, Kansas and Nebraska. The “b” haplotype has so far shown to be east of the mountain divide in eastern Mexico and north through Texas to south central Washington (east of the Cascade Mountains). There is some overlap of regions in these two haplotypes, especially in Texas, although this could have arisen through movement of infected plant material for agricultural purposes. For example, this movement is specifically noted in the metadata of the material sourced from Florida (Table 1, Fig. 1), where the tomato/potato psyllid is absent and zebra chip has not been documented, except one case in a commercial field owned and operated by a potato grower from Texas (JEM, personal observation).

Fig. 1
figure 1

Distribution of haplotypes “a” and “b” of ‘Candidatus Liberibacter solanacearum’ across North and Central America. Haplotype “c” is currently only known from Finland. Plant material in Florida originated from Texas potato seed

A somewhat similar geographic range separation of genes has been identified in the psyllid host (Jackson et al. 2008; Liu et al. 2006). As more data points are obtained, a better estimation of the native geographic ranges of these haplotypes will become evident with a better separation of natural and agriculturally-derived spread. As more sequences become available, these known variations will be useful to attempt to map the phylogenetic relationships of both psyllid host and bacterium, and particularly to assist in biogeographical studies.

Only a few sequences from New Zealand are present in GenBank and these are all of the “a” haplotype. With the relatively recent discovery of the tomato/potato psyllid in New Zealand (Teulon et al. 2009), this haplotype is not inconsistent with introduction of the bacterium, probably with the psyllid, from the western range of the psyllid in North and/or Central America.

Although we have only a single data point of Lso in carrots, the haplotype difference would be consistent with a view of Lso being native to northern Europe rather than an introduction, although this might need to be reviewed as new data becomes available. The carrot psyllid has been a pest of carrot production for a long time in northern and some parts of central Europe (Nehlin et al. 1994).

Quite strikingly across the haplotype designations is the congruence of SNPs linked to all three gene regions. The lack of mixing of the various SNPs between and within gene regions strongly suggests a long divergence and separation of bacterial populations, even though there is geographic, plant and insect host overlap of “a” and “b” in North America. It is tempting to suggest a relatively recent incursion of Lso into the solanaceous crop/psyllid combination based on the 1920s and 1990s reports of new potato diseases associated with tomato/potato psyllid (Linford 1927; Wen et al. 2009). Unfortunately, the early reports of tomato/potato psyllid damage to plants in California are too lacking in detail to infer any presence of Lso at that time (Compere 1915; Crawford 1917; Essig 1917). Two plant/insect host combinations offer a possible explanation for the derivation of these two haplotypes now found across common crops in similar geographic regions.

No evidence yet exists for co-infection of these haplotypes, probably because sequencing sensitive to the possibility of co-infection has not yet been attempted. Co-infection of citrus by both Laf and Las has been reported (Garnier and Bové 1996) and by “Ca. Liberibacter americanus” and Las (Teixeira et al. 2005), suggesting that these more closely related haplotypes could also co-exist in the insect and plant hosts. At present, no biological difference is indicated for these haplotypes other than the obvious host differences between haplotype c in comparison to haplotypes a and b.

It is strongly recommended that future Lso trials note the haplotype used, that sequences derived from plant or psyllid sources be analysed to haplotype level, and that these sequences and descriptions of source material, geographic origin and symptoms are noted with the GenBank accession. Wider investigation of plants in the geographic regions of current ZC incidence is suggested to determine the proximal source, i.e. reservoir plant(s), of Lso.