Introduction

Phytoplasmas are a large group of plant pathogenic bacteria, lacking cell walls, belonging to Mollicutes (Brown et al. 2010). Phytoplasmas, which live in the phloem sieve elements of their host plants, are insect-transmitted, and associated with diseases in important crops worldwide (Lee et al. 2000; Christensen et al. 2005). Based on their classification by RFLP analysis of 16S rRNA-encoding gene sequences, 33 phytoplasma/16Sr groups and more than 100 subgroups have been identified (Bertaccini and Lee 2018). The 16SrII peanut witches’ broom phytoplasma group consists of diverse phytoplasma strains with a wide range of biological and ecological properties, associated with diseases in a variety of economically important plant species.

Limes trees (Citrus aurantifolia (Christm) Swing) are widely cultivated in the Southern Iran where they are infected with a witches’ broom disease (WBDL) associated with the presence of ‘Candidatus Phytoplasma aurantifolia,’ first reported in Oman during the 1980s and then spread to other countries such as Iran, United Arab Emirates, Saudi Arabia, India, and Pakistan (Bové 1986; Garnier et al. 1991; Ghosh et al. 1999; Bové et al. 2000; Alhudaib et al. 2009). Although the disease primarily affects lime, it was also detected in bakraee (a natural citrus hybrid), grapefruit (C. paradisi Macfad.), citron (C. medica L.) and limequat (C. aurantifolia × Fortunella sp.) (Djavaheri and Rahimian 2004; Bagheri et al. 2010; Azadvar et al. 2015; Faghihi et al. 2017). Disease symptoms start as little leaves and progress towards production of a large number of leaves which are smaller in size and are light green to yellow in color (Fig. 1). In advanced stages of this disease, symptoms develop in the entire canopy and the tree die within 4 to 5 years eventually (Bové et al. 1988).

Fig. 1
figure 1

Citrus trees (a grapefruit, b local acid lime, c bakraee, d Mexican lime, and e Mexican lime trees with severe witches’ broom symptoms in Iran) with witches’ broom symptoms in Iran. Symptomatic branches usually do not produce fruits; however, local acid lime fruits (a) were deformed and wrinkled

Phytoplasma differentiation have been based on 16S rRNA gene sequences and is carried out by RFLP analysis of PCR-amplified R16F2n/R2 region of ribosomal DNA sequences using 17 restriction endonuclease enzymes (Lee et al. 1998). Previous research indicated that 16S rRNA gene is not sufficiently variable for differentiation of WBDL phytoplasma strains (Al-Abadi et al. 2016). Therefore, for a finer characterization of ‘Ca. P. aurantifolia’ strains, several regions of the phytoplasma genome are investigated (Siampour et al. 2013; Al-Abadi et al. 2016; Al-Ghaithi et al. 2018). Imp is one of the three main types of immunodominant membrane proteins (IDP) identified in phytoplasmas. Variability analyses of imp genes allowed the identification of three phylogenetic subgroups (A, B, and C) among ‘Ca. P. aurantifolia’ related strains (16SrII), with WBDL phytoplasma clustering within subgroup C. However, WBDL phytoplasma strains from Mexican lime trees throughout Oman, UAE, and Iran (Al-Abadi et al. 2016), as well as different geographical (semi-tropical, subtropical, and desert) regions (Al-Ghaithi et al. 2018), Shared 99.8 to 100% imp gene sequence similarity to each other.

Similarly, based on PCR amplification of secA and SAP11 genes, the existence of a limited variation among ‘Ca. P. aurantifolia’ strains of Oman, UAE, and Iran has been reported. To the authors’ knowledge, there is no available information on genetic diversity of WBDL phytoplasma infecting Iranian citrus species (bakraee, grapefruit, and other lime biotypes). Thus, in this study, ‘Ca. P. aurantifolia’ strains were collected from different citrus hosts in Iran and were characterized to determine the possible presence of genetic variability.

Materials and methods

Plant samples

Samples from 46 citrus plants, mostly over 10 years old, exhibiting the symptoms of WBDL and grown in commercial orchards or private gardens in Hormozgan and Kerman provinces in Iran, were collected in the summer of 2016 and 2017 (Table 1). Similar procedure was followed from asymptomatic plants grown in an insect-proof greenhouse (Citrus and Subtropical Fruits Research Center, Ramsar, Iran) used as negative control. Moreover, ‘Ca. P. aurantifolia’ DNA, extracted from the infected lime tree (GenBank accession number MG893890), was included in the analysis as positive control. Selected tissues were grinded with liquid nitrogen and stored at − 80 °C.

Table 1 WBDL phytoplasma strains used in this study

DNA extraction and polymerase chain reaction

Total nucleic acid was extracted from leaf midrib tissues of both symptomatic and asymptomatic citrus trees (limes, bakraee, and grapefruit) using a CTAB extraction procedure (Murray and Thompson 1980) with some modification. Briefly, 0.5 g of each sample was homogenized in 3 mL of CTAB extraction buffer (2% CTAB; 2% PVP; 100 mM Tris pH 8.0; 50 mM EDTA; 5 M NaCl; 0.2% 2-mercaptoethanol). After, 1 mL of the extract was incubated at 65 °C for 30 min and then centrifuged at 13,000 rpm for 5 min. The upper (aqueous) phase was transferred to a new sterile Eppendorf tube, mixed with chloroform-isoamyl alcohol and centrifuged at 13,000 rpm for 10 min. The DNA was precipitated by incubation with isopropanol at − 20 °C overnight, DNA was pelleted and re-suspended in 50 μL of nucleic acid-free sterile water and stored at − 20 °C until further use.

Multilocus sequence analysis (MLSA) was performed after amplification of four genomic loci: (i) a major portion of ribosomal RNA-encoding locus (16S rRNA and 16S-23S rRNA intergenic spacer) gene, (ii) ribosomal gene protein, including 3′ end of rps19, entire rplV (rpl22) and rpsC (rps3) genes, (iii) part of phytoplasma retroelement (group II intron reverse transcriptase/mature gene), and (iv) cell-division protein (ftsH). The primer pairs P1A/P7A, 5′-AACGCTGGCGGCGCGCCTAATAC-3′/5′-CCTTCATCGGCTCTTAGTGC-3′ (Lee et al. 2004), rp(II)F2 (Martini et al. 2007) /rp(I)R1A (Lee et al. 2003), 5′-ATGGTAGGTTATAAATTAGG-3′/5′-GTTCTTTTTGGCATTAACAT-3′ IntF1/IntR 5′-ATAACACGTTGAAGAATCGCT-3′/5′-TATACGAGTTTTATTGTGGATTC-3′ (Siampour et al. 2015), and FtshF2/FtshR2, 5′-TAAAGATATGGGAGCCCGTATTC-3′/5′-TATATCCACCAACAGAACCTCTC-3′ (the present study) were used to amplify the segments of these genes, respectively. Amplifications of phytoplasma retroelement and ribosomal protein regions were performed under PCR described conditions (Martini et al. 2007; Siampour et al. 2015). PCR amplification of ftsH gene was performed at 25 μL containing 1 μL of each primer at 10 μM, 12.5 μL 2× master mix (Amplicon), and 2 μL (50 ng) DNA template. The following conditions were used: 3 min initial denaturation at 95 °C, followed by 35 cycles of denaturation at 95 °C for 30 s, annealing at 52 °C for 45 s, and extension at 72 °C for 1 min, with a final extension step of 72 °C for 5 min. The DNA extracted from the tissues of asymptomatic citrus trees (limes, bakraee, and grapefruit) was used as negative control. Amplicons were analyzed by electrophoresis in 1% agarose gels.

Comparative analyses of nucleotide sequences

Amplicons of all four genes were purified and sequenced on both strands using the same primers employed for their amplification with direct Sanger sequencing by Microsynth, Switzerland. All of the sequences were analyzed with Chromas Lite v2.01; positions with missing data and gaps were removed and multiple sequence alignments were performed using Muscle (Edgar 2004) integrated in the MEGA 7 software. The presence of SNPs (single nucleotide polymorphisms) and indels (insertions and deletions) was recorded, and the positions of nucleotide (nt) changes were determined. The final sequences of each gene were deposited in GenBank (Table 1) and compared with the current GenBank database entries using the BLASTn program (online at http://www.ncbi.nlm.nih.gov/BLAST).

Virtual RFLP and phylogenetic analysis

Computer-simulated RFLP analysis of 16S rRNA-encoding gene (R16F2n/R2) was also performed using iPhyClassifier on the sequences obtained from the Iranian ‘Ca. P. aurantifolia’ strains. Similarity coefficients (F) were calculated (Zhao et al. 2013). Phylogenetic relationships among the Iranian ‘Ca. P. aurantifolia’ strains were assessed based on sequences of all four genomic loci. Phylogenetic analysis was carried out with the software MEGA7 (Kumar et al. 2016) using the UPGMA (unweighted pair group method with arithmetic mean) (Sneath and Sokal 1973) method and bootstrapping 1000 times to estimate branching stability.

Results

Bands of the expected sizes were obtained in all 46 citrus tree samples with the P1A/P7A primers. PCR amplification using the IntF1/IntR, rp(II)F2/rp(I)R1A, and FtshF2/FtshR2 primers produced approximately 900, 1290, and 800 bp products, respectively. None of the asymptomatic plants generated amplification products with the primers used (data not shown).

The nucleotide sequences of the 16S rRNA and IS genes of ‘Ca. P. aurantifolia’ citrus strains were aligned and compared with reference strain (U15442) available from the GenBank database. Three genotypes with 99.9% sequence identity were detected (Table 2). Sequence identity of 16S rRNA gene and IS regions showed 99.5–99.9% of similarity between the available Iranian and GenBank strains. Relative to the start position of the 16S rRNA (CTG) of the reference strain, insertions and deletion of four nucleotides were detected in all Iranian strains (limes, bakraee, and grapefruit). One of the point mutations (380^381insG) generated an additional recognition site for BstUI endonuclease and also caused different RFLP patterns, as a consequence of lacking one recognition site for HpaII at the same position. In silico restriction analyses with BstUI and HpaII (Fig. 2a, b) confirmed that the Iranian strains had different RFLP patterns from the 16SrII-B reference strain. In further detail, the alignment of all Iranian ‘Ca. P. aurantifolia’ 16S rRNA sequences revealed low genetic diversity; two SNPs were detected among the strains, which were present in the sequences of seven strains (DE1, DE3, DE5, DE7, DE8, JD2, and ZR3) (Table 2). A unique nucleotide substitution (G/T) was identified at position 393 (JD2), which produced a new restriction site for the RsaI enzyme. In addition, digestion with RsaI enzyme revealed that the JD2 strain had a different pattern that was not present in the other strains and also yielded an RFLP pattern type that distinguished it from the reference strain for the 16SrII-B subgroup (Fig. 2c). However, the virtual RFLP pattern derived from the Iranian phytoplasma strains is very similar to the 16SrII group, subgroup B with a similarity coefficient value of 0.98 (genotype 3) and 0.99 (genotypes 1–2). On other hand, the 16S-23S spacer region of all strains was identical (100%) to each other and showed 99.1% similarity with the reference (WBDL) strain. A total of two variable sites were identified in the IS genes, when the Iranian strains were compared to the reference strain (Table 2).

Table 2 Single nucleotide polymorphisms in the Iranian ‘Ca P. aurantifolia strains in the ribosomal RNA
Fig. 2
figure 2

Computer-simulated restriction fragment length polymorphism analysis of R16F2n/R2 region of the 16S rRNA comparing the Iranian ‘Ca. P. aurantifolia’ with other phytoplasmas subgroup 16SrII. a BstuI, b HPAII, and c RsaI Restriction enzymes are listed at the bottom (for more details see Table 2)

Analysis of ribosomal protein gene variability was performed on 1151-bp sequences comprising the complete rpl22-rps3 genes. For the rp genetic marker, 100% sequence similarity was detected among these strains compared to 99.9% similarity with the reference strain (GenBank acc. no. EF186815). Relative to the start position of the rps3 gene (ATG), between the nucleotides at positions 660 (T) and 661 (A), an A was inserted, and this insertion caused substitutions at amino acid sequences in the Iranian WBDL phytoplasma strains (Fig. 3).

Fig. 3
figure 3

Comparison of amino acid sequences of ‘Ca. P. aurantifolia’ rp3 protein of the Iranian strains. Numbering (rp3 gene) is in accordance with the ‘Ca. P. aurantifolia’ strain LWB (GenBank acc. no. EF186815). Upper row, sequences of nucleotides; lower row, amino acid sequences; an inserted A is present between nucleotides 660 and 661 (highlighted) in the Iranian citrus strains, resulting in substitutions at amino acid sequences

In the most variable genetic locus analyzed (Fig. 4), retroelements, four genotypes were detected among phytoplasmas infecting citrus host. Sequencing of the retroelement area (group II intron reverse transcriptase/mature gene) allowed the assembly of 889 nucleotide fragments, including the partial N-terminal reverse transcriptase (RT) domain. Sequence identity among the ‘Ca. P. aurantifolia’ strain ranged between 96.1 and 100%. Thirty-six variable sites were identified in this part of the genome. There is one similar substitution (463C>T), detected in genotype 2 (NO1, NO2, RO10, and DE5) collected from different localities. Each SNP was detected in at least two nucleotide sequences from two independent localities, supporting that they were not due to PCR or sequencing errors (Wei et al. 2008; Quaglino et al. 2009). The highest variability was observed for genotypes 3 (CWB1 and CWB2) and 4 (KAH2) sequences where the majority of the SNPs were non-synonymous, causing amino acid substitutions in their protein sequence (Fig. 4). For the ftsH gene, 779-bp sequences were obtained from 46 sequenced strains. The partial ftsH gene sequence was identical among all Iranian citrus strains analyzed (data not shown).

Fig. 4
figure 4

Comparison of unique sequences of the Iranian ‘Ca. P. aurantifolia’ group II intron reverse transcriptase/maturase genes. Substitution in specific nucleotides that changed the sequence of amino acids, highlighted in gray and deduced amino acid sequence showed under each box. *For amino acid sequences, substitutions are described according to https://www.hgvs.org/mutnomen/examplesAA.html (e.g., tryptophan 26 to a cysteine → p. Trp26Cys, in which “p” refers to protein)

Phylogenetic analysis of 16S rRNA-encoding and 16S-23S rRNA spacer region sequences showed that the Iranian strains were closely related and formed a well-supported clade (Fig. 5a). A UPGMA tree constructed from the analysis revealed that only one strain (JD2) clearly formed a separate cluster, while all the other strains were more closely related to the reference strain (GenBank acc. no. U15442), forming a separate cluster (Fig. 5a). Furthermore, phylogenetic analysis of the Iranian citrus rpl22 and rps3 gene sequence, together with the reference strain (GenBank acc. no. EF186815) showed that the Iranian strains were closely related to the reference strains of ‘Ca. P. aurantifolia’. Phylogenetic analysis of 46 phytoplasma strains revealed that these strains formed a cluster with the known phytoplasma strains in group rp-XVi (Fig. 5b). Phylogeny based on group II intron reverse transcriptase/mature gene distinguished four separate clusters supported by high bootstrap values, revealing much higher diversity among the Iranian strains (Fig. 6a). One group of phytoplasma strains (genotype 3) formed a clear separate cluster with a bootstrap support of 100%, comprised of two strains from acid lime tree, which are different from those detected in Mexican lime. Phylogenetic analysis of 46 strains based on the ftsH gene sequences showed clustering of all strains in one group (Fig. 6b).

Fig. 5
figure 5

UPGMA phylogenetic analyses using the 16SrRNA-IS (a), and rpl22-rps3 genes (b) genetic loci of 46 Iranian phytoplasma strains from citrus species. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) is shown next to the branches. The reference strains of 16SrII-B-group WBDL (GenBank acc. no. U15442) and LWB (GenBank acc. no. EF186815) were used for 16SrRNA-IS and rpl22-rps3 tree, respectively. Abbreviations of the phytoplasma strains are listed in Table 1. The circle, triangle, diamond suit, and square symbols represent local acid lime, grapefruit, Mexican lime, and bakraee strains, respectively. ‘Candidatus Phytoplasma australasia’ and peanut witches’ broom were used as the out-group to root 16S rRNA-IS and rpl22-rps3 genes trees

Fig. 6
figure 6

UPGMA phylogenetic analyses using the groupII intron reverse transcriptase/maturase (a) and ftsH genetic loci (b) of 46 Iranian phytoplasma strains from citrus species. Bootstrap values (> 50%) for 1000 replicates are shown on branches. Abbreviations of the phytoplasma strains are listed in Table 1. The circle, triangle, diamond suit, and genotype symbols represent local acid lime, grapefruit, Mexican lime, and bakraee trains, respectively. Bacillus cereus was used as the out-group to groupII intron reverse transcriptase/maturase gene and ftsh trees

Discussion

For decades, the production of acid lime and other citrus species in Iran has been affected by witches’ broom disease. The 16SrII phytoplasma group has been detected in many important crops worldwide. Thus far, at least 21 subgroups have been identified (Bertaccini and Lee 2018) and two phytoplasma strains within this group have been proposed as ‘Ca. Phytoplasma sp.’, ‘Ca. P. aurantifolia’ (16SrII-B) and ‘Ca. P. australasia’ (16SrII-D) (Zreik et al. 1995; White et al. 1998).

In this study, the variability of five genomic loci of ‘Ca. P. aurantifolia’ strains were analyzed by RFLP and sequence analyses targeted loci consisting the 16S rRNA, 16S-23 rRNA intergenic spacer (IS), rp genes, retroelement, and cell division protein. All Iranian strains had mutations in 16S rRNA-IS genes sequence and differed from the reference strain. A total of eight variable sites was identified in the 16S rRNA-IS genes sequences comparing the Iranian strains to the reference strain. According to the 16S rRNA RFLP analysis, if a phytoplasma strain has a similarity coefficient between 0.97 and 1 with all those strains representative of a certain ribosomal group (e.g., 16SrI), subgroup (e.g., A), the phytoplasma under study is a variant of the certain subgroup (Zhao et al. 2013). Therefore, the three genotypes of the phytoplasmas described in this work that have similarity coefficients equal to or higher than 0.98 are a variant of 16SrII-B. On other hand, in a previous study, a very close relationship between the Middle East (Iran, UAE, and Oman) WBDL phytoplasma strains collected from Mexican lime trees was reported (Al-Abadi et al. 2016), and this is further supported by the characterization of 46 strains of ‘Ca. P. aurantifolia’ from Mexican limes in Iran where the strains grouped together with strains from Oman and UAE (Fig. 7). In addition, DNA sequences from representative phytoplasma strains belonging to different subgroups of 16SrII obtained from GenBank were included in phylogenetic analysis. The phylogenetic tree using the F2nR2 region showed that all Iranian phytoplasma strains detected in citrus species were clustered together with Oman and UAE WBDL phytoplasma strains collected from Mexican lime trees (Fig. 7).

Fig. 7
figure 7

Phylogenetic analyses of 16S rRNA (F2nR2 region) sequences of ‘Ca. P. aurantifolia’ associated with witches’ broom disease in Citrus spp. using UPGMA. The numbers at the nodes of the branches indicate the percentage of replicate trees in which the associated taxa clustered together in the bootstrap (> 50%) test. The sequences comprised of acid lime strains from Middle East [Iran, Oman and United Arab Emirates (UAE)], which are distinguished by SNPs], and grapefruit, bakraee and local acid lime from Iran. The circle, triangle, diamond suit and square symbols represent local acid lime, grapefruit, Mexican lime and bakraee strains, respectively. ‘Bacillus cereus’ are served as an out-group and ‘Ca. P. aurantifolia’ strain WBDL (acc. no. U15442) used as reference strain in the phylogenetic tree reconstruction

Previous studies have suggested that, among the closely related phytoplasma strains, ribosomal protein genes are more variable than 16S rRNA and have more phylogenetically informative characters, which are suitable as supplemental molecular markers for finer strains’ differentiation. For example, among the closely related Bulgarian ‘Candidatus Phytoplasma mali’ strains, sequences of the rp gene are more variable and distinguish up to three different RFLP subtypes (subgroups rpX-A, rpX-B, and rpX-F) (Fránová et al., 2019).

However, in Poland, most of the ‘Ca. P. mali’ strains were identified as belonging to rpX-A and only one strain was affiliated to rpX-B subgroup (Cieślińska et al. 2015). The findings of this survey indicated that rp loci were highly conserved among the Iranian citrus strain of ‘Ca. P. aurantifolia’ and showed 100% nucleotide identity with each other as well as 99.9% identity with the phytoplasma reference strain.

The genome sequences of several phytoplasmas have been characterized to date and a group of putative composite transposons called potential mobile units (PMUs) is found in their genome. Cell division protein (ftsH) is one of the PMU signature genes, and it is reported in phytoplasma genomes in multiple copies (up to 24). These genes encode membrane-associated ATP-dependent Zn proteases of 700 amino acids (Bai et al. 2006). Previous surveys using this gene have demonstrated the existence of genetic diversity in ‘Ca. P. mali’ and ‘Ca. P. asteris’ strains (Seemüller et al. 2010; Fránová et al. 2016). However, the sequences of the ftsH gene obtained with primers FtshF2/FtshR2 from the Iranian strains were 100% identical to each other.

Group II introns are large catalytic RNAs that are widespread in many bacteria and in organelles of plants, fungi, and algae. They are genetic retroelement capable of self-splicing and inserting into the DNA site (Tourasse et al. 2005). These elements have been identified in several phytoplasma genomes (Siampour et al. 2015). In general, mobile elements play an important role in the ecological dissemination and evolution of host-adaptive strategies in bacteria (Toft and Andersson 2010). RNA-Seq profile of “flavescence dorée” phytoplasma in grapevine confirmed that the transcriptional levels of group II intron were higher than the other hypothetical proteins and genes that had possible involvements in the host-bacterium interactions. The authors suggested that this mobile element might be linked to the genomic plasticity that is necessary for the phytoplasma to increase its fitness and endorse host-adaptive strategies (Abbà et al. 2014). The group II intron reverse transcriptase had the highest degree of genetic variability compared to the other sequenced genes in the Iranian phytoplasma strains; thus, it was more efficient in differentiating phytoplasmas within the 16SrII group, subgroup B. The phylogenetic analysis of these genes allowed the differentiation of four genotypes among 46 Iranian strains. Genotype 1 contained genotypes from Mexican lime along with bakraee and grapefruit, genotype 2 is exclusively grouped into the Mexican lime strains, strains of local acid limes are clustered in genotype 3, and a single Mexican strain formed genotype 4, in which samples collected from the Mexican lime tress in Kerman Province had the highest genetic diversity compared to those from the Hormozgan Province. One possible reason for the detected high genetic variability of ‘Ca. P. aurantifolia’ strains from Kerman Province is that citrus trees are often grown in less intensively managed orchards. Under such conditions, it is possible that trees were more frequently exposed to vectors (Seemüller et al. 2010) and different phytoplasma strains could have been transmitted among cultivated trees from nearby orchards (Fránová et al., 2019).

In total, phylogenetic analyses based on the variability of four genes showed that the Iranian strains is clustered in five groups: group 1 and 3 are comprised of strains from Mexican lime found in Hormozgan province (group 1: DA2, FA1, FA3, FA6, FA7, FA8, HA6, HA7, JI1, JI2, MIN1, MIN3, RO7, RO8, RO14, RO17, RO18, RO19, SA1, SA2, TO1, ZR1, and ZR3; group 3: RO10), and Kerman province (group 1: DE3, DE7, DE8, JI1, JI2, KAH1, KAH3, ME1, ME2, ME3, and ME4; group 3: NO1, NO2, and DE5), as well as the bakraee (BWB1 and BWB2) and grapefruit (GWB1 and GWB4) strains which are clustered only in group 1(Fig. 8). Indeed, two strains isolated from Mexican lime trees were clustered into group 2 (JD2) and 5 (KAH2) which were found at one site (Kahnooj, Kerman Province). On other hand, the phytoplasma strains from local acid lime were placed in a separate group. Findings from this study showed that multiple gene analyses are powerful tools to understand the genetic diversity and phylogenetic relationships within WBDL phytoplasma isolated from different citrus genotypes which are cultivated at different sites. Additional studies are needed to understand the phytoplasma spread between citrus crops by studying the insect vectors and alternative plant hosts in Iran.

Fig. 8
figure 8

UPGMA phylogenetic analyses of sequences of ‘Ca. P. aurantifolia’ associated with witches’ broom disease in Citrus spp. based on a concatenation of the 4234 nucleotides from 16SrRNA-IS, rpl22 and rps3, ftsH, and group II intron reverse transcriptase gene sequences. The numbers at the nodes of the branches indicate the percentage of replicate trees in which the associated taxa are clustered together in the bootstrap (> 40%) test. Abbreviations of the phytoplasma strains are listed in Table 1