Plants growing in a natural environment are affected by several biotic factors. Among these, viruses threaten and contribute extensively to crop loss. Such losses due to viral diseases greatly affect crop production. The Geminiviridae are an important family of plant pathogens with a circular single-stranded DNA genome [1]. Begomovirus is the largest genus in this family, and its members are transmitted by whiteflies (Bemisia tabaci Genn.), which have a very wide host range, infecting dicotyledonous plants [2, 3]. Begomoviruses are responsible for a substantial economic damage to agricultural production, as severe crop losses continue to be reported from several parts of the world [4, 5].

Chilli (Capsicum annuum) belongs to the family Solanaceae, and its green and dried pods (fruit) are used as condiments worldwide to enhance the flavour of foods. Originating in South America, chilli is now one of the most important vegetable and spice crops in all tropical and subtropical countries, including India. Chilli leaf curl disease (ChiLCD) significantly limits production of chilli in the Indian subcontinent [68]. The viral etiology of chilli leaf curl disease was reported as early as the 1960s, but its association with begomoviruses was reported only recently [913]. The disease is characterised by typical begomoviral symptoms, including leaf curling, vein thickening, and stunted growth. If the disease persists later in the life cycle, flower buds will abscise and anthers will set without pollen grains, which ultimately results in poor fruit setting, resulting in distorted or underdeveloped fruit [1113].

Severe leaf curl disease symptoms were observed on chilli plants during 2014 in the suburbs of Ahmedabad, India (Fig. 1A). The infected leaves exhibiting typical begomovirus-like symptoms (upward curling of leaves, shortening of internodes, etc.) were used for extraction of total genomic DNA using the method described by Dellaporta et al. [14].

Fig. 1
figure 1

(A) Chilli plants exhibiting viral symptoms of leaf curling, vein clearing and shortening of internodes, collected from the Ahmedabad district of Gujarat, India. (B) Phylogenetic tree showing the relationship of DNA-A of the studied begomovirus to selected begomoviruses. Sequences were aligned using the MUSCLE alignment algorithm (MEGA6). The vertical axis is arbitrary, and the horizontal axis represents the distance expressed in percentage of nucleotide substitution × 100. The numbers at the nodes indicate the number of times the given branch was supported. The begomovirus from this study is indicated in bold. The begomovirus sequences used for comparisons and phylogenetic analysis were obtained from GenBank, NCBI. GenBank accession numbers are indicated to the right of each virus name, and abbreviations are according to Brown et al. [20]

Universal begomovirus degenerate primer pairs were used to detect the presence of begomovirus genome in whole genomic DNA from 20 collected samples [15]. The full-length viral genome was amplified using an Illustra TempliPhi Amplification Kit (GE Healthcare Life Sciences) using φ29 polymerase and total DNA as a template as per the manufacturer’s instructions. The resulting concatamers were subjected to digestion with various restriction endonucleases (BamHI, EcoRI, HinDIII, KpnI, PstI and XbaI) to release the unit-length viral genome. Digestion with BamHI and KpnI digestion resulted in the release of a specific 2.7-kb that was presumed to be the viral genome. This fragment was purified by elution from an agarose gel and ligated into pBluescipt Vector (pKS+) that had been linearized by digestion with the same enzymes. The ligated products were then used to transform competent DH10β E. coli cells, and selection of recombinant clones was done using a blue-white colony-screening assay. Recombinant clones containing the 2.7-kb putative viral genome were confirmed by agarose gel mobility assay (by comparing moblity with empty pKS+ plasmid) and restriction digestion. A restriction digestion map was created by digesting confirmed clones with several restriction endonucleases and analysing product mobility patterns by agarose gel electrophoresis. Two recombinant 2.7-kb clones were selected for full-length genome sequencing.

Total genomic DNA isolated from an infected chilli plant was also examined for the presence of a betasatellite DNA molecule, using universal betasatellite primers [16]. An approximately 1.3-kb DNA fragment was amplified, confirming the presence of a betasatellite DNA molecule along with the viral genomic DNA within the same sample. The PCR-amplified product was isolated, ligated into pTZ57R/T vector (InsTAclone PCR Cloning Kit, Thermo Scientific, USA) and cloned.

Full-length sequences of two viral DNA clones (pChAB5 and pChAK2) and two associated betasatellite clones (pTAAβ3 and pTAAβ9) were determined by the dideoxynucleotide chain termination method using an ABI automated sequencer with primers specific for the cloning vectors. A primer-walking strategy was employed to obtain full-length viral genomic sequences. To obtain more information about this virus, the sequences thus obtained were compared using the MUSCLE alignment algorithm in SDT v1.2 (http://www.cbio.uct.ac.za/SDT) with those of other viruses [17, 18]. Nucleotide sequence analysis revealed that both clones for viral DNA and betasatellite molecules were 99.7 % and 99.6 % identical, respectively to each other (pChAB5/pChAK2 and pTAAβ3/ pTAAβ9). This suggests that a single type of begomovirus and associated betasatellite molecule was present in the studied sample. Since the clones were more than 99 % identical, one clone each of the viral genomic DNA (pChAK2) and betasatelite molecule (pTAAβ9) was used for subsequent investigation and analysis. The full-length genomes of the begomoviral isolates pChAK2 and pTAAβ9 comprise 2744 nucleotides (nt) (GenBank accession no. KM880103) and 1369 nt (Gen-Bank accession no. KM880104), respectively. Virus-specific primers were designed from the full-length sequence of the studied virus (pChAK2) and previously reported chilli-infecting begomoviruses [11], and these were subjected to PCR using all 20 collected samples, resulting in amplification of the studied virus and Chilli leaf curl Pakistan virus. Our studied virus was present in 17 samples, whereas Chilli leaf curl Pakistan virus was present only in three of the 20 collected samples. These results show that the two viruses were present in the suburbs of Ahmedabad, India, and that the studied virus was more prevalent.

Subsequently the full-length sequences of pChAK2 and pTAAβ9 were analysed using the BLASTn search program (http://www.ncbi.nlm.nih.gov/) [19]. BLASTn analysis revealed that the pChAK2 clone shares 90 % sequence identity with chilli leaf curl virus-[India:Guntur:2009] (ChiLCV-[IN:Gun:09], GenBank accession no. HM007100), with 100 % query coverage. The recombinant clone pChAK2 possesses six ORFs (ORF finder http://www.ncbi.nlm.nih.gov/gorf/gorf.html), two on the virion-sense strand and the remaining four on the complementary-sense strand. The viral-sense strand has two ORFs, designated as V1 and V2, encoding a coat protein (29.8 kDa) and a pre-coat protein (13.5 kDa), respectively. The ORFs on the complementary strand were designated as C1 (encoding the rolling-circle replication initiator protein Rep, 40.86 kDa), C2 (encoding the transcriptional activator TrAP, 15.37 kDa), C3 (encoding the replication enhancer protein REn, 15.9 kDa), and C4 (encoding an 11.59-kDa protein). The intergenic region (IR) sequence is 279 nt long and contains the characteristic inverted repeat, which is capable of forming a stem-loop structure. The IR sequence has the upstream TATA box for the C1 and the Rep protein. The MUSCLE alignment algorithm in SDT v1.2 (http://www.cbio.uct.ac.za/SDT) was used to calculate the percent sequence identity between each pair of sequences in the dataset [17, 18]. MUSCLE alignment comparison of the nucleotide sequence of full-length pChAK2 showed maximum nucleotide sequence identity of 90.4 %, i.e., below the 91 % threshold, with chilli leaf curl virus-[India:Guntur:2009] (ChiLCV-[IN:Gun:09]; GenBank accession no. HM007100) (Supplementary Table S1, Supplementary Fig. S1). Thus, in accordance with the taxonomic criteria laid out by the ICTV [20, 21], the chilli leaf curl virus reported here should be considered a member of a new begomoviral species, and we propose the name chilli leaf curl Ahmedabad virus [India:Ahmedabad:2014] (ChiLCAV- [IN:Ahm:14]) for this virus.

A MUSCLE alignment of the genome of ChiLCAV-[IN:Ahm:14] with the genome sequence of a previously reported begomoviruses (Table 1) revealed the chimeric nature of ChiLCAV-[IN:Ahm:14] [17, 18]. The ORFs C1, C2, C4 and IR shared maximum nucleotide sequence identity with the ORFs of chilli leaf curl virus-[India:Guntur:2009] (ChiLCV [IN:Gun:09], GenBank accession no. HM007100) at 93.6, 97.3 and 97.0 % (Supplementary Table S1), respectively. ORF C3 of ChiLCAV-[IN:Ahm:14] shared a maximum of 94.10 % nucleotide sequence identity with the C3 ORF of pepper leaf curl Bangladesh virus-[Pakistan:Faisalabad:2006] (PepLCBV-[PK:Fai:06]; GenBank accession no. AM691745). ORF V1 of ChiLCAV-[IN:Ahm:14] shared the highest nucleotide sequence identity (87.8 % and 88.3 %), with chilli leaf curl virus-[India:Salem:2008] (ChiLCV-[IN:Sal:08] GenBank Accession No. HM007119) and tomato leaf curl Bangalore virus-[India:Bangalore 4:1997] (ToLCBaV-[IN:Ban4:97], GenBank accession no. AF165098), respectively. ORF V2 of ChiLCAV-[IN:Ahm:14] showed maximum nucleotide sequence identity (91.2 %) to ChiLCV-[IN:Sal:08].

Phylogenetic analysis was done by comparing the nucleotide sequence of the full-length viral genomic DNA with those of 21 known begomoviruses, and this showed that ChiLCAV-[IN:Ahm:14] is present in the same clade with ChiLCV-[IN:Gu:09] (Fig. 1B). Similarly, ORFs C1, C2, C4 and IR of ChiLCAV-[IN:Ahm:14] shared the same clade with ChiLCV-[IN:Gun:09] (data not shown). However, ORFs V1 and V2 shared a clade with ToLCBaV-[IN:Ban4:97] and ChiLCV-[IN:Sal:08] (data not shown). These analyses indicate that ChiLCAV-[IN:Ahm:14] probably arose as a result of genomic recombination with other viruses. During multiple infections of a single host, viruses can exchange or rearrange their genetic components, resulting in new, distinct, recombinant progeny that might be more virulent and adaptive than their parent viruses [22, 23].

Similarity plots generated using SimPlot (Version 3.5.1) with the Kimura 2-parameter distance model suggested that ChiLCAV-[IN:Ahm:14] (Fig.2A) resulted from genetic exchange between ChiLCV-[IN:Gun:09] and ChiLCV-[IN:Sal:08]. Recombination analysis was carried out with the RDP3 programme (v.4. 13) with default settings [24, 25], and this confirmed the presence of recombination sites in the genome (nt 200-543 and 979-1223), putatively identifying ChiLCV-[IN:Gun:09] as the major parent (Fig. 1B) and ChiLCV-[IN:Sal:08] (Fig.2B) as the minor parent (highest probability with MaxChi, p = 2.751 × 10-12 and Chimera, p = 5.096 × 10-04, respectively). We did not find any recombination events in the betasatellite molecules when using the RDP3 program.

Fig. 2
figure 2

Identification of putative recombination sites in chilli leaf curl Ahmedabad virus (ChiLCAV) isolate pChAhm-Kpn I2. (A) Similarity plot analysis of ChiLCAV using the SimPlot program (version 3.5.1) with a sliding window of 200 nucleotides moving in 20-nucleotide steps. The dotted line shows an arbitrary 70 % reliability threshold, and the solid line shows the threshold of species demarcation [20]. (B) Schematic illustration of the linearized genome of ChiLCAV, displaying the origin of recombinant fragments, recombination breakpoints, p-values determined by the RDP programme, and putative parental viruses

BLASTn analysis of the associated betasatellite molecule pTAAhmβ9 showed a maximum of 93.5 % nucleotide sequence identity to the previously reported tomato leaf curl Bangladesh betasatellite -[India:Meerut:Chilli:2011] (GenBank accession no. JX193616), with 100 % query coverage. According to ICTV taxonomic criteria [26], the associated betasatellite molecule pTAAhmβ9 may be named tomato leaf curl Bangladesh betasatellite-[India:Ahmedabad:Chilli:2014]. The betasatellite molecule contains one ORF betaC1 (363 nt), an SCR (220 nt), and an adenine-rich region characteristic of its genome organization.

To detect the presence of an associated genomic DNA molecule (DNA-B) in the diseased chilli sample from Ahmedabad, India, we used degenerate primers [27] as well as a new set of primers corresponding to the common region shared by DNA-A and DNA-B. However, we did not detect an associated DNA-B molecule in any of the collected samples.

Thus, the results presented here confirm that ChiLCAV-[IN:Ahm:14] is a new monopartite begomovirus, which, along with the tomato leaf curl Bangladesh betasatellite Ahmedabad isolate, forms a complete disease complex.

The phylogenetic tree, ORFs and recombination results suggest that ChiLCV-[IN:Gun:09] is the source that gave rise to ChiLCAV-[IN:Ahm:14] through recombination with the distantly related begomovirus ChiLCV-[IN:Sa:08]. Significantly, this finding suggests that Chilli leaf curl virus is the prevalent virus in chilli crops in the Indian subcontinent, playing a very important role in disease development [7] as well as in the emergence of chilli begomoviruses through recombination with other begomoviruses [28]. Further studies are required to explore the specific interactions between ChiLCV-[IN:Gun:09] and ChiLCAV-[IN:Ahm:14] and to investigate the role of these interactions between the begomoviruses and the host chilli crop in disease development.