Chilli (Capsicum spp.) belonging to the family Solanaceae, is a globally and economically important plant. It is cultivated as a spice, vegetable and medicinal herb. Green chili is a rich source of Vitamin A and C [1].

Geminiviruses are plant viruses characterized by having single-stranded, circular DNA genomes of 2.5-3.0 kb. On the basis of genome organization, insect vector and host range, the family Geminiviridae is differentiated into seven genera [2] with Begomovirus being the largest genus. Members of the genus Begomovirus are transmitted by whitefly (Bemisia tabaci) vectors. In the New World (NW), most begomoviruses have bipartite genomes (DNA A and DNA B), while in the Old World both bipartite and monopartite begomoviruses have been identified [3]. Most monopartite begomoviruses are associated with a satellite molecule called a betasatellite which is often required for wild type symptom development in naturally-infected host plants. Some begomovirus-betasatellite disease complexes are associated with a nanovirus-like component named alphasatellite [4, 5]. A novel class of DNA satellites (deltasatellites) has recently been identified in association with NW begomoviruses that depend on a limited range of begomoviruses for maintenance in plants [6]. These satellites are approximately one quarter the size of a begomovirus genome/genomic component and possess a stem-loop structure with the nonanucleotide (TAATATTAC) forming part of the loop, a putative second predicted stem-loop structure and an A-rich region; they are also non-coding [7].

Chilli Leaf curl disease (ChiLCD) has become a serious problem in India, and it is a major limitation to chilli cultivation [8]. Begomoviruses such as tomato leaf curl New Delhi virus, chilli leaf curl virus (ChiLCV), tomato leaf curl Joydebpur virus, chilli leaf curl Palampur virus (ChiLCPaV), chilli leaf curl Vellanad virus, papaya leaf curl virus, pepper leaf curl Bangladesh virus (PepLCBV), chilli leaf curl Salem virus (ChiLCSV), tomato leaf curl virus (ToLCV) and chilli leaf curl Bijnour virus [912], and betasatellites such as chilli leaf curl betasatellite, tomato leaf curl Bangladesh betasatellite (ToLCBDB), croton yellow vein mosaic betasatellite, radish leaf curl betasatellite, tomato leaf curl Joydebpur betasatellite and tomato leaf curl Ranchi betasatellite have been reported to be associated with leaf curl disease of chilli [11, 13]. Also, a number of begomoviruses and betasatellites have been identified in association with leaf curl disease of Capsicum in other parts of Asia and North America [1417].

Two leaf samples from two chilli (local variety, Suraj Mukhi) plants showing typical leaf curl disease symptoms were collected from a single field of the Gonda district (27.13°N; 81.93°E) in Uttar Pradesh, India, in April 2013. Total genomic DNA was isolated from a single leaf sample using the DNAeasy Plant Mini kit (QIAGEN, Germany) following the manufacturer’s instructions. The full-length genome of the virus was amplified by rolling-circle amplification (RCA) using the TempliPhi kit (GE Healthcare, USA). The RCA product was initially subjected to restriction digestion with BamHI, HindIII, KpnI and PstI (Thermo Scientific, USA). Subsequently, it was digested with BamHI yielding a DNA fragment of 2.7 kb which was cloned into the pGreen0029 vector. DNA B and alphasatellite could not be detected by PCR in the leaf sample. The betasatellite was PCR-amplified, using the RCA product as the template, with the universal primer pair β01/β02 [18] and cloned into the pGEM-T Easy Vector (Promega, USA). Initially, the transformed bacterial colonies were screened through α-complementation and plasmid DNA was isolated. The desired DNA fragment was confirmed following restriction digestion with respective restriction enzymes. Two colonies from each i.e. begomovirus and betasatellite, were sequenced at Xcelris genomics (Ahmedabad, India).

The sequences were submitted to GenBank under the accession numbers KJ957157 and KJ868822 for the genomes of the begomovirus and betasatellite, respectively. Multiple sequence alignments and phylogenetic trees were produced using MEGA software version 6.0 (Table S1 and S2) [19]. Pairwise alignment was done using Sequence Demarcation Tool version 1.2 (SDT, Table S3 and S4) [20].

For recombination analysis the full-length sequence of the begomovirus genome of the present isolate was analyzed using Recombination Detection Program (RDP) version 3.1 (Table S5) [21]. It uses six different automated methods namely RDP, GENECONV, BootScan, Chimera, SiScan and 3Seq with highest acceptable probability value (P = 0.05). Default settings were used and the recombination events were detected by all six programs.

Infectious clones of the genome of the Gonda isolate and associated betasatellite were constructed in pGreen0029 (pGR5.4) and pCAMBIA1391Z (pCAMβ) binary vectors, respectively as essentially described by Pratap et al. [22]. The RCA product, partially digested with BamHI, yielded a DNA fragment of 2.7 kb. It was cloned into the pGreen0029 vector (pGR2.7I) and nucleotide sequences were determined. A primer pair (Chi For GGATCCGTTACTTAACCACTTACC and Chi Rev. AAGCTTCGCATTGGTATGTGTGCGGTA) incorporating BamHI and HindIII restriction sites, respectively was designed and PCR was performed on the RCA DNA. The PCR fragment (2.7 kb) obtained was cloned into pGEM-T Easy Vector (pGEM-T-2.7), digested with BamHI and HindIII restriction enzymes (Fermentas, USA) and cloned into the pGreen0029 vector (pGR2.7II). The monomer (cloned in pGR2.7I vector), was released following BamHI digestion and it was cloned into the binary vector pGR2.7II. Thus a complete head-to-tail dimer of ChiLCGV genome was cloned into the pGreen0029 vector (pGR5.4). The integration and orientation of the dimeric clone of begomovirus genome of the Gonda isolate was confirmed following restriction digestion by NdeI. The betasatellite was cloned into the pCAMBIA1391Z binary vector (pCAMBβ). The infectious clones viz. pGR5.4 (along with helper plasmid pSoup) and pCAMβ were mobilized in Agrobacterium tumefaciens strain LBA4404 using the freeze thaw method [23]. Agroinfiltration of Nicotiana benthamiana plants was done as previously described [24].

The full-length nucleotide sequence of begomovirus genome was 2760 bp in size. Analysis for begomovirus components showed the presence of a predicted hairpin structure with a nonanucleotide (TAATATTAC) sequence forming part of the loop. This structure is typically part of the Ori of the virion-sense strand of geminiviruses [25]. The final adenine nucleotide of the nonanucleotide sequence is conventionally designated as the first nucleotide of the sequence. The ORF Finder showed the presence of six predicted genes, four in the complimentary sense (C1, C2, C3 and C4) and two in the virion sense (V1 and V2), diverging from the intergenic region that contains the predicted hairpin structure. The coding capacity and the position of genes are shown in Supplementary Table S6.

The complete nucleotide sequence of the betasatellite was determined to be 1374 bp in size. The sequence showed all the features typical of a betasatellite [5], including a single ORF in a complimentary sense strand which encodes a 120 amino acid protein and an adenine rich sequence. It also contains a satellite conserved region which is conserved in all begomoviruses, and a typical nonanucleotide stem-loop structure (TAATATTAC), required for replication initiation with similarity to the Ori of geminiviruses.

SDT based pairwise alignment show that the genome of the Gonda begomovirus isolate possesses less than 89 % nucleotide sequence identity with other begomoviruses (Fig. 1A). It shared the highest level of identity (89 %) with PepLCBV-India isolate Chhapra (JN663853) which is less than the threshold value (91 %) for demarcation of begomovirus species [26]. These observations confirm that the virus identified here is a new species in the genus Begomovirus, for which the name chilli leaf curl Gonda virus (ChiLCGV) is proposed. Pairwise alignment shows that the cloned betasatellite (present study) shared a maximum identity of 96 % with ToLCBDB India isolate Jodhpur (HM007105), which is greater than the species demarcation threshold for betasatellites (Fig. 1B) [27]. The species demarcation threshold for betasatellites is 78 % sequence identity. Phylogenetic analysis grouped ChiLCGV with other PepLCBV isolates reported from India, Bangladesh and Pakistan (Fig. 2A). However, the betasatellite sequence of the Gonda isolate is closely related to ToLCBDB reported from India (Fig. 2B).

Fig. 1
figure 1

Sequence Demarcation Tool based pairwise sequence comparisons. Colour-coded pairwise identity matrix generated from (A) 27 begomovirus genomes and (B) 27 betasatellites. Each coloured cell represents a percentage identity score between two sequences (one indicated horizontally to the left and the other vertically at the bottom). A coloured key indicates the correspondence between pairwise identities and the colours displayed in the matrix

Fig. 2
figure 2

(A) Phylogenetic dendogram showing the relationship of chilli leaf curl Gonda virus (ChiLCGV, KJ957157, this study) with other isolates of chilli leaf curl virus (ChiLCV). The optimal tree with the sum of branch length (=1.10982199) is shown. (B) Dendogram showing the relationship of the betasatellite associated with ChiLCGV (KJ868822, this study) with other betasatellites associated with ChiLCV. (C) Recombination events detected within ChiLCGV using RDP3. A genome map of ChiLCGV showing the recombinant region (ChiLCGVa, ChiLCGVb) and the location of possible recombination breakpoints (130-1512, 1836-2314) is provided. Putative parental viruses for this recombinant and the algorithms supporting these data, with their average P-values are also listed

Our results clearly demonstrated that ChiLCGV is a potential recombinant of PepLCBV (HM007097) and ChiLCV (HM007102) (Fig. 2C). Recombination in ChiLCGV was detected at two sites, one in the V2, CP and REn region (nt position 130-1512), and the other in the Rep-C4 region (nt position 1836-2314) with P-values of 1.52 × 10−06 to 6.67 × 10−30 and 1.55 × 10−04 to 2.27 × 10−13, respectively (Fig. 2C). In the REn region, PepLCBV was the major parent and ChiLCV was detected as the minor parent. In the Rep-C4 region it was vice versa. Evolution of viruses often involves an essential process of genetic recombination which seems quite frequent in begomoviruses. The major factor behind the emergence of new begomovirus species is inter-species recombination between different begomoviruses [9, 28, 29]. ChiLCD-associated begomoviruses such as ChiLCPaV [9], ChiLCSV, ChiLCV, PepLCBV, ToLCV [11] have shown recombination. Our study provides additional evidence of inter-species recombination involved in the evolution of begomoviruses.

The agroinoculation results are summarized in Supplementary Fig. S1 and Table S7. When an infectious clone of ChiLCGV (pGR5.4) alone was agroinfiltrated in N. bethamiana leaves only mild symptoms appeared, however when used in combination with the betasatellite severe leaf curl with yellowing of leaves was observed 15 days post infiltration (Fig. S1). ChiLCGV is therefore a betasatellite-dependent begomovirus for symptom induction in N. benthamiana. Accordingly, the role of betasatellites for development of severe leaf curl disease in chilli has been recently demonstrated [11].

In this study, ChiLCGV (Gonda begomovirus isolate) shares the highest nucleotide sequence identity (89 %) with PepLCBV. This is less than the threshold value (91 %) for species demarcation in begomoviruses according to ICTV guidelines [26]. ChiLCGV and ToLCBDV have been described and characterized for the first time from the Gonda region of India where, in combination, they represent a new begomovirus-betasatellite complex, infecting Capsicum.