Introduction

Viruses in the family Geminiviridae infect a large number of plants and cause significant economic losses to many crops worldwide [1]. The genome of geminiviruses typically consists of one or two circular single-stranded DNA (ssDNA) molecules of about 2.5-3.2 kilobase pairs (kb) that are encapsidated within twinned (geminate) icosahedral particles of about 22 × 38 nm in size [2]. New geminiviruses are rapidly emerging via a high mutation rate and frequent recombination [3]. Currently, the family is divided into nine genera, namely Becurtovirus, Begomovirus, Capulavirus, Curtovirus, Eragrovirus, Grablovirus, Mastrevirus, Topocuvirus, and Turncurtovirus, on the basis of host range, type of vector, genome organization, genome-wide pairwise sequence comparisons, and phylogenetic relationships [2].

During a survey of viral diseases of legumes in July 2018, common bean (Phaseolus vulgaris) plants showing virus-like symptoms were observed and collected in Harbin, Heilongjiang province, China. A small-RNA (sRNA) library was constructed from an equal-ratio leaf mixture of three common bean samples (DN-18, DN-19, and DN-20) showing severe stunt and leaf curling, vein banding, and chlorosis, respectively. The library was sequenced using the Illumina HiSeq-400 sequencing platform at Lianchuan Biotechnology Co., Ltd. (Hangzhou, China). A total of 17,060,132 reads with length between 17 and 27 nucleotides (nt) were obtained after adaptor trimming and read-quality filtering using Cutadapt 2.4 [4]. These reads were assembled using Velvet and Oases with a k-mer value of 17 [5, 6], and the resulting contigs were analyzed as described earlier [7]. A total of 22, 16, and 27 contigs had a high level of similarity to bean common mosaic virus (BCMV; genus Potyvirus), broad bean wilt virus 2 (BBWV-2; genus Fabavirus), and alfalfa mosaic virus (AMV; genus Alfamovirus) respectively. This analysis also revealed 12 contigs with a low level of similarity (less than 65 % amino acid [aa] sequence identity) to beet curly top virus (BCTV; genus Curtovirus), turnip curly top virus (TCTV; genus Turncurtovirus), or sesame yellow mosaic virus (SeYMV; genus Turncurtovirus). Three sets of primers were then designed on the basis of these geminiviral contigs (Supplementary Table 1). Polymerase chain reaction (PCR) was then performed using Phanta Super-Fidelity DNA Polymerase (Vazyme Biotech, Nanjing, China) and total DNA extracted from DN-18, DN-19, or DN-20 by the CTAB method [8]. Two primer sets (Turto_F1 plus Turto_F1 and Turto_F2 plus Turto_ R2; all located in the C3 open reading frame [ORF]) successfully amplified two fragments of about 2.7 kb from sample DN-18, but not from the other two samples, whereas primers Turto_F3 (located in C2 ORF) and Turto_R3 (located in C1 ORF) did not give a replicon of the predicted size (~1.1 kb) for any of the three samples. The amplicons were recovered, cloned into the pEASY-Blunt vector (Transgen, Beijing, China), and sequenced. The results showed that the two fragments were from the same new geminivirus. Therefore, the primers Turto_100F and Turto_2000R (located in V1 and C1, respectively) were designed to amplify a fragment of about 1.1 kb of this geminivirus. The 5´and 3´ ends of this fragment overlapped with the two 2.7-kb fragments. We also designed a primer (Turto_1488R) that is back-to-back with Turto_F1 for amplifying the entire genome of this geminivirus. A band of about 3.0 kb was successfully amplified from DN-18, but not from the healthy common bean leaf sample (Fig. 1B). The amplified 3.0-kb fragment was inserted into the pEASY-Blunt vector. Plasmids from two independent colonies were sequenced using an ABI automated DNA sequencer (Sangon Bio., Shanghai, China). The resulting fragments from Sanger sequencing were assembled using the SeqMan program in Lasergene 7.1 (DNASTAR, Inc, Wisconsin, USA). Multiple sequence alignment using the MegAlign program in Lasergene 7.1 showed that this 3.0-kb fragment was identical to the genomic sequence assembled from the two overlapping fragments, suggesting that the amplified 3.0-kb fragment represents the full genome of the new geminivirus. We also performed RT-PCR to confirm the presence of BCMV, BBWV-2, or AMV in DN-18. The RT-PCR results showed that none of the three viruses was detected in the sample showing severe stunt and leaf curling symptoms.

Fig. 1
figure 1

Symptoms, genome structure, and phylogeny of CBCSV. A. Symptoms of CBCSV infection in a common bean plant. B. PCR amplification of the complete genome. M, DNA marker. Lanes 1 and 2 are healthy and symptomatic common bean leaves, respectively. C. Schematic diagram of the CBCSV genome. D. Phylogenetic tree based on full genome sequences of CBCSV and selected geminiviruses. The tree was constructed using the maximum-likelihood (ML) method in the MEGA X software [13] with the Jukes-Cantor genetic distance model and 1000 bootstrap replicates. The Kimura 2-parameter nucleotide substitution model was determined by the Model Selection function in MEGA X software. E. Recombination events in the CBCSV genome detected by RDP 4 software [13] using default parameters. Only recombination events with a p-value <0.01 were accepted, and the two recombination detection methods with the lowest p-values are shown

The full genome of this geminivirus comprises 2,959 nucleotides (nt) (GenBank accession no. MK673513) and has the highest nucleotide (nt) sequence identity (55%) to SeYMV isolate IR/Jir/JK_10-2/14 (Table 1). The origin-of-replication sequence of this geminivirus is identical to the conserved nonanucleotide motif of the majority of geminiviruses (TAATATT/AC). Sequence analysis using Lasergene Seqbuilder 7.1.0 (DNASTAR, Inc., Madison, WI, USA) revealed three ORFs in the virion sense, namely V1 (nt 508-1272), V2 (nt 267-590), and V3 (nt 193-417), and four ORFs in the complementary sense, e.g., C1 (nt 2818-1715), C2 (nt 1848-1429), C3 (nt 1706-1308), and C4 (nt 2658-2401). C1 and V3 are separated by a 333-nt intergenic region (IR) containing the conserved nonanucleotide motif, whereas C3 and V1 are separated by a 35-nt IR (Fig. 1C). V1 encodes a 254-aa coat protein (CP) that shows very limited similarity to those of other geminiviruses (Table 1). The V2 protein is homologous to that of turncurtoviruses, whereas no homologous counterpart of V3 was found in the GenBank database. C1 encodes a 367-aa replicase (Rep) protein that has the highest aa sequence identity (65.5%) with BSCTV (Table 1). C2, C3, and C4 are homologous to those of curtoviruses on the basis of BLASTp analysis [11]. This geminivirus was located in a separate branch adjacent to the clade of turncurtoviruses in phylogenetic trees constructed based on the full genome sequences (Fig. 1D) or amino acid sequence of the Rep proteins (Supplementary Fig. 1B) of representative geminiviruses. Interestingly, this geminivirus forms a distinct clade in the CP-based phylogenetic tree (Supplementary Fig. 1A). A recombination analysis was performed using CBCSV and the most closely related geminiviruses in the NCBI database, via PSI-BLAST (GenBank accession nos.: KC108902, MF536416, KT388064, EU921828, KX529650, AF379637, KX867037, MH595452, MH595454, MH595453, U02311, and X97203) with the detection methods RDP, Chimaera, BootScan, 3Seq, GENECONV, MaxChi, SiScan, and LARD in RDP 4 software [12]. The results showed that the N-terminal portion of C1 (nt 2344-2829) was possibly acquired by recombination from TCTV isolate IR:Lap:L2-7:Jim:13 (GenBank accession no. MF536416), the middle part of C1 (nt 1915-2165) was possibly derived from BSCTV isolate CFH (GenBank accession no. X97203), and the upstream portion of V3 (nt 29-136) was possibly acquired by recombination from BCTV isolate CTS07-043 (GenBank accession no. KX867037) (Fig. 1E). These data indicate that this geminivirus is a recombinant virus that is phylogenetically related to turncurtoviruses, although it has a slightly different genome structure from that of turncurtoviruses (i.e., it contains a putative V3 gene and a 35-nt IR between C3 and V1). Because this novel geminivirus has a unique genome organization and its genomic sequence is highly divergent from those of other geminiviruses, it is difficult to assign it to one of the nine current genera of the family Geminiviridae. We propose to name this novel geminivirus “common bean curly stunt virus” (CBCSV).

Table 1 Percent nucleotide and amino acid sequence identity between CBCSV and selected geminiviruses

Northeastern China is an important region for production of common bean crops. A survey has suggested that this virus is widely distributed in common bean plants in Heilongjiang province. Therefore, special attention should be paid to the damage that it may cause.