Introduction

Tobacco vein banding mosaic virus (TVBMV) was first discovered in Taiwan in 1964 [1] and confirmed to be a distinct species of the largest plant virus genus, Potyvirus, in 1994 [2, 3]. TVBMV was once a major threat to tobacco production in some regions including Taiwan and northern USA, but only occurred rarely in most parts of China [46]. However, TVBMV has been detected frequently in tobacco plants in Shandong, Henan, Anhui, and Yunnan provinces in surveys conducted in the past few years [7]. Although TVBMV has long been known, it has not been studied extensively at the molecular level. The 3′-proximal genomic sequences of 12 TVBMV isolates were recently determined and the coat protein (CP)-encoding sequences subjected to phylogenetic analysis, which revealed three groups of isolates in accordance to their geographical origins [7]. However, the complete genomic sequence of TVBMV has not been reported. Here, we report the first complete genomic sequence of TVBMV and compare it with other potyviruses.

Materials and methods

Virus isolate and host range test

The TVBMV isolate, TVBMV-YND, was obtained from a flue-cured tobacco plant (Nicotiana tabacum) in Yunnan, China, following mechanical inoculations to the local lesion host, Chenopodium amaranticolor, and inoculation from a single lesion to N. tabacum in which the isolate was maintained in the greenhouse at 22–27°C [7]. To determine its host range, the isolate was mechanically transmitted to a set of test plants. Infection was confirmed by indirect plate-trapped-antigen ELISA [8] and reverse transcription polymerase chain reaction (RT-PCR) [7].

Aphid transmissibility

Transmissibility of TVBMV-YND by aphids was tested with virus-free apterous adults of Aphis gossypii and Myzus persicae. The aphids were fasted for 2 h before they were allowed to acquire the virus from TVBMV-YND-infected N. tabacum plants for 1–2 min and were then transferred to healthy N. tabacum plants for 5 min inoculation feeding.

Cloning of complete TVBMV-YND genome

Total plant RNA was extracted from the infected tobacco leaves using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) and the first-strand cDNA synthesized using Murine MLV-reverse transcriptase (Promega, Madison, WI, USA) according to the manufacturer’s instructions. For cDNA synthesis, the primer 5′-GGT CGA CTG CAG GAT CCA AGC (T)16-3′ complenmentary to the 3′-terminal poly(A) of TVBMV-YND genome was used [9]. Subsequently, S primer (5′-GGN AAY AAY AGY GGN CAR CC-3′) and the primer mentioned above were used to amplify the first fragment. Primers pCI(+) (5′- GTN GGN TCN GGN AAN TCN AC-3′) and pP3new(+) (5′-CTN NTN TGY GAY AAY CRR YTN GA-3′) [10] were used sequentially in PCR together with primers designed according to newly determined sequences to amplify most of the viral genomic sequence. PCR reactions were carried out using an Expand™ Long Template PCR system (Roche Diagnostics GmbH, Mannheim, Germany) according to the manufacturer’s protocols. The 5′-proximal end of the genomic sequence was determined with the 5′ Rapid Amplification of cDNA Ends (RACE) method using the SMARTRACE cDNA amplification kit (Clontech, USA) following the manufacturer’s instructions. The amplified fragments were purified using TaKaRa agarose gel DNA purification kit (TaKaRa Biotechnology Dalian Co, Ltd) and then cloned into pMD18-T vector (TaKaRa Biotechnology Dalian Co, Ltd). For each fragment, two clones from independent PCR reactions were sequenced. If there was any difference at any position of the two sequences, at least one more clone was sequenced to decide the base at the position concerned and to obtain the consensus sequence.

Sequence and phylogenetic analysis

Complete genomic sequences of 26 most related potyviruses used for comparative analysis were retrieved from the GenBank (NCBI): Bean yellow mosaic virus (BYMV, AB079886), Chilli veinal mottle virus (ChiVMV, AJ237843), Clover yellow vein virus (ClYVV, AB011819), Daphne virus Y (DVY, DQ299908), Japanese yam mosaic virus (JYMV, AB016500), Konjac mosaic virus (KoMV, AB219545), Leek yellow stripe virus (LYSV, AB194621), Lettuce mosaic virus (LMV, AJ278854), Lily mottle virus (LMoV, AJ564636), Narcissus yellow stripe virus (NYSV, AM158908), Papaya leaf distortion mosaic virus (PLDMV, AB088221), Papaya ringspot virus (PRSV, AY162218), Pepper mottle virus (PepMoV, AB126033), Peru tomato mosaic virus (PTV, AJ437280), Plum pox virus (PPV, AJ243957), Potato virus A (PVA, AJ131400), Potato virus V (PVV, AJ243766), Potato virus Y (PVY, AF237963), Scallion mosaic virus (ScaMV, AJ316084), Sweet potato feathery mottle virus (SPFMV, D86371), Thunberg fritillary mosaic virus (TFMV, AJ851866), Tobacco etch virus (TEV, M11458), Tobacco vein mottling virus (TVMV, U38621), Turnip mosaic virus (TuMV, AB105134), Wild potato mosaic virus (WPMV, AJ437279), Yam mosaic virus (YMV, U42596). Sequence analyses and comparisons were performed using the DNAMAN program (Lynnon Biosoft, Quebec, Canada). Phylogenetic relationships were determined by methods including neighbor-joining (NJ), maximum parsimony (MP) and minimum evolution (ME) that are all packaged in MEGA3.1 [11]. Bootstrap analysis with 1,000 replicates was performed to evaluate the significance of the internal branches.

Results and discussion

Host range and symptom

TVBMV-YND produced local lesions in both C. amaranticolor and C. quinoa, systemic symptoms in N. benthamiana, N. tobacum Samsun, Lycopersicon esculentum, and Datura metel. However, infection of Pisum sativum, Vicia faba, and Amaranthus retroflexus was not detected.

Transmissibility of TVBMV-YND

Aphids of both A. gossypii and M. persicae could transmit TVBMV-YND with high efficiency of 90% (18 plants infected out of the 20 inoculated plants) and 94.4% (17/18), respectively.

Genomic sequence of TVBMV-YND

The genomic RNA of TVBMV-YND was 9,570 nucleotides (nt) in length, excluding poly(A) tail at the 3′-end. The sequence was deposited to the GenBank database under the accession number of EF219408. The base composition of the viral genomic RNA was adenine 31.69%, cytosine 19.02%, guanine 22.84%, and uracil 26.45%, which is similar to those of other potyviruses. The first in-frame translation start codon AUG (147–149) was situated in the appropriate context of CACAAUGGC [12]. The −3 and +4 nucleotides were both purines that might be required for initiation of translation [13]. The translation termination codon (UGA) was located at position 9,384–9,386. Hence, the open reading frame of TVBMV-YND was predicted to encode a polyprotein of 3,079 amino acids (Mr 348.6 kDa). The 5′-untranslated region (5′-UTR) of TVBMV-YND was 146 nt. There were several CAA repeats in the 5′-UTR, which was described for the TMV 5′-leader sequence associated with translation enhancement [14]. Two highly conserved potyboxes, ‘a’ (A-X-ACAACAU) and ‘b’ (UCAAGCA) exist in the 5′-UTR of many potyviruses [15]. However, no potybox ‘a’ was found in TVBMV-YND. Instead, TVBMV-YND had two potybox ‘b’ motifs located at positions 23–29 and 80–86. The 3′-UTR whose secondary structure might be involved in potyviral genome replication [16] was 184 nt and AU-rich (58.3%).

Conservative motifs of TVBMV-YND and other potyviruses

Comparison of the coding region of TVBMV-YND with those of other known potyviruses revealed nine putative proteinase cleavage sites (Fig. 1). The cleavage site at the C-terminus of P1 was Y/S, while that of HC-Pro was G/G in the conservative motif of KXYXVG/G [17]. The cleavage sites for potyviral proteinase NIa might be X-X-(V/G)-(X)2-(Q/E)/(S/G/A) [17, 18]. Most of the putative cleavage sites for TVBMV-YND NIa were VXXQ/S. As for other potyviruses, the cleavage site for NIa-VPg and NIa-Pro was VXXE/A instead of Q/A and may be less efficiently cleaved. Slower cleavage at the VPg/Pro junction is important for viral infectivity, as found with Tobacco etch virus [13, 17, 18]. However, the cleavage site for TVBMV-YND NIb/CP was Q/N, which is a rare one for potyviruses [17]. We checked the amino acid sequences of 1,821 potyviral isolates whose NIb/CP encoding sequences are available in the GenBank, and found the cleavage site Q/N only in four, but not all, isolates of Zantedeschia mosaic virus (accession numbers AB181352, AB181353, AB181354, and AB251346).

Fig. 1
figure 1

Schematic presentation of the genomic structure of Tobacco vein banding mosaic virus (TVBMV) and the predicted proteolytic cleavage sites of the TVBMV polyprotein. Numbers above each part of the polyprotein indicate the total number of amino acids of the mature protein. Numbers below indicate the positions for the beginning of each of the mature proteins. The conserved amino acids in the proteolytic cleavage sites are in bold. P1, the first protein; HC-Pro, helper component-proteinase; P3, the third protein; 6K1 and 6K2, 6 kDa protein 1 and 2; CI, cytoplasmic inclusion protein; NIa-Vpg, nuclear inclusion protein a—viral genome-linked protein; NIa-pro, 49 kDa proteinase; NIb, nuclear inclusion protein b; CP, coat protein

In the serine proteinase P1, the conserved motifs GMSG [19] and FIVRG [20] that represent the active sites responsible for the autoproteolytic activity were located at position 257–260 and 280–284 of the amino acid (aa) sequence, respectively. In HC-Pro, the putative zinc finger metal-binding motif C-X7-C-X5-CC-X3-C-X2-C that is involved in aphid transmission [21] was found at the polyprotein position 344–366. TVBMV-YND had a RITC motif which also supports aphid-transmissibility according to the data of this study. In the motif RITC, the first amino acid arginine (R) is positively charged, similar to lysine (K). The RITC motif has also been found in Bean yellow mosaic virus (BYMV), Chilli veinal mottle virus (ChiVMV), Clover yellow vein virus (ClYVV), Lily mottle virus (LMoV), and Pea seed-borne mosaic virus (PSbMV). Therefore, according to results of this and previous studies [2123], the so-called KITC motif should represent K(/R)I(/L)T(/S)C. The conserved motif FRNK, associated with symptom expression [24] and suppression of RNA silencing [25], was conserved in the HC-Pro at aa position 489–492. The motifs CCC and PTK that are involved in aphid transmission and virus movement [26] were found at aa positions 600–602 and 618–620 of the polyprotein, respectively. However, the less-conserved motif IGN in the central region of HC-Pro that is essential for genome amplification [27] was IQN (aa 558–560) in TVBMV-YND. The residues of the putative proteinase active site were C 652-(X)72-H 725, in which C was in the GYCY motif (aa 650–653) conserved in most potyviruses. In P3, the conserved residues EPY-(X)7-SP-(X)2-L could be found at aa position 799–813 of the putative polyprotein, which has been suggested to be involved in regulation of proteolytic processing of the potyviral polyprotein, similar to Cowpea mosaic virus (CPMV) [28]. In RNA helicase CI, the NTP-binding motifs GXXGXGKS and VEPTRPL [29] were found at positions 1,251–1,258 and 1,274–1,280, respectively. The conserved amino acids DECH (position 1,340–1,343), LKVSATPP (1,366–1,373), LVYV (1,418–1,421), VATNIIENGVTL (1,496–1,480) and GERIQRLGRVGR (1,513–1,524), which are characteristic of helicase proteins and might be involved in ATP hydrolysis, RNA binding or unwinding [17, 28, 30], were found in the CI of TVBMV-YND. In NIa-VPg, the conserved tyrosine residue (Y) that is required for linking VPg to potyviral RNA was found in the context MY1927GF [31]. In NIa-Pro, the motif H-(X)2-D-(X)3-GHCG, which is responsible for the proteinase activity [32] was identified at position 2,195–2,205. In NIb, the conserved residues SIKAEL necessary for RNA polymerase activity and ADGSRFD for NTP binding were located at positions 2,464–2,469 and 2,541–2,547, respectively. The conserved amino acids FDSS at position 2,546–2,549 of the polyprotein was located 263 aa upstream of the putative NIb/CP cleavage site [33]. The conserved (S/T)G-(X)3-T(X)3-N(S/T)(X)18–37GDD that is necessary for RNA polymerase activity and NTP binding began at position 2,604. In CP, the DAG motif that interacts with PTK of HC-Pro to regulate potyviruses transmission by aphids [34] was located at position 2,815–2,817. The three consensus motifs found in the CP of potyviruses [33] were also found in TVBMV-YND (MVWCIENGTSP, 2,928–2,938; AFDF, 3011–3014; and QMKAAA, 3,031–3,036).

Phylogenetic analysis of TVBMV-YND

To address the relationship of TVBMV with other potyviruses, multiple aa sequence alignments and phylogenetic analyses of the complete genomic sequences were made using the software of MEGA package.

Phylogenetic analysis revealed that TVBMV-YND was most closely related to ChiVMV, a virus reported in Asia and Africa and which mainly infects solanaceous plants [35, 36] (Fig. 2). The 5′-UTR of TVBMV-YND was 28.0–43.3% identical to those of 26 most related potyviruses analyzed (Table 1). It shared the highest identity with Tobacco etch virus (TEV). The 3′-UTR of TVBMV showed identity of 21.7–34.1% with other potyviruses and the highest identity with Tobacco vein mottling virus (TVMV). The most variable region was found in the P1 between and other potyviruses with the highest identities of 35.5% to Lettuce mosaic virus (LMV, AJ278854) at nt level, and 19.3% to LMoV (AJ564636) at aa level. TVBMV P3 protein shared the highest identity with Plum pox virus (PPV; 43.4 and 30.8% at nt and aa level, respectively) (Table 1). The seven other proteins of TVBMV (HC-Pro, 6K1, CI, 6K2, VPg, NIa, and NIb) had the highest identities with ChiVMV (67.4, 66.0, 59.4, 63.3, 66.0, 59.9, and 66.5% identity at aa level, respectively) (Table 1). TVBMV CP shared the highest identities of 68.5 and 74.2% at nt and aa levels, respectively, with Pepper veinal mottle virus (PVMV; complete genomic sequence not available). The aa sequence identities of TVBMV CP with other potyviruses analyzed in this study ranged from 55.8 to 71.9% (Table 1). The nt and aa identities in pair-wise comparisons with other potyviruses fell below the identity values used for potyvirus species demarcation [37]. The entire genomic sequence of TVBMV-YND had identities of 51.0–59.4% with the potyviruses analyzed, while the complete ORF sequences shared identities of 51.7–59.8% and 46.1–56.5% with these potyviruses at nt and aa levels, with the highest identities to ChiVMV. The values fell into Cluster B [37], which confirmed that TVBMV is a distinct species of the genus, Potyvirus.

Fig. 2
figure 2

A phylogenetic tree constructed from the polyprotein-encoding sequences of TVBMV and 26 related potyviruses. Sequences were aligned by Clustal W. The phylogenetic tree was constructed by the neighbor-joining algorithm, both of which were in the MEGA3.1 package. The data set was subjected to 1,000 bootstrap replicates. The bootstrap values higher than 80 are indicated at the corresponding branch

Table 1 Percent nucleotide (left) and amino acid (right) sequence identities of TVBMV-YND with the most related potyviruses

In conclusion, this is the first report of the complete nucleotide sequence and genome structure of TVBMV. The data clearly support the status of TVBMV as a distinct species in the genus Potyvirus. Furthermore, TVBMV was found to contain some unique characteristics in the genome and deduced polyprotein sequences, which has provided new information about the variability of the potyviral genomes. In the phylogenetic tree constructed with CP-encoding sequences in a previous study, TVBMV-YND formed a separate cluster with YN9 [7]. They shared some specific amino acids, including the unique NIb/CP cleavage site of Q/N. It would be interesting to determine the complete genomic sequence of a TVBMV isolate that belongs to the other cluster and explore isolates in that cluster contain the RITC motif in HC-Pro.