Members of six species of the genus Potyvirus infect sweet potato and cause significant yield losses in sweet potato production worldwide [1]. Sweet potato latent virus (SPLV), formerly designated as sweet potato virus N, was first reported from Taiwan [2]. The virus was then found in several other countries in Asia and Africa [36]. SPLV has flexuous, filamentous particles that are approximately 700-750 nm long, and it induces typical cylindrical inclusions in the cytoplasm of infected cells. The experimental host range of SPLV is wider than that of sweet potato feathery mottle virus (SPFMV), inducing symptoms on some Chenopodium and Nicotiana species [2, 7]. Similar to SPFMV and sweet potato virus G (SPVG), a single infection by SPLV is symptomless in most sweet potato cultivars, but co-infection with sweet potato chlorotic stunt virus (SPCSV) causes synergistic disease [8]. SPLV is serologically related to, but distinct from, SPFMV [7]. Sequence comparison of 3’-partial sequences showed that SPLV is a member of a distinct species in the genus Potyvirus [9]. In this study, the complete genomic sequence of SPLV was determined, and phylogenetic analysis clearly indicated that SPLV is a distinct potyvirus that is most closely related to sweet potato virus 2 (SPV2) and other viruses in the SPFMV lineage.

The SPLV isolate originally collected from a diseased sweet potato cv. Tainung 63 co-infected with SPFMV in Taiwan in 1970s [2] was used in this study. The infected sweet potato plant was transferred from Taiwan to the International Potato Center (CIP) in 1980s. SPLV was transmitted to Nicotiana clevelandii by mechanical inoculation and maintained in this experimental host through mechanical inoculation. Leaf tissue from an infected N. clevelandii plant was obtained under a USDA-APHIS permit and used to inoculate N. benthamiana seedlings. The inoculated N. benthamiana plants were tested by RT-PCR for SPFMV, SPVC, SPVG, SPV2, SPLV and SPMMV, and the results showed that the plants were only infected with SPLV. Total nucleic acids were extracted from an infected N. benthamiana plant by the CTAB method [10]. RT-PCR was performed using a SuperScript TM III One-Step RT-PCR System (Invitrogen, Carlsbad, CA, USA) following the manufacturer’s instruction.

The complete genomic sequence of SPLV was determined by a genomic-walking strategy described for ApVY [11] in a series of sequential RT-PCR cloning steps. The 3’-terminal sequence of approximately 1.7 kb was obtained by RT-PCR using a SPLV-specific forward primer designed based on the 3’-partial sequences of SPLV-CY (EF492050) and an oligo (dT) primer. The 3’-terminal sequence was extended to 2950 bp using the degenerate forward primer PotyF1 and an SPLV-specific primer based on the sequence obtained in this study. Sequences of partial CI and HC-Pro fragments were obtained from RT-PCR products using two pairs of degenerate primers (CIFor/CIRev and HPFor/-HPRev, respectively) described by Ha et al. [12]. The HC-Pro–NIb sequence of 5.0 kb was determined from two overlapping RT-PCR clones using SPLV-specific primers based on the sequences of the three fragments (HC-Pro, CI and 3’ terminus) obtained during this study. The 5’-terminal region of 3.3 kb was amplified by RT-PCR using an SPLV-specific reverse primer and a 5’-end forward primer, Poty5endF1 (5’-GTTTTCCCAGTCACGACAAAATATAAAAACTCAACA-3’), based on the 5’-end sequences of some potyviruses. To obtain an accurate 5’-end sequence, a 5’ RACE reaction was conducted using a GeneRacer® Kit (Invitrogen) according to the manufacturer’s instructions. The complete genomic sequence of SPLV was confirmed using nine overlapping fragments obtained by RT-PCR with SPLV-specific primers and another 5’RACE assay.

Contig assembly was performed using the DNAStar 5.01 package (DNAStar Inc., Madison, WI, USA). The putative cleavage sites in the SPLV polyprotein sequence and conserved domains of the putative functional proteins were determined by comparison of its polyprotein sequence with those of representative potyviruses. Pairwise comparisons of nucleotide sequences, deduced polyprotein sequences and individual protein sequences were performed using the EMBOSS Needle Pairwise Sequence Alignment tool at http://www.ebi.ac.uk/Tools/psa/emboss_needle/nucleotide.html. Phylogenetic analysis of the complete genomic sequences of SPLV and 82 other potyviruses was conducted using MEGA 5, and neighbor-joining and maximum-likelihood trees were constructed using 1000 bootstrap replicates. Putative recombination breakpoints were identified using the RDP4 Beta 4.16 program (http://darwin.uvigo.es/rdp/rdp.html).

The complete genomic sequence of SPLV is 10081 nucleotides (nt) long excluding the 3’ poly (A) tail and has been submitted to the GenBank database under accession number KC443039. Its genomic organization is typical of potyviruses, containing a single open reading frame (ORF) (Fig. 1). The putative ORF starts at the first in-frame AUG (nt 144-146) and ends with a UAG termination codon at nt 9885-9887. It encodes a polyprotein of 3247 amino acids (aa) with a calculated molecular weight of 367.8 kDa. The nine putative protease cleavage sites are identified in the SPLV polyprotein, which are predicted to give rise to ten putative mature proteins (P1, HC-Pro, P3, 6K1, CI, 6K2, VPg, NIa-Pro, NIb, and CP). Most motifs typical of potyviruses are found in the amino acid sequences of these mature proteins: 350H-8X-D-31X-G-X-S-G394 and 414F-I/V-V-R-G418 in P1; 623F-R-N-K626 and 786C-72X-H859 in HC-Pro; 1389G-A-V-G-S-G-K-S-T1397, 1409V-L-L-V/L-E-P-T-R-P-L1418 and 1478D-E-X-H1481 in CI; 2239H-34X-D-67X-G-X-C-G-14X-H2360 in NIa-Pro; 2682C-D-A-D-G-S2687 and 2746S-G-3X-T-3X-N-T-30X-G-D-D2789 in NIb [1315]. In the CP, the D-A-G motif, which is involved in aphid transmission [15], was changed to 2961D-T-G2963. This mutation only occurs in five other isolates of SPLV (accession nos. HQ844128, HQ844129, HQ844131, HQ844147, X84012), and the majority of the SPLV isolates reported to date have the D-A-G motif in the CP. The substitution of threonine in the second position of the D-A-G motif reduces aphid transmissibility in tobacco veinal mosaic virus [16]. A lysine in the first position of another aphid transmissibility motif in the HC-Pro, K-I-T-C [15], is changed to 494E-I-T-C497 in SPLV. The significance of the changes in these two motifs on aphid transmission of this SPLV isolate is not known. A putative P3N-PIPO ORF was identified that include the frame shift sequence 3311GAAAAAA3317 [17].

Fig. 1
figure 1

Schematic representation of the genome organization of sweet potato latent virus. The 5’- and 3’-untranslated regions are represented by solid lines, and the open reading frame (ORF) is depicted by an open box with solid lines. The location of each putative cleavage site on the polyprotein is marked by an arrow, and amino acid residues around each cleavage sites are listed for each arrow. The name and size of each protein are listed inside each mature protein, and number above each cleavage site indicate the number of the first amino acid residue of the mature protein in the polyprotein sequence

A BLAST search in the GenBank database shows that the polyprotein sequence of SPLV is most closely related to that of SPV2. Pairwise comparisons show that SPLV shares identities of 51.0 % (onion yellow dwarf virus, OYDV) to 56.3 % (SPV2) at whole genomic sequence level and 39.6 % (narcissus degeneration virus) to 49.8 % (SPV2) at the polyprotein sequence level, respectively, with 82 other members of the genus Potyvirus. These values fall under the species demarcation thresholds [18], confirming that SPLV is a distinct species in the genus Potyvirus. Detailed comparisons of individual protein sequences show that SPLV has the highest degree of identity to viruses in the SPFMV lineage (SPFMV, sweet potato virus C, SPVG and SPV2) [19] for HC-Pro, P3, 6K1, CI, 6K2, NIa and NIb, reflecting the relatedness of the whole genome. The aa sequences of the P1, VPg and CP of SPLV share the highest degree of identity with Brugmansia suaveolens mottle virus, Moroccan watermelon mosaic virus and pepper veinal mottle virus, respectively.

Despite similarities in genomic and aa sequences of seven central proteins (HC-ProNIb), SPLV is different from viruses of the SPFMV lineage, especially in the P1 region. The P1 region of SPLV does not contain the third ORF encoding the putative PISPO unique to the SPFMV lineage [19]. The SPLV P1 protein of 442 aa is shorter than that of the SPFMV lineage (618-724 aa) due to four deletions at the N-terminal and central regions. The P1 protein also does not contain WG/GW(Y) motifs and zinc-finger-like structure unique to the SPFMV lineage and sweet potato mild mottle virus (SPMMV). The sizes of all individual mature proteins of SPLV are similar to those of most potyviruses. The sequence identities of these proteins between SPLV and the viruses of the SPFMV lineage are also similar to those of most members of the genus Potyvirus.

A phylogenetic analysis was initially conducted using the deduced polyprotein sequences of SPLV and 82 other members of the genus Potyvirus. Both neighbor-joining and maximum likelihood trees clearly place SPLV in a distinct branch adjacent to the SPFMV lineage in the PVY group [19, 20]. The tree was then simplified to include the representative members of five groups of the genus Potyvirus (Fig. 2). With the exception of P1 and CP, a similar clustering of SPLV and the SPFMV lineage is maintained in phylogenetic trees based on the other proteins and the complete genomic sequences of all available potyviruses (data not shown). The results indicate that SPLV and the SPFMV lineage may share a common ancestor, with separation of SPLV and the SPFMV occurring before speciation of the viruses in the SPFMV lineage. However, how and where these potyviruses have evolved is not clear, since they are very closely related to the viruses of subgroup 1 in the PVY group that have evolved in the Asia-Mediterranean-Europe region. Furthermore, the distribution of SPLV is mainly restricted to Asian production areas [6, 7], suggesting that the virus might have evolved there. Analysis of the complete genomic sequences of SPLV, four viruses in the SPFMV lineage, and SPMMV using the RDP4 program did not reveal any positive recombination breakpoints in SPLV (data not shown).

Fig. 2
figure 2

Maximum-likelihood tree based on the deduced polyprotein sequences of sweet potato latent virus and representative members of the genus Potyvirus. Bootstrap analysis was applied using 1000 bootstrap replicates. The scale bar represents a genetic distance of 0.2. Sugarcane streak mosaic virus, a member of the genus Poacevirus, was used as an outgroup

Phylogenetic analysis based on the CP aa sequences of this isolate and 30 other isolates of SPLV (27 from China, two from Taiwan and one from Japan) available in GenBank showed that this Taiwan isolate was most closely related to two other Taiwan isolates (Taiwan [accession no. X84012] and CY [EF492090]), and they are placed in the major cluster [7]. Comparison of the 3’ partial sequences of 1076 nt showed that there were only 14 nt differences (1.3 %) between the two Taiwan isolates. The difference is very similar to those among different clones from a given region (approximately 1 %). Therefore, the two Taiwan isolates might come from the same source, representing the same isolate. Serial passage of this isolate in tobacco plants through continuous mechanical inoculation might not have much effect on the viral population structure. This study provides the first report of the complete genomic sequence of SPLV and will assist in the development of molecular diagnostic tests and effective disease management strategies.