Introduction

Sacred lotus (Nelumbo nucifera Gaertn.) is an aquatic herb crop of considerable agricultural, religious, ornamental, and medical importance (Wang et al. 2013a, b; Guo 2009; Liu et al. 2016). Lotus belongs to the small plant family of Nelumbonaceae (Tian et al. 2008; Kubo et al. 2009). In terms of morphological differences and molecular markers, lotuses have been divided into rhizome lotus, seed lotus, and flower lotus (Liu et al. 2016; Fu et al. 2011; Hu et al. 2012). Lotus has been cultivated widely in many countries around the world, such as China, Japan, India, Thailand, South Korea, America and Australia. China is considered to be one of the cultivation centers of lotus (Guo 2009). In 2012, rhizome lotus and seed lotus were cultivated across 660,000 and 67,000 ha of land, respectively (Liu et al. 2016). Sacred lotus is known to be infected with, or susceptible to, several plant pathogens such as Curvularia lunata (Cui and Sun 2012), Fusarium tricinctum (Li et al. 2016), Phytopythium helicoides (Yin et al. 2016), cucumber mosaic virus (CMV) and dasheen mosaic virus (DsMV) (Yu et al. 2015).

Sweet potato latent virus (SPLV) is a specie of the genus Potyvirus in the family Potyviridae (Wang et al. 2013a, b; Kwak et al. 2013). It infects Nicotiana clevelandii and N. benthamiana by mechanical inoculation (Wang et al. 2013a, b). SPLV has a single-stranded, positive-sense RNA genome of approximately 10,000 nucleotides (nts) (Wang et al. 2013a, b; Kwak et al. 2013). The genome encodes a large polyprotein that is cleaved into ten mature proteins after translation by three virus-encoded proteinases. Five full length genomic sequences of SPLV isolates from South Korea and Taiwan have been published (Wang et al. 2013a, b; Kwak et al. 2013).

With the advent of next-generation sequencing (NGS) and homology dependent or independent bioinformatics algorithms, more than fifty new plant viruses (RNA and DNA viruses) and subviruses have been identified through deep sequencing of sRNAs (Wu et al. 2015). In the present study, one novel strain of sweet potato latent virus, was identified from lotus (SPLV-lotus) by NGS of sRNA and viral sequence-specific amplification in Jiangsu province, China.

Materials and methods

Plant materials

The sRNA library was generated using samples of mixed leaves from six lotus plants of cultivar ‘Meirenhong’ that exhibited yellowing, with or without mottle symptoms, between June to September of 2015 in Hanjiang and Wuzhong prefectures of Jiangsu province, China.

Construction, sequencing and analyses of the sRNA library

The sRNA libraries were constructed from 1.0 mg of total RNA using the ‘Small RNA v1.5 Sample Prep’ kit from Illumina and sequenced in a single lane on a Genome Analyser IIx. The output files were processed with Illumina’s CASAVA pipeline (version 1.8). The reads resulting from NGS were processed by removing the adaptor and then assembled de novo into larger contigs using Velvet Software 0.7.31 (Zerbino and Birney 2008) with a k-mer of 17. The resulting final contigs were compared against the GenBank virus reference database using the BLASTN programs.

RT-PCR, sequencing and assembly

Total RNA was extracted by TaKaRa MiniBEST Plant genomic RNA Extraction Kit (TaKaRa, Dalian), and cDNA was amplified using a reverse transcription (RT) kit (Promega). The 5’ and 3’ terminal regions were determined using a rapid amplification of cDNA ends (RACE) kit (Takara). To amplify the full-length genomic sequences from lotus sample, seven pairs of primers (Table S1) were derived from the NGS sequence. The coat protein (CP) encoding region of the virus were determined using CP specific primers (Table S1). The PCRs were carried out in 25 μl reaction volumes using the Prime STAR® Max DNA Polymerase (TaKaRa), according to the manufacturer’s rapid protocol recommendations. Sequences of each fragment were determined using three independent cloned plasmids. At least four clone sequences of CP encoding region from PCR products corresponding to the random selected positive lotus samples were determined. Each clone was sequenced using an automated DNA sequencer (ABI PRISMTM 3730XL DNA Analyzer). Sequence reads were assembled by BioEdit 5.0.9.

Sequence alignment and nucleotide identities

The genomic sequences of seventy-two species of potyvirus (Table S2) derived from GenBank were used for genetic analyses alongside the genome sequence of SPLV-lotus obtained in this study. Additionally, hordeum mosaic virus (HordMV) (GenBank accession No. NC005904) (French and Stenger 2005) and ryegrass mosaic virus (RGMV) (NC001814) of the genus Rymovirus were used as the outgroups, which share highest identities with the genomic sequence of the genus Potyvirus in GenBank by BLASTN searches. Multiple nucleotide (nt) and amino acid (aa) sequence alignments for phylogenetic analyses were performed using default options in CLUSTAL_X2. Nucleotide and amino acid sequence alignments with other SPLV sequences published in GenBank were calculated using DNAMAN 6.0. The pairwise nucleotide sequence identity scores were represented as a distribution plot using SDT version 1.2 software (available from http://web.cbio.uct.ac.za/SDT).

Phylogenetic analysis

The maximum-likelihood (ML) method and the neighbor-joining (NJ) method implemented in PhyML version 3.0 and MEGA version 7.0.18 were used to analyze phylogenetic relationships of the aligned polyprotein and coat protein sequences. For the ML tree, the best fit model of each dataset was selected using jModeltest v0.1.1, the general time-reversible substitution model with invariant sites and a gamma distribution and a proportion of invariable sites (GTR + I + Ƭ4) was detected to be the best-fit nt substitutions model for the aligned polyprotein and coat protein sequences.

Mechanical transmission and field surveys

To evaluate the host range of SPLV-lotus, mechanical inoculations onto leguminous (Glycine max), solanaceous (Lycopersicon esculentum, Piper nigrum, Nicotiana benthamiana), cucurbitaceous (Cucumis sativus), chenopodiaceous (Chenopodium amaranticolor and C. quinoa) and convolvulaceous (Ipomoea batatas) plants were performed using the sap of infected lotus leaves which confirmed by sequencing. For field surveys, leaf samples were collected from 43 lotuses randomly chosen in Jiangsu province of China in 2015 and 2016. The details of these samples, such as geographical origins, collection time, and symptoms, are given in Table S3. The fresh lotus leaves were stored at −80 °C for later use. Total RNA of these samples were extracted by TaKaRa MiniBEST Plant genomic RNA Extraction Kit (TaKaRa, Dalian) and detected by RT-PCR with specific primers (Table S1).

Results

Potyviruses identified by NGS and RT-PCR

Sixty contigs generated from NGS of the sRNA library from lotus shared similarities with the genomic regions of sweet potato latent virus (SPLV), sweet potato virus 2 (SPV2), plum pox virus, wild tomato mosaic virus, celery mosaic virus, turnip mosaic virus, pokeweed mosaic virus, sweet potato virus C (SPVC), peru tomato mosaic virus, lily mottle virus, and freesia mosaic virus. These viruses belong to the genus Potyvirus of the family Potyviridae, suggesting the presence of one or more additional potyviruses in the lotus.

To confirm the presence of SPLV in the lotus tested, the primers: SPLVF2–1: 5’-GAATGCTATCATACCTTGAG-3’and SPLVR3–1: 5’-GCAACTGCGACATCACTATTG-3’ were designed according to the contig sequences of NGS (Table S1). The above six lotus plants selected for NGS of the sRNA library were tested by these primers, and a predicted product of 854 bp was obtained from two (YZ1 and SZ1) of the six lotus samples (Fig. S1). Then, one of the two amplicons was selected to be cloned and sequenced. BlastN showed that the 854 nt sequence had highest similarity (76%) with SPLV in GenBank. Both the NGS and RT-PCR results suggested that a probably novel potyvirus or strain of SPLV was identified from lotus.

Complete genome sequence of a putative new potyvirus

The complete genome sequence from YZ1 lotus sample was determined by Sanger resequencing of seven overlapping PCR fragments. The resulting 10,081 nts long sequence, excluding a poly (A) tail, was deposited in GenBank under the accession No. MH705333. This isolate encodes a polyprotein of 3247 amino acid residues with a calculated Mr. of 366 kDa. The sequence shows a typical genome organization of viruses belonging to the genus Potyvirus (Fig. 1). Based on the proposed cleavage sites (Table 1), the polyprotein was processed into ten mature proteins, identified as the protein 1 (P1), helper-component proteinase (HC-Pro), the protein 3 (P3), the 6 k Dalton 1 (6K1), nuclear inclusion b protein (CI), the 6 k Dalton 2 (6K2), viral protein genome-linked (VPg), nuclear inclusion a (NIa), nuclear inclusion b (NIb), and coat protein (CP). Similar to the members of the genus Potyvirus (Chung et al. 2008), an additional protein (pretty interesting Potyviridae ORF, PIPO) was also predicted to be expressed as a fusion with the N-terminal region of the protein 3 (P3) of the virus (Figs. 1 and 2). However, the conserved G2A6 sequence in the P1 coding region of the monophyletic sweet potato feathery mottle virus (SPFMV) group (including SPFMV, SPVC, SPV2 and sweet potato virus G, but not SPLV) of the genus Potyvirus (Untiveros et al. 2016), were not found in this virus. The H-×8-D/G, GxSG, and RG motifs in the P1 coding region, the KITC, GE, GW/WG, and FRNK motifs in the HC-Pro coding region, and the R-×44-D, DAG motif in the CP coding region were found in this virus.

Fig. 1
figure 1

Schematic genome structure of SPLV-lotus. Open reading frame and untranslated regions were showed by an open bar and a single line, respectively. Vertical lines in the open bar indicated the putative cleavage sites within polyprotein. The positisions of the first nucleotide of ten mature protein coding regions were shown above the bar

Table 1 Difference of cleavage sites in the polyprotein of the SPLV group viruses
Fig. 2
figure 2

Amino acid alignment of P3N-PIPO in the potyviruses isolates in SPLV group, including SPLV-lotus

Taxonomic placement

The lotus virus identified here shares highest identity with SPLV in the genus Potyvirus, with the values 76.43–76.76%, and 76.28–76.60% based on the complete genome sequence and the polyprotein sequences (Table 2). These values were close to, but not below the arbitrary threshold value for species demarcation of the family Potyviridae, which suggests that the potyvirus isolated from lotus probably represents a novel strain of SPLV. Phylogenetic analysis, based on the polyprotein of the virus obtained here and other members of the genus Potyvirus, indicated that this lotus virus is clustered with five SPLV isolates into a separate group (Fig. 3). These results provide further support for the existence of a novel strain of SPLV in the genus Potyvirus. In addition, four proposed cleavage sites in P1/HC-Pro, CI/6K2, 6K2/VPg and NIb/CP (Table 1) of the virus are bearing 1–2 aa substitutions in comparison with five previous reported SPLV genomic sequences. Additionally, the PIPO length of the virus is much smaller than that of SPLV (Fig. 2). In all, we considered the potyvirus obtained in this study as a distant strain of SPLV based on the analysis of sequence diversity and phylogeny, and the different PIPO length.

Table 2 Nucleotide sequence identities of sweet potato latent virus lotus isolate to other members of SPLV group in the genus Potyvirus
Fig. 3
figure 3

Maximum-likelihood tree calculated from the polyprotein sequences of the viruses in the genus Potyvirus. Numbers at each node indicate the percentage of supporting puzzling steps (or bootstrap samples) in maximum-likelihood. The homologous sequences from an isolate of hordeum mosaic virus (HordMV) (GenBank accession No. NC-005904) and ryegrass mosaic virus (RGMV) (NC-001814) in the genus Rymovirus were used as outgroups. SPLV-lotus isolate is shown with red circles

Bioassays

To investigate the host range of this virus, G. max (varieties sudou7 and Jufeng), S. lycopersicum (varieties F203 and F422), P. nigrum (varieties J4 and J6), C. sativus (varieties yunv and KP2), N. benthamiana, C. amaranticolor, C. quinoa and I. batatas were inoculated by young potyvirus positive lotus sap. The virus was detected by RT-PCR with with primers SPLVF2–1/SPLVR3–1 (Table S1) from upper leaves of G. max (both the varieties sudou7 and Jufeng), C. sativus (both the varieties yunv and KP2), N. benthamiana, C. amaranticolor, C. quinoa and I. batatas frequently, one month postinoculation (Table 3). More interestingly, none of the infected plants showed typical symptoms. This virus was not detected in either of the two varieties of S. lycopersicum or P. nigrum.

Table 3 List of the herbaceous hosts mechanically inoculated with sweet potato latent virus lotus isolate

Field surveys

The previous SPLV infected lotus samples confirmed by sequencing were used as positive control, the RT-PCR were performed with primers SPLVF2–1/SPLVR3–1 (Table S1) to survey the SPLV-lotus in Jiangsu province. Twenty-one SPLV positive samples were detected from all of 43 lotus samples (Table S3), regardless of symptoms. The detection rate was 62.7%, indicating a wide distribution of SPLV-lotus in Jiangsu Province, China.

Phylogenetic analysis and pairwise identities based on CP sequences

Twenty-one CP clone sequences from PCR products corresponding to four random selected positive lotus samples were determined and submitted to GenBank with the accession number MK923654-MK923674. Phylogenetic analysis based on CP coding region also supported that SPLV isolated from lotus were clustered into a separate group (SPLV-Lotus) compared with SPLV sweet potato isolates (Fig. 4a). Twenty-two CP sequences in SPLV-Lotus group were divided into two subgroup (1 and 2) (Fig. 4a). Clear intra-isolate variability were found in samples YZ25 and YZ27, for example, two clone sequences of YZ27 (YZ27–4 and YZ27–7) in subgroup 1 shared 89.6–90.1% identities with other five YZ27 clone sequences in subgroup 2. The pairwise identity of CP sequences between twenty-two SPLV isolated from lotus and five sweet potato isolates were around 76% (Fig. 4b).

Fig. 4
figure 4

Phylogenetic analysis and pairwise identity of sweet potato latent virus (SPLV) based on CP coding region sequences. a maximum-likelihood tree calculated from the coat protein sequences of SPLV. Numbers at each node indicate the percentage of supporting bootstrap samples in maximum-likelihood. SPLV isolate is shown with red circles. b the distribution of pairwise identity scores of the coat protein coding regions of SPLV using the MUSCLE multiple sequence alignment program in SDT software

Discussion

NGS is a powerful tool to identify viruses in herbaceous and woody plants (Zerbino and Birney 2008; Liang et al. 2015; Al Rwahnih et al. 2018; Rott et al. 2018) and has been used to identify and characterize several novel potyvirids (Monger et al. 2010; Dombrovsky et al. 2012; Fuentes et al. 2012; Li et al. 2012; Wylie et al. 2013). In this study, we identified one novel strain of SPLV from an aquatic crop, lotus, further underscoring the benefits of NGS as a versatile tool for the detection of viruses.

Phylogenetic analysis and sequence diversity showed that the lotus virus obtained here is most closely related to SPLV in the genus Potyvirus. The species demarcation criteria of the family Potyviridae, based upon the large ORF, is generally accepted as <76% nucleotide identity and < 82% amino acid identity (Wylie et al. 2017). The similarity values between the potyviruses isolated from lotus and five reported SPLV isolates, based on the large ORF, is close to the species demarcation criteria of the family Potyviridae. Additionally, there are four different polyprotein cleavage sites between the virus obtained here and the five reported SPLV sweet potato isolates, whereas these cleavage sites are almost identical between the SPLV sweet potato isolates (Table 1). In addition, more, the P3N-PIPO length (66 aa) of the virus obtained here is similar to that of SPFMV (67 aa), SPVC (67 aa), and SPV2 (64 aa), but largely shorter than that of SPLV sweet potato isolates (75–79 aa). Considering the above differences from SPLV, we considered this potyvirus obtained from lotus as a distant strain of SPLV. Phylogenetic analysis and pairwise identity based on CP coding region also supported that SPLV isolated from lotus was a distant strain.

As a perennial aquatic herb crop, lotus (Nelumbo nucifera Gaertn.) is susceptible to accumulate several plant pathogens, especially plant viruses, similar to other perennial crops, such as sugarcane, potato and grape. Previously, two viruses, CMV and DsMV (Wang et al. 2013a, b), have been found in lotus. Recently, our group also found CMV (Dong et al. 2017), apple stem grooving virus (He et al. 2019), and several badnaviruses (shown in another paper) from lotus collected in Jiangsu Province. The SPLV-lotus found here further imply the necessity of virus-free lotus seeds in the healthy production of lotus in China.

In conclusion, our study has identified one distant strain of SPLV from lotus plants using next-generation sequencing of sRNAs.. This study also shows that the novel SPLV strain is widely distributed in Jiangsu Province and could infect many plants in the family Cucurbitaceae, Chenopodiaceae, Leguminosae and Convolvulaceae. Our discovery will be greatly helpful in designing control strategies for lotus viral diseases.