Cnidium officinale is a medicinal herb that is used in Asia, mainly China, Korea, and Japan, to treat diseases such as anemia, suppurative skin disease, and gynecological disorders [12]. It has been reported that medicinal plants are infected with many viruses [15, 16], and cnidium vein yellowing virus (CnVYV: family Secoviridae) was first detected in C. officinale in Korea [23]. Additionally, this virus was also found in Rehmannia glutinosa in Japan [21]. In 2017, we collected C. officinale samples showing mosaic symptoms from the Hokkaido and Iwate prefectures in northern Japan. Total RNA was extracted from these samples using an RNeasy Plant Mini Kit (QIAGEN, Venlo, The Netherlands) and a reverse transcription (RT)-PCR assay was performed by using genus- universal primer sets [11]. RT-PCR products of the expected sizes were obtained in both the potexvirus and cucumovirus assays. Since this was the first time that a potexvirus had been detected in C. officinale, we determined the full-length sequence of this virus. Here, we report the complete nucleotide sequence of a potexvirus infecting C. officinale, and based on molecular criteria, designate it as a member of a new potexvirus species.

To determine the complete nucleotide sequence of this virus, a cDNA fragment containing the polymerase motif within ORF 1 was amplified by the above RT-PCR assay using potexvirus universal primers, ‘potex1’ and ‘potex2’ (Supplementary Table 1) [8]. The fragment obtained (fragment I) was purified using a GFX PCR DNA and Gel Band Purification Kit (GE Healthcare, Buckinghamshire, England) and cloned using a TOPO TA Cloning Kit for Sequencing (Thermo Fisher Scientific, MA, USA). DNA sequence analysis was performed by the Biotechnology Center at Akita Prefectural University, and the sequences were analyzed using GENETYX-Mac 16.0.7 software (Software Development Co., Tokyo, Japan).

Based on this nucleotide sequence, two specific primers, ‘CnV-1R’ and ‘CnV-1S’, were designed to amplify the fragments of the 5′ upstream and 3′ downstream regions, respectively. The 5′ upstream region fragment was amplified using the sense primer ‘potex5′-2’, which contained a NotI site and was designed based on the 5′-terminal conserved sequence of previously published potexviruses [7], and the antisense primer ‘CnV-1R’. The 3′ downstream region fragment was also amplified using the sense primer ‘CnV-1S’ and primer ‘M4dT’ [3]. RT-PCR was performed according to the manufacturer’s protocol (PrimeScript™ II High Fidelity One Step RT-PCR Kit; Takara, Kyoto, Japan) with the following conditions: one cycle of 45 °C for 15 min and 94 °C for 2 min, followed by 35 cycles of 98 °C for 10 s, 50 °C for 15 s, and 68 °C for 10 s per 1 kb. The amplified fragments, 968 bp (fragment II) and 2,601 bp (fragment III) (Supplementary Fig. S1), were cloned and sequenced as described above. Based on the obtained nucleotide sequence of fragment II, the specific reverse primer ‘CnV-2R’ was re-designed to amplify the fragment of the 5′ upstream region, and the amplified fragment (2,136 bp, fragment IV) was amplified in combination with the sense primer ‘WM-5F’ [22]. This RT-PCR product was also cloned and sequenced as described above. Finally, to determine precisely the 5′-terminal nucleotide sequence, 5′ RACE was performed using two specific primers, ‘CnV5′-1’ and ‘CnV5′-2’, designed from the nucleotide sequence of fragment IV. 5′ RACE was performed according to the manufacturer’s protocol (5′/3′ RACE Kit, 2nd Generation; Sigma-Aldrich, MO, USA), and the purified 5′ RACE products were also cloned and sequenced as described above. The nucleotide sequence of each fragment was aligned and analyzed using GENETYX-Mac 16.0.7 software (Software development Co., Tokyo, Japan). For phylogenetic analysis, the amino acid sequences of the polymerase and coat protein (CP) regions were aligned with sequences representing members of the genus Potexvirus, using the multiple sequence alignment tool MUSCLE (https://www.ebi.ac.uk/Tools/msa/muscle/). Phylogenetic trees were then constructed from the alignment data using the maximum-likelihood (ML) and neighbor-joining (NJ) methods, using MEGA7 software (Fig. 1) [14].

Fig. 1
figure 1

Phylogenetic analysis of the proposed cnidium virus X and other potexviruses based on published polymerase (A) and coat protein (CP) (B) nucleotide sequences. The maximum likelihood (ML) method [10] was used to construct the phylogenetic trees. The bootstrap values (ML/NJ: 1,000 replications [5]) of identical branches in each tree constructed by the two methods are shown above and below the branches. Branching of the corresponding sequence of shallot virus X (ShVX) was used as the out-group of the ML tree in the polymerase analysis. In the CP analysis, the partially branched tree constructed by the neighbor-joining method (NJ) is shown in an inset. In the ML analysis, the initial tree for the heuristic search was generated automatically by applying the Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model [10] and then selecting the topology with a superior log likelihood value. A discrete gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 1.2315)). In the NJ analysis [18], the evolutionary distances were computed using the Poisson correction method [24] and are in units of the number of amino acid substitutions per site. All positions containing gaps and missing data were eliminated. Evolutionary analysis was conducted using MEGA7 [14]. The following viruses were included in the analysis: actinidia virus X (AVX), allium virus X (AlVX), alstroemeria virus X (AlsVX), alternanthera mosaic virus (AltMV), Asparagus virus 3 (AV3), bamboo mosaic virus (BaMV), cactus virus X (CVX), cassava common mosaic virus (CsCMV), cassava virus X (CsVX), clover yellow mosaic virus (ClYMV), cymbidium mosaic virus (CymMV), foxtail mosaic virus (FoMV), hosta virus X (HVX), hydrangea ringspot virus (HdRSV), langenaria mild mosaic virus (LaMMoV), lettuce virus X (LeVX), lily virus X (LVX), malva mosaic virus (MalMV), mint virus X (MVX), narcissus mosaic virus (NMV), nerine virus X (NVX), opuntia virus X (OVX), papaya mosaic virus (PapMV), pepino mosaic virus (PepMV), phaius virus X (PhVX), pitaya virus X (PiVX), plantago asiatica mosaic virus (PlAMV), potato aucuba mosaic virus (PAMV), potato virus X (PVX), scallion virus X (ScaVX), schlumbergera virus X (SchVX), strawberry mild yellow edge virus (SMYEV), tamus red mosaic virus (TRMV), tulip virus X (TVX), vanilla virus X (VVX), white clover mosaic virus (WClMV), yam virus X (YVX), zygocactus virus X (ZVX), and shallot virus X (ShVX)

The RNA genome of the virus is composed of 5,964 nucleotides (nt), excluding the poly(A) tail. Four of the five ORFs overlap but ORF1 does not overlap with any other ORFs. The overlapping of ORFs 4 and 5 (Supplementary Fig. S1), like that found in the plantago asiatica mosaic virus (PlAMV) [19], mint virus X (MVX) [20], nerine virus X (NVX) [6], phaius virus X (PhVX) [11] and strawberry mild yellow edge virus (SMYEV) [9], is an unusual arrangement. An extensive database search failed to match the virus detected in C. officinale to any other potexvirus sequences. ORF1 (nt 72–3,974) presumably encodes a putative polymerase of 1,300 amino acids (aa) with a calculated Mr of 146,323. ORF 2 (triple gene block [TGB] p1: nt 4,111–4,803), ORF 3 (TGBp2: nt 4,766–5,071) and ORF 4 (TGBp3: nt 5,019–5,258) together constitute a TGB whose organization is conserved in members of the genera Potexvirus, Mandarivirus, Carlavirus, and Foveavirus. TGBp1, 2 and 3 encode presumptive polypeptides of 230, 101 and 79 aa, with a calculated Mr of 25,164, 11,035 and 8,722, respectively. ORF 5 (nt 5,080–5,787) encodes a putative CP of 234 aa with a calculated Mr of 25,006.

A comparison of the nucleotide and amino acid sequences of the virus detected in C. officinale with those of other potexviruses indicated that the polymerase (ORF1) and CP had the highest sequence similarity to those of white clover mosaic virus (WClMV) [2] and vanilla virus X (VVX), respectively. TGBp1, 2 and 3 were highly similar to the corresponding genes of SMYEV, schlumbergera virus X (SchVX) [13], and bamboo mosaic virus (BaMV) [17], respectively (Table 1). A phylogenetic tree constructed based on the amino acid sequences of the polymerase and CP also indicated that the polymerase of the virus is closely related to that of SMYEV; and the CP, to those of both yam virus X and VVX (Fig. 1).

Table 1 Nucleotide sequence identity/amino acid sequence identity (%) of cnidium virus X to other potexviruses

The polymerase and CP amino acid sequences of the virus detected in C. officinale were 32.4–54.2% and 18.5–51.1% identical, respectively, to those of a series of potexviruses (Table 1). According to the molecular criteria for species demarcation established by Adams et al. [1], 40.4–73.3% polymerase and 20.5–79.8% CP amino acid sequence identity, indicates a distinct species, while sequence identity exceeding 88.8 and 74.4%, respectively, indicates that the viruses being compared belong to the same species [1]. The Ninth Report of the International Committee on Taxonomy of Viruses indicates that members of distinct species have less than ~ 72% nucleotide sequence identity or less than 80% amino acid sequence identity between their CP or polymerase genes [4]. Since the highest scores for the polymerase and CP genes were 54.2% (WClMV) and 51.1% (VVX), respectively, these data indicate that the C. officinale virus is different from members of any other potexvirus species that have been published previously.

The complete nucleotide sequence of this newly identified potexvirus isolated from C. officinale indicates that it is closely related to WClMV and VVX but differs from all other known potexviruses and by molecular criteria must be considered a member of a new species. We propose that the newly isolated virus should be classified as a member of a new potexvirus species and designated as “cnidium virus X” (CnVX).

Nucleotide sequence accession number: The complete genome sequence of cnidium virus X (CnVX) with annotation was deposited in the DDBJ nucleotide sequence database under the accession number LC460456.