The brown planthopper (BPH), Nilaparvata lugens (Stål, 1854) (Hemiptera: Delphacidae), is a major insect pest that inflicts damage on rice crops and is widely distributed in tropical and temperate countries within the Asian-Pacific region [1, 2]. In addition to causing direct physical damage, BPH also acts as a vector for two important rice viruses, rice ragged stunt virus (RRSV) and rice grassy stunt virus (RGSV), leading to substantial reductions in crop yield [2, 3]. Apart from plant viruses transmitted by BPH, several BPH-specific viruses belonging to the families Iflaviridae (Nilaparvata lugens honeydew virus, NLHV), Dicistroviridae (Himetobi P virus and Nilaparvata lugens C virus), and Spinareoviridae (Nilaparvata lugens reovirus) have been identified [4,5,6,7].

Over the past decade, the use of next-generation sequencing (NGS) and advanced bioinformatics methods has provided novel insights into viral evolution, leading to a new era of virus discovery [8, 9]. Most metagenomic methods rely on similarity searches to identify virus sequences based on inferred homology. However, there may be viruses that cannot be recognized solely through detectable homology, and these are referred to as "dark viruses" [10, 11]. Quenyaviruses were initially discovered in Drosophila melanogaster as "dark viruses" based on their small RNA profiles, which exhibited a characteristic pattern of virus-derived small interfering RNA (vsiRNA) in D. melanogaster, with a modal length of 21 nucleotides (nt), but these showed no identifiable similarity to sequences from known viruses or cellular organisms [10, 12]. Subsequently, through similarity searches, a series of homologous segments were identified in approximately 20 arthropod species, including insects, crustaceans, spiders, and myriapods, forming a completely new lineage named "quenyaviruses". Recent studies have classified quenyaviruses as part of a novel phylum within the viral kingdom Orthornavirae, and using artificial intelligence, they have been grouped into larger clusters of distantly related viruses [8, 13]. The quenyavirus genome consists of five positive-sense, single-stranded RNAs, each containing a single open reading frame (ORF). Proteins encoded by segments S1-S4 bear no detectable similarity to known viral proteins, while the longest segment, S5, is believed to encode the RNA-dependent RNA polymerase (RdRP) [10].

In this study, a novel insect-specific quenyavirus was identified in a BPH and named "Nilaparvata lugens quenyavirus 1" (NLQV1). The complete nucleotide sequences of the five segments of this virus were determined and characterized. BPH samples were initially collected in Vietnam as described in our previous work and subsequently maintained in a phytotron of our laboratory under the following conditions: 26 ± 1℃, 6 ± 5% relative humidity, and a 16:8 h (light/dark) photoperiod [1]. To identify RNA viruses in this population, a combination of nymph and adult insects (10 planthoppers in total) was used, and total RNA was extracted using TRIzol Reagent (Invitrogen, MA, USA). PolyA-selected RNA was used for transcriptome sequencing and whole-genome sequencing. The RNA library used for transcriptome sequencing was constructed and sequenced on an Illumina HiSeq 4000 platform (Illumina, San Diego, CA, USA). Reads were filtered to remove low-quality and adapter sequences using Trimmomatic 0.39 (LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 TOPHRED33), and de novo assembly was carried out using Trinity software (version 2.8.5) with default parameters [14], resulting in a total of 127,557 contigs. These contigs were compared to sequences in the NCBI viral RefSeq database using Diamond BLASTx. As a result, five viral contigs were identified that showed the most sequence similarity to reported quenyaviruses, with identity values ranging from 31.26% to 50.97% (Table 1). Furthermore, to validate the five segments and avoid false matches, a BLAST search against the entire NCBI nucleotide (NT) and non-redundant (NR) protein databases was conducted. The presence of NLQV1 was confirmed through reverse transcription PCR (RT-PCR), and the complete genome sequence of NLQV1 was successfully obtained using a SMARTer RACE 5'/3' Kit. The five segments of NLQV1 were then verified by Sanger sequencing. The primers used in this study are listed in Supplementary Table S1.

Table 1 Characteristics of the NLQV1 genome identified in a BPH

The complete genome of NLQV1 is 9,288 nt in length and comprises five segments, with each segment containing a polyA tail. Segments S1-4 range from 1,696 to 1,829 nt in length, and S5 is 2,206 nt long, excluding the polyA tail (GenBank accession numbers: PP681291-PP681295) (Table 1, Fig. 1A). Based on predictions made using NCBI ORFfinder, each segment is monocistronic and contains a single open reading frame (Fig. 1A). Segment 1 was predicted to encode an mRNA-capping enzyme domain, and segment 5 was predicted to encode an RNA-dependent RNA polymerase (RdRP), based on the presence of conserved domains identified using InterProScan and HHpred (Fig. 1A). These characteristics aligned with those of the quenyaviruses [10]. The five segments of NLQV1 have a moderate G+C content, ranging from 49.86% (S3) to 52.33% (S2) (Table 1), with an average G+C content of 50.99%. The lengths of untranslated regions (UTR) varies across the five genome segments, with the lengths of the 5'-UTRs ranging from 33 to 55 nt and those of the 3'-UTRs ranging from 154 to 449 nt. All five segments have the sequence motif AUCUG at their 5'-terminus (Table 1, Fig. 1A), suggesting that the genome segments of other quenyaviruses might also have conserved nucleotides at their 5’-termini. To assess the abundance and coverage of NLQV1, RNA-seq sequences were aligned with the reconstructed complete genome sequence of NLQV1. A total of 538,595 reads mapped perfectly to the NLQV1 genome, accounting for 2.25% of the RNA-seq reads (Fig. 1). The transcripts were distributed throughout the viral genome, indicating efficient replication of NLQV1 in the host insect.

Fig. 1
figure 1

Genome structure and phylogenetic analysis of NLQV1. A Genome structure and RNA-seq read coverage of NLQV1. UTR, untranslated region. B Relationship of the quenyaviruses to other RNA viruses of the viral kingdom Orthornavirae. An unrooted phylogenetic tree was constructed based on RNA-dependent RNA polymerase sequences, showed showing the relationships between quenyaviruses (including NLQV1) and representatives of other virus families. C Phylogenetic relationship of NLQV1 to other quenyaviruses. A close-up view of the region of the tree within the box with the dotted frame in panel B, corresponding to the quenyaviruses, is shown. Nodes with bootstrap values >50 are indicated by blue circles, with larger circles indicating higher bootstrap values

To investigate the taxonomic status of NLQV1, the predicted amino acid (aa) sequence of its RdRp and those of previously reported quenyaviruses, obtained from the NCBI database, were aligned using MAFFT (https://www.ebi.ac.uk/Tools/msa/mafft/, accessed on 12 April 2023) [10, 15]. Gaps were trimmed using trimAl, and the optimal aa substitution model was determined using ModelTest-NG with default parameters [16]. A maximum-likelihood (ML) phylogenetic tree was then constructed using IQ-tree with 1000 bootstrap replications [17, 18]. The results demonstrated that the RdRPs of quenyaviruses formed a monophyletic clade, supporting their potential inclusion in a new family (Fig. 1B), as suggested previously [8]. NLQV1 was clearly located within this clade, showing a close relationship to Sina virus, Nete virus, and Nai virus, with high bootstrap support (Fig. 1C). Therefore, our phylogenetic analysis supports the classification of NLQV1 as a new quenyavirus.

One of the key antiviral pathways utilized by insects to combat viral invaders is the small interfering RNA (siRNA)-based RNA silencing pathway. In this pathway, viral RNA is cleaved, resulting in the accumulation of vsiRNA, which is a signal for the activation of the RNAi antiviral pathway [19]. To gain a better understanding of siRNA-based immunity against NLQV1, a small RNA library from a BPH was created using an Illumina TruSeq Small RNA Sample Preparation Kit (Illumina, San Diego, California, USA) and sequenced on an Illumina HiSeq 2500 platform. The quality of the library was ensured through adapter removal and elimination of low-quality sequences, using the Cutadapt tool [20]. The quality-trimmed sRNA reads were subsequently mapped back to the complete genome of NLQV1 using the zero-mismatch Bowtie software as described previously [21]. The sRNA sequence library contained a total of 13,471 vsiRNA reads that matched NLQV1, accounting for 0.04% of the library. The vsiRNAs derived from the five genome segments of NLQV1 all exhibited a distribution pattern peaking at 21 and 22 nt and originated equally from both sense and antisense strands of the viral genome. NLQV1-derived RNAs showed a length preference similar to that of many other viruses that infect planthoppers [21,22,23,24]. These vsiRNAs were evenly distributed across all five segments of the NLQV1 genome, with certain regions showing preferential targeting by Dicer (Fig. 2). This characteristic vsiRNA profile suggests that the host antiviral RNAi pathway plays an active role in the response to NLQV1 infection.

Fig. 2
figure 2

Profiles of virus-derived small interfering RNAs (siRNAs) for the five segments of NLQV1. The size distribution and position of the siRNAs in each genome segment are shown

In conclusion, we discovered a novel insect-specific quenyavirus, NLQV1, in a BPH and determined its complete genome sequence. Quenyaviruses form a novel viral clade within the viral kingdom Orthornavirae and have a broad distribution among arthropods. This study represents the first report of a quenyavirus in planthoppers, thereby enhancing our knowledge of quenyaviruses and contributing to a better understanding of insect-specific viruses in planthoppers.