Introduction

Prevention and management of viral pathogens in clonally propagated crops primarily depend on their precise detection. Robust and specific detection assays are always in demand requiring constant technology advancement, protocols optimization, and adequate knowledge. Currently bioinformatics analysis of high-throughput sequencing (HTS) reads for robust virome profiling in a targeted host has become a preferred approach for identification of known and unknown plant viruses. The list of viruses infecting citrus crops has increased gradually in the past few decades. Some of them are globally widespread and some are limited to particular regions or countries (Zhou et al. 2020; Licciardello et al. 2021). Kinnow mandarin (Citrus reticulata) is an important hybrid citrus crop formed by the crossing of ‘Willow Leaf’ (C. deliciosa) and ‘King’ (C. nobilis). It is mostly cultivated in the northwestern part of India (Ahlawat 1989; Kokane et al. 2021a, b) and is known to be attacked by a large number of pathogens belonging to different groups like fungi, bacteria, viruses, viroids, and phytoplasmas. Among them, viruses are the major constraints in the production of citrus as they get perpetuated through vegetative propagation. Mandariviruses are a new group of viruses recorded in citrus. It includes three members viz. Indian citrus ringspot virus (ICRSV), citrus yellow vein clearing virus (CYVCV), and citrus yellow mottle-associated virus (CiYMaV) reported till date (Wu et al. 2020). The mandariviruses are flexuous filamentous particles having positive sense single-stranded RNA genome which includes 3ˊ poly A tail, the modal length is about 685 nm. and diameter is about 13–14 nm; contains 6 open reading frames (ORFs) namely RNA-dependent RNA polymerase (RdRp), triple gene block proteins (TGB), capsid protein (CP). and nucleic acid binding protein (NB) (Byadgi et al. 1993; Alshami et al. 2003; Loconsole et al. 2012; Zhen et al. 2015; Meena et al. 2019). CYVCV is an emerging threat to citrus growers and has been well-documented worldwide (Catara et al. 1993; Ahlawat and Pant 2003; Chen et al. 2014; Hashemian and Aghajanzadeh 2017; Liu et al. 2020; Licciardello et al. 2021). The symptoms ranging from mild-to-moderate yellow vein clearing, chlorosis, clear irregular conspicuous ringspots, and mosaic-like patterns on the leaves are recorded during winter and spring season but mostly absent in other seasons (Ahlawat and Pant 2003; Zhou et al. 2017). CYVCV infection recorded from China in Eureka lemons and Satsuma mandarin showed 20% decline in yield (Li et al. 2017; Chen et al. 2014). Occurrence of CYVCV has been also recorded from India as 40% of the Kinnow leaf samples tested were found positive for CYVCV in reverse transcription polymerase chain reaction (RT-PCR) (Meena et al. 2019) and ICRSV was also recorded in Kinnow mandarin trees in India (Byadgi and Ahlawat 1995; Kokane et al. 2021a, b). ICRSV-infected plants exhibit mottling, conspicuous chlorotic to irregular chlorotic ring patterns; ringspots are mainly visible on mature leaves (Thind et al. 2000; Rustici et al. 2002; Kokane et al. 2021a, b). CiYMaV is a new virus which is recently reported in Symons sweet orange from Punjab province of Pakistan, the infected plant leaves exhibited mottling and yellowing symptoms and this virus is recently included in genus Mandarivirus (Wu et al. 2020). The same research group has used the full-length infectious clones of CiYMaV for systemic infection in several citrus species, which exhibited severe disease symptoms (Wu et al. 2023). In recent years, viral genome sequences obtained through HTS have been notably used to study diversity and novelty of viruses and viroids in citrus crops (Loconsole et al. 2012; Matsumura et al. 2017; Cao et al. 2018; Wu et al. 2020; Licciardello et al. 2021; Bester et al. 2021). However, there are no such reports of HTS analysis for virome profiling of citrus crops in India. Therefore, the prime objective of this study was to perform the HTS of symptomatic and asymptomatic pooled leaf tissues of Kinnow mandarin for the identification of viruses, which might have been un-detected earlier in molecular and serological assays. The study includes sequence identity, phylogenetic analysis for HTS-identified viruses, and development of specific RT-PCR-based detection assays. A rapid and reliable duplex RT-PCR protocol was also developed and standardized for the simultaneous detection of both CYVCV and CiYMaV in infected Kinnow mandarin.

Materials and methods

Samples source, electron microscopy, nucleic acid isolation, library preparation, and sequencing

Total 18 Kinnow mandarin leaf samples (15 symptomatic and 3 asymptomatic) were collected from 3 different Kinnow mandarin orchards at Indian Agricultural Research Institute (IARI), Pusa campus, New Delhi. To avoid contamination, surface of the collected leaf samples was wiped with 70% ethanol. Representative of one symptomatic leaf sample was subjected to leaf-dip electron microscopy assay (Gibbs et al. 1966). Symptomatic leaf tissue of Kinnow mandarin plant was pulverized in 0.07 M phosphate solution (pH 6.5), negatively stained in 2% uranyl acetate, and was visualized under JEOL-1011, electron microscope (EM). Digital images were captured by Olympus CCD camera, SIS MEGAVIEW G2, installed to the EM interface. Further, the same symptomatic leaf sample was subjected to immunosorbent electron microscopy (ISEM) (Ahlawat et al. 1996) using in-house raised polyclonal antiserum specific to ICRSV and CYVCV (Pant et al. 2018). Total RNA was extracted from the collected Kinnow leaf tissues (100 mg/plant) using TRIzol TM reagent (Invitrogen, Thermofisher Scientific, Wilmington, USA) as per manufacturer’s guidelines. RNA from each sample was dissolved in 50 µl DEPC treated water and its quantity and quality were determined using the Nanodrop 2000 (Thermofisher Scientific, Massachusetts, USA). A part of the isolated RNAs was then stored at -80 °C for post-HTS validations. The concentration, purity, and integrity of RNA were analyzed by Agilent 2100 bioanalyzer (Agilent, USA) and Qubit 4 Fluorometer (Invitrogen, USA). On the basis of quality control (QC) results and RNA integrity number (RIN) value, 13 out of total 18 samples (12 symptomatic and 1 asymptomatic) that passed the QC parameters and having RIN score > 7 were subjected to HTS analysis. RNA of 13 samples were pooled in equimolar concentration, to make one sample and further used for HTS. For complete removal of ribosomal RNA from pooled RNA, a Ribozero treatment was given utilizing QIAseq FastSelect-rRNA Plant kit (Qiagen, Hilden, Germany) as per manufacturer’s guidelines. cDNA synthesis has been done using 500 ng of purified RNA by KAPA Hyper-Prep Kit/ for cDNA Synthesis and Amplification Module (Roche, USA) followed by purification using KAPA Pure Beads (Roche, USA). Library preparation was carried out by KAPA RNA Hyper-Prep Library Prep Kit for Illumina (Roche, USA) following manufacturer’s guidelines. Further, the quality and quantity along with cDNA library length were assessed using Agilent 2100 bioanalyzer (Agilent, USA) and Qubit 4 Fluorometer (Invitrogen, USA). High-standard total RNA-Seq libraries were paired-end sequenced (2 × 150 bp) on NovaSeq 6000 (Illumina, USA) available at Nucleome, Hyderabad. Bioinformatic analysis of pooled sample was performed according to the workflow as mentioned in Supplementary Fig. S1.

De novo assembly of sequences and genome reconstruction

The original data obtained from the HTS platform were transformed to sequenced reads by base calling. Raw data were recorded in a FASTQ file, which contains sequenced reads and corresponding sequencing quality information (Cock et al. 2010). The overall quality of reads was checked with FastQC (Ewels et al. 2016). Adapters were removed using fastp tool (Chen et al. 2018) with the cut-right parameter, which trims adapters by scanning bases from 5’ to 3’. HISAT2 v.2.2.1 tool (Kim et al. 2019) was used to perform mapping of the RNA sequence reads to a reference genome (C. reticulata; GenBank assembly: GCA_003258625.1). The unaligned sequence reads were de novo assembled using Trinity v.2.14.0 and MEGAHIT v.1.1.3 (Haas et al. 2013; Li et al. 2015). Assembled contigs were later used in standalone BLASTn program (2.13.0) with default parameters (evalue -0.00001, max_target_seqs 1, max_hsps 1) for identification of virus/viroids against related sequences available in databases (source: NCBI). For additional examination, the assembled virus genomes with greater than 75% sequence coverage with the reference genome were taken into consideration. ORF finder (https://www.ncbi.nlm.nih.gov/orffinder/) was used to identify the ORFs in recovered virus genome. Presence of identified viruses in samples was confirmed by PCR amplification and Sanger sequencing, and the sequences were submitted to GenBank.

Variant analysis, copy number variation, and Quasi-species diversity (krona plot) analysis

To identify the variants, the cleaned sample reads were mapped to the recovered virus genomes. For variant analysis study, softwares such as bcftools, mpileup, bcftools call, bcftools filter (1.10.2) were used to identify and filter variants (% QUAL ≥ 20), filter SNVs and INDELs. The estimation of transcript abundance depends on alignment-based abundance estimation methods used for RNA-Seq. Clean sequence reads were aligned with viral contigs using the salmon (v1.3.0) and subjected to the fragments per kilobase of transcript per million mapped reads (FPKM) method to calculate the normalized expression value. Quasi-species diversity analysis of HTS sequence reads was performed by OmicsBox 1.2 (https://www.biobam.com/omicsbox) using Kraken 2 database containing reference virus/viroid sequences, and outcomes were generated in the form of pie chart by rich visualizations tool in OmicsBox 1.2.

Sequence identity and phylogenetic analysis

For sequence identity and phylogenetic analysis, complete genome, CP and RdRp sequences of viruses from the Alphaflexiviridae family were retrieved from NCBI GenBank. Their names are as follows CYVCV, CiYMaV, ICRSV, potato virus X, donkey orchid symptomless virus, Lolium latent virus, shallot virus X, Botrytis virus X and Sclerotinia sclerotiorum debilitation-associated RNA virus. ClustalW program for multiple sequence alignment was conducted using BioEdit sequence alignment editor software version 7.1.3.0 (Hall 1999). Phylogenetic trees were generated using Neighbor-Joining method in Molecular Evolutionary Genetic Analysis and the evolutionary distances were computed using the Maximum Composite Likelihood Model (Tamura et al. 2021).

Development of duplex-RT-PCR and validation with field samples

The viruses identified in HTS were validated using the RNA stored at −80 °C and freshly collected samples of Kinnow mandarin from ICAR-IARI, New Delhi. Quantity and quality of stored RNA were checked through Nano-Drop™ One Spectrophotometer (Thermo Scientific, Wilmington, USA) and on non-denaturing agarose gel electrophoresis. To perform duplex-RT-PCR, target-specific primers were designed and optimized for the amplification of RdRp (345 bp) region of CYVCV and CP (502 bp) region of CiYMaV. Citrus housekeeping gene, EF-1α- F (185 bp) was used as internal control to check the proper cDNA preparation in all PCR reactions (Kokane et al. 2021a, b). The presence of CYVCV and CiYMaV infection was checked by duplex RT-PCR using 10 µM specific primers set (Supplementary Table S1) in two steps. In first step, single-stranded cDNA was synthesized using approximately 500 ng of RNA, Improm-II Reverse Transcriptase (Promega, Madison, USA) and oligo (dT) primer, followed by PCR amplification of the targeted gene, i.e., RdRp and CP using target-specific primers. The PCR mixture in a total volume of 25 μl contained 1 μl (~ 250 ng) of cDNA template, 1.25 mM of MgCl2, 0.5 mM of dNTP mix, 0.25 μM of each F/R primer, 1X reaction buffer, 1 unit of DNA polymerase (DyNAzymeII) (Thermo Fisher Scientific, MA, USA) and remaining nuclease-free water. The PCR conditions used were as follows: one cycle of initial denaturation at 94 °C for 5 min followed by 35 cycles of denaturation at 94 °C for 30 s, annealing at 60 °C for 45 s, extension at 72 °C for 1 min, and one cycle of final extension at 72 °C for 10 min. The amplified product was electrophoresed in ethidium bromide containing 1% agarose gel and finally visualized under UV illumination gel documentation system. The expected PCR amplicon was gel-purified and cloned into the pGEM-T Easy vector (Promega, Madison, USA) by following the standard molecular biology procedures (Sambrook and Russell 2006). Two positive clones of each CYVCV and CiYMaV were sequenced in the forward and reverse direction using T7 and SP6 universal primers. The obtained sequences were then compared with virus genome sequences identified in HTS and analyzed with BLASTn feature of NCBI, GenBank nucleotide database (http://www.ncbi.nlm.nih.gov/blast) for its further confirmation. For the amplification of complete CP (978 bp) region of CiYMaV, separate primer set was designed (Supplementary Table S1) and remaining procedure was followed as similar to duplex PCR.

Specificity and sensitivity assessment of developed duplex RT-PCR

For specificity assessment of developed duplex-RT-PCR, both the primer sets for CYVCV and CiYMaV were individually tested on field samples along with plant co-infected with both the viruses. Sensitivity experiments were performed for both uniplex and duplex RT-PCR with an initial concentration of 100 ng RNA using dilutions ranging from 100 to 10–4 for the preparation of the cDNA template.

Results

Electron microscopy and ISEM

Kinnow mandarin leaf samples bearing symptoms of yellow vein clearing, mottling, chlorosis, severe mosaic and necrosis of the leaves showed flexuous filamentous particles measuring ∼650 nm in electron microscopy (Fig. 1). These particles were further decorated with in-house developed polyclonal antibodies (pAbs) specific for ICRSV and CYVCV detection at 1:50 dilution separately. All the mandarivirus particles were found undecorated in ISEM assay with antiserum raised against ICRSV (Figure not shown). The antibody raised against CYVCV reacted positively and heavy antibody halo around some of the mandarivirus particles were readily visualized in negatively stained preparations, hence considered as CYVCV particles. The mandarivirus particles that remained undecorated by CYVCV pAbs were the particles of recently reported virus species CiYMaV under the genus Mandarivirus as that was later confirmed in HTS and RT-PCR assay (Fig. 1).

Fig. 1
figure 1

Kinnow mandarin leaf showing yellow vein clearing symptom (a). Electron micrograph showing flexuous filaments particles measuring ∼650 nm from  mandarivirus infected Kinnow mandarin plant, decorated (right) and undecorated (left) with CYVCV antiserum (1:50 dilution) (b)

Sequencing and de novo assembly of host unaligned reads

Raw and clean read statistics for pooled sample were recorded as 21.4 M (million) and 19.7 M. and detailed information is shown in Table 1. Mapping statistics of host C. reticulata were recorded as 79.4% for host aligned reads and 20.5% for host unaligned reads (Table 1). Raw sequence file was later submitted to SRA database under accession number: SRR25212059. After removal of host (C. reticulata) aligned reads, the resulting number of assembled contigs generated by Trinity and MEGAHIT for pooled sample were recorded as 1921 and 1978; largest contigs identified were 7553 and 7450; total lengths in bases were recorded as 1.7 M and 1.3 M; GC % was around 42; and N50 values (value represents the length of the shortest contig in the group of longest sequences that together represent, at least 50% of the nucleotides in the set of sequences) were recorded as 839 and 638 (Table 1).

Table 1 HTS statistics of pooled RNA from Kinnow mandarin leaves

Annotation and identification of viruses and viroids in pooled Kinnow sample

Assembled contigs greater than 250 nucleotides (nt) generated from Trinity and MEGAHIT were annotated using standalone BLASTn program against the related sequence of virus/viroid genomes retrieved from NCBI. The annotated virus contigs having sequence coverage of more than 75% with the related virus genome sequences were preferably selected for analysis. BLASTn results showed the presence of near-complete genome of two viruses namely CYVCV and CiYMaV in pooled sample. CYVCV-associated contigs were in abundant (n = 26) (Supplementary Table S5) than CiYMaV (n = 12) (Supplementary Table S4) in the pooled library. Percentages of viral contigs found were 2.5% for CYVCV and 0.5% for CiYMaV. Maximum contig length found for CYVCV and CiYMaV were 7553 nt and 7450 nt, respectively.

Analysis of HTS-generated viral sequences

Retrieved nearly complete genome sequences of two important mandariviruses. i.e., CiYMaV (7450 nt) and CYVCV (7553 nt) from HTS were analyzed for six important proteins encoding ORFs. The sequence analysis results for CiYMaV revealed 67 nt long 5’UTR (1–57) followed by 4896 nt RdRp gene (68–4963) codes for 1631 aa, 678 nt TGB-25 K gene (4971–5648) codes for 225 aa, 330 nt TGB-12 K gene (5626–5955) codes for 109 aa, 183 nt TGB-6.4 K gene (5882–6064) codes for 60 aa, 978 nt CP gene (6088–7065) codes for 325 aa, and 669 nt NB gene (6765–7433) codes for 222 aa. The sequence analysis of CYVCV revealed 73 nt 5’UTR (1–73), 4.9 Kb RdRp gene (74–523) codes for 1649 aa, 678 nt TGB-25 K gene (5030–5707) codes for 225 aa, 327 nt TGB-12 K gene (6585–6011) codes for 108 aa, 183 nt TGB-6.4 K gene (5938–6120) codes for 60 aa, 978 nt CP gene (6143–7120) codes for 325 aa, 669 nt NB gene (6820–7488) codes for 222 aa and ends with 64 nt UTR (7489–7553) (Supplementary Table S2). Genome organization of CiYMaV and CYVCV sequences recovered in this study is shown in Fig. 2.

Fig. 2
figure 2

Genome organization of full genome sequence identified in HTS and analyzed by ORF finder, CiYMaV (A) and CYVCV (B)

Variant (SNVs and INDELs) and FPKM analysis

Using HTS-identified viral sequences from the pooled samples, variant analysis was performed. CiYMaV and CYVCV showed maximal SNV counts of 1036 and 873, respectively. One INDEL was recorded for CiYMaV, while three INDELs variation were recorded for CYVCV. CiYMaV and CYVCV were predicted to have FPKM values of 58.67 and 200.12, respectively.

Alignment depth, percent coverage, and krona plot analysis for viruses

After removing the host (C. reticulata) reads from the total reads of Kinnow mandarin pooled leaf sample, the longest contig lengths found for CYVCV and CiYMaV were 7553 nt and 7450 nt, respectively. These lengths demonstrated 99% and 100% query coverage with their respective reference sequences, the CYVCV China isolate (Acc. No. NC_026592.1) and the CiYMaV-Pakistan isolate (Acc. No. NC_076409.1). Krona plot distribution of mandarivirus species in pooled leaf sample revealed 11% for CYVCV and less than 0.5% for CiYMaV. (Supplementary Fig. S2).

Development of duplex RT-PCR along with specificity and sensitivity analysis

All QC-passed individual samples along with pooled sample were checked for the presence of CYVCV and CiYMaV by duplex RT-PCR along with internal control. All 13 samples (12 symptomatic and one asymptomatic) were found positive for CYVCV as they amplified 345 bp of amplicons and 3 (symptomatic) out of total 13 samples were found positive for CiYMaV as they amplified 502 bp of amplicons. EF-1α- F used as internal control in all the samples produced 185 bp of amplified products. Pooled sample was found positive for both the viruses, hence used as a positive control in this study. Specificity test was conducted to check the cross-reactivity of the primers with non-targeted mandariviruses using three symptomatic and CYVCV and CiYMaV co-infected (positive control) plants through RT-PCR. The primers were found specific with their targeted genes amplification as the CYVCV-specific primer gives amplified product of 345 bp in all three tested samples along with CYVCV, CiYMaV co-infected positive sample, and no amplification was observed in healthy sample. CiYMaV-specific primers showed 502 bp of amplified product in one out of three samples and CYVCV and CiYMaV co-infected positive sample and no amplification was observed in healthy sample (Fig. 3-B). Hence, the results for both the primer sets confirmed their specificity with the targeted virus genome region and the same was confirmed by Sanger sequencing of PCR amplicons and BLASTn analysis of sequences. Sensitivity test for uniplex and duplex RT-PCR using serially diluted (from 100 to 10–4) RNA preparations starting with 100 ng concentration from a Kinnow mandarin leaf tissue co-infected with CYVCV and CiYMaV clearly shows the sensitivity up to 10–2 dilution with individual primer sets specific for CYVCV and CiYMaV and with their duplex primer sets (Fig. 3-C).

Fig. 3
figure 3

A Confirmation of viruses in duplex RT-PCR, where lane M, GeneRuler™ 100 bp DNA Ladder (Thermo Scientific™); lane 1–10: symptomatic Kinnow plant samples; lane HC: healthy plant control; lane PC: positive control (CYVCV and CiYMaV co-infected positive sample). B Specificity test using CYVCV and CiYMaV positive plant samples, where lane M, GeneRuler™ 100 bp DNA Ladder (Thermo Scientific™); lane 1–3: symptomatic leaf samples; lane PS: CYVCV and CiYMaV co-infected positive sample; lane HC: healthy plant control. Gel picture clearly shows amplification of CYVCV in all three samples along with CYVCV and CiYMaV co-infected positive sample (a). Gel picture shows amplification of CiYMaV in 2nd sample along with CYVCV and CiYMaV co-infected positive sample (b). C Sensitivity assessment of uniplex and multiplex RT-PCR. A comparison of uniplex RT-PCR sensitivity for CYVCV (a), CiYMaV (b) and with multiplex RT-PCR for both CYVCV and CiYMaV (c). RT-PCR sensitivity tests were performed using serially diluted (from 100 to 10–4) nucleic acid preparations from a Kinnow mandarin leaf tissue infected with CYVCV and CiYMaV, where lane M, GeneRuler™ 100 bp DNA Ladder (Thermo Scientific™); lane 100 to 10–4: serially diluted nucleic acid ranging from 100 to 10-4 where starting concentration is 100 ng

Validation of viruses in Kinnow samples by duplex RT-PCR

Total 63 Kinnow leaf samples were collected from 3 different Kinnow mandarin orchards of Indian Agricultural Research Institute (IARI), Pusa campus, New Delhi. These samples were later subjected for validation to check the presence/absence of both the mandariviruses through developed duplex RT-PCR. Results revealed that 48/63 were found positive for CYVCV and 18/63 were found positive for CiYMaV while both the viruses were absent in all the Kinnow mandarin plant leaf samples maintained in the glass house (Table 2). Results for 10 out of total 63 samples are shown in Fig. 3-A which clearly show the presence of CYVCV in all the samples and among them, only 7 showed the presence of CiYMaV.

Table 2 Validation of CYVCV and CiYMaV in Kinnow mandarin samples by duplex RT-PCR

Positive predictive value (PPV) and negative predictive value (NPV) were calculated to check the prevalence of CYVCV and CiYMaV in total Kinnow mandarin samples (both symptomatic and asymptomatic) using the standard formula (PPV: total number of positive samples/total number of samples tested). PPV for CYVCV and CiYMaV was found to be 76.19 and 28.57%, respectively. NPV was calculated using the formula (NPV: total number of negative samples / total number of samples tested). NPV for CYVCV and CiYMaV was found to be 23.80 and 71.42%, respectively. Thus, the incidence of CYVCV was found to be 76.19%, while the incidence of CiYMaV was 28.57%.

Amplification of complete CP gene of CiYMaV in Kinnow field samples

Using specific primers, the complete CP gene (978 bp) of CiYMaV was efficiently amplified from two of the four infected Kinnow leaf samples (Fig. 4). The amplified product of 978 bp was further confirmed in sequencing results as it showed 99.69% nucleotide identity with 100% query coverage and 100% amino acid identity with HTS-retrieved CiYMaV sequence. The CP sequence was further checked for percent identity with CP sequences of other mandarivirus isolates retrieved from NCBI and same were used in phylogenetic analysis.

Fig. 4
figure 4

Amplification of complete CP gene of CiYMaV (978 bp) and EF-1α- F (185 bp) used as internal control, where lane M: GeneRuler™ 100 bp DNA Ladder (Thermo Scientific™); lane 1–4: symptomatic leaf samples; lane HC: healthy plant control

Sequence identity and phylogenetic analyses of CiYMaV and CYVCV

The CiYMaV-Delhi isolate (Acc. No. OR251442) has a nearly full genome sequence that shared 71% to 98% identity at the nt level with query coverage ranging from 14 to 100% with other known mandarivirus isolates that infect citrus. It shared the highest nt identity of 97.88% and 100% query coverage with CiYMaV-Pakistan isolate (Acc. No. NC_076409.1) infecting citrus and the lowest nt identity of 71% and 14% query coverage was recorded with ICRSV (NC_003093.1). Further analysis of the aa sequences of CP and RdRp of CiYMaV-Delhi isolate showed the maximum identity of 96% and 99% with the CP and RdRp aa sequences of CiYMaV-Pakistan isolate, respectively. The details of the sequence identity analysis of CiYMaV-Delhi isolate with other mandarivirus isolates are shown in Table S3. Full genome sequence of CYVCV-Delhi isolate (Acc. No. OR251443) showed highest nt identity of 98.31% and 99% query coverage with CYVCV-Pali isolate (Acc. No. KT696512.1) from India followed by NCBI-designated CYVCV sequence (Acc. No. NC_026592.1) that showed 97.17% nt identity and 99% query coverage. Further analysis of the aa sequences of CYVCV-Delhi isolate again shared maximum identity of 99% and 98% with the CP and RdRp sequences of the CYVCV-Pali isolate, respectively.

In the phylogenetic analysis, full genome sequences of total 28 viruses from Alphaflexiviridae family, comprising Mandarivirus (22), Allexivirus (1), Botrexvirus (1), Lolavirus (1), Platypusvirus (1), Potexvirus (1), and Sclerodarnavirus (1), were analyzed (Fig. 5). Further, the virus isolates belonging to genus Mandarivirus clustered into different groups in phylogenetic tree. The mandarivirus isolates (22) clustered into 1 major group which was further divided into 3 sub-groups (I, II/ and III), comprising 17 isolates of CYVCV (sub-group I), followed by 2 isolates of CiYMaV (sub-group II) and 3 isolates of ICRSV (sub-group III). The evolutionary analysis clearly revealed that CiYMaV-Delhi isolate (Acc. No. OR251442) showed maximum similarity to CiYMaV-Pakistan isolate (Acc. No. NC_076409.1) as they come under the same sub-group II. CYVCV-Delhi isolate (Acc. No. OR251443) clustered closely with CYVCV-Pali isolate from India in sub-group I (Fig. 5A).

Fig. 5
figure 5

Phylogenetic tree constructed using MEGAX through Neighbor-Joining method (1000 bootstrap replicates) and distance computed using Maximum Composite Likelihood Model. A Complete genome sequences of Alphaflexiviridae family viruses [citrus yellow vein clearing virus (CYVCV), citrus yellow mottle virus (CiYMaV), Indian citrus ringspot virus (ICRSV), potato virus X (PVX), donkey orchid symptomless virus (DOSV), Lolium latent virus (LoLV), shallot virus X (ShVX), Botrytis virus X (BVX), and Sclerotinia sclerotiorum debilitation-associated RNA virus (SaDaV)] used for evolutionary studies. Comparison of B coat protein (CP) sequences and C RNA-dependent RNA polymerase (RdRp) sequences of Mandarivirus with the isolates characterized in present study

Further, when the phylogenetic analysis was done using coat protein gene, the CYVCV and CiYMaV isolates (characterized in the present study) grouped in two separate clusters, i.e., in cluster I and cluster III (Fig. 5B). Whereas when the analysis was done using RdRp gene, CYVCV-Delhi isolate from India grouped in cluster I, while CiYMaV-Delhi isolate grouped differently in cluster II (Fig. 5C).

Discussion

Kinnow mandarin (C. reticulata) is an important variety of citrus, which has provided economic impulses to citrus growers in Northwest India, due to its beautiful golden-orange color, profuse bearing, rich juice content, good quality, and economic returns. Being a long-lived perennial tree, they are vulnerable to large number of pests and pathogens including viral pathogens. Mandariviruses are one of them as they limit the productivity of the trees and quality of their valued fruits leading to economic losses (Prabha and Baranwal 2011).

The genus Mandarivirus belongs to the family Alphaflexiviridae of the order Tymovirales, and the virus species of this genus comprises linear positive sense ssRNA (+) genome enclosed in a flexuous filamentous particle. The history of “Indian citrus ringspot virus” dates back to late 1980s (Ahlawat 1989). Most characteristic symptoms of the disease are conspicuous yellow ring spots on mature leaves. Once the full genome sequence of ICRSV was available in public domain and based on phylogenetic analysis of polymerase and coat protein gene sequences, it was placed in a new plant virus genus Mandarivirus of a new plant virus family Flexiviridae (Rustici et al. 2002). With the characterization of another new virus species “Citrus yellow vein clearing virus”, the Mandarivirus genus got expanded (Loconsole et al. 2012). In India, CYVCV was first reported from Abohar in Punjab state on a citrus cultivar, Etrog Citron (Ahlawat 1997). The viral etiology of citrus yellow vein clearing was established in 2003 (Alshami et al. 2003) and the virus was tentatively named as citrus yellow vein clearing virus. The complete genome sequencing of CYVCV from Turkey and China showed approximately 74% similarity with ICRSV. Recently the complete genome of CYVCV has also been reported from India (Meena et al. 2019). Mandariviruses are similar with regard to their natural host range, genome organization, particle morphology, and symptoms produced. In nature, they can be present either individually or simultaneously in a common natural host like citrus (Pant 1995).

In recent years, incidence of disease induced by two mandariviruses viz. ICRSV and CYVCV was recorded in India in ‘Kinnow mandarin’ trees (83.8%), ‘sweet orange’ (70%), followed by ‘lemon’ (20%) that lead to continuous reduction in flowering, fruit size, quality, and yield (Byadgi and Ahlawat 1995; Prabha and Baranwal 2011; Meena et al. 2019). These viruses were reported to be associated with 20% decline in fruit yield in China (Li et al. 2017; Chen et al. 2014). The viruses representing the genus Mandarivirus are easily transmitted to their hosts by mechanical means (Liu et al. 2020). The surveillance of mandariviruses is essential to preserve the healthy citrus plants. For virus indexing, several diagnostic protocols have been employed, like biological indexing, electron microscopy (Pant et al. 2018), enzyme-linked immunosorbent assay (Ahlawat and Pant 2003), nucleic acid-based detection techniques like RT-PCR (Sharma et al. 2009; Meena and Baranwal 2016), RT-LAMP (Kokane et al. 2021a, b) and RT-qPCR (Kokane et al. 2021a, b). However, all these assays are used to identify only known viruses for which prior information related to targeted pathogen is required. Apart from that, bio-indexing can also be used for virus detection but it is costlier and time-consuming assay (Maliogka et al. 2018). However, molecular diagnostic assays like PCR are quick and extremely specific but may require several sets of primers and PCR runs for screening of different viral pathogens. Standardized multiplex PCR for several viruses and viroids identification simultaneously may speed up the diagnostic experiments, but in some cases, it may cause problematic interactions due to which results may vary (Maliogka et al. 2018). To overcome these problems, HTS can be used in combination with relevant molecular techniques which is a well-known efficient broad-spectrum diagnostic tool in the field of new virus discoveries (Al Rwahnih et al. 2015; Massart et al. 2014, 2019; Wu et al. 2020). HTS has an advantage to identify novel viral pathogen and their variants along with other pathogens. In recent years, HTS has been notably used to study diversity and novelty of viruses and viroids in citrus crops (Loconsole et al. 2012; Matsumura et al. 2017; Cao et al. 2018; Wu et al. 2020; Licciardello et al. 2021; Bester et al. 2021).

In this study, we successfully employed HTS coupled with RT-PCR assay to identify CiYMaV a mandarivirus from Kinnow mandarin plants for the first time in India as mixed infection with CYVCV. CiYMaV was earlier reported from Punjab province of Pakistan in year 2018 in Symons sweet orange (Citrus sinensis L. Osbeck) (Wu et al. 2020). We also recovered the near-complete genome sequence of both the viruses from the generated HTS data using bioinformatics applications. For the identification of new or unknown viruses, de novo transcript assemblers are required and de novo assembly of short sequencing reads is must to retrieve the full virome compositional and functional information (Sutton et al. 2019). We used the reconstruction of complete or near-complete genome approach for the identification of both known and unknown viruses from the infected Kinnow mandarin plants by following procedure described earlier by Sidharthan et al. 2020 and Wu et al. 2020. In our findings, both the de novo assemblers (Trinity and MEGAHIT) identified a similar number of viruses, but in case of Trinity, assembled larger contigs showed more virus sequence coverage as compared to MEGAHIT. Increase in sequencing depth provides better virus genome coverage but when more than one assembler is being used, one or more viruses that escape detection by one assembler can be detected by the other (Massart et al. 2019). Thus, according to Sidharthan et al., (2020), use of multiple tissues and assemblers enabled better unraveling of grapevine virome. In our study, we have utilized two assemblers (MEGAHIT and Trinity) which enabled better understanding/insights to virome analysis of citrus.

CiYMaV reported in 2018 from Pakistan showed mottle symptoms on leaves of Symons sweet orange (Wu et al. 2020) and similar symptoms were now detected on leaves of Kinnow mandarin plants in India (in the present study). This clearly shows the expansion of its host range from Symons sweet orange to Kinnow mandarin. Krona plot distribution of mandarivirus species in pooled samples of Kinnow mandarin showed higher percentage of CYVCV than CiYMaV. The FPKM value for CYVCV was higher than CiYMaV, which clearly showed its dominance. RNA viruses are known to have a higher mutation rate, i.e., up to a million times greater than their relevant hosts, which speed up its virulence and evolution capability (Duffy 2018) which ultimately leads to the formation of variants commonly known as quasi-species (Schneider and Roossinck 2001). The krona plot shows all taxonomic levels based on the NCBI taxonomy starting from superkingdom to family level and the associated abundances based on the number of identified viruses. SNV analysis showed that CiYMaV is found to be more prone to mutation than CYVCV. Hence, CiYMaV is expected to have greater quasi-species diversity than CYVCV.

In conclusion, the present study reports that the ISEM assay, which used ICRSV and CYVCV pABs to screen for viral pathogens predominantly, revealed both decorated and undecorated mandarivirus particles in the Kinnow mandarin leaf samples. CiYMaV was subsequently verified to be present in the same sample by RT-PCR. HTS-based virus identification of these mandariviruses in the pooled Kinnow mandarin plant samples provided the overview of the viral diversity. As per the symptoms observed in the infected plants, the morphology of the virions, genomic organization, and the nucleotide sequence identity, the isolate was identified as CiYMaV, which has been identified for the first time in India in association with CYVCV, causing disease in Kinnow mandarin. Validation results confirmed the mixed infection of both the viruses in Kinnow mandarin plants and we did not find a single plant infected only with CiYMaV which clearly indicates that the incidence and infection of CYVCV is more than CiYMaV. ICRSV was not detected in RT-PCR assay with same Kinnow mandarin samples earlier tested for CYVCV and CiYMaV. The primers developed for duplex RT-PCR along with host internal control primers were found highly specific for their targeted regions and they have shown the sensitivity up to 10–2 dilutions in both uniplex and multiplex RT-PCR. The findings of this study enhance our knowledge of the viral pathogens of Kinnow mandarin and it would be helpful in development of improved virus indexing protocols and certification programs for commercial Kinnow mandarin cultivars.