Members of the family Luteoviridae are distributed worldwide and infect monocotyledonous and dicotyledonous plants [3, 9]. Their 5600- to 5900-nt-long positive-sense single-stranded RNA genomes are encapsidated in icosahedral particles and encode five to seven ORFs [2, 12]. The family includes the genera Luteovirus, Enamovirus and Polerovirus. Whereas members of the genus Polerovirus encode both an ORF0 and an ORF4, members of the genus Luteovirus lack an ORF0, and members of the genus Enamovirus lack an ORF4 [2, 12]. Further, members of the genera Luteovirus and Polerovirus also have a small non-AUG-initiated gene, ORF3a, that is not found in members of the genus Enamovirus [12]. The complete genomes of 28 poleroviruses have been characterized so far (NCBI data). Recently, two novel cowpea-associated poleroviruses provisionally named cowpea polerovirus 1 (CPPV1) and cowpea polerovirus 2 (CPPV2) were identified using a metagenomics-based approach [11]. Specifically, a genomic fragment of 5012 nt was determined for CPPV1 (accession number KX599154) and a fragment of 3164 nt for CPPV2 (accession number KX599164).

The aim of this study was to determine the complete genome sequences of CPPV1 and CPPV2. Cowpea plants BE167 and BE179, from which the two viruses were initially isolated [11], were used to extract total RNA using an RNeasy Plant Mini Kit (QIAGEN, Valencia, CA) as described by the manufacturer. Before determining the 5’ and 3’ ends of the genomes by RACE-PCR, a large internal part of the CPPV2 genome sequence had to be completed by sequencing RT-PCR fragments amplified with a combination of two primer sets: one set designed based on the phasey bean mild yellow virus genome (PBMYV) and another previously used to obtain the partial CPPV2 genome (Supplementary Table 1). To sequence the 5’ region of each genome, cDNAs were produced with primers PoleroNB202R (CPPV1) and Polero2NB1834R (CPPV2) using SuperScript III Reverse Transcriptase (18080-044). These cDNAs were subsequently tailed in parallel with polyA, G or C using a recombinant terminal transferase TdT enzyme mini kit (New England BioLabs, Beverly, MA, USA). The 5’RACE-PCR was performed as described previously [6]. The sequence of the 3’ end was determined for each genome, using a 3’RACE-PCR procedure. Briefly, RNAs were tailed with polyA using a poly(U) polymerase mini kit (New England BioLabs) as described by the manufacturer. Amplicons generated with polyT and virus-specific primers were sequenced directly using automated Sanger sequencing (Beckman Coulter Genomics). Assembly and alignment of the nucleotide sequences and the identification of ORFs was performed as reported previously [11]. Maximum-likelihood phylogenetic trees were produced using PhyML 3.1 [4] implemented in MEGA version 6.06 [14] with the JTT+G amino acid substitution model with branch support tested using 1000 bootstrap replicates. Potential recombination events were detected using the RDP, GENECONV, BOOTSCAN, MAXIMUM CHI SQUARE, CHIMAERA, SISCAN and 3SEQ recombination detection methods, which are implemented in RDP4.79, with default settings [8]. KnotInFrame (http://bibiserv.techfak.uni-bielefeld.de/knotinframe) was used to predict the slippery heptameric -1 ribosomal frameshift [15].

Both the 5845-nt CPPV1 genome (accession number KY364846) and the 5945-nt CPPV2 genome (accession number KY364847) have an organization that is typical of poleroviruses [1, 2, 12], with the presence of the conserved ACAAAAGA motif at the 5’ end and seven ORFs that are characteristic of poleroviruses, including ORF0, ORF4 and ORF3a (Fig. 1). The aa lengths of the putative CPPV1 and CPPV2 proteins are as follows: P0, 246 and 260 amino acids, respectively; P1, 623 and 654 amino acids, respectively; P1-P2, 1024 and 1059 amino acids, respectively; P3a, 45 and 46 amino acids, respectively; P3, 198 and 197 amino acids, respectively; P4, 190 and 189 amino acids, respectively; and P3-P5, 668 and 698 amino acids, respectively. Similar to other known poleroviruses, the -1 ribosomal frameshift for the P1-P2 fusion protein [5] is predicted by the ‘slippery heptamer’ sequence GGGAAAC at positions 1632 to 1638 for CPPV1 and 1684 to 1690 for CPPV2. The putative CPPV1 proteins all share less than 85% identity with their counterparts in other poleroriruses [11]. The putative CPPV2 proteins share 44% (P0), 61% (P1), 76% (P3) and 84% (P1-P2) identity with their homologues in phasey bean mild yellow virus (PBMYV, accession number KT963000), 71% (P3-P5) and 67% (P3a) with their presumed homologues in chickpea chlorotic stunt virus (CpCSV, accession number NC008249) and 56% (P4) with their homologue in suakwa aphid-borne yellows virus (SABYV, accession number KF815677).

Fig. 1
figure 1

Genome organization of CPPV1 (A) and CPPV2 (B). The ORFs that are likely to represent genes expressing characteristic polerovirus proteins (P0-P5) were identified based on comparisons with other members of the genus Polerovirus. RdRp, RNA-dependent RNA polymerase; CP, coat protein; MP, movement protein; RTD, readthrough domain. The predicted -1 slippery ribosomal frameshifting site is indicated by a filled triangle, and the leaky stop codon is indicated by an asterisk. The two unique recombination events (R1 and R2) detected within the CPPV2 sequence are indicated by red bars below the CPPV2 genome diagram

Maximum-likelihood phylogenetic trees based on the P3 protein (Fig. 2A) and the P1-P2 protein (Fig. 2B) confirmed that CPPV1 and CPPV2 cluster with the known poleroviruses. Whereas CPPV1 clusters with groundnut rosette assistor virus (GRAV) with 75% bootstrap support in the P3 tree, in the P1-P2 tree, it does not form a strongly supported cluster with any of the other known poleroviruses. Inversely, CPPV2 clusters with 100% bootstrap support with PBMYV isolates in the P1-P2 tree but forms no strongly supported clusters with any known polerovirus in the P3 tree. It is noteworthy that both CPPV1 and CPPV2 are apparently most closely related to poleroviruses isolated from leguminous plants (phasey bean, groundnut and chickpea).

Fig. 2
figure 2

Maximum-likelihood phylogenetic tree depicting the relationships between CPPV1, CPPV2, and established members of the family Luteoviridae based on the P3 (A) and the P1-P2 (B) proteins. The following members of the genus Polerovirus that exhibited the highest degree of similarity to CPPV1 and CPPV2 were selected for the comparison: chickpea chlorotic stunt virus (CpCSV), groundnut rosette assistor virus (GRAV), phasey bean mild yellow virus (PBMYV), beet western yellow virus (BWYV), beet mild yellowing virus (BMYV-IPP), cucurbit aphid-borne yellows virus, (CABYV), melon aphid-borne yellows virus (MABYV), pepo aphid-borne yellows virus (PABYV-RSA), suakwa aphid-borne yellows virus (SABYV-TW19), cereal yellow dwarf virus (CYDV) and pepper vein yellows virus (PeVYV). One representative of the genus Luteovirus (soybean dwarf virus, SbDV) and one member of the genus Enamovirus (pea enation mosaic virus 1, PEMV-1) were used as outgroups. Branch support values are percentages estimated based on 1000 bootstrap replicates

Using a dataset of 12 full polerovirus genomes, four apparently unique recombination events were detected, including two events in the CPPV2 genomes. The sites of these two events (R1 and R2) are located within the CPPV2 P2-P3a (R1) and P3/P4 (R2) regions (Fig. 2B), and both appear to have involved phasey bean mild yellow virus (accession number NC028793) as the major parent (i.e., the contributor of the larger fraction of the genome) and an undescribed polerovirus as the minor parent (i.e., the contributor of the smaller fraction of the genome; Fig. 2B). The genomic region acquired from the minor parent encompasses the end of the RdRp gene and the beginning of the CP, a genome region that is considered a recombination hot spot in members of the family Luteoviridae [7, 10]. The widespread occurrence of recombination between members of the family Luteoviridae is considered a form of modular evolution that can play a key role in the adaptation of luteoviruses to changing environmental conditions [7, 10, 13].

According to the International Committee of Taxonomy of Viruses, a virus of the family Luteoviridae is considered a member of a new species if it has less than 90% amino acid sequence identity to any previously classified members of the family Luteoviridae, in any of its predicted proteins [2]. Based on this species demarcation criterion, CPPV1 and CPPV2 are likely representatives of two new species in the genus Polerovirus.