Introduction

Prune dwarf virus (PDV, genus Ilarvirus, family Bromoviridae) is a commonly found pathogen in stone fruits, such as sour cherry (Prunus cerasus), sweet cherry (P. avium), almond (P. dulcis), apricot (P. armeniaca), plum (P. domestica), and peach (P. persica) (Vaskova et al. 2000; Paduch-Cichał et al. 2011). PDV plant infection is recognized through various symptoms, including leaf deformity, chlorotic rings and spots on leaves, and stunting. These symptoms can vary depending on the host plants, virus isolates, and environmental factors (Fonseca et al. 2005; Predajňa et al. 2017; Kamenova et al. 2019). Except for the transmission by grafting, the transmission of PDV occurs vertically via pollen or seeds (Kelley and Cameron 1986; Silva et al. 2003). As a result, PDV infection highly affects the health and yields of susceptible plants (Uyemoto and Scott 1992).

A multipartite PDV genome consists of three positive-sense single-stranded RNA segments. RNA1 possesses a single open reading frame (ORF) that encodes the non-structural P1 protein, which is involved in virus replication (Pallas et al. 2013). Similarly, RNA2 contains a single ORF encoding the P2 protein, which is an RNA-dependent RNA polymerase and an element of the replication complex (Kozieł et al. 2017). RNA3 has two ORFs: ORF3a and ORF3b. ORF3b encodes the movement protein (MP), which is immediately translated from RNA3. In ORF3a, a downstream subgenomic promoter region enables the transcription of sgRNA4, which encodes ORF3b from which the viral coat protein (CP) is translated (Pallas et al. 2013). The MP generates tubular structures that allow cell-to-cell viral transportation via plasmodesmata (Kozieł et al. 2018); on the other hand, CP performs an important role in genome activation, which is essential for viral replication and encapsidation of viral particles (Bachman et al. 1994; Pallas et al. 2013).

Initial investigations on variations in the PDV genome in different Prunus spp. mostly focused on the CP gene (Vaskova et al. 2000; Öztürk and Çevik 2015; Zong et al. 2015; Kamenova et al. 2019; Çelik and Ertunç 2020). These studies indicated the high divergence of the nucleotide (nt) sequences of the CP gene from different hosts (Fonseca et al. 2005; Ulubaş Serçe et al. 2009; Öztürk and Çevik 2015). In spite of the variable biological and serological characteristics of PDV isolates, the molecular characterization of their CP sequences failed to clearly differentiate isolates on the basis of host or geographical origin relationship (Vaskova et al. 2000; Öztürk and Çevik 2015). Thus, molecular variability studies must be extended to other coding regions, e.g., the MP gene (Predajňa et al. 2017).

Selection analysis can be used to reveal evolutionary processes to gain insights into the molecular epidemiology and extent of virus genomic variability (Farzadfar and Pourrahim 2019; Santosa and Ertunç 2021). The genetic diversity analysis of plant virus populations can also be used to determine the best disease management options, including engineering plants that show resistance (Sokhandan-Bashir and Melcher 2012). Although PDV, which is distributed worldwide, is a common and traditional stone fruit virus, its population genetic structure and molecular evolution remain poorly understood and require further exploration. This study aimed to determine the genetic variability of PDV populations using the most recent database data and compare the population genetic structures of isolates of different geographical origins, with emphasis on the PDV population in Turkey.

Materials and methods

PDV isolates and recombination analysis

The nt and amino acid (aa) sequences of P1, P2, MP, and CP genes of PDV, which are available in the National Center for Biotechnology Information (NCBI) GenBank, including 41 Turkish isolates originating from four Prunus spp. from different regions of the country (Ulubaş Serçe et al. 2009; Öztürk and Çevik 2015; Çelik and Ertunç 2020), were analyzed in this study. The complete nt sequence of the P1 gene spans nt 39–3206 in RNA1 (reference isolate: accession No. NC_008039). The P2 gene spans nt 32–2398 in RNA2 (reference isolate: accession No. NC_008037). The MP spans nt 260–1141 in RNA3, and the CP gene at nt 1214–1870 in RNA3 (reference isolate: accession No. NC_008038). Some divergent isolates show differences in the length of their respective coding regions due to indel mutations.

A total of 51 isolates with complete RNA3 sequences, which are available in the NCBI GenBank database (accessed on October 11, 2021), were aligned and trimmed using CLUSTAL W implemented in MEGA X v.10.2.4 (Kumar et al. 2018) to create two alignments involving the MP and CP regions. A total of 26 other isolates, for which only complete or nearly complete MP sequences were available, were added to the MP region alignment to generate a final MP dataset of 77 isolates. Then, 103 other isolates, for which only complete CP sequences were available, were added to the CP region alignment to generate a final CP dataset of 154 isolates. Two other alignments were performed on 9 and 11 isolates for which the complete ORF of P1 and P2, respectively, are available in NCBI GenBank (Online Resource 1-Table 1).

The recombination signals on the P1, P2, MP, and CP sequence alignments were scanned using Recombination Detection Program (RDP v.4.56) (Martin et al. 2015); credible signals included only those supported by five or more of the RDP, GENECONV, BootScan, MaxChi, ChiMaera, SiScan, and 3Seq algorithms with a Bonferroni-corrected P value of < 0.05. Further analyses of this study excluded all recombinant isolates to conduct an accurate phylogenetic analysis.

Phylogenetic and comparative nt and aa analyses

Four phylogenetic trees based on MP and CP nt sequences were built using maximum-likelihood (ML) and neighbor-joining (NJ) algorithms in MEGA X v.10.2.4 (Kumar et al. 2018). Two ML trees were constructed to observe the phylogrouping of isolates based on the P1 and P2 coding regions. The models for nt substitution of each alignment were selected using the lowest Bayesian information criterion score. The statistical significance of isolate clusters was determined using Tamura 3-parameter (Tamura 1992), with 1000 bootstrap replicates, uniform rate among sites, and complete deletion of missing data treatment.

Analysis of aa variations in the MP and CP regions

Estimation of the percentage identities of P1, P2, MP, and CP genomic regions at the nt and aa levels was performed using Sequence Demarcation Tool v1.2 software (Muhire et al. 2014). The aa sequences of the MP region of 8, 7, and 5 isolates represented groups 1, 2, and 3, respectively. The aa sequences of the CP region of 19, 11, and 10 isolates represented groups, 1, 2, and 3. All these aa sequences were aligned in BioEdit v.7.2.5 (Hall 1999) to examine the sequence variation among isolates from different phylogroups.

Genetic diversity and polymorphism analyses

The genetic diversities of P1, P2, MP, and CP nt sequences among non-Turkish and Turkish isolates were estimated using DnaSP v.6.12.03 (Rozas et al. 2017), based on the confidence intervals of several parameters: the number of haplotypes = h, haplotype diversity = Hd, the number of variable sites = S, total number of mutations = η, average nt differences between sequences = k, nt diversity (per site) = π; the ratio of nonsynonymous to synonymous site (dN/dS) = ω.

Neutral selection analysis

Statistical tests of Tajima’s D, Fu and Li’s D*, and Fu and Li’s F* (without out-group; window length: 100 sites and step size: 25 sites) were performed using DnaSP v.6.12.03 (Rozas et al. 2017) to explain the demography of PDV populations. The differences between the number of segregating sites and the average nt differences served as the basis for Tajima’s D (Tajima 1989). The differences between the number of singletons and the total number of mutations were the basis for Fu and Li’s D*, and that for Fu and Li’s F* included the differences between the number of singletons and the average nt differences between pairs of sequences (Fu and Li 1993).

Gene flow and genetic differentiation among populations

The KS*, KST*, Z*, Snn, and FST values (Hudson et al. 1992; Hudson 2000), which determined the genetic differentiation of P1, P2, MP, and CP nt sequences among phylogroups, respectively, were calculated using DnaSP v.6.12.03 (Rozas et al. 2017). The KST* value is close to zero in the absence of genetic divergence among the tested populations (Tsompana et al. 2005). The small Z* value indicates a low genetic isolation (Hudson et al. 1992). The value of Snn ranges from 0.5 (the exact same population) to 1 (distinctly separated populations) (Hudson 2000). The FST value ranges between a minimum of 0, which is used to describe strictly the same populations, to a maximum of 1, which is used to describe entire distinct populations (Hudson et al. 1992; Tsompana et al. 2005). FST > 0.25 usually implies a high gene flow and large genetic isolation among populations (Gao et al. 2016). The calculated FST values among the three sampling regions in Turkey (AHK region: Adana, Hatay, and Kahramanmaraş; BB region: Bursa and Bilecik; AI region: Afyon and Isparta) provided insights into the genetic distances of isolates from different regions in Turkey.

Results

Phylogenetic and recombination analyses

The exact same topology was observed between the two NJ and two ML phylogenetic trees based on MP and CP alignments, that is, the PDV isolates were divided into three major lineages (Fig. 1; Online Resource 2- Figs. 1 and 2). Turkish isolates were detected in all the three phylogroups. DL, NS9, and PCH4 isolates were consistently positioned in the same groups in the trees based on the P1, P2, MP, and CP comparisons. However, given the currently limited availability of the complete sequences of P1 and P2, a general consensus for the phylogrouping of the four genomic regions could not be established (Fig. 1; Online Resource 2-Table 1). Recombinant events were detected in three different parts of P1 of KY883327 (Table 1).

Fig. 1
figure 1

Maximum-likelihood trees constructed using Tamura 3-parameter model (T92) with uniform rates and complete deletion of missing data implemented in MEGA X software and based on nucleotide sequences of prune dwarf virus P1 (A), P2 (B), MP (C), and CP (D) genomic regions. A total of 77 and 154 isolates were included in the MP and CP comparison, respectively. Bootstrap values on each branch were supported with 1000 bootstrap replications (only values > 50% are shown). The accession numbers for isolates in the compressed branches are listed in Online Resource 1-Table 1

Fig. 2
figure 2

The most significant genetic linkages between the prune dwarf virus populations of three sample collection sites in Turkey: BB region: Bursa + Bilecik provinces (11 isolates); AI region: Afyon + Isparta provinces (25 isolates); AHK region: Adana + Hatay + Kahramanmaraş provinces (4 isolates). Linkages are indicated by their fixation index (FST) values which were calculated based on coat protein region comparisons

Table 1 Putative recombination events detected in the P1 region by RDP4 analysis of PDV isolates

Analysis of nt and aa variations

The comparative analysis suggested the relatively high nt and aa identities among all studied isolates in the P1, P2, MP, and CP regions. In general, the nt identity among isolates in the MP region was higher than the aa identity, and in the CP region, the identity was higher at the aa level (Table 2).

Table 2 Nucleotide (nt) and amino acid (aa) identities in the P1, P2, movement (MP) and coat protein (CP) genomic regions among PDV phylogroups

The MP and CP genomic regions exhibited highly conserved aa sequences. The three phylogroups in the MP region revealed five major aa residue mutations at positions 14, 189, 258, 279, and 280 (RefSeq NC_008038) (Online Resource 2-Fig. 3). The CP region showed seven major aa residue changes at positions 6, 13, 24, 27, 97, 173, and 177 (Online Resource 2-Fig. 4). In addition, six additional residues were observed in the nt sequences of five almond isolates from the Portuguese population (3.19.A1.2, 3.12 N.14, 3.17.A1, 3.20.2, and 3.20.1), which translated into two additional aa residues at positions 38–39. Twelve Portuguese, one Italian almond, and one apricot isolate from Greece (B3) showed three additional nts (ACT) at positions 622–624, which were translated into one extra aa residue (T, Threonine) at position 208. These changes were not observed in the other two almond isolates (4B from Portugal and PDV-AL2 from Italy) (Online Resource 2-Fig. 4). More substitutions were observed in the aa sequences of the MP and CP genes of group 3 isolates compared with the other group isolates.

Genetic diversity and polymorphism analyses

In general, the three phylogroups and different countries showed higher nt diversities (π values) among in the CP genomic region than in the MP genomic region, except the populations from Bulgaria, Poland, and Slovakia, which exhibited higher diversities in the MP region. Only the Hd value of group 3 in the MP genetic comparison reached 1.000, which further supported the indication of a relatively low nt diversity among PDV isolates. The Turkish population had Hd and π values of ere 0.998 and 0.0806, respectively. In the CP genomic region, the Turkish population had considerably higher S, η, and k values than the populations from other countries. These results showed that Turkish isolates, which were distributed in all three phylogroups, shared a large genetic variation. MP and CP gene comparisons showed that group 2 had higher S, η, and k values than the other two phylogroups (Table 3).

Table 3 Genetic diversity and polymorphism analyses of P1, P2, MP and CP genomic regions of PDV variants from different phylogroups and countries

The dN/dS was also calculated to analyze the evolutionary forces acting on the MP and CP genes. According to the results, the MP and CP genomic regions experienced very intense negative selection pressures, with a stronger pressure observed in the MP region (ω = 0.0669) than in the CP region (ω = 0.1795). The purifying selection pressures on the MP region of the Bulgarian, Slovak, and Polish isolates (ω = 0.0701, 0747, 0.0811, respectively) were also more intense than on their counterparts in the CP genomic region (ω = 0.1314, 0.1478, 0.1487). Therefore, the MP gene may have played an important role in the adaptive evolution of global PDV isolates. As expected, the negative selection pressures on the highly variable CP region of Turkish isolates (ω = 0.2279) was weaker than those of other countries except Portugal (ω = 0.3366) (Table 3). In addition, P2 had a higher genetic divergence among its isolates and experienced a weaker purifying pressure than P1 (Table 3).

Neutral selection analysis

Neutrality tests estimated positive values for P1 and P2 and negative values for all three phylogroups in the MP and CP genomic regions. However, the calculations by Fu and Li’s D* and Fu and Li’s F* tests for all groups 1 and 2 isolates were more convincing given the support from significant values (Table 4). All three neutrality tests predicted positive values for Bulgarian and Polish populations in the MP region comparison. As for the CP gene, Fu and Li’s D* and Fu and Li’s F* tests only assigned positive values to the Polish population (Table 4).

Table 4 Results from demographic test statistics between sequences of P1, P2, MP and CP genomic regions of PDV populations

Gene flow and genetic differentiation among populations

The analysis showed statistically significant KS*, KST*, and Z* values in the MP and CP genetic comparisons of all phylogroups. The Snn values among phylogroups all reached 1.0000 except group 1/group 2 for the CP gene (0.9760). These results were in line with those of the phylogenetic analysis, which showed significant genetic variations among the three PDV lineages. In the comparison of Turkish isolates with other isolates from different phylogroups, non-significant KS*, KST*, and Z* values were obtained only in Allλ/Turkey 1 and group 1λ/Turkey 1 (Table 5). However, these results were driven only by the presence of three Turkish isolates in group 1.

Table 5 Genetic differentiation estimates for lineages of PDV, based on MP and CP gene sequence comparison

Compared with the CP region, the MP region showed higher FST values, which were greater than 0.25, among phylogroups. The analysis results also indicated that in general, the gene flow from groups 1 and 2 to group 3 was lower than that from group 1 to group 2. Furthermore, the gene flow from non-Turkish to Turkish isolates in group 3 (Turkey 3) was lower than the gene flow to Turkish isolates in groups 1 (Turkey 1) and 2 (Turkey 2) (Table 5). Pairwise comparisons with two other collection sites revealed that the isolates collected in BB region had FST values > 0.25 (AI region: Afyon and Isparta provinces; AHK region: Adana, Hatay, and Kahramanmaraş provinces) (Table 5). Figure 2 summarizes graphically the genetic linkages between isolates from the three sample collection sites in Turkey.

Discussion

The significant genetic diversity of RNA plant viruses is largely due to recombination and mutations in virus populations. The distribution of variability itself depends on two main evolutionary processes: genetic drift and selection (García-Arenal et al. 2001). Knowledge of the molecular basis for adaptations of plant RNA viruses to different hosts and geographical ranges and their population structure can provide data for the development of effective detection tools and strategies for efficient phytosanitary measures (Jones 2009; Çelik and Ertunç 2021). PDV is one of the important viruses hampering stone fruit production worldwide, including that in Turkey. Therefore, this study performed updated and detailed diversity and evolutionary analyses of P1, P2, MP, and CP genes of non-Turkish and Turkish isolates currently available in GenBank.

Our study showed the absence of recombination events in the MP and CP genomic regions, which is in agreement with the lack of recombination in the CP region reported by Kalinowska et al. (2014). These results were also in agreement with the observation of Predajňa et al. (2017), who reported that the recombination breakpoints of isolates PDV-PE247 (accession No. GU187047), PDV-PA78 (GU181403), PDV-SOF15P11 (GU181404), 1048 C (KU949338), and 235 C (KU949346) were found on the central part of RNA3 (within or close to the intergenic region) and did not affect the MP and CP genes directly.

Phylogenetic analysis of the studied PDV isolates showed their affiliation to three major groups (1, 2, and 3) and the same topology in MP- and CP-based trees. The phylogroups did not exhibit a distinct relation with hosts and geographical origins, and the isolates obtained from different Prunus spp. and countries were integrated together in the same group and subgroup. This finding indicates the long-term diversification of PDV and the spread of PDV isolates facilitated by the intensive global trade of natural PDV hosts. This phylogrouping was similar to those in other recent reports (Öztürk and Çevik 2015; Predajňa et al. 2017; Kamenova et al. 2019; Song et al. 2020), which indicated that genetically, PDV isolates are not highly variable, as shown by the results of the identity analysis of nt and aa sequences conducted in this study. Most isolates whose MP and CP sequences are available were clustered in an identical manner to the MP and CP trees, as also observed by Predajňa et al. (2017). Hence, these data raised the possibility that phylogrouping of PDV isolates can be based on either the MP or CP gene nt sequences, with the same results. However, further analysis involving P1 and P2 showed that substantially more complete sequences are needed to fully understand the PDV phylogeny.

MP aa sequences were highly conserved among the isolates, suggesting the production of synonymous substitutions by most of nt changes in the region. On the other hand, the CP aa sequences presented considerably lower identity percentages than the nt sequences. The majority of nt changes in the CP region generated nonsynonymous aa substitutions, most of which occurred on group 3 isolates, which contained seven major aa differences compared with groups 1 and 2 isolates (Online Resource, Fig. 2). Thus, nt changes became one of the bases for PDV phylogrouping. However, these changes showed no negative effect on the fitness of group 3 isolates. The aa substitution A13T in KPTTRSQSFA motif in the CP (position aa 10–19) of some isolates, as reported by Ulubaş Serçe et al. (2009) and Predajňa et al. (2017), was also observed on a number of other isolates belonging to all three phylogroups; this finding suggested that T13 and A13 residues are interchangeable in this motif, which is involved in the RNA-binding activity of CP. As mutation evidently contributed to shaping of the population structure of PDV, the information obtained here can be considered in breeding programs to develop varieties with increased resistance to PDV.

Only the Hd value of group 3 in MP and CP comparisons among phylogroups reached 1.000, further indicating that the PDV population consisted of many identical haplotypes. The isolates from Slovakia (in MP and CP) and Hungary, Italy, and Portugal (in CP) also had Hd values of 1.000, showing the high divergences of their respective isolates. In the MP and CP comparisons, the lowest S, η, k, and π values were all correlated to group 3 isolates, indicating their high genomic similarities. Among the populations from various countries, Turkish isolates had the highest S and η and the second highest k and π values (after Hungarian) in the CP comparison. Thus, the Turkish isolates, which constituted almost a third of the tested isolates and were positioned in all the three phylogroups, exhibited a high variability. The high variations were probably due to the weak purifying selection pressure on Turkish isolates, especially those in group 2 (Turkey 2) (ω = 0.3085). Turkey 2 isolates also recorded higher S, η, k, and π values than Turkey 1 and 3. The stronger negative constraint on the MP than the CP gene is likely the most important evolutionary factor that made the MP gene more conserved than the CP gene.

Neutrality tests assigned positive values to P1, P2, Bulgarian, and Polish populations, and these values indicated that the populations emerged from an excess of intermediate frequency alleles and can evolve out of other population through bottlenecks, balancing, and/or structural selection (Biswas and Akey 2006). The positive values also suggested that Bulgarian and Polish isolates emerged or were introduced a long time ago and have been endemic in their respective country, or that they were results of vegetative propagation from the same source(s). Moreover, members of the Polish population, which consisted of apricot, peach, plum, sour cherry, and sweet cherry isolates, might have infected different hosts at various times. On the other hand, the negative values for the other PDV populations belonging to different phylogroups and countries indicated their population expansion (Tsompana et al. 2005). Therefore, new haplotypes can be predicted to emerge among these expanding populations.

Gene flow and genetic differentiation analyses presented significant KS*, KST*, Z*, and Snn values and vividly distinguished the three phylogroups in the MP and CP comparisons. Based on the obtained FST values, group 1 had a higher gene flow to group 2 than group 3 and thus shared more genetic similarities with group 2 than with group 3. This finding is in agreement with phylogenetic analyses that clustered group 1 closer to group 2 than group 3. The FST values also indicated that Turkey 1 was more similar to Turkey 2 than Turkey 3. Moreover, gene flow analysis revealed that non-Turkish isolates had considerably higher differences from Turkey 3 (FST = 0.4954) than from Turkey 1 and 2 (FST = 0.0838, 0.1714, respectively).

The observation of gene flow among isolates from three different collection sites in Turkey showed that the 11 isolates from BB region, which were all from peach and formed a distinct subcluster within group 3, were genetically isolated from the other Turkish isolates obtained in either AI or AHK region, which were from apricot, sweet cherry, and sour cherry and were distributed in all three phylogroups. Meanwhile, the comparison between isolates from AI and AHK regions yielded a very small negative FST value (–0.0212) and suggested the lack of divergence between them. Therefore, the obtained data also indicated relatively strong relations between the phylogrouping of Turkish isolates with host species and geographic locations.

In summary, the detailed genetic diversity and evolutionary studies on P1, P2, MP, and CP genes have expanded our knowledge on the structure of non-Turkish and Turkey PDV populations. These results may provide a basis for studies of other genomic regions, including isolates from other parts of the world (Asia, Africa, and Oceania) from which very few isolates have been molecularly characterized to date.