Introduction

Apple mosaic virus (ApMV), first reported in Rosa spp. and Malus domestica in USA [1], is a species in the Ilarvirus genus subgroup III (Bromoviridae family) [2]. In addition to the roses (Rosa sp.), ApMV infects a number of woody plants including blackberry (Rubus), raspberry (R. idaeus), black raspberry (R. occidentalis), red current (Ribes rubrum), apple and crab apple trees (Malus pumila, M. silvestris), apricot, cherry, almond, plum, and peach (Prunus sp.), and mountain ash (Sorbus aucuparia) (all from the family Rosaceae); hazelnut (Corylus avellana), silver birch (Betula pendula), and chestnuts (Aesculus hippocastanum, A. × carnea, A. flava, A. parviflora) (all belonging to the order Fagales); and hop (Humulus lupulus, family Cannabaceae), [3]. One isolate is known from strawberry (Fragaria sp.) [4], and six isolates were found in lichen symbiotic Trebouxia algae [5]. The leaves of infected trees show typical chrome-yellow line pattern, bright yellow blotches and vein clearing, rings, and oak-leaf pattern, which appear at the beginning of summer [6]. The symptomatic leaves often drop prematurely. The symptomatology is generally not of diagnostic significance, because similar symptoms may be produced also by other Ilarviruses [6]. In hop, leaves of sensitive cultivars show a light green or yellow pattern which could become necrotic, although, many insensitive cultivars show no symptoms of virus infection [3, 7]. ApMV is a mechanically and graft-transmissible pathogen that is present worldwide and persists in propagative material, which is probably the main source of infection by the virus. It is not pollen-borne, nor does it occur in seedling rootstocks. It has not been identified in naturally infected weeds, and the source of the virus remains unknown [8]. In hops, ApMV infection decreases bitter alpha acid yield by 5–34 % [7]. In pome fruits, the infection cause the retardation of growth and premature drop of leaves, in stone fruits, it may cause growth reduction and yield losses, in some sensitive almond cultivars the virus induce failure of blossom and leaf failure symptom [9].

Studies on potyviruses have shown that the extant populations of different species are only decades to centuries old, reflecting an evolutionary burst related to the intensification of agricultural practices, colonization of new areas by Europeans, and post-Columbian world trade over the last 500 years [10]. Previous sequence analyses of ApMV performed on a limited number of strains revealed significant similarity among the isolates from distinct hosts like hop, apples, and stone-fruit plants [7, 1113]. An unusually high-sequence identity of isolates from geographically distant localities points to a hypothesis of host-conditioned modification and a fixation of nucleotide substitutions in fruit hosts from the subfamily Maloideae in contrast to hop, Prunus, and other host species of ApMV. In addition to the effects of the host, an accidental transfer of the virus from nonrelated hosts within a locality could also contribute [11]. The source of virus inside lichen algae is completely unknown, since four were from lichens growing on tree bark and two from lichens growing on ground [5]. Adaptation to the host species is of fundamental importance, thereby, elevating the reproductive rate of the virus above the critical value needed for sustained transmission [14].

Molecular evolution of viruses is driven by both selection and recombination. The fixation rate of positive ssRNA viruses is in the range of 10−2–10−4 mutations/site/year [15]. However, strong bottlenecks existing for viruses during migration between different hosts or between plants of a given host via a vector [14] is accomplished in Ilarviruses due to the fact that no insect vector is known to transmit them. Therefore, another genetic bottleneck like selection during virus movement in plants can be more important here [16]. In Ilarviruses, the coat protein (CP) is generally required for systemic movement and may be required for cell-to-cell spread [17]. Crucial for the proper CP functions is its RNA-binding and protein-binding activity. Highly conserved arginine in Q/K/R·P/N·T·X·R·S·R/Q·Q/N/S·W/F/Y·A binding consensus sequence and a zinc-finger motif (C·x2·C·x·h·x·H·x3·c·x2·C·x2·C·H/C) have been recognized in N-terminal part of CP of several Ilarviruses [1820].

In ApMV, the consensus CP sequence has been established as having 654 nt, but isolates with insertions 6–15 nt after nt position 141 have been described [12]. Also, isolates from apple, pear, and cherry trees are known with CP genes encoding proteins 221, 222, and 223 aa long respectively. Another length variant is known from almond, with a CP 220 aa long as well as variants from hop, prune, and mahaleb which are 218 aa long [1113, 2125]. A stringent and robust criterion for detecting adaptive evolution in a protein-coding gene is an accelerated nonsynonymous (Pi(a), amino acid replacing) rate relative to the synonymous (Pi(s), silent) rate of substitutions, with the rate ratio Pi(a)/Pi(s) > 1. As silent mutations do not change the amino acid whereas replacement mutations do, the differences in their fixation rates provides a measure of selective pressure on the protein. In this study, we analyzed CP sequences of 65 ApMV isolates, including 23 new sequences from hop and new sequences from mountain ash, hazelnut, peach, and apricot, to determine the molecular variability of ApMV and examine the correlations between amino acid substitutions and host species or geographic origin.

Materials and methods

Virus isolates

Twenty-three ApMV isolates from asymptomatic hop plants originating from different locations were obtained from the collection kept at the Hop Research Institute Co., Ltd. (Žatec, Czech Republic). Isolates from mountain ash (Sambucus nigra), hazelnut (Corylus avellana), peach (Prunus persica), and apricot (Prunus armeniaca) originated from the Czech Republic and Italy (Table 1).

Table 1 ApMV isolates and sequences used in this study

Nucleic acid isolation and reverse transcription

Infected leaves from plants growing either in a screen house or in the wild were used in this study. The nucleic acid was isolated from 100 mg of plant material using TriPure Isolation Reagent (Roche), then eluted to obtain 80 μl of analyte. Reverse transcription was performed from 750 ng of total RNA in a 50 μl mixture containing 1× First-Strand Buffer (Invitrogen), 0.5 μg random hexamers (Roche Diagnostics), 0.5 mM dNTP, 40 U RiboLock™ RNase Inhibitor (Fermentas), 4 mM DTT (Invitrogen), and 40 U MuMLV Reverse Transcriptase (Invitrogen) for 55 min at 42 °C.

Amplification and sequencing

Amplification was performed with TaKaRa Ex Taq™ polymerase (Takara) using primers 5′-GGCCATTAGCGACGATTAGTC-3′ and 5′-ATGCTTTAGTTTCCTCTCGG-3′ amplifying the complete CP gene localized within the positions 1126–1794 nt [9]. Cycling parameters were as follows: denaturation at 94 °C for 30 s, annealing at 57 °C for 30 s, and extension at 72 °C for 45 s, in 35 cycles, with a final extension at 72 °C for 7 min. PCR products were purified using MiniElute Gel Extraction Kit (Qiagen), ligated into the pSC-A vector, and transformed to competent cells. Recombinant plasmids harboring a cDNA insert were sequenced in both orientations using M13 primers (Macrogen, the Netherlands).

Data analysis

The resulting sequences were analyzed using the BioEdit 7.0.9 program (Ibis Biosciences, USA) and deposited in GenBank under accession numbers HE866936 to HE866962. An additional complete CP sequences published previously were used for the analyses (Table 1). The CP sequence of PNRSV (NC_004364) was used as an outgroup.

Multiple alignments

Multiple alignments were carried out using the Clustal X, version 2.0 [26] with gap opening penalty set to 15 and gap extension penalty set to 6.66.

Recombination

The possible occurrence of recombination was tested using the default conditions of the suite of programs included in RDP3 Beta41 [27].

Phylogenetic analysis

MEGA 5 was used to find the best DNA model for our data [28]. The optimal substitution pattern was determined according to the program’s Bayesian information criterion, which indicated as best the Kimura 2-parameter model with discrete gamma distribution (+G), and five rate categories or the Kimura 2-parameter model as just stated but assuming that a certain fraction of sites are evolutionarily invariable (+I).

Substitution rates and evolution

Synonymous and nonsynonymous substitution analyses were also conducted in DnaSP v5 [29] according to the Nei–Gojobori model and using the Jukes–Cantor correction as well as for carrying out the neutrality test.

Results and discussion

Sequence analysis of CP gene of ApMV isolates

The CPs of all the analyzed ApMV isolates from hop cultivars as well as from non-hop hosts P. armeniaca, P. persica, S. nigra, and C. avellana consisted of 654 nt (218 aa) residues. No heterogeneity in the gene size was noticed.

No recombination event was detected among the 65 sequences by more than two methods carried out in RDP3 beta 41 and with window size set at 25 nt.

Neighbor-joining (NJ) and maximum likelihood (ML) phylogenetic analyses of this largest set of sequences to date were compared to test the occurrence of significant differences, but the tree topologies were highly similar. Therefore, only those results obtained with the ML method are shown here.

The ApMV capsid protein gene is monophyletic for all the isolates known today. Phylogenetic analyses of the nucleotide sequences indicated two main clusters of isolates and two single-standing isolates from almond (PAM-ITA) and from unknown Chinese host (XX1-CHN) supported by bootstrap values above 50. Cluster I is more homogeneous than cluster II. It contained all hop isolates irrespective of geographic origin, isolates from various Prunus hosts, an isolate from hazelnut (CA-CZE), and one isolate from pear (PC1-CZE). Cluster II contained all the remaining isolates from apple and pears, one isolate from Prunus avium (PAV-IND), and isolates from lichen Trebouxia algae symbionts. We can speculate that these extra-host standing PC1-CZE and PAV-IND isolates are accidental infection of pear and Prunus avium, respectively, that originated from another primary host growing in their vicinities (blackthorn in case of PC1-CZE and apple in case of PAV-IND) (Petrzik unpublished; [13]). Subclusters Ia, Ib, Ic, IIa, and IIb supported by high-bootstrap values are recognizable on the tree (Fig. 1). The nature of subclusters Ib and Ic, which contained two isolates each, is supported by clustering with another 8 and 9 hop isolates, respectively, when 399 nt long central part CP of larger set of isolates was used for phylogenetic analysis (Supplementary Table 1). In this analysis, new cluster III contained isolate from strawberry and two Chinese isolates from unknown hosts occurred (Supplementary Figure 1).

Fig. 1
figure 1

Clusters of ApMV isolates based on the maximum likelihood phylogenetic tree. Bootstrap values was applied using 1000 replicates

Mean codon-based evolutionary diversity characteristics (Pi(S), Pi(a)) were computed for complete CP gene in 65 sequences; clusters I, and II; and also for subclusters Ia, IIa, and IIb (Table 2). The mean Pi(a)/Pi(s) ratio was 0.178 and is close to 0.187 value computed for smaller amount of isolates previously [30]. Substitution rates, smaller than 1, were found in all clusters and subclusters here. This suggests that, in all virus populations, purifying selection occurred due to natural selection, which conserved the protein sequences despite the occurrence of nucleotide polymorphisms. The Pi(a)/Pi(s) ratio of isolates from subcluster IIa only is significantly higher than that of the other subclusters, thus, indicating that those isolates are under tighter functional constraints. In subcluster IIa, there are isolates from apple hosts from different geographic origin, but where shift mutations in the CP gene occur [12, 21]. Similar difference between groups of isolates had been observed in comparison of nucleotide diversity for CP gene of almond and “other hosts” isolates of the related Prune dwarf virus (Ilarvirus): a significantly higher proportion of nonsynonymous substitutions had been found for the cluster of almond variants, than for the “other hosts” isolates. The differences in selection pressure had been explained by the agricultural practice here [31].

Table 2 Estimates of mean codon-based evolutionary diversity

Detailed aa sequence comparison revealed unique motif characteristic for the clusters (Supplementary Figure 2): The position Gly7 in the context MVCKYCGHT is specific for all the sequences from cluster I (and for PAM-ITA and XX1-CHN isolates); Asn7 is specific for sequences from cluster II in the same position (supposing presence of corresponding motives in all Australian, Indian, and Latvian isolates, also, where the 5′-end of CP was not sequenced). The Gly7/Asn7 position is in immediate proximity with the zinc-finger motif and could influence its nucleic acid- and/or protein-binding activity [32] and indirectly the host preference. This change is also of some value for discrimination, as it is in a putative antigenic site on the CP [12].

The positive selection hypothesis distinctly for the apple mosaic virus CP gene has not been confirmed. Significant differences in substitution rates in isolates from distinct host-related clades were not observed, however, some groups of isolates (especially that with frame-shift mutations from subcluster IIa) showed much higher values of mean diversity suggesting different functional constraints comparing the rest of isolates (results not shown). It is highly probable that finer classification of cluster II will be performed in future with the growing number of sequences from new hosts.