Introduction

Southern tomato virus (STV) is a persistent virus belonging to the genus Amalgavirus (family Amalgaviridae). STV is widespread infecting tomato crops in different production areas in USA, Mexico, France, Italy, China, Bangladesh and Spain [1,2,3,4,5,6]. STV has been detected in both symptomatic and asymptomatic tomato plants with high incidence in Peninsular Spain and Great Canary Island [7, 8]. STV is transmitted vertically by seeds with rates of about 80% whereas no horizontal transmission has been reported [1, 8]. Relationships between persistent plant viruses and their hosts have been poorly studied since most of these viruses did not induce any disease. Some persistent viruses establish mutualistic rather parasitic interactions, typical of acute viruses, with their hosts. For example, white clover cryptic virus‐1 which modulates the nodulation process in clover plants and curvularia thermal tolerance virus enabling the fungus Curvularia protuberata to confer thermal tolerance to panic grass (Panicum sp.) so it can grow in soils with temperatures over 50 °C [9, 10]. For STV, beneficial effects in the tomato host have been described [11] although without statistical support. A recent report showed changes of microRNAs populations, involved in many developmental processes and stress responses, in STV-infected tomato plants even though no ultrastructural and macroscopy changes were observed [12].

STV has a small double-stranded RNA genome of 3.5 kb with two overlapping open reading frames. STV genome organization is similar to that of the family Totiviridae. The 5′ encode for a putative coat protein (CP) named p42 and the 3′ encode for a RNA-dependent RNA-polymerase (RdRp) as a fusion protein using a +1 ribosomal frameshift [13]. Phylogenetic analysis of the STV RdRp region showed that the family Amalgaviriridae is evolutionary closer to the family Partiviridae. [1]. However, the STV p42 did not show sequence similarities with the CP of viruses of the families Totiviridae or Partitiviridae, but it showed structural homologies with the CP of members of the genus Tenuivirus [1, 14]. All this suggests that the Amalgavirus ancestor could have emerged from a recombination taking the Partivirus RdRp and a Tenuivirus CP [14].

Analysis of genetic diversity and evolution of plant viruses is crucial to understand their epidemiology [15] and develop accurate methods for virus detection and implement efficient measures of disease control [16]. Studies on genetic variability of persistent viruses are very scarce in contrast to the abundance of information regarding on acute viruses [17, 18]. In this work, the nucleotide diversity of the putative CP of STV isolates from two regions of Spain was determined and compared with that from worldwide STV isolates. Phylogeny and evolution forces driving STV populations were studied.

Materials and methods

Eleven STV isolates from Valencian Community (Spain) and four isolates from Great Canary Island were collected (Supplementary Table S1). Total RNA was extracted from STV-infected plants using a phenol/chloroform method [19]. The cDNA was synthesized from RNA extracts by reverse transcription (RT) using Superscript IV Reverse Transcriptase Kit (Thermofisher) with Random primers. The complete p42 gene (putative CP) was amplified by PCR using High fidelity polymerase Kit (Bio-rad) with the primers p42_F 5′GTC AGA TTT CTC GTC GTT GCT T′3, position 68–90 and p42_R 5′CGT GAC CGC GAG AAT GGA ATA G′3, position 1289–1311. RT-PCR products were purified using the QIAquick® PCR Purification Kit (Quiagen). The consensus nucleotide sequences of the amplification products were determined in both senses using an ABI 377 DNA sequencer (PerkinElmer) and deposited in GenBank (MK026630–MK026644). In addition, the nucleotide sequences of p42 from STV isolates from Mexico, China, South Korea, Dominican Republic, United Kingdom, Switzerland, Bangladesh, Spain and USA were retrieved from GenBank (Supplementary Table S1). The nucleotide sequences were aligned at the amino acid level using the program CLUSTAL W 2.0 [20]. The nucleotide substitution model that best fitted the sequence data was the Tamura-Nei model [21] which was used to assess nucleotide distances between STV isolates and to infer their phylogenetic relationships by the maximum-likelihood (ML) method with 500 bootstrap replicates [22]. The role of natural selection at the molecular level was evaluated by comparing the rate of nonsynonymous substitutions per nonsynonymous site (dN) and the rate of synonymous substitutions per synonymous site (dS) according to the Pamilo–Bianchi–Li method [23]. All these analyses were performed with MEGA X [21]. Selection across the genomic coding regions was studied by estimation of the rates of dN and dS at each codon using the fixed effects likelihood method [24] implemented in the Datamonkey Server https://www.datamonkey.org/ [25]. Also, STV population evolution was studied using the Tajima’s D [26], Fu and Li’s D* and F* [27] neutrality test. Recombination among STV isolates was assessed by the program GARD [25] from the Datamonkey package and RDP v.4.97 [28].

Results and discussion

The CP coding sequence of STV showed low nucleotide variation, being the proportion of segregating sites 0.0212. Nucleotide diversity of the two STV populations analyzed in Spain was 0.0007 ± 0.0004 and 0.0006 ± 0.0003 for Great Canary Island and Valencia Community, which were similar to the nucleotide diversity between both populations, 0.0006 ± 0.0007 and about five times lower than that for the worldwide STV population, 0.0032 ± 0.0007. The genetic diversity of STV CP gene was compared with that of other persistent viruses phylogenetically related belonging to the families Amalgaviridae, Partitiviridae and Totiviridae [1], as well as with the most important acute viruses infecting tomato and members of the genus Tenuivirus given the structural homology with STV CP (Table 1 and Supplementary Table S2). The nucleotide sequences encoding for the CP from the different virus isolates were retrieved from the GenBank (Supplementary Table S2) and the nucleotide substitution model that best fitted the data for each virus was determined. In cases of viruses with too many sequences in GenBank, a manageable number of sequences from different geographical origins was randomly selected. Regarding the persistent viruses, the other member of the genus Amalgavirus, with enough available sequences, blueberry latent virus, showed also very low nucleotide diversity (0.0012 ± 0.0003). The six viruses of the family Partiviridae infecting plants, fungi and protozoa had low nucleotide diversity ranging from 0.0058 ± 0.0017 to 0.0339 ± 0.0058. However, two members of the family Totiviridae showed high nucleotide diversity (> 0.100). Regarding the acute viruses, the nucleotide diversity was variable ranging from 0.0090 ± 0.0019 to 0.2846 ± 0.0402. The two members of the genus Tenuivirus infecting rice plants showed a nucleotide diversity of 0.0202 ± 0.0030 and 0.0338 ± 0.0038.

Table 1 Nucleotide diversity of the coat protein gene of several persistent and acute viruses

The phylogenetic tree of STV showed two clades or groups of STV isolates: Group A with isolates from Spain, Mexico, UK, Dominican Republic, USA, South Korea Bangladesh and China; and Group B with isolates from Switzerland and China (Fig. 1 and Supplementary Table S3). The mean nucleotide distance between both groups was 0.0112 ± 0.0028 whereas within Group A and Group B were 0.0012 ± 0.0004 and 0.0018 ± 0.0010, respectively. No correlation between the geographic and nucleotide distance was found and some isolates from different countries had identical nucleotide sequences such as Florida, Mexico-1, DR (from Dominican Republic) or some Spanish isolates (Supplementary Table S3).

Fig. 1
figure 1

Unrooted ML phylogenetic tree based on the nucleotide sequences encoding for the putative coat protein (p42) of 28 STV isolates: STV isolate names and geographic origins are indicated. Branch lengths are proportional to genetic distances. Bootstrap values > 50 are indicated in the nodes. Boxes include identical sequences

With regard on the role of natural selection at the STV evolution at molecular level, values of dN and dS were 0.0009 and 0.0072, respectively, and the value of the ratio dN/dS was 0.1241 indicating negative selection for amino acid change due to protein functional constrains (Supplementary Table S4). The value of dN/dS obtained for STV is in the range of most plant viruses. However, the value of dS (0.0009) of STV was much lower than most plant viruses, such as citrus leaf blotch virus and potato virus V (dS values of 0.0620 and 0.0540, respectively) [17, 29]. This result suggested a strong negative selection not only at amino acid level but also at nucleotide level which might due to secondary structure constrains [30] and/or codon usage bias [24]. This is supported by the fact that some isolates with identical sequence were collected in different years (Supplementary Table S1) and explains the genetic stability observed. Analysis of the natural selection pressure across the genomic coding regions showed that the positions 31, 36, 53, 138, 157 and 373 were under negative selection, whereas no codon was under positive selection. These sites could be involved in functional or structural domains. At the population level, analysis using the Tajima’s D [26], Fu and Li’s D* and F* neutrality test [27] gave negative values (D = − 1.52041, D* = − 0.41803 and F* = − 0.90146) suggesting negative natural selection, although they were non-significant, which could be due to low statistical power because of the low genetic variation. Finally, no recombination events were found for STV by analysis with the program GARD [25] from the Datamonkey package and RDP v.4.97 [28]. The low genetic variation of STV isolates could preclude the detection of recombinants.

In conclusion, the absence of geographic structure of STV population determined in this work has been also observed in populations of other plant viruses [31, 32] and might be due to worldwide trade in tomato seeds and/or the negative selection pressure.