Introduction

Human enterovirus A (HEV-A) is a species in the genus Enterovirus, family Picornaviridae. Other species in this genus include HEV-B, -C and -D, Human rhinovirus (HRV)-A, -B and -C, Bovine enterovirus, Porcine enterovirus B and Simian enterovirus A [1]. Most enterovirus infections are asymptomatic, but the viruses in HEV-A, particularly coxsackievirus A16 (CVA16) and human enterovirus 71 (HEV71), are often associated with hand, foot and mouth disease in children [2, 3]. HEV71 has also been associated with encephalitis, meningitis and poliomyelitis-like paralysis [46]. These viruses contain a positive-stranded RNA genome of approximately 7,500 nucleotides (nt), which consists of an open reading frame (ORF) encoding a single polyprotein. The ORF is divided into three regions: P1 encodes structural proteins, and P2 and P3 encode non-structural proteins. It is flanked by 5′- and 3′- untranslated regions (UTR) with a poly-A tail at the 3′-terminus of the 3′-UTR (reviewed in ref. 7).

The enteroviral 3′-UTR was initially thought to be a highly conserved heteropolymeric sequence consisting of two common stem-loop domains (SLDs) X and Y, as found in poliovirus Mahoney type 1 (PV1M) [8]. However variation in the number of SLDs has been shown to occur in members of other species. The 3′-UTR of coxsackievirus B3 (CVB3) and other members of the HEV-B species have been shown to possess an additional stem-loop structure, SLD-Z [9], whilst the 3′-UTR of the HRVs carried only the SLD-Z [10, 11]. The inter-species heterogeneity of the 3′-UTR secondary structure prompted us to examine whether viruses of other species have unique 3′-UTR structures that can be used as important evolutionary markers. Sequence analyses of the 3′-UTR of some enteroviruses have been reported [8, 12]; however, to our knowledge, there have been no comparative studies on secondary structures.

The functional role of the enterovirus 3′-UTR is still unclear. Together with the poly-A tract, the 3′-UTR has been implicated to be involved in both translation and replication [9, 11, 13]. Destabilising point mutations affecting the 3′-UTR tertiary structure have resulted in a lethal phenotype [14, 15]. Structural and biological studies of the CVB3 SLD-Z have demonstrated a relationship to virulence in vivo [9]. Nevertheless, translation- and replication-competent picornaviruses have been generated in which the entire 3′-UTR has been deleted [13, 16].

A potentially functional SLD-Z was recently identified in the 3′-UTR of HEV71 [17]. In this study, we performed a detailed analysis of the sequence within the 3′-UTR of all serotypes belonging to HEV-A. Sequence alignment and RNA folding predictions confirmed the presence of a secondary structure comprising three SLDs. Phylogenetic and SimPlot analysis revealed the significance of this region in determining genetic distance and as a marker of recombination.

Materials and methods

Sequence and secondary structure analysis

Sequences representative of HEV-A used in this study are listed in Table 1. Alignment of the sequences was performed using the Clustal W programme [18]. The secondary structure of the 3′-UTR was predicted using the Mfold version 3.2 programme [19]. When more than one structure was predicted, the structure with lower free energy was selected.

Table 1 List of viruses used in this study

Phylogenetic and SimPlot analysis

Sequences were aligned using CLC Main Workbench 6.1.1 (www.clcbio.com) with the following parameters: gap open cost = 10, gap extension cost = 1, end gap cost = as any other. Trees were generated using the neighbour-joining (NJ) algorithm with 100 bootstrap replicates. CVB3 was selected as the out-group in each analysis. To assess potential recombinatorial relationships, nucleotide sequences were analysed by using the bootscanning method implemented in SimPlot v3.5.1 [20]. A window size of 350 base pairs with steps of 10 base pairs was used for each analysis with strict consensus matching.

Results

3′-UTR sequence and secondary structure analysis of HEV-A

The formation of three stem-loop domains within the 3′-UTR of HEV71 was predicted using Mfold version 3.2 [17]. In the current study, we extended this analysis to other HEV-A serotypes. The 3′-UTR sequences from 19 HEV-A prototype strains (representing all serotypes) were obtained from GenBank (Table 1). Alignment of these sequences confirmed the finding that the distal region of the 3′-UTR is the most conserved, and the proximal region the least conserved (Fig. 1a). The newly identified enteroviruses HEV76, HEV89-92, SV43 and SV46 showed a primary sequence and secondary structure that are distinct from those of the other HEV-As (less than 40 % similarity). The 3′-UTRs of HEV76 and HEV89-92 showed greater similarity to each other, particularly in the SLD-X and -Y regions, than to the CVAs and HEV71. The same can be said for SV43 and SV46. However, in the SLD-Z region, these recently identified enteroviruses varied in sequence and predicted secondary structure (data not shown). Interestingly despite a lack of sequence conservation, the 3′-UTR carried three stem-loop structures in all HEV-A serotypes examined.

Fig. 1
figure 1

3′-UTR sequence and secondary structure of HEV-A. (a) Alignment of 3′-UTR sequences. Conserved nucleotides are shown as (.); SLD borders determined by Mfold are highlighted in grey; the stop codon is shown in a white box. (b) Mfold predictions of the 3′-UTR secondary structure of CVA16-G10 and HEV71-BrCr

The SLD-Z can be divided into two main groups based on consensus sequence and secondary structure (Fig. 1a, b). Group I encompassed a longer sequence commencing immediately after the ORF stop codon at the 3′ terminus of the 3D polymerase gene. Group II consisted of a shorter sequence from six nucleotides downstream of the stop codon. Alignment of prototype strains from HEV-A showed that CVA4, CVA14 and CVA16 carried the group I SLD-Z (Fig. 1a). Other prototypes including CVA2, CVA3, CVA5-8, CVA10 and CVA12 and HEV71 carried the group II SLD-Z. The new enteroviruses and simian viruses carried SLD-Z elements of variable size.

Within HEV71, all but two subgenogroups (B3 and C4) carried the group II SLD-Z, similar to HEV71 prototype strain BrCr [17]. Subgenogroups B3 and C4 carried the group I SLD-Z, reminiscent of those of CVA4, CVA14 and CVA16.

Phylogenetic trees were constructed using the SLD-Z sequence. Each of the trees included the corresponding region of CVB3 as the out-group. Analysis of the prototype HEV-A strains revealed a separation into three distinct groups with strong bootstrap support (Fig. 2a). In one group, the majority CVAs and HEV71 showed a close relationship. These genotypes carried the group II SLD-Z. On the other hand, CVA4, CVA14 and CVA16, all of which carried the group I SLD-Z, formed a separate cluster. The newly discovered enteroviruses formed another group, with HEV76 and HEV89-91 in a tight cluster, whilst HEV92 demonstrated divergence. Interestingly, SV43 clustered with the group I SLD-Z, whilst SV46 clustered with the group II SLD-Z.

Fig. 2
figure 2

Phylogenetic analysis based on the SLD-Z nucleotide sequence of (a) HEV-A prototypes and (b) HEV71 and CVA16 strains. Values at each node indicate bootstrap values, with each bar representing the distance scale for each tree

SLD-Z as a marker for recombination

We were particularly interested in comparing the SLD-Z elements of HEV71 and CVA16, two genotypes of HEV-A that are often co-isolated in HFMD [2123]. A detailed phylogenetic analysis was carried out using the SLD-Z sequences of HEV71 and CVA16 strains (Fig. 2b). Of the eight strains of CVA16 tested, only two strains, the prototype strain CVA16-G10 and CVA16-FY18, carried the group I SLD-Z. Six other strains (from different outbreaks) carried the group II SLD-Z.

We then utilised SimPlot analysis to verify our findings. Analysis of the 3′ end of HEV71-75-Yamagata-Org (subgenogroup C4) against representative strains of all HEV71 subgenogroups and CVA16-G10 showed highest similarity to CVA16-G10 (Fig. 3a). Bootscan analysis showed up to 95 % sequence similarity at nucleotides 6,300-6,700 and >7,100 (Fig. 3b). SimPlot analysis of the full-length genome confirmed our findings and further revealed a recombination crossing point at approximately nucleotide 3,500 (Fig. 4a, b). Nucleotides 1-3,500 HEV71-75-Yamagata-Org showed a high sequence similarity with subgenogroup C2.

Fig. 3
figure 3

Simplot analysis based on the 3D and 3′UTR sequence. Similarity plots (left panels) and bootscan analysis (right panels) were prepared using the (a, b) HEV71-75-Yamagata-Org (c, d) HEV71-26M and (e, f) CVA16-KMM-08 as query sequences

Fig. 4
figure 4

Simplot analysis based on the full-length sequence. Similarity plots (left panels) and bootscan analysis (right panels) were prepared using the (a, b) HEV71-75-Yamagata-Org (c, d) HEV71-26M and (e, f) CVA16-KMM-08 as query sequences

When HEV71-26M-AUS-4-99 (B3) was used as the query sequence, we again found high sequence similarity to CVA16-G10 (Figure 3c). The region with highest similarity (up to 95 % bootscan value) was after 6,700 (Fig. 3d). When the analysis was extended to the full-length genome, a recombination crossing point was again seen at approximately nucleotide 3,500 (Fig. 4c, d). Upstream of this region, the highest sequence similarity was observed with members of subgenogroup B4/B5.

Next, we compared the CVA16-KMM-08 strain to the various HEV71 subgenogroups and CVA16-G10. This strain clustered with others containing a group II SLD-Z (Fig. 2b). Bootscan analysis showed up to 95 % similarity with HEV71-BrCr (subgenogroup A) 5′ of nucleotide 6,500 and 3′ of nucleotide 6,600 (Fig. 3f). A full-length genome comparison revealed a recombination between CVA16-G10 and HEV71-BrCr with a crossing point at approximately nucleotide 3,500 (Fig. 4e, f).

Discussion

Unlike the IRES and cloverleaf regions of the 5′-UTR, the 3′-UTR of enteroviruses is a poorly understood region. In early reports, the existence of tRNA-like terminal structures that may play an important role in RNA synthesis was proposed [8, 24]. Other studies have shown discrepancies in elucidating the importance of the 3′-UTR in tissue culture and in animal models [9, 13, 16]. In the current study, we performed a comparative analysis of published sequences to elucidate the evolutionary significance of the HEV-A 3′-UTR.

Computer-based alignment and secondary structure predictions have proven to be useful in studying the 3′-UTRs of various picornaviruses [810, 24]. The Clustal W sequence alignment programme and the Mfold RNA-folding programme were used to derive a model for the secondary structure of the HEV71 3′-UTR, which consists of three predicted stem-loop structures [17]. In this study, we further showed that the 3′-UTR-associated stem-loop structures exist in all HEV-A serotypes.

The 5′-UTR sequence of HEV-A is closely related to that of HEV-B [25]. Despite a lack of 3′-UTR sequence similarity, our data demonstrated relatedness between members of these two species in the higher-order RNA structure. The HEV-A SLD-Z is reminiscent of the HEV-B SLD-Z [9], although it is shorter.

Detailed analysis of the HEV-A 3′-UTR revealed two groups of SLD-Z elements based on sequence and secondary structure. CVA4, CVA14 and CVA16 carried the longer group I SLD-Z, whilst all other serotypes carried the shorter group II stem-loop domain. Within HEV71, B3 and C4 are the only two subgenogroups carrying the group I SLD-Z. All other subgenogroups tested carried the group II SLD-Z. A distinct clustering of the two groups was shown by phylogenetic analysis.

The newly identified enteroviruses HEV76, and HEV89-92, clustered together on a separate branch as expected, due to their low primary sequence similarity to other HEV-As. However all 3′-UTR sequences tested were predicted to fold into three stem-loop domains, which further points to the importance of secondary structure conservation. Interestingly, SV43 clustered with the group I SLD-Z (longer), whilst SV46 clustered with the group II SLD-Z (shorter). The SV43 and SV46 viruses were found to carry a long and short SLD-Z, respectively (data not shown). Taken together, these data suggest that the size and secondary folding of the SLD-Z play an important part in the classification of these viruses.

Phylogenetic analysis of the 3D gene, situated upstream of the 3′-UTR of HEV-A, has been performed previously [26]. A distinct clustering of HEV71-26M (subgenogroup B3) with two other HEV71 strains, HEV71-SHZH03 and HEV71-SHZH98 (both subgenogroup C4), as well as CVA4, CVA14 and CVA16-G10, was evident. Our data extended these findings to the 3′-UTR region and suggest that both the 3D and 3′-UTR regions of HEV71 subgenogroups B3 and C4 were acquired through recombination.

Inter-typic recombination between co-circulating enteroviruses such as between HEV71 and CVA16 has been reported [27], and HEV71-SHZH03 (subgenogroup C4) has been proposed to be a product of recombination between HEV71 subgenogroup C2 and CVA16-G10-like viruses. This is consistent with our data. Based on SimPlot analysis results for the 3D and 3′-UTR regions (Fig. 3c, d) and the complete genome (Fig. 4c, d), we further propose that HEV71-26M-AUS-4-99 (subgenogroup B3) is a product of recombination between HEV71 genogroup B and CVA16-G10-like viruses. Since HEV71 subgenogroups B3 and C4 were isolated more recently compared to other viruses in the same cluster, these are more likely to be the ‘recipient’ of the group I SLD-Z.

All other HEV71 subgenogroups (including the prototype strain HEV71-BrCr) and HEV-A serotypes tested carried the group II SLD-Z. Six CVA16 strains were also found to carry the group II SLD-Z. Chan and AbuBakar [27] suggested that one of these strains, CVA16 strain Tainan-5079-98 (CVA16-5079-98 in this study), was a product of recombination between CVA16-G10 and HEV71-BrCr. Phuektes et al. [26] further demonstrated the clustering of CVA16-SHZH00 (CVA16-SHZH00-1 in this study) and CVA16-5079 (CVA16-5079-98 in this study) in a branch close to EV71-BrCr, whilst CVA16-Gunnell was found in another group comprising EV71-MS (subgenogroup B2), EV71-5865 (subgenogroup B4) and CVA5. In the current study, we showed that CVA16-KMM/08 is a recombinant of CVA16-G10 and HEV71-BrCr, confirming previously published results.

Taken together, these data suggest that ancestral strains of HEV71 carried the shorter group II SLD-Z, whilst their counterpart in the CVA16 serotype carried the longer group I SLD-Z. The emergence of newer strains of HEV71 and CVA16 carrying the reciprocal SLD-Z could have occurred through inter-typic recombination. This recombination event was not apparent by conventional screening using VP1 sequencing. Thus, SLD-Z is a useful marker to screen for recombination in HEV-A strains.

In the current study, we have confirmed the presence of three stem-loop domains in members of the species Human enterovirus A. We have further revealed two groups of SLD-Z in HEV-A based on the size of the stem-loop domain. Finally, we propose that the SLD-Z is a novel evolutionary marker for recombination in HEV-A.