Introduction

Porcine transmissible gastroenteritis virus (TGEV) (genus Alphacoronavirus, family Coronaviridae) is an enveloped virus with a single-stranded positive-sense RNA genome approximately 28.5 kb in length [1]. The genome has nine open reading frames (ORFs) that encode four structural proteins (spike [S], envelope [E], membrane [M], and nucleocapsid [N]) and five non-structural proteins (ORF 1a/1b, ORF 3a/3b, and ORF7) [2]. These genes are arranged in the order 5′-ORF1a-ORF1b-S-ORF3a-ORF3b-E-M-N-ORF7-3′ [1]. TGEV is a pathogen that infects newborn piglets, causing viral diarrhea and enteritis. The mortality rate in piglets less than 2 weeks old is 100% [3, 4]. All pigs are susceptible to infection by TGEV, but piglets under 2 weeks of age are at especially high risk [4]. TGEV was first reported in the USA in 1946 [5], and since then, cases of TGEV infection (or coinfection) have occurred in pork-producing regions of Europe (England [6], Spain [7], and Germany [8]) and Asia (China [9] and Japan [10]), resulting in significant economic losses.

Molecular and phylogenetic analysis of TGEV isolates has led to genotypic classification into two groups: the traditional group (the Purdue and Miller subgroups) and the variant group [11]. The traditional group has been identified in the USA [12, 13], Europe [6, 12], and Asia [5, 14, 15], whereas the variant group has been identified (and is prevalent) mostly in the USA [12]. However, there have been no reports of molecular and phylogenetic analysis of TGEV strains isolated in Southeast Asia since 1982 [16]. Among the TGEV genes, mutations are most common in S and ORF3, and these are strongly associated with virulence and cell tropism [2]. The S1 subunit of the S protein binds to sialic acid moieties and specific receptors on host cells [2, 17]. Four major antigenic sites (A–D) in the S1 subunit of the TGEV have been mapped at its N-terminus. Of these, antigenic sites A and D are antigenically dominant with respect to neutralization of TGEV in vitro [18,19,20]. The ORF3 gene of TGEV encodes ORF3a and ORF3b, and in many TGEV strains [21,22,23], as well as other coronaviruses such as porcine respiratory coronavirus (PRCV; a respiratory variant of TGEV) [24, 25], porcine epidemic diarrhea virus [26], and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [27], both 3a and 3b often carry deletions. Previous studies have suggested that ORF3 deletions are associated with viral fitness (which is supported by identification of a naturally occurring truncated ORF3 gene) [27,28,29,30,31] or cell adaptation in vitro [13, 32,33,34].

Because little is known about the molecular characteristics of TGEV strains circulating in Vietnam, the aim of the present study was to perform a detailed analysis of TGEV isolated from piglets in Vietnam. We identified a novel TGEV strain (designated “VET-16”) and determined its full genome sequence. Molecular and phylogenetic analysis and additional detailed analysis of the S and ORF3 genes showed that this TGEV strain isolated from piglets in Vietnam has unique molecular features never before identified in other strains of TGEV.

Materials and methods

Sample preparation and RNA extraction

In 2016, a survey of 18 farms in Hanoi, Hung Yen, Lao Cai, Tuyen Quang, and Thai Nguyen was conducted to assess the prevalence of piglet diarrhea in northern Vietnam, and TGEV was detected on one farm in Hung Yen, where mild diarrhea symptoms were observed in suckling and weaned piglets. TGEV-positive fecal samples obtained from piglets were diluted 1:2 in phosphate-buffered saline, and RNA was extracted from these samples using a Patho Gene-spin DNA/RNA Extraction Kit (LiliF Diagnostics, South Korea).

PCR amplification and sequencing

The full-length genome of TGEV was sequenced by one-step reverse transcription polymerase chain reaction (RT-PCR) using universal primers targeting TGEV (Supplementary Table S1) [30] and a HelixCript One-Step RT-PCR Kit [Hot-Taq] [UDG System] (NanoHelix, South Korea). The RT-PCR mixtures (50 μL) comprised 4 μL of RNA template, 2 μL of each F/R primer (10 pmol), 25 μL of 2× Reaction Mix (Hot-Taq containing dUTP), 2 μL of enzyme mix (Hot-Taq containing UDG), and 15 μL of nuclease-free water. The RT-PCR conditions were as follows: UDG activation at 25℃ for 5 min, cDNA synthesis at 55℃ for 50 min, and pre-denaturation at 95℃ for 15 min, followed by 40 cycles of denaturation at 95℃ for 20 s, annealing at 52–55℃ for 40 s, and extension at 72℃ for 1 min 30 s. Post-extension was performed at 72℃ for 5 min. The full genome sequence was divided into 44 fragments, with a 50- to 200-nt overlap between adjacent segments. PCR products were separated by agarose gel electrophoresis and purified by gel extraction. Finally, Sanger sequencing was performed using an Applied Biosystems 3730xl DNA Analyzer.

Molecular and phylogenetic analysis

To analyze their molecular characteristics using BioEdit software (7.2.5 version), the full genome sequences of 39 TGEV strains obtained from the GenBank database were aligned (Table 1). For bioinformatic analysis, we used the nucleotide Basic Local Alignment Search Tool BLASTn to search the NCBI database. For phylogenetic analysis, the full genome, S, and ORF3 sequences were analyzed using Molecular Evolutionary Genetics Analysis 11 (MEGA 11). All phylogenetic trees based on nucleotide sequences were constructed using the maximum-likelihood method with 1000 replicates in MEGA 11.

Table 1 Sequence information for isolated TGEV strains

Recombination analysis

Recombination analysis was performed using RDP4 software (which includes RDP, BootScan, and SiScan) to identify likely parental strains and recombination breakpoints, using default settings. The criterion for identifying recombination breakpoints were a P-value < 10-6 or a recombination score > 0.6.

Results

Whole-genome sequence of the VET-16 strain

Sanger sequencing of the full-length TGEV VET-16 strain revealed that the genome is 27,867 nucleotides (nt) in length, with a 314-nt 5′ untranslated region (UTR), nine ORFs, including ORF1a (nt 315–12,368), ORF1b (nt 12,332–20,368), S (nt 20,365–24,714), ORF3a (nt 24,833–24,922), ORF3b (nt 24,948–25,151), E (nt 25,138–25,386), M (nt 25,397–26,185), N (nt 26,198–27,346), and ORF7 (nt 27,352–27,588), as well as a 279-nt 3′ UTR with a poly(A) tail.

Comparison of the VET-16 genome with those of other strains

Sequence comparisons showed that the ORF1a/b, S, E, M, N, and ORF7 genes of VET-16 were 99.0−100% identical to those of the eight Purdue subgroup TGEV strains and 92.4–97.5% identical to those of four variant group strains. However, comparative analysis of the ORF3 gene revealed that VET-16 strain showed much less sequence similarity than the other 19 TGEV strains, and this was particularly evident for ORF3a (26.6–27.8% identity). It is noteworthy that the ORF3b gene of the VET-16 strain was most similar (74.6% identity) to that of the Miller M60 strain (Table 2).

Table 2 Nucleotide and amino acid sequence homology between the TGEV VET-16 strain and other TGEV strains

Nucleotide BLAST analysis showed that the sequence of the ORF3a gene of VET-16 was very similar to those of previous TGEV and canine coronavirus (CCoV) isolates (Table 3). Indeed, the ORF3a gene of strain VET-16 was 100% identical to that of CCoV, but recombination analysis did not indicate any mixing of VET-16 strain and CCoV sequences in the ORF3a gene.

Table 3 Nucleotide BLAST results for the ORF3a sequence of TGEV strain VET-16 (searched against the NCBI nucleotides database)

Molecular characteristics of the S gene

The S gene (4350 nt) of the VET-16 strain was found to have a 6-nt deletion at nt 1123-1128 (Fig. 1A, B). This deletion has been reported previously in the NEB72-RT, DAE, Purdue P115, WH-1, HX, and HQ2016 strains. This gene also contained four nt substitutions (resulting in three aa changes), a 9-nt (3-aa) insertion in antigenic site D (Fig. 1A, B), and a 3-nt deletion in the TM domain (Fig. 1C, D). The indels in the S gene of VET-16 strain were confirmed by additional Sanger sequencing (Fig. 1B).

Fig. 1
figure 1figure 1

Multiple sequence alignment of the spike genes of TGEV strains. (A) Mutation of nucleotide sequences at positions 1122–1151. (B) Sanger sequencing chromatograms confirming a 3-nt deletion, four nt substitutions, and a 9-nt insertion in the VET-16 strain. (C) A 3-nt deletion in the 3' terminal region. (D) Sanger sequencing chromatograms confirming the presence of the 3-nt deletion shown in panel C. (E) Amino acid mutations in antigenic site D and in the transmembrane (TM) domain. Yellow and purple rectangles indicate nucleotide deletions and substitutions, respectively.

Amino acid 72 of the VET-16 strain is asparagine, whereas that in the Virulent Purdue and Purdue P115 strains is aspartic acid. Amino acid 219 of the VET-16 strain is serine, whereas that in the Virulent Purdue, Purdue P115, Miller M6, Miller M60, H16, and attenuated H strains is alanine. Amino acid 585 of the VET-16 strain is alanine, which is the same as that in the Purdue P115, Miller M60, H16, and attenuated H strains. The amino acid residues in the aminopeptidase N (APN) binding site of the VET-16 strain are identical to those in the DAE and NEB72-RT strains (Table 4). The 6-nt deletion of nt 1123-1128 results in the deletion of two aa residues (N375_D376del) from the S protein of the VET-16 strain (Fig. 1E). These deletions have also been observed in the NEB72-RT, DAE, Purdue P115, WH-1, HX, and HQ2016 strains. The VET-16 strain carries three aa substitutions (V378L, S379T, and D380N) and a 3-aa insertion (F383_F387insWEK) in antigenic site D (Fig. 1E). In addition, the VET-16 strain has a single aa deletion (F1413del) in the TM domain of the S protein (Fig. 1E).

Table 4 Comparison of cell tropism-associated amino acid substitutions in the virulent and avirulent strains

Molecular characteristics of the ORF3 gene

The VET-16 strain has a large deletion (∆725 nt) in the ORF3 gene (Supplementary Fig. S1), which was confirmed by RT-PCR using specific primers (Large-del-F, 5′-GGATGCATAGGTTGTTTAG-3′; Large-del-R, 5′-CCACGTATTGCTATGCTTAC-3′; amplicon size: 1080 bp). Because the start codon of the ORF3a and ORF3b sequence was not deleted, the ORF3 and ORF3b proteins are still expressed, but shortened, resulting in a length of 29 aa and 67 aa, respectively. The length of ORF3b of the VET-16 strain is the same as that of the Miller M60 strain (Fig. 2A). The large deletion was verified by comparing the size of DNA bands visualized on agarose gels using a TGEV-positive control virus; it was also confirmed by analyzing the signal peaks generated by Sanger sequencing (Fig. 2B).

Fig. 2
figure 2

Electrophoresis gel of PCR products, confirming the presence of a large deletion in ORF3 (3a and 3b) of the VET-16 strain. (A) Comparison of gene deletions in ORF3 of the VET-16 strain with those in other TGEV strains and two PRCV strains. (B) PCR of the VET-16 strain, with a large deletion in ORF3, and a TGEV-positive control strain with no deletion in ORF3. Lane M, DNA (100 bp) ladder; lane 1, VET-16 strain; lane 2, TGEV, used as a positive control; lane 3, nuclease-free water; lane 4, no-template control

Fig. 3
figure 3

Phylogenetic trees based on nucleotide sequences of (A) the full genome, (B) the complete spike gene, and (C) the complete ORF3, built using the maximum-likelihood method with 1000 replicates in MEGA 11. Canine coronavirus strain K378 was used as an outgroup.

Phylogenetic analysis

Phylogenetic trees based on the nucleotide sequences of the complete TGEV, as well as the S and ORF3 genes, revealed that the VET-16 strain belongs to a Purdue subgroup within the traditional TGEV group (Fig. 3). TGEV strain VET-16 is closely related to the strains HB, NEB72-RT, and Purdue P115 but distinct from the TFI strain isolated in Southeast Asia. The phylogenetic tree based on the S gene showed that the VET-16 strain is closely related to the NEB72-RT strain, which was isolated from the respiratory tract of an infected animal. Phylogenetic analysis based on the ORF3 gene indicated that the VET-16 was most closely related to the Purdue P115 vaccine strain. RDP, BootScan, and SiScan analysis showed that the VET-16 strain had a P-value > 10-6 and a recombination score < 0.6, and therefore, no evidence of a recombination event was found.

Discussion

Analysis of the complete genome sequence of the VET-16 strain revealed that it contains crucial mutations in the spike gene. The amino acids at positions 72, 219, and 585 of the TGEV S protein are considered to be potential determinants of enteric tropism [14, 35]. A previous study suggested that aa substitutions at residue 72 (aspartic acid → asparagine) and residue 219 (alanine → serine) are associated with a loss of gut tropism [35]. Other studies have suggested that a substitution at aa 585 (serine → alanine) is a marker of attenuation [7, 13, 22]. Here, we found that the amino acids at positions 72, 219, and 585 of the S protein of VET-16 isolated from piglets in Vietnam were asparagine, serine, and alanine, respectively. Therefore, we predict that these substitutions may lead to attenuation of the virus due to a loss of intestinal tropism. A previous study showed that the APN binding site (aa 522–744) of the S protein is also associated with tissue tropism and virulence [13, 35]. Interestingly, the APN binding site of the VET-16 strain was found to have the same amino acid sequence as that of the NEB72-RT strain, which has lost intestinal tropism [14]. The S gene of the VET-16 strain contains a 2-aa deletion (N375_D376del) that is also found in the attenuated strains NEB72-RT and Purdue P115. Previous studies have shown that N375_D376del is also present in recombinant TGEV strains that show reduced replication in the enteric tract, which implies a loss of intestinal tropism [14, 36]. This may mean that the VET-16 strain is attenuated, with a reduced growth rate in enteric tissue.

Antigenic site D (aa 378-392) [20] is a neutralization epitope in TGEV [19, 37]. Unlike other TGEV strains, the VET-16 strain contains three aa substitutions (V378L, S379T, and D380N) and a 3-aa insertion (F383_F387insWEK) in antigenic site D. These six aa mutations might change the 3D structure of antigenic site D, which might in turn affect antigenicity and virulence. Mutations in antigenic site D of the VET-16 strain may change the epitope structure of the antigen. It is suspected that changes in the epitope structure may reduce viral pathogenicity.

The deletion of F1413 (resulting in loss of an aromatic amino acid in the TM domain [17], which anchors the S protein to the viral membrane [38]), may reduce the interfacial and hydrophobic properties of the TM peptide. Indeed, the loss of a hydrophobic or aromatic amino acid has been observed to cause a defect of viral fusion in recombinant SARS-CoV and murine coronaviruses in vitro [39, 40]. Mutations in the TM domain of the VET-16 strain may act as an attenuation factor by weakening cell-to-cell fusion; however, in vitro studies of growth kinetics are required to address this question.

The VET-16 strain also carries notable mutations in the ORF3a/b gene. PRCV, a variant of TGEV that harbors an ORF3a gene deletion, shows a loss of enteric tropism [22, 24]. A previous reverse genetics study demonstrated that deletions in the ORF3a/b gene of recombinant TGEV might be associated with a reduction in virulence and replication in pigs [36]. In comparison to the virulent Miller M60 strain, the Miller M6 strain harbors a large deletion in the ORF3b gene [13]. Taken together, these data suggest that deletions in ORF3 are associated with viral attenuation. The VET-16 strain contains a large deletion in the ORF3 gene, resulting in truncated ORF3a and ORF3b proteins. This may be why the VET-16 strain causes only mild diarrhea in piglets. Evaluation of pathogenicity in newborn piglets is required to examine this further.

In general, TGEVs cause severe diarrhea or enteritis in piglets aged less than 2 weeks; however, on the Vietnamese farm where the VET-16 strain was isolated in 2016, infected piglets showed only mild diarrhea symptoms. These mild diarrhea symptoms are likely to be associated with the molecular features of the VET-16 strain identified in this study. If future studies confirm that the VET-16 strain is indeed attenuated, it may be a potential vaccine candidate. To demonstrate an association between the unique characteristics of the VET-16 strain and reduced pathogenicity, clinical signs such as diarrhea should be evaluated in piglets inoculated with the VET-16 strain.

In conclusion, molecular characterization of the VET-16 strain isolated from piglets in Vietnam identified 10 genetic mutations in the S gene and a large deletion in the ORF3 gene. These genetic data suggest that the VET-16 strain may be attenuated and have reduced enteric tropism. Therefore, the VET-16 strain will be a helpful reference for future studies of TGEV evolution.