Introduction

Porcine epidemic diarrhea (PED) is a highly contagious and acute intestinal infectious disease caused by porcine epidemic diarrhea virus (PEDV), a member of the genus Alphacoronavirus, family Coronaviridae. PEDV can cause emaciation and acute watery diarrhea, dehydration, and weight loss in pigs. It can infect pigs of different ages but is especially common in suckling piglets, with mortality rates reaching up to 100% [1, 2]. The PEDV genome is about 28 kb in length and, in addition to nonstructural proteins, encodes four structural proteins: the spike (S), membrane (M), envelope (E), and nucleocapsid (N) proteins. The S protein is a type I membrane glycoprotein located on the viral surface. The S protein is the most variable viral protein and accounts for much of the genetic diversity of PEDV. It is the main viral antigen, containing neutralizing epitopes, and it plays an important role in viral entry into host cells and induction of an immune response [3,4,5,6].

To adapt to environmental changes, the virus mutates frequently, resulting in many different PEDV strains that differ in their virulence and infectivity [7,8,9,10]. These strains are divided into two groups: classical strains (GI) and mutant strains (GII) [11]. Classical strains include PEDV strain CV777, which has been found in Europe, and most cell-culture-adapted mutant strains obtained through continuous in vitro passage, such as attenuated CV777 and attenuated DR13. The virulence of GI strains is generally lower than that of GII strains [12, 13]. Due to widespread vaccination of animals with a vaccine based on the classical strain CV777, PED caused by classical strains has become rare [14, 15].

Since 2010, PEDV mutant strains have dominated globally, especially in China. The morbidity and mortality of the mutant strains in piglets are as high as 100% and 80%-100%, respectively [16], and the current commercial vaccines are not effective against these strains [17, 18]. Amino acid sequence comparisons of S proteins have shown that the mutant isolates have two insertions (55TGEN58 and 136N) and one deletion (157NI158) when compared to the vaccine strain CV777 [19]. Due to their high infectivity and virulence, mutant strains have caused large-scale outbreaks in many countries. The mutant strains can be divided into three subtypes: GII-a, GII-b, and GII-c. The GII-a subtype include strains from the USA and from other countries that have reported the circulation of US-like PEDVs (such as AH2012). The GII-b subtype is mainly composed of relatively early (from about 2010–2020) strains from China (such as AJ1102 and CH/SD2014). The GII-c subtype includes more-recent epidemic strains (such as SD2021 and JX2020) [11, 19]. S-INDEL strains are believed to have resulted from genetic recombination between classical (GI) and mutant (non-S-INDEL) strains [20], and S-INDEL strains are less virulent than non-S-INDEL strains [15, 21].

New PEDV variants are still continuing to arise. To better understand the prevalence and molecular characteristics of PEDV in different regions of China, 30 complete S gene sequences were obtained from positive samples collected in six provinces in China from 2020 to 2023. The S gene sequences were analyzed by bioinformatics, focusing on phylogenetic analysis, analysis of amino acid variation of neutralizing epitopes, and recombination analysis. These data are expected to provide important information for the development of new effective vaccines.

Materials and methods

Design of primers for detection of PEDV and amplification of the S gene

Based on conserved regions in the S proteins of epidemic and classical strains with sequences published in the NCBI GenBank database, a pair of PEDV-specific primers and four pairs of primers for amplification of the S gene were designed using Primer Premier 5 software (Table 1). The primers were synthesized by Nanjing Genscript Biotechnology Co., Ltd.

Table 1 Primers designed to detect PEDV and amplify the PEDV S gene

Detection of PEDV and amplification of the S gene

Samples were diluted with phosphate-buffered saline and centrifuged at 12,000 rpm for 5 min at 4°C, and the supernatants were transferred to 1.5-mL RNase-free tubes. Viral RNA was extracted according to the instructions of the RNA extraction kit (FastPure Cell/Tissue Total RNA Isolation Kit V2, Vazyme, China) and then reverse transcribed into cDNA according to the instructions of the reverse transcription kit (Vazyme, China). PCR was performed using 2× Es Taq MasterMix (Cwbio, China). The resulting cDNA was further amplified using the gene primers shown in Table 1, and the amplicons were cloned into the plasmid pUC19 and sequenced by Nanjing Tsingke Biotechnology Co., Ltd.

Comparison of nucleotide and amino acid sequences

Amplicon sequences from classical and mutant strains were combined to obtain full-length S gene sequences, using the SeqMan program in the Lasergene software package, and BioAiderV1.423 was then used to compare the nucleotide and amino acid sequences with those of the reference vaccine strains CV777 (GenBank accession number KT323979) and AJ1102 (GenBank accession number JX188454).

Phylogenetic analysis

One hundred S gene sequences of representative PEDV strains were selected from the GenBank database (Table 2) for phylogenetic analysis. The 30 S sequences obtained in this study were compared with the reference sequences and used to construct a phylogenetic tree with 1000 bootstrap replicates in MEGA7 and visualized using Chiplot software (https://www.chiplot.online/) [19, 22, 23].

Table 2 PDEV reference strains described in this study

Amino acid sequence alignment

To identify amino acid variations in the 30 isolates from this study, amino acid sequence comparisons of representative sequences and reference sequences was performed using BioEdit.

Recombination analysis

In order to identify possible recombination events, gene sequences were analyzed using RDP4 v.4.101 [24]. Potential recombinant strains, parental strains, and recombination breakpoints were identified using RDP, GENECONU, BootScan, MaxChi, Chimaera, SiScan, and 3Seq. Recombination events that were identified by at least five of the methods with a cutoff of p < 0.05 were examined further by phylogenetic analysis.

Results

Sequence comparisons

From 2020 to 2023, nearly 2000 clinical diarrhea samples were collected from different areas of China, and it was found that the main pathogens causing piglet viral diarrhea in China were PEDV (39.9%), transmissible gastroenteritis virus (TGEV) (8.9%), porcine delta coronavirus (PDCoV) (20.1%), and porcine rotavirus (PoRV) (18.6%). Thirty complete S gene sequences were obtained from the PEDV-positive samples in order to examine sequence variations in recent PEDV strains.

The nucleotide and amino acid sequences the 30 PEDV S genes from this study were compared with those of the vaccine strains CV777 and AJ1102. The nucleotide and amino acid sequences of the thirty PEDV S genes were 93.32% and 92.52% identical, respectively, to those of CV777 and 97.05% and 97.55% identical, respectively, to those of AJ1102 (Table 3). The 30 isolates from this study were similar in sequence to other recently reported isolates.

Table 3 Nucleotide and amino acid sequence identity of the isolates to the vaccine strains CV777 and AJ1102 from this study

Phylogenetic analysis

The PEDV S gene sequences of 30 isolates and 100 reference sequences were used to construct a phylogenetic tree in MEGA. As shown in Fig. 1, the PEDV S genes formed two groups, corresponding to types GI and GII, which were further subdivided into six subtypes: GI-a, GI-b, GII-a, GII-b, GII-c, and S-INDEL. Three strains belonged to the GII-a subtype, accounting for 10%; two belonged to the GII-b subtype, accounting for 6.67%; 20 belonged to the GII-c subtype, accounting for 66.67%; and five belonged to the S-INDEL subtype, accounting for 16.66%. These results show that GII-c strains have become prevalent in some areas of China, and almost GII-c strains were epidemic strains isolated in recent years. This illustrates the importance of continuously monitoring the circulation of new variants.

Fig. 1
figure 1

Phylogenetic analysis based on the PEDV S gene. Multiple nucleotide sequences were compared using MAFFT7, and a phylogenetic tree with 100 bootstrap replicates was constructed using by MEGA7 and Chiplot (https://www.chiplot.online/). Isolates from this study are indicated by a red star. GI-a, GI-b, S-INDEL, GII-a, GII-b, and GII-c isolates are indicated in purple, magenta, pink, yellow, green, and orange, respectively

Alignment of S protein amino acid sequences

Mutations in the coronavirus S gene can affect the immunogenicity of the S protein and the efficiency of binding to its receptor. Previous studies have identified several neutralizing epitopes of PEDV, including E3 (aa 55–70), E4 (aa 82–98), E5 (aa 126–141), S1A (aa 435–485), COE (aa 499–638), SE16 (aa 722–731), SS2 (aa 748–755), SS6 (aa 764–771), and 2C10 (aa 1368–1374) [25,26,27,28,29]. The amino acid sequences of 14 representative strains and 30 isolates from this study were compared using BioEdit. Compared with the GI, GII-a, and GII-b subtypes, most GII-c strains have the amino acid substitutions N139D and I289M (Fig. 2, red boxes). Compared with the GII-b subtype, most GII-a, GII-c, GI, and S-INDEL strains have the amino acid substitutions T492R, T501I, A522S, and A970S (Fig. 2, blue boxes). Five of these amino acid substitutions lead to changes in amino acid polarity and charge (Table 4).

In addition, compared with other reference strains, the five S-INDEL subtype isolates had a unique amino acid deletion at position 139, as well as the amino acid substitutions with N118G, T137S, A138S, and D141G (Fig. 2, yellow boxes). The amino acid changes at position 137–141 are located in the neutralizing epitope E5. There was a deletion (Fig. 2, green boxes) of amino acids 59–62 (QGVN) in the neutralization epitope E3, which were also found in the GI and S-INDEL subtype reference strains; and there was a G87S amino acid substitution (Fig. 2, purple boxes) in the neutralization epitope E4, which was also found in the GI type reference strains.

Fig. 2
figure 2

Analysis of amino acid mutations in the S proteins of 30 PEDV isolates. The sequences were arranged and visualized using BioEdit. The characteristic amino acid mutations of the GII-c subtype are represented by a red box. The common amino acid mutations of the GII-a, GII-c, and GI subtypes are represented by blue boxes. The unique amino acid mutations of five S-INDEL subtype isolates are represented by yellow boxes, those that are the same as in the reference strains of GI and S-INDEL subtypes are represented by a green box, and those that are only the same as in GI are represented by purple box

Table 4 Changes in the polarity or charge of amino acids

Recombination analysis

Recombination analysis of 14 representative reference strains and 30 isolates from this study were performed using RDP4 software. The results showed that six of the 30 isolates from this study were recombinant strains. Of the seven detection methods in RDP4, five identified potential recombination events in XJ2020 and AH-FY2023; six identified potential recombination events in GD-GZ2020-10, GD-GZ2020-13, and AH-HF2022; and seven identified a potential recombination event in XJ2023.

As shown in Fig. 3a, XJ2020 was predicted to have HK2021 as its major parent and XJ1904-34 as its minor parent, with the non-recombinant region most closely related to GII-c strain HK2021 and the recombinant region (nt 976-end) most closely related to GII-c strain XJ1904-34. As shown in Fig. 3b, GD-GZ2020-10 was predicted to have JXPY as its major parent and JXGZ as its minor parent, with the non-recombinant region most closely related to GII-b strain JXPY and the recombinant region (nt 2612–4106) most closely related to GII-c strain JXGZ. As shown in Fig. 3c, GD-GZ2020-13 was predicted to have JXXG-2 as its major parent and AH2012/12 as its minor parent, with the non-recombinant region most closely related to GII-c strain JXXG-2 and the recombinant region (nt 950–1644) most closely related to GII-b strain AH2012/12. As shown in Fig. 3d, AH-HF2022 was predicted to have SDLY2020 as its major parent and SM98 as its minor parent, with the non-recombinant region most closely related to GII-c strain SDLY2020 and the recombinant region (nt 21–714) most closely related to GI-a strain SM98. As shown in Fig. 3e, AH-FY2023 was predicted to have SD2021 as its major parent and ZJU as its minor parent, with the non-recombinant region most closely related to GII-c strain SD2021 and the recombinant region (nt 46–718) most closely related to GI-a strain ZJU. As shown in Fig. 3f, XJ2023 was predicted to have OH1414 as its major parent and SQ2014 as its minor parent, with the non-recombinant region most closely related to GII-a strain OH1414 and the recombinant region (nt 46–711) most closely related to GI-b strain SQ2014.

The above results suggest that XJ2020 is a recombinant of GII-c and GII-c strains, GD-GZ2020-10 and GD-GZ2020-13 are recombinants of GII-c and GII-b strains, AH-HF2022 and AH-FY2023 are recombinants of GII-c and GI-a strains, and XJ2023 is a recombinant of GII-a and GI-b strains.

Fig. 3
figure 3

Recombination analysis of the S genes of isolates XJ2020 (a), GD-GZ2020-10 (b), GD-GZ2020-13 (c), AH-HF2022 (d), AH-FY2023 (e), and XJ2023 (f) using RDP4 v.4.101. The potential recombinant strains, parental strains, and possible recombination breakpoints were identified using RDP, GENECONU, BootScan, MaxChi, Chimaera, SiScan, and 3Seq and verified by phylogenetic analysis. In BootScan and phylogenetic analysis, the potential recombinant strains, major parent, and minor parent are indicated in red, green, and purple, respectively

Discussion

In recent years, PED has become the most important intestinal disease of pigs and has had a serious impact on the pig industry in China. In order to adapt to vaccine immunity and environmental stress, PEDV has been constantly mutating, which has led to a deterioration of the protective effect of existing vaccines [30,31,32]. Therefore, understanding the characteristics of circulating PEDV strains is of great significance for prevention of PED. In this study, it was found that PEDV was still the main cause of piglet diarrhea in the past four years. Sequence comparisons showed that 30 isolates had a high degree of similarity to recent mutants. Phylogenetic analysis indicated that the prevalent PEDV strains mainly belonged to the GII-c and S-INDEL subtypes. Multiple subtypes were found to coexist in several provinces. For example, the S-INDEL and GII-b subtypes coexist in Anhui province, the GII-a and GII-c subtypes coexist in Guangdong province, and the GII-b and GII-c subtypes coexist in Shandong province, which may be related to pig trade between different provinces.

When compared to the vaccine strain AJ1102, the 20 GII-c strains had two characteristic mutations (N139D and I289M), which can serve as markers for differentiating GII-c and GII-b strains [33]. The mutations N139D, A522S, and A970S alter the polarity or charge of the S protein and thus have the potential to affect the virulence and adaptability of the virus and its ability to escape host immune recognition. The A522S mutation is located in the COE epitope and might therefore change the immunogenicity of the virus. The N139D mutation is located in the NTD region, which is important for PEDV binding to sialic acid on the host cell surface [34].

Genome recombination can result in changes in the pathogenicity and transmission abilities of a virus [34]. In this study, we performed recombination analysis on 30 PEDV S genes, and found evidence of six recombination events. The isolates XJ2023 and AH-HF2022 are new strains that resulted from recombination of a GII-c strain with a GI-b strain and a GI-a strain, respectively. The minor parents SQ2014 and SM98 provided partial gene fragments encoding the S protein NTDs for isolates XJ2023 and AH-HF2022, respectively. The substitution of the NTD region of the S gene may change the binding activity of PEDV to sialic acid molecules and affect viral entry into host cells [20]. The isolates XJ2023 and AH-HF2022 are new strains that appear to have been produced by the recombination of Chinese native strains and foreign strains. We speculate that the emergence of new recombinant strains is associated with the introduction of live pigs from foreign countries. Economic globalization and the international trade of live pigs among countries increases the risk of recombination between PEDV strains in different regions, as well as the large-scale transmission and spread of PEDV.

In summary, in this study, 30 PEDV S gene sequences were analyzed, showing that GII-c was the main epidemic subtype of PEDV in China and that subtype GII-c strains had two characteristic amino acid substitutions (N139D and I288M) in their S protein. Five S-INDEL subtype strains have a unique amino acid deletion (139N) and four amino acid substitutions (N118G, T137S, A138S, and D141G). Several strains appeared to be recombinants, suggesting that recombination plays a significant role in the diversification and complexity of PEDV.