Introduction

Classical swine fever (CSF) is a viral infection of pigs with high economic impact world-wide. The causative agent is Classical swine fever virus (CSFV), a small enveloped RNA virus belonging to the genus Pestivirus within the family Flaviviridae. The genome of CSFV has a size of approximately 12.3 kb and comprises a single open reading frame (ORF) coding for one polyprotein which is co- and post-translationally processed into 12 mature proteins [1, 2]. The ORF is flanked by 5’ and 3’ non-translated regions (NTR). While the structural proteins (C, E1, E2, and Erns) are encoded in the 5’-region of the genome, the 3’-region encodes the non-structural proteins (Npro, p7, NS2, NS3, NS4A, NS4B, NS5A, and NS5B) [3, 4] with the exception of the autoprotease Npro which precedes the structural proteins at the 5’-end of the ORF [5]. Classical swine fever virus strains can be assigned to three genogroups with three to four sub-genogroups each [6]. This classification is based on partial sequences, namely 150 nt of the 5’-NTR and 190 nt of the E2 encoding region [7]. Most genogroups show a distinct geographical distribution pattern [8] but there is no clear correlation between a specific genogroup and virulence. Over the last decades, moderately virulent strains of genogroup 2, especially sub-genogroups 2.1 and 2.3, have predominated in Europe and several other regions world-wide [9].

Course and outcome of CSF can vary between acute, chronic, and prenatal forms of infection [10]. The two former are results of postnatal infection with the acute form leading to either death (acute-lethal) or convalescence (acute transient) of the infected animal, and the chronic form always being fatal [10]. Among the typical signs of acute CSF are high fever, general depression, anorexia, gastrointestinal and respiratory signs, ataxia, and hemorrhages [10]. Initial signs of chronic CSF are similar to the acute infection but generally mild. Later, predominantly non-specific signs are observed including intermittent fever, chronic enteritis and wasting. The affected animals may survive for a few months before they eventually die. The outcome of prenatal CSFV infections depends on the stage of gestation and the virulence of the CSFV strain involved [11]. While transplacental infections in early pregnancy often result in abortions and malformations, immunotolerant, persistently infected piglets can be induced when infections take place in the second or third month of gestation (mainly between days 50 and 70). This phenomenon is similar to bovine viral diarrhea virus infection and the induction of persistently infected calves [10]. Affected animals are reported to eventually die from the so-called “late-onset” form of CSF. Recently, it has been demonstrated that a persistence/immunotolerance phenomenon can also be induced in suckling piglets very early after farrowing [12, 13]. Both, the animals displaying the postnatally acquired chronic form of CSFV and the persistently infected animals constantly shed large amounts of virus and are therefore important reservoirs and sources of CSFV. However, it is important to note that persistent infection is a tolerance phenomenon while chronic infections result from an impaired but existent immune response. While the acute courses of the disease have been extensively studied, research into the epidemiologically important chronic infection (here defined as postnatal infection for at least 28 days with constant shedding of virus and no or little antibody response) is hampered by its rare occurrence under experimental conditions. Hence, little is known about host and viral factors favoring the chronic course. Known factors influencing course and outcome of CSFV infection include age and immune status of the host as well as the virulence of the CSFV strain involved [10]. The chronic course of disease is mainly seen after infection with moderately virulent CSFV strains and is characterized by a prolonged period (at least 28 days) of unspecific signs (intermittent fever, chronic enteritis, wasting) and constant shedding of virus [10]. Antibodies may temporarily be present at low titers; however, the host’s immune system seems to be unable to mount an effective immune response. Recently, it has been demonstrated that chronically infected animals show an up-regulation of genes that can inhibit NF-κB- and IRF3/7-mediated transcription of type I interferons [14]. Moreover, activation of natural killer and cytotoxic T-cell pathways seem to be impaired [14]. In addition, it has been shown that genes related to the human autoimmune disease systemic lupus erythematosus were upregulated in animals suffering from chronic disease [14]. The latter findings support the assumption that chronic CSF has a strong immune-pathological aspect. Very little is known about the impact of CSFV genetics and genetic adaptations during the infection process on the manifestation of a chronic disease and in this respect, links between viral gene function (e.g. inhibition of interferon responses or apoptosis inhibition) are scarce. Given the high viral replication in chronic infection, a high degree of genetic plasticity was anticipated, at least similar to that of persistently infected cattle [1517].

Here, we studied the possible influence of the viral genome and its quasispecies composition on the course of CSF. Based on the finding that viruses re-isolated from a chronically infected animal did not indicate an impact of the virus and its quasispecies, additional samples were selected from animal trials performed with different CSFV strains which had led to acute and chronic courses in different animals. These samples were subjected to full-genome virus deep-sequencing. The resulting sequences of the inocula and viruses found in acutely or chronically infected animals were compared at the consensus sequence and at the quasispecies level. It has to be noted that chronic disease courses cannot be reliably induced and that none of the initial trials were conducted to study chronic CSF infections. Thus, only the remaining, sometimes sub-optimal materials of a few individual animals could be investigated over time (the total of samples available to the authors). However, the samples at hand still provided the opportunity to get an orientating dataset that was, to date, missing.

Materials and Methods

Samples, RNA extraction, library preparation, and sequencing

Samples were selected from previously conducted and published animal trials [1821] that had led to the occurrence of chronically infected animals (clinical, virological and serological responses of the included animals are summarized in supplementary table S1). Depending on the availability of suitable blood and leukocyte samples, acute and chronic disease courses were compared at virus level. Details of samples subjected to next-generation sequencing (NGS) are presented in Table 1. All inocula used for the animal trials were sequenced and sample preparation and sequencing of the inocula was done as described below for the samples derived from the respective animal trials. From all samples, total RNA was extracted with Trizol Reagent (Invitrogen, Carlsbad, USA) in combination with RNeasy columns (Qiagen, Hilden, Germany) including on-column DNase I digestion, as recommended by the manufacturer. RNA was converted to double stranded DNA using the cDNA synthesis kit (Roche, Mannheim, Germany) as described in the Genome Sequencer Rapid RNA library preparation guide (Roche).

Table 1 Summary of sample materials and obtained sequencing results

Animal trial 1: In this trial, a group of five weaner pigs (about 6 weeks of age) was intranasally and intramuscularly inoculated with 106 tissue culture infectious doses 50 % (TCID50) of the moderately virulent CSFV “Alfort-p447” (genotype 2.3) derived from an infectious cDNA clone, as previously described [19]. One animal (#43) showed a chronic CSFV infection, and samples of days 10 and 44 post infection were subjected to NGS. In addition, sequencing was attempted from a sample of an acute-transiently infected animal taken at 10 days post infection (dpi). Despite the limited availability of sample material, this trial was used as a proof-of-principle approach. Samples were prepared for pyrosequencing, as previously described [22]. Sequencing was conducted on the Genome Sequencer FLX (Roche) with Titanium chemistry (Roche) according to the manufacturer’s instructions.

Animal trial 2: In this vaccination/challenge trial, a control group of four domestic weaner pigs (6-8 weeks of age) were oro-nasally inoculated with 3.6 x 105 TCID50 of the 4th passage of moderately virulent, genotype 2.1 CSFV isolate ”CSFV1047” [20], isolated in Israel in 2009 [21]. Samples taken at 10 and 41 dpi from two animals, which developed a chronic infection, were subjected to NGS. These samples were supplemented with a sample originating from a pig showing the acute-lethal course of CSF (taken 14 dpi). Unfortunately, sufficient sample material of the original inoculum was not available for all tests. Therefore, the 5th passage was also sequenced to serve as the substitute inoculum dataset, however, since this showed three consensus substitutions compared to the 4th passage, it was used as an individual sample. Sequencing libraries were prepared using the Nextera XT kit (Illumina, San Diego, USA) according to manufacturer’s instructions except for T2-inoculum and T2-CSFV1047-P5 which were prepared as previously described [23]. Additionally, T2-CSFV1047-P5 was amplified over 12 cycles. Sequencing was done with the Illumina MiSeq (Illumina).

Animal trial 3: The corresponding samples were collected during a host response trial [18] with pigs of different breeds (German landrace pigs (12 weeks of age), hybrid pigs (8–10 weeks of age), European wild boar (12 weeks of age)) which were oro-nasally inoculated with 105.5 TCID50 of the moderately virulent, genotype 2.3 CSFV strain “Roesrath” (isolated in Germany 2009 [24]). One wild boar became chronically infected, and samples from days 3 and 14 post infection were used for NGS. In addition to the samples from the chronically infected animal, four samples from animals with the acute-lethal disease course (three domestic pigs of different breeds and one wild boar), and two from transiently infected animals (one domestic pig, one wild boar) were sequenced. Sequencing libraries were prepared as previously described [23] and amplified after library preparation using Nextflex primer (Biooscientific, Austin, USA) and AccuPrime Polymerase (Invitrogen) for 15 cycles. Library T3-inoculum was initially prepared for pyrosequencing as described above (animal trial 1) and subsequently converted into an Illumina compatible library using the Nextera XT kit, as per the manufacturer’s recommendations. Sequencing was done with the Illumina MiSeq (Illumina).

Sequence data analysis

Consensus sequences were assembled from the raw data using the Genome Sequencer software suite (v. 2.6; Roche). To this end, raw reads originating from CSFV RNA were identified by mapping (Genome Sequencer software suite v2.6) the complete datasets against all available CSFV sequences in Genbank, and mapped reads were subsequently used for de novo assembly. In case of data originating from animal trial 2, only partial datasets were used for the initial de novo assembly of complete genome sequences. Finally, to eliminate assembly errors, all raw sequence reads were mapped against the resulting consensus sequences. To compare consensus sequences of complete coding regions, these were aligned using MAFFT [25] within Geneious 8.0.5 (Biomatters Ltd, Auckland, New Zealand).

Population analyses

Quality trimming and duplicate reads removal were performed prior to population analysis using the QUASR pipeline [26] with –l 50 and –m 30 for minimum length and minimum mean read quality, respectively to remove PCR artifacts. For population analysis, the procedure previously described [27] was further developed. In brief, ten equalized read datasets of each sample resulting in an average depth of 500x or 250x for T2 and T3, respectively, were mapped (Genome Sequencer software suite v3.0; Roche) along the respective inoculum consensus genome (forward and reverse) and a detailed 454AlignmentInfo.tsv file was generated using the –nft parameter. All subsequent calculations were performed in R [28]. From the data in file 454AlignmentInfo.tsv, proportions were calculated for every base at every position. For high reliability of the results, only variants with frequencies of at least 0.05 were taken into account when the frequencies between the forward and reverse mapping results did not differ more than 4-fold. Again, this was done to exclude variants which were possibly introduced by PCR bias. For the calculation of Manhattan distances between viral populations, 500 bootstrap replicates were calculated from the complete table representing the frequencies of all bases at all genome positions. Finally, the mean distances of the bootstraps were calculated for each partial read dataset. For plotting, the Manhattan distances were fitted 2-dimensionally using R function cmdscale. For the final plot, the focus of the population locations for each sample was calculated.

Results

For 12 out of a total of 26 samples, derived from 3 different animal trials with different breeds and disease courses, sufficient raw reads were obtained for the assembly of complete genome sequences. In general, the generation of sufficient raw data for consensus genome sequence assembly was possible from the inocula and from samples derived from animals after infection of at least 10 days, more readily for chronically than from acutely infected animals. Moreover, for the samples from animal trials 2 and 3, the raw data also enabled single nucleotide variant (SNV) analyses at the population level.

Prolonged viral replication in chronically infected animals does not give rise to consensus sequence alterations

As a starting point, and to get an impression of the major changes within the viral genomes, raw sequence reads were assembled into consensus sequences comprising at least the complete coding sequences. Regardless of the disease course, comparison of these sequences uncovered no (animal trials 1 and 3) marked differences between sequences determined for the viruses from inocula and those derived from animal samples.

From animal trial 1, two out of the available four samples yielded sufficient raw data. Besides the inoculum (T1-inoculum) for two samples drawn from one chronically infected animal at two time points (T1-DP43C10L and T1-DP43C44S), the obtained data were sufficient for the analysis of the complete coding sequences. Comparison of the available consensus sequences for these samples showed no differences for the complete genome. The sample collected from an acute-transiently infected animal did not contain enough viral RNA to warrant sequencing of the viral genome.

All samples collected from animal trial 2 contained high viral loads that enabled complete genome sequencing. Like in animal trials 1 and 3 (see next paragraph), analyses of the viral genome sequences derived from the samples drawn from two chronically infected animals (T2-DP365C10B, T2-DP365C41B, T2-DP366C10B, and T2-DP366C41B) revealed no differences at the consensus sequence level in comparison with the sequence determined for the inoculum (T2-inoculum). However, in contrast to trial 3, the comparison of the consensus sequence of the sample taken from an acute-lethally infected pig (T2-DP368AL14B) with the consensus sequence of the inoculum (T2-inoculum) unveiled 4 single base substitutions (A3245C, T3381C, C3724A and C8955T), 2 of which were synonymous and 2 non-synonymous. Notably, one synonymous and one non-synonymous substitution occurred within the genetically very stable p7-encoding region. Since the available material from the inoculum of animal trial 2 only permitted sequencing to a median depth of not more than 33, we attempted to add depth by sequencing the 5th passage of this isolate (T2-CSFV1047-P5; one cell culture passage from the initial virus). However, the comparison of the consensus sequences obtained for the 4th and 5th passages revealed three differences in the sequences, namely T1428A, T3603C, and A8934T. Of these, T1428A is non-synonymous causing the amino acid substitution S476R within the Erns protein which is a known cell culture adaptation enabling heparan sulfate binding [29]. Due to the aforementioned changes, the dataset for the inoculum remained unchanged.

For four out of fourteen samples that were available from animal trial 3, sequencing yielded sufficient raw data for full-genome analysis. Like in the other two animal trials, comparison of the consensus sequences obtained from the sample of the chronically infected animal (T3-WB15C14L) uncovered no differences. Unlike sample T2-DP368AL14B, comparison of the consensus sequences obtained for the acute-lethally infected animals (T3-WB16AL14L, T3-DP49AL14L, and T3-DP53AL14L) also uncovered no deviations from the inoculum (T3-inoculum) sequence.

Population analyses uncovers differences in virus evolution between different trials and inocula

For a more detailed insight, we performed viral population analyses. To achieve a high resolution in determining the viral relations we analyzed the population data according to an approach previously described [27] and similarly also applied elsewhere [30]. This analysis takes the frequencies of all detected single nucleotide variants (SNV) into account to calculate distances between populations. This calculation was only possible for animal trials 2 and 3, since for animal trial 1 the sequence depth was not sufficient and no additional sample material was available for additional sequencing. The results of the population-based distance and diversity analyses are summarized in Figure 1.

Fig. 1
figure 1

Metrically scaled Manhattan distances between viral populations. The sizes of the circles represent the diversity of the viral populations (solid, mean variability calculated from the replicates; dashed, mean plus standard deviation). For details about calculations please refer to materials and methods. (A) Viral populations of animal trial 2 (T2). For T2-inoculum no standard deviation could be calculated since the data were not sufficient for replicate calculations based on partial datasets. (B) Viral populations of animal trial 3 (T3) plotted with the same scale as (A). The dotted rectangle shows the region that is enlarged in (C). (C) Enlarged plot of viral populations of T3

The viral population of T2-inoculum had certain diversity, as reflected by the spot size in Figure 1A. Regarding the chronically infected animals, the viral diversity clearly increased only in animal DP366 after 41 days (T2-DP366C41B), and stayed relatively constant in the other samples. In comparison with the inoculum, the viral population drawn from the acute-lethally infected animal (DP368) had also diversified (T2-DP368AL14B). The overall topology of the plot (Figure 1A) implies that in all animals a similar portion of the initial viral population was selected since all populations shifted in the same direction from the inoculum although with different distances. On the contrary, the population of the 5th cell culture passage (T2-CSFV1047-P5) of the original virus isolate shifted in the opposite direction, implying that a different subpopulation of the original strain had further adapted to cell culture. This is in concordance with the consensus sequence analysis. The viral populations of the samples derived from individual chronically infected animals were closely related to each other. In both cases, the viral populations found in samples drawn from the chronically infected animals after 41 dpi moved slightly back in the direction of the inoculum, i.e. they had a slightly higher similarity with the inoculum than did the samples taken after 10 dpi. The population of the acute-lethally infected animal substantially deviated from all populations derived from chronically infected animals. Nevertheless, it was more related to the samples after animal passage than with the inoculum and its 5th cell culture passage.

In contrast to animal trial 2, the diversity and distance analysis of the data available for trial 3 revealed lower distances between the populations although this was accompanied by similar diverse populations as detected in animal trial 2. Figure 1B depicts the results of this analysis for animal trial 3 at the same scale as Figure 1A for animal trial 2. In the enlarged plot (Figure 1C) of the distance and diversity analysis for trial 3 it is visible that the diversity initially present in the inoculum in trial 3 had substantially increased after the animal passages. Moreover, while the analysis of animal trial 2 unveiled that the same portion of the population of the inoculum was selected (as visualized by the concurrent shift of the spots of the animal samples) and further diversified, there is no direction visible in the graph for animal trial 3. Rather, the inoculum is located centrally among the populations drawn from the animals.

Detailed analyses of the viral populations fit distance and diversity analyses and provide functional clues to detected variants

Although the sequencing depth for the inoculum in animal trial 2 (T2-inoculum) was less than that for the other samples, the variants detected in the population appear to be reliable. This is implied by the concordance between the variable positions detected in the populations of both the inoculum (T2-inoculum) and its subsequent passage (T2-CSFV1047-P5), albeit at different frequencies (table 2, e.g. 1428, 3603, 8934). Moreover, the presence of variants at the same positions in the populations derived from animal samples strengthens this assumption.

Table 2 Summary of variants detected in the populations of samples derived from animal trial 2, mean percentages and standard deviations

A closer look at the detected SNVs supports the implication of the distance and diversity analyses that in animal trial 2 a selection against cell culture adaptations of the inoculum population took place. Some examples for this selection are provided in the following: First of all, the frequencies of the three variants that were detected in the inoculum (Table 2; T2-inoculum; 1428A, 3603C, 8934T, with frequencies of roughly 0.4) were clearly reduced after the animal passages, but were enhanced to frequencies of approximately 0.8 in the subsequent cell culture passage (T2-CSFV1047-P5). Although all three were detected with clearly reduced frequencies in the animal samples, one (1428A) of these was still present in all viral populations after animal passage, but the other two (3603C, 8934T) were absent from the population sampled in the acute-lethally infected animal. Although variant 2317A (Table 2) was not detected in T2-inoculum, it was present in all animal derived populations. Secondly, most variants detected in the populations from chronically infected animals occurred in a pairwise manner, i.e. a variant that was present at 10 dpi was also present at 41 dpi, as was the variant at position 3192 in animal DP366 (Table 2; T2-DP366C10B, T2-DP366C41B). Conversely, those variants that were absent from one of the populations were not detected in the corresponding sample as well (see variants at positon 3192 in animal DP365). Thirdly, a substantial number of variants occurred exclusively in the viral population of the acute-lethally infected animal DP368 (Table 2; T2-DP368AL14B). Four of these variants (3245C, 3381C, 3724A, 8955T) were found with a constant frequency of roughly 0.75. These determine the consensus and contribute substantially to the distance of the respective population from all other populations. The exclusive differences are the basis for the clear separation of DP368 from the virus populations derived from the chronically infected animals.

In the samples collected in animal trial 3, a different situation was observed. On the one hand, no samples drawn from the same animal at different times were available, rendering the correlation of variant frequencies over time impossible. On the other hand, more variants were present in the inoculum population, albeit at lower frequencies. These variants were all maintained in the viral populations in all but one of the animal samples (Table 3; variant 1957A was absent from the population in sample T3-DP53AL14L).

The scattering of the populations in variable directions and distances from the inoculum can be explained by the occurrence of different variants which are mainly found in individual samples. Four exceptions from this observation were detected (Table 3; 472A/472C, 2655C/2655G, 7032G, 8455T). Of note, at nucleotide 472, the two different variants 472A and 472C were detected in a background of 472T in samples T3-WB16AL14L and T3-DP53AL14L, respectively.

Table 3 Summary of variants detected in the populations of samples derived from animal trial 3, mean percentages and standard deviations

Both in animal trials 2 and 3, a number of non-synonymous variants were detected within the viral populations (Tables 2 and 3). In trial 2 at four positions and in trial 3 at five positions 6 non-synonymous variants were detected. The density of non-synonymous variants was higher in the regions encoding proteins for direct interaction between virus and host, i.e. the surface proteins of the virus. In virus populations derived from animal trial 2, two of the four non-synonymous variants were located in non-structural protein encoding regions, namely in the p7 and in the NS2 encoding regions. The other 2 non-synonymous variants were located in the Erns and the E2 encoding regions, both known as antibody targets. In the populations analyzed from animals of trial 3, three out of six non-synonymous SNVs were located in genomic regions encoding non-structural proteins. Two of these were different variants at the same position in the Npro coding region. Another non-synonymous SNV in samples T3-WB16AL14L and T3-DP49AL14L was located in the NS4B encoding region. One of the other 3 non-synonymous SNVs was located within the Erns and the remaining 2 non-synonymous SNVs were in the E1 encoding regions, respectively.

Discussion

Chronically infected animals are among the crucial factors impacting on disease transmission and perpetuation of long-term outbreaks. While their role might be limited in outbreak scenarios, under modern industrialized settings with a stamping out policy, they gain importance in endemically infected countries and wild boar populations. In the latter, chronic and persistent infections are probably the most important drivers of long-term persistence of CSFV in a wild boar population [31] and harbor the threat of reintroducing CSFV into naïve populations.

The factors influencing the development of chronic infections are still far from being understood although chronicity seems to be linked to the moderate virulence of the CSFV strain involved [14]. One reason for the lack of data is the rare and rather unpredictable occurrence of chronically infected animals under experimental conditions and the limited availability of comparable sample sets. In the present study, we collected samples from different animal trials that had given rise to acute and chronic infections and accepted the diversity of sample matrices, viral load, and background. None of the studies was performed to investigate chronic infections and thus, remaining and sometimes suboptimal samples had to be employed.

Essentially, we tested the hypothesis that chronic infections could result from changes in the viral consensus sequence and/or a higher viral diversity and thus (partial) immune escape and continued high-level replication.

However, our results do not support this hypothesis. Even under the condition of chronic infection, CSFV is surprisingly stable at both the consensus and the quasispecies level contrasting with other infections, e.g. persistent BVDV [17]. High stability was shown in different settings with both field-type viruses with expected viral quasispecies, and viruses derived from a cDNA clone with lower quasispecies diversity. Single base substitutions were only observed for one acute-lethally infected animal showing two synonymous and two non-synonymous substitutions. Interestingly, two of these substitutions (one of them non-synonymous) occurred in the p7-encoding region. This contrasts findings with similar strains under field conditions, where this region was completely stable for the same strain over years [32]. The fact that the known cell culture adaptive Erns mutation was found in the 5th passage of CSFV “Roesrath” adds reliability to the analyses and can act as a positive control.

Subsequent population analyses also did not reveal any marked differences between acute-lethally and chronically infected animals. However, an accordant drift was observed for all animal-derived samples in trial 2, and the virus population from both one chronically and one acute-lethally infected animal showed an increased diversity. In trial 3, viral sequence diversity increased but distances were smaller. Again, no differences were seen among disease courses. Thus, also the direction and extent of genetic changes did not indicate factors favoring chronicity.

Detailed analyses of SNV of samples from trial 2 suggest that the observed changes (the observed drift) were mainly selections against cell culture adaption. This assumption is strengthened by the fact that the further passaged virus (until passage 5) shows mutations in the opposite direction and that these selections are missing in trial 3 where a virus was used that showed a field-type phenotype, both at the sequence level and in in vitro experiments [33, 34]. Furthermore, this virus showed a higher quasispecies diversity, as is expected for a successful field virus infection.

Yet, the high overall stability and rather low diversity under selective pressure and high replication is surprising as the existence of a broad quasispecies has often been discussed as a virulence and robustness factor [35]. Moreover, reduced quasispecies diversity was shown to result in virus attenuation [36]. An explanation could be that most variants were non-functional or disadvantageous for virus replication and therefore remained at low frequency. This would be in line with recent studies that showed that a majority of virus variants are non-functional [37]. Furthermore, it could be shown that highly virulent CSFV clones could be generated based on only the consensus sequence [38].

Based on the limited data set that was available for analyses, it can be stated:

  • CSFV seems exceptionally stable even under conditions of chronic infection with high and continued viral replication,

  • No marked differences in the viral genomes were observed between an acute-lethal and a chronic infection with the same CSFV strain.

  • The disease courses seem to be independent of the viral quasispecies (no predictors for chronicity),

  • Host factors and virus host interactions need further investigation, preferably targeting a larger and more consistent data set.