Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 The Time Is Right to Apply Genomic Tools for Improvement of Complex Health Traits in Pigs

In the last few decades, growth, meat quality, as well as feed and reproduction efficiency have been the most well-studied traits in swine breeding; however, in recent years, pig performance in the face of disease challenge is becoming progressively more important. Hence, selection objectives in the swine breeding industry have broadened to include traits that reflect overall robustness and disease resistance (Mellencamp et al. 2008). Heritability of cellular immune traits associated with resistance are often very high (Flori et al. 2011), so genetic selection towards more resistant pigs is certainly a feasible method to improve both animal production and welfare, but the possible existing trade-off with other traits should be kept in mind (Rauw 2012; Stear et al. 2001). However, the biology behind resistance towards even a single pathogen is highly complex and dynamic, creating an opportunity to apply systems immunology or systems biology to improve disease resistance (Kidd et al. 2014). The integration of experimental and computational research would allow a better understanding of these complex biological systems (Hollung et al. 2014), and high-throughput technologies, measuring thousands of parameters at once, would provide the requisite datasets (Kidd et al. 2014). To date, substantial whole genome, transcriptome and proteome data have been collected with regard to several economically important swine diseases; metabolome and microbiome datasets are also growing. The biggest challenge lies in bringing the data together to understand the immune responses in a comprehensive way and to use such information to improve pig health practically and sustainably.

2 Systems Biology Tools and Their Use in Pig Disease Studies

A multitude of pig disease genetics studies make use of knowledge gathered by genome-wide association studies (GWAS) through examining possible associations between single nucleotide polymorphisms (SNPs), insertions, deletions or copy number variants (CNVs) and the disease of interest (Arakawa et al. 2015; Fowler et al. 2013; McKnite et al. 2014; Sharma et al. 2015). When a mutation is found to be associated with the disease trait, one can select for it, with consideration of additional effects the mutation might have on other traits. At the transcriptomic level, differential expression (DE), usually over time or between disease states, can be informative. When the expression level of genes that differ between diseased and healthy phenotypes can successfully be repeated in other populations, they can be used as biomarkers. This is called signature-based analysis (Bebek et al. 2012). However, a single marker or a set of marker genes is usually not enough to explain or predict a complex phenotype. Integrative analyses merging gene expression profiles with pathway data have been shown to be helpful in understanding immune responses (Sahadevan et al. 2014). The gene set enrichment analysis (GSEA) algorithm is a powerful method to find enriched pathways in the transcriptome (Subramanian et al. 2005). With this method, a pathway is scored according to how many and how enriched the genes representing the pathway are in the extreme up-regulated or down-regulated lists of genes. A similarly well-known annotation tool is the Database for Annotation, Visualization and Integrated Discovery (DAVID) (Huang da et al. 2009), which works together with the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway databases (Kanehisa et al. 2014) for pathway visualization. InnateDB can be a powerful tool to annotate sets of genes to specifically examine the innate immune response (Lynn et al. 2008). One or more of these tools are typically used when analyzing pig transcriptomic data from microarray or RNA-seq. A few drawbacks in these pathway-based analyses are that well-characterized pathways are easier to find than those that describe the function of less studied genes and may be overemphasized in such annotation analyses. There is also the assumption that expression patterns of genes coding for proteins in a pathway should show a clear correlation with others in the same pathway, which is not necessarily always the case (Bebek et al. 2012).

However, systems biology goes further than a genome or transcriptome study and the annotation of overrepresented and underrepresented pathways. It is a combination of knowledge concerning several biological system parts, e.g. DNA, RNA, proteins, cells, tissues, organs, organisms and ecologies. Rather than producing these data and solely giving a summation of the outcome in each field, systems biology attempts to understand how the parts work together to create emergent properties that are less likely to be observed (recognized) in analyses of only components of the system. A further goal, specifically for systems immunology, is predicting how, in the light of health research, genetic and regulatory interactions, as well as environmental factors, orchestrate responses to disease (Tuggle et al. 2011). In what follows, we describe network-based and multilevel data integration analyses and, although both methods are still in their infancy in pig disease studies, we provide examples and illustrate their use.

2.1 Network-Based Analyses of Porcine Immunological Responses

One emerging network-based tool is the Weighted Gene Co-expression Network Analysis (WGCNA) developed by Langfelder and Horvath (2008). WGCNA was originally developed for microarray analyses, but is also applicable on RNA-seq data (Langfelder and Horvath 2008). Whereas in DE analyses only genes that pass an arbitrarily defined statistical threshold for DE are used for analysis, in WGCNA genes with a similar expression pattern across the experiment are clustered together into modules. Thus, genes with only a small but consistent difference over time or between phenotypic groups can be clustered in a module and will be considered. A unique and useful component of the WGCNA package is that, after clustering, the calculated eigengene of a module (defined as the first principal component of that module) can be correlated with an external numerical or categorical trait. Correlation coefficients together with nominal p-values indicate the strength of a module’s relationship to the trait of interest. To understand their biological relevance to the trait, the genes in a module can then together be analyzed with GO annotation tools.

With regard to disease in pigs, Kommadath et al. (2014) used WGCNA to examine the blood transcriptome in pigs infected with Salmonella enterica serovar Typhimurium (ST) and grouped in extremes for the amount of fecal shedding bacteria, e.g. eight low shedders versus eight high-shedding animals (for more details, see Section 4.6). Four modules were correlated with shedding, two of which were annotated for immune functions and many of the immune genes in these modules were up-regulated 2 days post-inoculation. Some of the genes found by this method were already known to be related to a Salmonella infection such as SLC11A1, TLR4, CD14 and CCR1. For others, such as SIGLEC5, IGSF6 and TNFSF13B, the association with Salmonella shedding was novel (Kommadath et al. 2014). In a PRRS microarray study comparing four phenotypic groups of animals with extremely different growth rates and viremia levels after a PRRS virus (PRRSv) infection, limited information was obtained through linear modeling of blood gene DE that contrasted pigs with these extreme phenotypes. However, when using WGCNA, an interesting immune-related module was found containing cytokines, chemokines, interferon type I stimulated genes, apoptotic genes and genes regulating complement activation. The eigenvalue of this cluster for each pig’s data correlated both with weight gain (WG) after 42 days post-infection with PRRSv and the WUR10000125 (WUR) SNP genotype on Sus scrofa chromosome 4 (SSC4), which explained a large proportion of the genetic variance for viral load and, to a lesser extent, weight gain (Boddicker et al. 2012). The genes in this WGCNA module could be useful targets for further selection against PRRS resistance (Schroyen et al. 2015). For more details, see Section 4.2 entitled “Transcriptomic Analysis of Host Response to PRRSv”.

Although not directly relevant to pig health, but using the pig as a model for human health, Kogelman et al. (2014) applied WGCNA on RNA-seq of subcutaneous adipose tissue from 36 pigs with different risk levels for obesity. The module that showed the highest correlation with obesity-related traits contained 275 genes. The most significant GO term defining this cluster was “osteoclast differentiation” and osteoclasts are derived from macrophages, an immune cell type highly up-regulated in obese individuals. Other immune-related GO terms enriched in this gene list involved natural killer cells and B cell receptor signaling pathways, enlightening the association between obesity and immune-related complications (Kogelman et al. 2014).

Partial Correlation and Information Technology (PCIT) (Koesterke et al. 2013, 2014; Reverter and Chan 2008), together with the regulatory impact factor (RIF) and phenotypic impact factor (PIF) algorithms (Reverter et al. 2010) were also used to examine differences in networks drawn from different biological states. With PCIT, the co-expression correlation between each gene pair in a network is calculated and changes between different phenotypic groups are noted. RIF and PIF algorithms compute differential wiring between nodes for different treatments or groups to identify novel regulators. Using PCIT, Schroyen et al. (2015) found tighter connections to genes in the immune activation pathways in the low weight gain group compared to the high weight gain group after PRRS infection, indicating that one of the most significant differences between these two phenotypic groups was an immune network response. However, when comparing WGCNA and PCIT results, the WGCNA method seems to be more sensitive, since the PCIT algorithm removes edges (gene interaction measures), which can sometimes lead to an underestimation of the importance of a hub gene, and has consequences for biological interpretations (Kadarmideen and Watson-Haigh 2012).

2.2 Multilevel Data Integration Analyses of Pig Disease Biology Are Sparse

To date, there are only a few examples of integration of multiple -omics datasets in research on pig disease. Most common data integration strategies are comparisons between the transcriptome and miRNAome, since the correlation analysis of the mRNA transcriptome and miRNAome data can reveal and explain the control of reciprocal expression patterns of predicted target mRNAs. An example is the negative correlation between expression levels of miRNAs and their predicted target genes in the swine leukocyte antigen (SLA) complex region found when comparing mRNA-seq and miRNA-seq data for liver, longissimus dorsi and abdominal fat (Endale Ahanda et al. 2012). The SLA region was chosen since this region is highly associated with immune response traits in pigs, for instance, in case of infectious diseases or after vaccination (Lunney et al. 2009), and miRNAs can play a crucial role in fine-tuning this immune response. With TargetScan, PACMIT and TargetSpy, several polymorphic miRNA target sites were found and SNPs in these 3′ untranslated regions (3′-UTR) were predicted to lead to altered miRNA regulation patterns (Endale Ahanda et al. 2012).

Bao et al. (2014) examined the buffering capacity of miRNAs in response to a Salmonella infection, i.e. the ability to lower the expression variation of target mRNAs, rather than changing their expression level. A significant buffering capacity was seen in lowly to moderately expressed target mRNAs when compared to non-target mRNAs, but this difference was not seen for highly expressed genes. In response to infection, at 2 days post-infection (dpi) in both up-regulated and down-regulated genes, an additional buffering capacity was noticed for the target mRNAs, which was not the case for the non-target mRNAs. This result was interpreted as indicating that such miRNAs cause the existing transcriptional network to rewire more tightly after infection (Bao et al. 2014). Other examples of miRNA–mRNA comparisons in pig disease studies will follow in the example sections of this chapter.

Another example of multilevel data integration analysis can be seen in the combination of GWAS and transcriptomic data. The first expression quantitative trait loci (eQTL) studies in pig were conducted to examine muscle development, carcass and meat quality traits; however, more and more eQTL studies have focused on pig health (Ernst and Steibel 2013). Ponsuksili et al. (2012) investigated the relation between SNP markers from the PorcineSNP60 BeadChip, gene expression in liver and muscle measured with an Affymetrix porcine genome array and plasma cortisol levels, which is important in regulating immune function. They used the network edge orienting (NEO) R software package to predict causal interaction between the three datasets and found 26 and 70 candidate genes in liver and 2 and 25 candidates in muscle to affect or respond to plasma cortisol levels, respectively (Ponsuksili et al. 2012). Chomwisarutkun et al. (2013) used a custom-designed microarray targeting previously detected QTL regions to find candidate genes for inverted teat defects as opposed to an earlier study which used a commercially available array. They found a number of DE genes in both epithelium and mesenchyme, almost all belonging to cell signaling pathways and encoding many members of the signaling cascades of growth factors (Chomwisarutkun et al. 2013). Reiner et al. (2014) used an Affymetrix porcine genome array and found 193 cis- and trans-eQTL, including 55 eQTL in a functional hotspot on SSC13, and they identified several candidate genes for a genetic predisposition for susceptibility to Actinobacillus pleuropneumoniae. With the increase of RNA-seq data, it has now become quite easy to assess allele-specific expression in heterozygous individuals. For an example on PRRS and allele-specific expression, we refer to the study done by Koltes et al. (2015) described below (see Section 4.1).

3 Visualization Tools Improve Our Ability to Identify and Interpret Complex Relationships

With the increase of -omics data and the complexity of data analyses, data visualization is becoming fundamental for the interpretation of high-dimensional molecular interactions. Tools to visualize GO enrichment analysis results, such as Gorilla (Eden et al. 2009), AmiGO (Carbon et al. 2009), Panther (Mi et al. 2013), REVIGO (Supek et al. 2011) and others, are freely available. In addition, there are also a few network expression tools available. One well-known tool to visualize large datasets is Cytoscape (Shannon et al. 2003). Cytoscape is an open source software platform that easily can be customized with plug-ins and shows data as nodes and edges in a network to which multiple levels of annotation can be added and in which genes can be selected or filtered out. Another freely available program is BioLayout Express3D (BE3D), which draws co-expression networks (Freeman et al. 2007). A Pearson’s correlation coefficient threshold decides which genes (nodes) are kept for visualization and a Markov clustering algorithm defines genes with similar expression patterns into clusters. Within BE3D are numerous user-defined variables for displaying these clusters, including the ability to label nodes with any user-inputted variable. For example, it is possible to overlay onto a gene expression-based network a visualization of correlation of such expression to an external trait such as pathogen level or growth during infection for the pigs in the study.

Kapetanovic et al. (2013) analyzed the expression profiles of pig alveolar macrophages (AM), bone marrow-derived macrophages (BMDM) and monocyte-derived macrophages (MDM) at 0 and 7 h after LPS stimulation. After stimulation, the expression profiles of AM were clearly distinct from those of BMDM and MDM, indicating a different regulation of LPS-stimulated genes in these macrophages. They also used the tool to compare expression patterns after stimulation between human, mouse and pig macrophages and showed clusters of genes with up-regulated expression patterns in human and pig that were not up-regulated in mouse macrophages or vice versa (Kapetanovic et al. 2013). It is even possible to use tissue-specific expression patterns from microarray data from many tissues obtained from healthy pigs to visualize the relationships of immune cells and their expression patterns versus other cell types (Freeman et al. 2012).

In Schroyen et al. (2016), BE3D identified clusters of genes whose expression patterns measured by RNA-seq differed between susceptible and more resistant animals in response to PRRS according to the WUR SNP, which will be described below (see Section 4.1). One cluster of 516 transcripts showed an apparent dissimilarity between the two contrasted groups and could be linked to signaling pathway differences involved in viral entry and replication.

Another example of the successful use of BE3D was described by the immune response annotation group (IRAG) (Dawson et al. 2013). IRAG was able to improve the characterization of the pig immunome by using correlation network analyses of transcriptomic data. In this massive study, genes were clustered according to their expression patterns in blood macrophages and lymph nodes derived from a multitude of pig stimulation, infection and disease studies. A cluster of 619 probesets, representing at least 511 transcripts, was significantly enriched for human immune-related GO terms. Since only 16% of these genes had been annotated in the pig, evidence was provided for the involvement of over 500 genes in immune responses that had not previously annotated for function in immune response processes (Dawson et al. 2013).

4 Bloodomics

More and more studies aiming to genetically improve livestock’s robustness involve whole blood to define the immune capacity or immunocompetence of an individual to different stimuli (Mach et al. 2013) and potentially identify predictive biomarkers for resistance or resilient pigs (Huang et al. 2011). The term “bloodomics” encompasses all molecular profiling -omics tools that have been applied to peripheral blood, in which the blood transcriptome plays an influential role (Mohr and Liew 2007). For the immune system, blood is a very relevant tissue, since cells of the immune system circulate between central and peripheral lymphoid organs as well as migrate to and from sites of injury via the blood (Chaussabel et al. 2010). Whereas in 2002, very few blood transcriptomic studies were executed on any animal species, by 2014, a significant number of studies based on the blood transcriptome had been published on several animal species, and in particular for cattle and pigs as livestock species (Chaussabel 2015; Schroyen and Tuggle 2015).

Whole blood studies have several advantages such as the ease of collection and the repeated sampling of the same individual during response to a stimulus, which allows accurate within-animal comparison back to the baseline prior to infection. Examining whole blood also facilitates the ability to develop a genetic marker screening based test, which should be relatively easy to obtain on a large scale in a commercial setting given that blood sampling is a common surveillance method in veterinary practice. Genes expressed in peripheral blood cells have been shown to reflect molecular mechanisms underlying differences in production traits and it can be an easily accessible source of information when monitoring physiological changes (Jegou et al. 2016). The genetic blood markers could include total and differential white blood cell counts, peripheral blood mononuclear leukocyte subsets and acute phase proteins, specific and non-specific antibodies, cytokines, as well as a set of differentially expressed genes between a healthy and diseased status. In Clapperton et al. (2009) and Flori et al. (2011), sets of porcine immune trait markers that can be used for selection, together with their heritability coefficients, are listed. However, since whole blood comprises a varying number of cell types, gene expression and protein differences from sample to sample should be interpreted with great caution. Gene expression patterns are highly dependent on the composition of the underlying cell population. Knowledge on immune cell specific expression could help with the investigation of exactly which cells are activated (Abbas et al. 2005). Computational methods such as cell type enrichment analysis (CTEN) (Shoemaker et al. 2012) or the tissue expression module in the annotation tool DAVID, used effectively by Hulst et al. (2013), could give an idea of the cell types dominating the whole blood transcriptome/proteome response. Complete blood counts (CBCs) as a covariate in statistical analyses can be adjusted for such differences across replicate blood samples. Furthermore, with such CBC data, the transcriptional response data can be deconvoluted to help identify the unique regulatory control of specific cellular responses to pathogens (Shen-Orr et al. 2010).

As with systems biology in general, one of the current hurdles with the interpretation of data from blood transcriptomic research is the organization of the data and the integration of different components such as sample information, quality of data, clinical information collected at the time of sampling and results of other cellular and molecular platforms (Chaussabel et al. 2010).

Example 1: Overview of -Omics Studies Concerning Porcine Reproductive and Respiratory Syndrome (PRRS) in Pig

Porcine reproductive and respiratory syndrome (PRRS), also known as mystery swine disease or blue ear disease, emerged in the late 1980s and 1990s and is to date one of the most economically important diseases affecting pigs worldwide (Holtkamp et al. 2013; Zimmerman 2003). The disease is caused by a single-stranded RNA virus belonging to the Arteriviridae family and, as its name reflects, affects two branches of the pig breeding industry. On the one hand, there are severe reproduction losses due to infertility, late-term abortions and mummified and stillborn fetuses. On the other hand, grower-to-finisher pigs suffer from serious pneumonia, which leads to increased pig morbidity and mortality rates (Rossow 1998). Depressed growth rates in subclinical infections are also significant, and to date production costs are estimated at $664 million a year, and that is only for the USA (Holtkamp et al. 2013). It is therefore not surprising that many efforts were made to understand PRRSv and its replicative life cycle, but the host point of view during PRRSv infection is also extensively studied. In this section, we give an overview of the different host-related -omics studies performed (Fig. 1) and, whenever present, the systems biology approaches utilized.

Fig. 1
figure 1

Overview of -omics studies concerning porcine reproductive and respiratory syndrome (PRRS) in pig. For more details, see “Example 1: Overview of -omics studies concerning porcine reproductive and respiratory syndrome (PRRS) in pig”

4.1 Linking Host Genomic Variation to Responses to PRRS

The first studies on host genetic variation associated with variation in response to PRRS used a limited set of SNPs. Galina-Pantoja et al. (2006) examined the association of phenotypes with 60 SNPs targeting host genes known to be associated with virus replication and viral entry into cells, as well as genes for receptors, macrophage and other innate immunity functions. They showed that in sows before and after infection with the virus, several of the SNPs tested were found to be associated with reproductive traits such as number of piglets born alive (Galina-Pantoja et al. 2006); these experiments were also summarized by Mellencamp et al. (2008). However, resistance is a complex and polygenic trait with substantial environmental influences; therefore, it is clear that selecting the best DNA marker or the best marker combination is complicated. Markers have to be consistent across datasets and they must have a positive effect on multiple traits and not be favorable for some and detrimental for others. Wimmers et al. (2009) used 88 markers, including 72 microsatellites and 16 biallelic markers, to find loci controlling the immune responsiveness in grower-to-finisher pigs. They screened for quantitative trait loci (QTL) by measuring complement activity, acute phase response and antibody response in animals before and after vaccination against Mycoplasma hyopneumoniae, herpesvirus I and PRRSv. In total, 21 QTLs were detected with a genome-wide significance level of 1%. These QTLs harbor several candidate genes for the traits examined (Wimmers et al. 2009). Uddin et al. (2011) used a panel of 79 microsatellites and 3 biallelic markers to search for immune-related QTLs. As innate immune traits they measured interleukin 2 (IL2), IL10, interferon gamma (IFNG), Toll-like receptor 2 (TLR2) and TLR9 levels in serum before and after vaccination with M. hyopneumoniae, PRRSv or tetanus toxoid (Uddin et al. 2011). The five traits were influenced by earlier described and newly found QTL on multiple chromosomes, implying multiple genes involved. Several candidate genes contributing to immune function were proposed for the three different vaccination experiments (Uddin et al. 2011).

However, although such analyses do help to discover regions containing QTL of interest, denser marker sets such as the porcine 60 K SNP chip could fine map the underlying genetic basis for these immune responses. However, substantially larger datasets are needed for such analyses. Serão et al. (2014) used the porcine 60 K SNP chip to perform a GWAS in a sow herd (n = 641) before and after a PRRS outbreak. They found a number of genomic regions strongly correlated with number of stillborn piglets, number and percentage of born dead piglets and sample-to-positive antibody ratios during and/or before PRRS infection. SNPs in these regions were found near genes associated with reproductive performance or immune response (Serão et al. 2014). Boddicker et al. (2012) also used this 60 K SNP chip, but focused on grower-to-finisher pigs and their genomics in relation to PRRSv infection. They found the QTL on SSC4 harboring the WUR SNP marker that has been associated with WG as well as PRRSv viremia levels, as described above (Boddicker et al. 2012). The effect of the SSC4 region and of WUR in particular was successfully validated in additional trials on animals with a different genetic background (Boddicker et al. 2013, 2014). This WUR marker maps close to several members of the guanylate binding protein (GBP) family which are known to be induced by gamma interferon. A transcriptomic approach was performed to identify differential expression between pigs with alternate QTL genotypes and potentially elucidate the underlying causal mutation. Koltes et al. (2015) specifically examined the expression of all genes in the region with high linkage disequilibrium to the WUR marker and determined that GBP5 was differentially expressed between WUR genotype groups. Through deeper analysis of the RNA-seq data, they found a putative causal mutation causing differential splice variants of GBP5.

However, although these genomic analyses could lead to SNPs with large effects on phenotypes or even discover causal mutations, and the pig breeding industry could use them for selection towards better performing animals, such analyses often give little or no information about the molecular mechanisms that underlie these differences in phenotypes. In an integration of SNP association data with genome functional annotation, Waide et al. (submitted) performed GO enrichment analyses on sets of genes in close vicinity of SNPs associated with viral load and weight gain. They analyzed gene sets located within 250 kb of SNPs that were associated with these traits (−log10(p-value) > 2.5). Analyses were performed using Panther (Mi et al. 2013) on a total of 13 trials of approximately 200 animals per trial and infected with the KS06 or NVSL PRRSv strain (Waide et al. submitted). For the SNPs associated with viral load, enriched biological processes (BP) terms for the KS06 strain included natural killer cell activation, immune response and B cell-mediated immunity, although the latter was not significantly enriched after Bonferroni correction. For the NVSL strain, enriched BP terms were immune response, metabolic process and lysosomal transport. For the SNPs associated with weight gain, antigen processing and presentation via MHC class II was the most enriched BP GO term for KS06; however, after Bonferroni correction, this term was no longer significant. Hence, it is possible to find groups of genes predicted to have functional differences between pigs with extreme phenotypes while using genomic rather than transcriptomic data. Since there are a large number of GWA studies available, it might be worthwhile to apply this approach to other existing datasets.

4.2 Transcriptomic Analysis of Host Response to PRRSv

Without doubt, the majority of research on host response to PRRSv is performed on the transcriptomic level. At the beginning of the twenty-first century, a multitude of microarray studies were performed examining host response to PRRSv, and these mostly in porcine alveolar macrophages (PAMs) (Genini et al. 2008; Zhou et al. 2011), lung (Bates et al. 2008; Xing et al. 2014), bronchial lymph nodes (Bates et al. 2008) and blood (Schroyen et al. 2015; Wysocki et al. 2012). Some of these studies compared non-infected with infected cells or tissues, while others focused on breed-specific (Ait-Ali et al. 2011; Xing et al. 2014) or within-breed resistance differences after infection (Boddicker et al. 2014). At the present time, the first RNA-seq studies on host response to PRRS have been reported (Badaoui et al. 2014; Koltes et al. 2015; Miller et al. 2012; Sang et al. 2014; Schroyen et al. 2016). These RNA-seq studies examined blood, macrophages and tracheobronchial lymph nodes. Differentially expressed genes were often annotated as pro-inflammatory and several signaling pathways linked to the innate immune response surfaced. Overall, it has been shown that the PRRS virus triggers an atypical innate immune response, with less type I interferon α (IFNα) production compared to other viruses (Van Reeth et al. 1999), which leads to a reduced expression of interferon-induced genes and pathways. Better performing animals, that are less affected by viral infection, are believed to trigger their immune system earlier and possibly have a more effective response than the more susceptible animals, as seen by the expression profile differences (Ait-Ali et al. 2011; Schroyen et al. 2015), as well as when comparing cytokine levels in the sera (Souza et al. 2013; Van Reeth et al. 1999). The earlier described BE3D analysis of all available Affymetrix data on porcine immune response (IR) studies identified a general cluster of genes up-regulated due to different infectious agents (Dawson et al. 2013). This cluster was also up-regulated after a PRRSv infection in both alveolar macrophages and lymph nodes, albeit at a slower pace when comparing to Salmonella spp. infection or stimulation with LPS (Dawson et al. 2013). Using all available porcine IR microarray data, including many array platforms, Badaoui et al. (2013) performed a meta-analysis using the software Pointillist. They compared multiple PRRS microarray studies including many different breeds, tissues and viral strains with many immune response experiments to find PRRS-specific expression responses (Badaoui et al. 2013). Several interferon regulatory transcription factors (IRF1, IRF3, IRF5 and IRF8) were among those found to respond to immune stimulation only in PRRS-specific experiments. In an extension of the WUR-specific transcriptomic analysis by Koltes et al. (2015), Schroyen et al. (2016) looked at the whole transcriptome in order to find differences in pathways between the different genotypes and found pathway differences as a result of the inability of the truncated GBP5 protein in susceptible pigs to restrain viral entry and replication as fast as the intact GBP5 protein in the more resistant pigs.

More recently, Loving et al. (in preparation) performed RNA-seq studies on thymus from non-infected animals and animals infected with different PRRSv strains to investigate thymic atrophy during the infection and how this is reflected in the thymic transcriptome. Thymic samples were collected from four groups of ±5 animals per group (non-infected animals and animals infected with a mild, moderate and severe strain) at 4 and 10 dpi. The number of up-regulated and down-regulated genes between the non-infected and infected animals increased with severity of strain. The transcriptome of the animals infected with the mild or moderate strain showed an inflammatory response at 4 dpi but the infection was resolved by 10 dpi, whereas for the most virulent strain, inflammation was still present at 10 dpi. The most severe PRRSv strain also caused the largest impact on thymic atrophy due to apoptosis, so that the amount and types of cells should be taken into account to fully understand the data. This experiment is therefore a further illustration of the impact of cell counts, as described above for blood transcriptomics.

Since miRNAs play an important role in influencing gene expression levels in a post-transcriptional manner, especially during an immune response (Contreras and Rao 2012), the miRNAome has also been examined with regard to PRRS infection. Several miRNAs are differentially expressed between infected and non-infected animals (Hicks et al. 2013), and there are responses unique to different PRRSv strains (Cong et al. 2014) or within different pig breeds (Li et al. 2015a). Interestingly, in two studies published this year, several miRNAs that were previously identified as influencing innate immunity or have antiviral functions were tested for their ability to reduce PRRSv in infected alveolar macrophages or MARC-145 cells. Jia et al. (2015) transfected MARC-145 cells with 10 miRNAs and at 24 h after transfection infected them with PRRSv at multiplicity of infection (MOI) of 0.1. Compared with the other miRNAs, a fivefold reduction of the viral titer was shown at 72 hours post-inoculation (hpi) when the cells were transfected with miR-26a. PRRSv also induced miR-26a expression in a dose-dependent manner. Li et al. (2015b) looked at 15 miRNAs in both alveolar macrophages or MARC-145 cells and found similar results at a MOI of 0.01 with a 25% and 50% reduction of viral titer at 24 and 48 hpi, respectively, when cells were transfected with miR-26a. Both groups used a luciferase reporter analysis to show that the overexpression of miR-26a affects PRRSv infection, not by attacking the PRRS genome directly but by up-regulation of the innate antiviral response and activation of type I interferon and interferon-induced genes (Jia et al. 2015; Li et al. 2015b).

4.3 Initial Proteomics Approaches to Understanding Host Response to PRRSv

Using iTRAQ labeling, Lu et al. (2012) examined the proteome in PAMs during PRRSv infection. A total of 160 proteins were differentially expressed between uninfected animals and infected animals for at least one time point from 12 up to 48 h post-inoculation of the cells with the virus (Lu et al. 2012). Among them were proteins involved in cytoskeleton networks and cell–cell communication, which is not surprising since viruses can hijack or interact with the host cytoskeletal transport machinery (Dohner and Sodeik 2005). This result was recently confirmed (and extended), as an RNA-seq analysis of blood also saw differences in network connections of genes involved in cytoskeleton rearrangement between susceptible and more resistant pigs (Schroyen et al. 2016). Other DE proteins found were involved in the oxidation-reduction system, RNA-binding proteins or heat shock proteins, which was also reported in other proteomics studies performed on porcine alveolar macrophages (PAMs) or lungs after PRRSv infection (Lu et al. 2012; Xiao et al. 2010; Zhang et al. 2009). However, the question remains how specific these proteins are up-regulated due to the PRRS virus, in contrast with the response to other viruses.

In order to find biomarker proteins in serum to detect early-onset PRRSv infection, Genini et al. (2012) used surface-enhanced laser desorption ionization time of flight mass spectrometry (SELDI-TOF MS). At the day of serum collection, no clinical signs were noted, and none of the piglets were treated. Genini et al. (2012) were able to find a set of 14 discriminatory proteins that could assign pigs to PRRSv-negative and PRRSv-positive groups with high accuracy. They used a dataset of 50 piglet serum samples (from 25 PRRS positive and 25 PRRS negative pigs) to discover these proteins and validated this set in an additional 70 serum samples from 35 PRRS positive and 35 PRRS negative pigs (Genini et al. 2012). We compared these 14 proteins with mRNA information from transcriptomic studies examining host response to PRRSv and some of these proteins could be linked directly to DE or differentially wired (DW) genes, while others belonged to families of genes that were DE in those studies. One of the 14 proteins was the S100 calcium-binding protein A10 (S100A10) and Miller et al. (2012) identified three family members (S100A8, S100A9 and S100A12) among the top 10 up-regulated genes after PRRSv infection. This DE occurred at the mRNA level in trachea–bronchial lymph nodes when animals infected with PRRS were compared to non-infected animals (Miller et al. 2012). Other interesting proteins among those 14 biomarkers were proteasome activator family member 28 beta, ubiquitin and vacuolar protein sorting 29 (vps29). Interestingly, in Schroyen et al. (2016), proteasome activator family member 28 beta (PSME2) and ubiquitin protein ligase E3A (UBE3A) were DW between susceptible and more resistant animals. Furthermore, VPS41 had a high phenotypic impact factor, which meant that it was DE between the susceptible and more resistant animals and at the same time highly expressed (Schroyen et al. 2016).

Recently, Luo et al. (2014) were the first to examine the PRRSv host response phosphoproteome, a large-scale study of protein phosphorylation levels in PAMs, using a TiO2-based enrichment method combined with liquid chromatography tandem mass spectrometry (LC-MS/MS). The phosphorylation level of over 200 proteins was altered at both 12 and 36 h post-infection (Luo et al. 2014). Pathway analysis revealed that several signal transduction pathways such as MAPK, NF-kB and PI3K-AKT signaling pathways were significantly activated after infection. It has been reported that the PI3K-Akt signal transduction pathway is involved in PRRSv entry (Ni et al. 2015; Zhu et al. 2013).

4.4 Mathematical models to help in the integration of PRRS data

A systems biology approach to understand the host response to PRRS would integrate these genomic, transcriptomic and proteomic results. Alternatively, mathematical host–pathogen interaction models could integrate these diverse empirical findings and contribute to enhancing our understanding of the immune responses even further (Doeschl-Wilson 2011). A useful example of mathematical modeling of host–PRRS interactions has been provided by Doeschl-Wilson and Galina-Pantoja (2010). Such modeling approaches start off as basic host–pathogen models describing the interaction between virus and host macrophages without host immune response, and increase complexity gradually by adding innate, humoral and cellular immune responses (Doeschl-Wilson and Galina-Pantoja 2010). Besides giving better insights, such models can also point towards missing system components and open up to further experimental investigations.

Doeschl-Wilson et al. (2012) applied the dynamical systems theory on individuals after a PRRSv infection. They could distinguish nine different performances versus pathogen burden trajectories in pigs infected with the same dose of PRRSv. They propose to use these trajectories as reliable categorical tolerance phenotypes in subsequent genetic studies (Doeschl-Wilson et al. 2012). While inspecting the viremia patterns in the blood over a time period from 0 dpi to 42 dpi, another categorical distinction emerged: cleared, persistent and rebound phenotypes. Islam et al. (2013) used Wood’s curves to fit these blood viremia patterns and linked the analysis of neutralizing antibody (nAb) to these patterns (Islam et al. 2013). In the pigs that were classified as cleared, a narrow nAb response was noted, showing an efficient immune response by which the virus used in the infectious dose is rapidly cleared. Pigs that were persistently viremic over the 42-day period displayed a broad nAb spectrum, indicating a more inefficient antibody response to the original strain as well as potentially a more diverse response due to new viral quasi-species that arise from within the inoculum via selection pressure from the host immune response. It would be of great interest to link these different types of immune response to transcriptomic and/or proteomic data and identify markers for successful adaptive immunity to PRRSv.

4.5 Systems Biology on PRRS

In some of the studies described above, some form of “systems-wide analyses” was utilized. When RNA-seq is performed to examine expression differences between animals with a different genotype for an SNP marker related to viral load and weight gain, transcriptomics meets genomics (e.g. Koltes et al. 2015, Schroyen et al. 2016). When the gene ontology analysis of genes in the vicinity of genetic markers associated with response traits elucidates differentially expressed pathways between susceptible and more resistant animals, genomics meets transcriptomics (e.g. Waide et al. submitted). When the genes encoding proteins found with proteomics are also identified by using RNA-seq analyses, or when altered expression of phosphorylation levels are found in proteins of a specific pathway, whose genes are up-regulated or down-regulated in microarray or RNA-seq experiments, proteomics meets transcriptomics (Genini et al. 2012; Lu et al. 2012; Luo et al. 2014; Miller et al. 2012; Schroyen et al. 2016).

To integrate the data from our whole blood microarray experiment described in Schroyen et al. (2015) with knowledge on protein interaction data, we re-analyzed the genes found in the immune-related module and performed a protein–protein interaction (PPI) analysis on these genes using NetworkAnalyst (Xia et al. 2014). By firstly annotating the genes in this module, it could be seen that the cluster is enriched for interesting annotations, including cytokines, chemokines, interferon type I stimulated genes, apoptotic genes and genes involved in complement pathways. Because all genes were allocated to the same co-expression module, their mRNA expression pattern from animal to animal was similar. By using NetworkAnalyst, knowledge about existing (human) protein–protein interactions is added on top of the mRNA information. We determined the largest zero-order interaction network between proteins encoded by the 506 genes in the immune-related module and found a set of 33 proteins, of which the topology is shown in Fig. 2a. In Schroyen et al. (2015), components of this protein network were identified, namely the inflammasome gene NLRP3, which is known to activate CASP1 and in turn leads to the activation of IL1B and IL18. However, with the PPI analysis, other connections become clear. For instance, TXNIP was found DE in PRRSv-infected lungs and bronchial lymph nodes (Bates et al. 2008) and its protein interacts with the NLRP3 protein. The pathogen-recognition RIG1 receptor or DDX58 interacts with CASP1, which in turn is linked to the interferon-stimulated IFIT3. The anti-apoptosis BCL2 family member MCL1 is linked to CASP1 through CASP3. To further explore this PPI network, the genes in this PPI network that exhibit up-regulation or down-regulation after 4 dpi compared to 0 dpi is shown in Fig. 2b. Because the animals in this microarray experiment had been genotyped for the WUR SNP described earlier by Boddicker et al. (2012) as a marker for susceptibility, the immune-related module was also correlated with WUR genotype. Looking at this reduced protein network, the substantial differences in expression pattern between more and less susceptible animals is very clear (Fig. 3a, b). These multi-omics analyses can help us better understand biological processes such as immune responses and they can be used to confirm or reject hypotheses made after performing a single-omics study. In any case, more information can be gained, often at low cost. As with the PPI example, the introduction of a protein network on top of transcriptomic data displayed a distinct small subset of 33 correlated genes that was evidently different between WUR genotypes animals and was not visible when looking at the micro-array dataset alone.

Fig. 2
figure 2

NetworkAnalyst protein–protein interactions (PPI) on immune-related module found in the whole blood PRRS microarray study of Schroyen et al. 2015. (a) Nodes are colored according to connectivity; more red means more connections. (b) Nodes are colored according to up-regulation (red) and down-regulation (green) of genes at 4 dpi compared to 0 dpi after PRRS infection

Fig. 3
figure 3

NetworkAnalyst protein–protein interactions (PPI) on immune-related module found in the whole blood PRRS microarray study of Schroyen et al. 2015, split between animals with the different WUR marker, predicting for susceptibility to PRRS found by Boddicker et al. 2012. (a) Nodes are colored according to positive average expression (orange) and negative average expression (blue) after LIMMA normalization of microarray data of genes (Schroyen et al. 2015) in more susceptible animals at 4 dpi. (b) Nodes are colored according to positive average expression (orange) and negative average expression (blue) after normalization of genes in less susceptible animals at 4 dpi

Example 2: Systems Biology in Salmonella Studies in Pig

Another important pathogen in the swine industry is Salmonella. It is a foodborne pathogen hazardous for human consumption, causing severe gastroenteritis and deaths worldwide. In the USA alone, costs for human salmonellosis are estimated at more than $2.4 billion annually. Human salmonellosis can often be linked to an animal source such as poultry, eggs, pork, beef and dairy cattle (Callaway et al. 2008). Other than affecting human health, Salmonella spp. also infect and/or multiply in almost all known vertebrates, from reptiles to birds and mammals (Edwards et al. 2002), and clinical and subclinical salmonellosis in pigs has been estimated to contribute to substantial economic losses to the swine industry (Haley et al. 2012).

4.6 Network-Based Analysis of Salmonella in Pigs

Probably the two most examined Salmonella serovars concerning pig gene expression regulation are S. enterica serovar Typhimurium (ST) and S. enterica serovar Choleraesuis (SC). ST causes enterocolitis in a wide variety of vertebrates, while ST is host-adapted and predominantly affects swine (Edwards et al. 2002). In pigs, SC was the most common serovar from 1986 to 1995, but in the mid-1990s, it was replaced by ST (Foley et al. 2008). Recently, several transcriptomic studies were performed to determine differences in whole blood causing variation in outcome between low (LS) and persistently shedding (PS) pigs after inoculation with ST (Huang et al. 2011; Knetter et al. 2015; Uthe et al. 2009, 2011). In order to find biomarkers that could distinguish between LS and PS animals before infection, Kommadath et al. (2014) performed a network-based analysis. Using recently acquired RNA-seq data of blood from ST-infected pigs and WGCNA, they found day 0 modules that contained genes annotated for innate defense against bacteria—or Salmonella in particular—and that had distinct expression patterns in LS versus PS animals, with the mean expression levels higher in the LS than PS animals. Examining the connectivity of the genes revealed that connections to hub genes within these modules were significantly stronger in LS than PS animals, which could be an indication of a more tightly regulated transcriptional response of the genes in these modules in the LS animals (Kommadath et al. 2014), and supports the hypothesis that LS animals are better prepared for an infection and quicker to respond.

miRNA-seq was performed on whole blood samples of the same set of LS and PS animals and together with the mRNA-seq data used by Kommadath et al. (2014), a potential involvement of miRNAs was examined (Bao et al. 2015). In both LS and PS pigs, miR-214 and miR-331-3p were associated with ST infection. Targets for miR-214 were predicted to be SLC11A1 and LILR-like. The expression of the mRNA for these two genes increased at 2 dpi, while the expression of miR-214 expression decreased. Both these genes are involved in immune response, but no role for miRNAs to control them has yet been described. VAV2 plays a role in the entry process of several pathogenic microbes. It is a target gene for miR-331-3p and had a lower expression after infection, which could be the result of an observed increase in miR-331-3p expression. Results were of a similar magnitude in both LS and PS animals. For comparisons between LS and PS, no miRNAs were DE at 0 dpi, and only three were DE at 2 dpi. Bao et al. (2014), as described earlier, reported a more tightly rewired network after Salmonella infection, and it would be interesting to look at DW between LS and PS animals of target mRNAs at 0 dpi.

4.7 Salmonella and the Microbiome

The pig microbiome has been the subject of many immune-related studies and gut microbiota are widely recognized to play a crucial role in animal health and well-being (Kim and Isaacson 2015). Bearson et al. (2013) compared the microbiome in non-infected (NI), LS and PS animals at days 0, and 2, 7 and at 21 dpi. At 0 dpi, significant differences in microbial community structure were seen between LS and PS animals; however, these two groups were both not significantly different from the NI group. At 2 and 7 dpi, there was no difference in the microbiome between the LS group and the NI animals, but a clear difference was shown between PS and the other two groups of animals. At 21 dpi, these differences between LS and PS groups were gone; however, microbiota profiles for both LS and PS were significantly different from the NI group at 21 dpi, suggesting a Salmonella-induced alteration in microbiota regardless of shedding status (Bearson et al. 2013).

With regard to screening for biomarkers for resistance/tolerance versus susceptibility before infection, DNA sequence analysis of day 0 microbiota samples in this study revealed an enriched presence of Ruminococcaceae in the LS animals (Bearson et al. 2013). This positive effect of Ruminococcaceae on resistance/tolerance is described in several studies focusing on intestinal microbiota compositions with regard to diarrhea, whether caused by Salmonella spp. or not (Pop et al. 2014; Suchodolski et al. 2012; Videnska et al. 2013). Members of this microbial family produce short-chain fatty acids (SCFA) with acetate, butyrate and propionate being the major SCFA produced in the colon. Specifically, butyrate can be influential in gut health due to its anti-inflammatory properties and its capacity to strengthen the colonic barrier and reduce the intestinal epithelial permeability (Hamer et al. 2008). In a study on gut microbiota in children with eczema, a negative association was reported between Ruminococcaceae and TLR2-induced IL6 and TNFα levels (West et al. 2015). Earlier, Huang et al. (2011) found that only in PS pigs, TNFα RNA in blood was elevated after 2 dpi ST infection (Huang et al. 2011). One interpretation of these results is that PS animals, with less Ruminococcaceae in their intestine compared to LS animals, do elevate the TNFα pathway, whereas in LS animals this is not the case. Certainly, more research is required to ascertain the generality of these proposed relationships.

5 Current Challenges and Future Directions

In the pig, there are only a handful of examples of studies approaching a systems biology analysis described thus far, but the merit of such research is becoming more and more apparent. Immunology is a highly relevant research domain for a systems-level approach because of the multitude of tissues, cells, proteins or genes interacting with one another when facing a disease challenge, with such interactions occurring at multiple scales of time. Currently, data created and analyzed by different labs and different experiments are hard to integrate in a powerful way due to different breeds used, different time points examined, and different protocols followed. To make a systems biology approach easier, consortia led by a complementary set of laboratories or institutions are being established (Benoist et al. 2012). Genetics research is far more active in consortium science, since it is easier to identify, map or sequence genes by several groups than it is to examine a complex immunological research question (Benoist et al. 2012). For pig, the PiGMaP consortium (Archibald et al. 1995) and the Swine Genome Sequencing Consortium (Schook et al. 2005) were the first consortia established. For pig diseases, and specifically to examine PRRS virus infections in pigs, the PRRS Host Genetics Consortium (PHGC) was founded (Lunney et al. 2011). Some of the research described above is part of this consortium (Boddicker et al. 2012, 2013, 2014; Koltes et al. 2015; Schroyen et al. 2015, 2016; Waide et al. submitted), and the genetic and immunological insights gained strongly demonstrate the value of collaborative efforts that increase the power of such challenge experiments.

A substantial advantage of these consortia is that variation is reduced by shared and standardized protocols and procedures, as is described for the Human Encyclopedia of DNA Elements (ENCODE) project (ENCODE project consortium 2011). By using standards, data quality is assured, data utility can be extended and data comparison and thus the establishment of a systems biology approach, has become easier. The ENCODE project has been expanded from humans to classical model species and recently the Functional Annotation of Animal Genomes (FAANG) consortium for domesticated animal species was launched (The FAANG consortium et al. 2015). As a start, this consortium will focus on chicken, pig, cattle, horse, goat and sheep, species with a high-quality reference genome and often a plentitude of (ancestor’s) phenotypic data already stored (The FAANG consortium et al. 2015). Cells and tissues relevant to pig health, including blood cells and liver, are being collected on healthy pigs in the FAANG project (www.faang.org). In addition, several groups have pathogen challenge projects that will provide data relevant to a deeper understanding of the porcine immune response and the parts of the genome that are responsible for these responses. Thus, the FAANG project will accelerate our ability to apply systems biology tools to improving pig health in the future.