Keywords

19.1 Introduction

In the present scenario, the major question is how to address the cause of crop yield and field stock infection, which is impacting the economy worldwide. Recent studies have shown that the amount of these infections may increase even more due to global warming. Several new variants of microorganisms, including viral, bacterial, and fungal pathogens, can find novel hosts and ecologic niches. Also by systems perspective, lack of understanding of the complex mechanism by which these pathogens evade the host defense machinery and adapt according to their lifestyle needs is evident. Hence, there is an absolute necessity to study the relationship between the host and pathogen in order develop suitable chemicals to reduce pathogenicity (Aderem et al. 2011). Over the past few decades, the advancement in technology has developed strategies for investigating the host-pathogen interaction on the scale of molecular levels by adapting various computational and analytical tools. With the outbreak of genome sequencing, various databases are present to show strains and variants of pathogens sequenced to date. At the same time, availability of vast data on population-level genetic variation for plant hosts offers a huge potential for the study of host-pathogen interaction.

Further to gain the insight into the pathogen virulence and how these pathogens rewire the cellular transcription and dynamics of protein networking of host systems (McDermott et al. 2011), several molecular tools, such as deep sequencing, high-throughput proteomics, and sophisticated interactome analysis, have been used (Peng et al. 2010; Niemann et al. 2011; de Chassey et al. 2008; Shapira et al. 2009; Mukhtar et al. 2011; Das and Kalpana 2009). During the course of evolution, the pathogens have developed a strong selection for the defense mechanism exerted by the host system and consequently adapt to their environment. It is very difficult to extract data through experimental observation of the host-pathogen relationship (Shi et al. 2006; Eriksson et al. 2003). In order to develop improved therapeutic agents, knowledge related to these interactions is essential. Previously most of the treatments, such as vaccines, antibiotics, and antivirals, were designed by exploiting the structural and molecular differences between the host and pathogen. However, most of the pathogens have developed resistance to antibiotics, which is again a major issue. Hence, periodic development of novel methodology based on the study of these pathogens to develop novel therapies is of utmost importance. The schematic of the PHI modeling system is depicted in Fig. 19.1.

Fig. 19.1
figure 1

Schematic modeling system for pathogen-host interaction (PHI)

19.2 Systems Biology as a Tool

The deeper understanding of the complex biological systems is very crucial in predicting the pathogen-host interactions (PHIs) (Durmuş et al. 2016). Systems biology helps to assemble a framework for models of biological systems for systematic measurements. It is an interdisciplinary field in life sciences integrating engineering, mathematical, bioengineering, medical, and computational disciplines to understand the nonlinear behavior in biological systems (Kitano 2002; Durmuş et al. 2015). Previously, reductionist approaches were used to understand the biological systems which consider only fewer molecules of interaction, whereas systems biology uses holistic approaches based on omics data, which gives the overall view of the interactions between protein, nucleotide sequences, ligands, and metabolites in PHIs. Further, noncoding RNAs and small molecules play a crucial role in understanding virus-host interactions and bacterial-host interactions (Durmuş et al. 2015; Raja et al. 2017; Likić et al. 2010).

It is very important to understand the biochemical networks of the system (viz., gene regulatory network, protein-protein interaction network, and metabolic network), which helps in deciphering the systems studies on biochemical subnetworks or cross-networks. Integrating the information from various biological levels provides complex and unanticipated global behavior of PHIs (Durmuş et al. 2015, 2016). The biochemical networks give the idea of how each component in the system behaves in the spatial and temporal ways and also how precisely the controls are excreted on them. The metabolomics approach makes it possible to precisely measure the metabolite concentration, whereas the transcriptomics and proteomics approaches provide the quantitative data of mRNA and protein levels, respectively (Karahalil 2016). Experimental approaches to assess in vivo reaction rates (fluxes) are again important parameters and are well developed to ascertain metabolic networks. The metabolic flux helps in determining the genotype-phenotype relationship (Antoniewicz 2015; Chen and Shachar-Hill 2012; Deidda et al. 2015). The omics data collected from infected cells and pathogens will be subjected to bioinformatics analysis to construct an infection-specific gene regulatory, metabolic, and protein-protein networks. The analysis of PHI omics data using computational systems biology tool unravels the infection mechanism, dynamics, and potential drug targets for the prevention of infections. Recently, web-based databases are available to accommodate the increasing data generated in PHI experiments, and also, they provide pathogen-host interactome data, which helps in focusing on specific pathogen or host system. Also novel text mining methods, which help in PHI data retrieval, are required (Durmuş et al. 2015).

19.3 Omics Technology: To Understand the Relationship between Host-Protein Interaction

During the 1920s, a botanist named Hans Winkler introduced a word genome by merging the words “GENe” and “chromosOME.” It is known that omics involves a mass or a large number of measurements per end point. Today, more than 1000 omics fields are available for describing the properties of lipids, nutrients, etc. (Karahalil 2016; Antoniewicz 2015; Chen and Shachar-Hill 2012; Deidda et al. 2015). The generation of omics data through the application of high-throughput techniques and the data management and analysis via computational biology and mathematical modeling has brought the major revolution in the field of infection biology. A deeper insight of host immune response during infectious conditions gives an idea for the development of diagnostics, therapeutics, and vaccines. Also the systems biology of the infection led to the development of personalized medicines and novel therapeutic targets. The integrative personal omics profile (iPOP) combines genomics, transcriptomics, proteomics, metabolomics, and autoantibody profiles from a single individual over a 14-month period (Sarker et al. 2013; Chen et al. 2012).

19.4 Genomics and Transcriptomics Data for PHI

In genomics, the analysis of the nucleotide sequences, genome structure, and nucleotide composition will be carried out. Further this analysis helps in understanding the genetic variation among the individual and thereby providing the structure and functional relationship, their variants and diseases or response to therapy. Understanding the genetic variations helps in elucidating the genetic basis of diseases using genome-wide association study (GWAS) associated with genome linkage analysis and case-control studies with individual gene. To obtain the insight of this genetic information known as central dogma (DNA-mRNA-proteins), high-throughput techniques, such as microarray and next-generation sequencing (NGS), are being used. Further whole-genome sequencing helps to identify the type of pathogen and its nature of virulence, antibiotic resistance, and diagnosis and the development of new vaccines. A plethora of the literatures published show the relationship between gene polymorphism and disease susceptibility. Single-nucleotide polymorphism (SNP) can be used as an important tool for the identification and characterization of pathogen variants and disease susceptibility in plants and humans (McCourt et al. 2013; Yağar et al. 2011; Karahalil et al. 2011; Mardan-Nik et al. 2016). Over the past few decades, with the development of NGS, a large amount of genomic sequencing data are available in public databases. These sequencing technologies are capable of handling huge genome dataset in a timely and cost-effective manner. The phylogenetic studies based on whole-genome sequencing have helped in understanding the evolution of the PHIs and the possible prevention of infectious diseases. Metagenomics and metatranscriptomics of pathogens revealed how pathogenic microorganisms adapt to hosts, e.g., plants (Guttman et al. 2014).

The systematic whole-genome sequencing procedure of PHI is shown in Fig. 19.2. Whereas on the other hand, to get more insights into the evolution of pathogen, molecular pathogenesis and host specificity by using comparative genomics. Further NGS gives the molecular insight for diverse pathogens on genomic and transcriptomic levels (Fig. 19.3). Usually genomics is based on static data, whereas transcriptomics gives a dynamic profile of gene expressions with time. The genotype and expression phenotype can be linked through the through mRNAs match with particular genes in the genome (Karahalil 2016). The functionality differences between tissues and cells, interaction between genes, gene regulation and regulatory sequences, and identification of diseased states can be provided using RNA profiling (Durmuş et al. 2015). Some of the genomics and transcriptomics tools are provided in Table 19.1.

Fig. 19.2
figure 2

Systematic whole-genome sequences procedure of PHI

Fig. 19.3
figure 3

Overview of next-generation sequencing technology used for sequencing PHI data

Table 19.1 Techniques used for genomics, transcriptomics, proteomics, and metabolomics and their applications

19.5 Proteomics and Metabolomics

The actual information related to metabolic and enzymatic processes can be obtained through a comprehensive study of the proteins. The characteristics of proteins and protein-protein interaction rapidly change cell proliferation and migration. Further characters, such as posttranslational modification, help to understand the dynamic proteome analysis (Wright et al. 2012; Larance and Lamond 2015). The protein structures and functional studies play a crucial role in PHIs as they can elucidate the role of the pathogens in eliciting the innate and adaptive immune responses. Pathogen-associated molecular patterns (PAMPs) are molecules or small molecular motifs within a group of pathogens (e.g., the protein flagellin, lipopeptides, lipopolysaccharide (LPS)) that are recognized by proteins, the so-called pattern recognition receptors (PRRs), such as Toll-like receptors (TLRs (Qian and Cao 2013)). In many cases, the signal transduction is stimulated by PRRs via different pathways, for example, JAK-STAT pathway, interferon gamma (IFNγ)-receptor pathway, and tumor necrosis factor-alpha (TNFα) signaling. During viral and microbial infections, the type II cytokines (IFN-γ) play a key role in innate and adaptive immunity (Prabhu et al. 2016, 2017, 2018). Transcription factor NF-κB also activated by various intra- and extracellular stimuli, such as bacterial or viral products, e.g., the TLRs signaling, and induces the expression of pro-inflammatory cytokines (interleukins, TNFα, Type I interferons) (Chen et al. 2012).

Utilizing bioinformatics as a tool for understanding the descriptive proteome analysis of the pathogen and its interaction with the host will give a better idea for designing the diagnostics and medicines. Several proteomics methods, such as mass spectrometry (MS), for protein and peptide analyses via, for instance, the matrix-assisted laser desorption/ionization (MALDI) and electrospray ionization (ESI) techniques resulted in powerful MS instrumentations (Del Chierico et al. 2014). The detail of the techniques is mentioned in Table (19.1). Further the alteration to the environmental variations can be determined by estimation of metabolites, which are the end products of the cellular regulatory process. Because endogenous metabolites are fewer than genes, transcripts, and proteins, only fewer data can be interpreted. Hence, metabolomics has a great advantage over genomics and proteomics. The change in the metabolites reflects the biological states of organism. An in silico study, such as genome-scale metabolic models, utilizes metabolites to identify the effective target of the drugs. One important PHI is the production of toxins by the pathogen that affects the host immune system. The fungus Aspergillus fumigatus which secretes gliotoxin induces apoptosis in host system. Systems biology-based models, including genetic regulatory networks (GRNs), help in understanding the uptake of important nutrients, such as nitrogen, carbon, and iron, by pathogens from the host system and how they regulate the biochemical network (Scharf et al. 2012; Gardiner and Howlett 2005).

19.6 Mathematical Modeling Assisting PHI Interaction

In the past few decades, the synthetic and systems biology field has witnessed a major paradigm shift with the availability of whole-genome sequencing for various organisms, which gave the whole picture of metabolic network, signaling and regulatory pathways in cells. For altering the metabolism of an organism, understanding the cellular biochemical network is very much essential (Bose 2013; Chuang et al. 2010; Chae et al. 2017). With the evolution of systems-based approaches, a wide range of techniques were applied for the simulation and analysis of biochemical systems. The entire biochemical modeling can be classified into (i) constrain-based modeling, which relies on the reaction stoichiometry; (ii) kinetic modeling, which is based on comprehensive mechanistic modeling; (iii) interaction-based network (Raman and Chandra 2009). The steps involved in reconstruction of metabolic pathways are shown in Fig. 19.4.

Fig. 19.4
figure 4

Overview of the steps involved in designing metabolic modeling of organism

Compared with kinetic modeling, which requires a detailed study for evaluating its parameters, constrain-based model offers a more precise quantification of genotype-phenotype relationship and hence is widely used in metabolic engineering (Antoniewicz 2015; Çalık and Özdamar 2011; Dai and Locasale 2016). In constrain-based analysis, the organism fine-tunes itself with the change in the environment satisfying the given constrain and achieves better survival capabilities. For in silico metabolic engineering, metabolic networks are simulated using constrain-based method and ultimately represent all biochemical networks in the organism. The metabolic network reconstruction may be focused on specific pathways/central metabolic pathways to encompass the entire genome leading to a genome-scale metabolic model. The reconstruction of genome-scale metabolic models involves various steps that includes (a) draft model creation, (b) detailed model reconstruction, (c) mathematical format conversion, (d) gap identification and filling, and (e) simulation and visualization (Faust et al. 2011; Geng and Nielsen 2017; Kim et al. 2012).

In the PHI context, the pathogens are solely dependent on the host for getting the substrate, thereby maintaining the active metabolic state; hence, there is a continuous exchange of metabolites between hosts and plant pathogen (Orth et al. 2010; Kauffman et al. 2003). Also for the pathogenesis of an organism it depends on the availability of the nutrients in the host system there is a direct link between the metabolism and the virulence. Recently advanced version of bioinformatics tools for the reconstruction of metabolic network based on genomics data and constrain-based modeling, there in silico metabolic networks are very essential in understanding the physiology of pathogen for e.g. substrate availability in the host that decides the pathogenicity or the secretion of the toxins based on the host environmental conditions (Chavali et al. 2012; Eisenreich et al. 2013; Gouzy et al. 2014; Brown et al. 2008; Milenbachs et al. 1997). A constrain-based modeling of the Gram-negative bacterial pathogen, Salmonella typhimurium, showed a systematic metabolic modeling between the pathogens and the hosts (Raghunathan et al. 2009). The simulation of flux balance models for the reconstruction of genome-scale metabolic models answered the question of survival capabilities of pathogen. It has been shown that when the author used the media similar to the host cell, the model-predicting ability was superior. The author also showed that integration of transcriptome data with this flux analysis data led to a better understanding of transport mechanism. Recently, a dynamic flux balance analysis (FBA) model of a barley plant was constructed, which is capable of predicting the steady-state flux distribution of the metabolism of different organs throughout the entire plant development (Grafahrend-Belau et al. 2013).

19.7 Gene Regulatory Network Modeling in PHI

The phenotype of an organism is solely dependent on the gene expression, the gene regulation is an interconnection of regulatory circuits at molecular levels. The molecular mechanism includes controlling of transcription by transcriptional factors; RNA transporting, which is responsible for the posttranscriptional control of RNA; chromosomal remodeling; controlling of protein translation through signal transduction network; and posttranslational modifications, such as phosphorylation and acetylation (Thompson et al. 2015). Measuring the interactions between these molecular components is very difficult, but the advances made in the past two decades to precisely measure these components have enabled large-scale measurements of gene expression at steadily decreasing costs. With this data, the reconstruction of the molecular systems can be done using computational techniques, and the interaction underpinning patterns of gene expression can be easily interpreted (Vijesh et al. 2013). Interactions among the molecular components of the living systems are collectively known as gene regulatory network (GRN) models. Most of the biological models help in understanding the pathogenicity of the organisms, ODE-based modeling are based on kinetic parameters describes PHI phenomenologically and does not consider the molecular mechanism (Hecker et al. 2009).

GRNs describe the logic of mode of infection by pathogens, adaption of pathogens to their hosts, and defense mechanism of hosts against pathogens. It is very difficult to reconstruct GRNs based solely on gene expression data. Proposed reverse engineering methods include those based on Boolean networks, Bayesian networks, differential or difference equations, and graphical Gaussian models that integrate gene expression data to better curate models (Hecker et al. 2009; Chai et al. 2014). In plant system, only few literatures based on GRN are available. Varala et al. (Varala et al. 2018) applied GRN to understand the temporal transcriptional logic underlying dynamic nitrogen (N) signaling in plant. The time series transcriptome analysis showed the dynamics of nitrogen signaling by a temporal cascade of cis elements. Recently, Ikeuchi et al. (Ikeuchi et al. 2018) used enhanced yeast one-hybrid (eY1H) screen to build GRN models, systematically showing the regulations between transcription factors and promoters. Also they showed that wound/hormone secretion invokes cross talks between genes and thereby regulates the common reprogramming-associated genes via multilayered regulatory cascades.

19.8 Protein-Protein Interaction Network Modeling in PHI

In recent years, the molecular structure and function of gene and proteins and their relationships are studied thoroughly, leading to a better identification of intra- and interspecies protein-protein interaction networks. Several characteristic features of PHIs, such as adhesion, colonization, and even invasion, can be interpreted through protein interaction map/protein-protein interaction (PPI) (Zhou et al. 2014). It has been observed that the PPI data used to predict the intra-species may not be applicable for interspecies host-pathogen PPIs. Several approaches of PPIs for understanding the PHI have been proposed among species. PPIs are broadly categorized into homology-based approach, structure-based approach, domain-motif interaction-based approach, and machine learning-based approach (Shao et al. 2012). Generally the protein-protein interaction network (PIN) is mathematically represented in the form of graphs where nodes symbolize proteins and edges connect the interacting protein pairs (Colizza et al. 2005). Interestingly it was observed that the datasets available for interaction show a similar nontrivial topological structure of the networks, defining a broad connectivity distribution P(k); i.e., the probability that any given protein interacts with k other proteins. This kind of pattern gives large hubs defining the nodes which have large number of connectivity leading complex architecture supporting nontrivial correlation and hierarchical features in network topology (Yook et al. 2004; Ravasz and Barabasi 2003; Maslov and Sneppen 2002). These features are shared among many biological networks that appear to have recurrent architectural principles that might point to common organizational mechanisms (Ravasz and Barabasi 2003; Dorogovtsev and Mendes 2002). A detailed review by Zhang et al. (Zhang et al. 2010) describes the importance of protein-protein interaction in the regulation of plant developmental, physiological, and pathological processes. Zhu et al. (Zhu et al. 2016) developed a protein-protein interaction database of maize plant. The architecture of gene regulatory networks and protein-protein interactions is shown in Figs. 19.5a and b, respectively.

Fig. 19.5
figure 5

(a) Gene regulatory network using gene expression data. (b) Protein-protein interaction modeling using proteomic data

19.9 Conclusion

With the advancement in omics technology, a huge amount of data is generated on genomics, transcriptomics, proteomics, and metabolomics. These data can be easily interpreted with computational biology techniques, which help in understanding the regulations between the gene and perturbation in the external environment. Further these tools are very useful in predicting the interactions between the pathogens and the hosts. With the application of flux balance analysis, it is possible to understand the genotype-phenotype relationship between the organisms. GRN modeling and protein-protein interaction-based modeling show the regulations of molecular mechanisms between the hosts and the pathogens. Systems biology has provided a better way to understand pathogenicity and drug discovery.