Keywords

1 Introduction

Our immune system protects us from deadly threats from pathogens. To fulfil the requirements, the immune system has to detect the invasion of exogenous pathogens, watch for the pathogenic conversion of endogenous microbes, communicate the threats to the other systems in our bodies, e.g., the nervous [14] and digestive system [57], and then coordinate the systems to evade the threats. Obviously, the immune system cannot function alone. In the past, the studies regarding the immune system [810] focused on the molecular functions and cellular constitution of the immune system itself, and on the physiological effects of immune-related molecules and cells. However, the immune system is one part of a biological organism. Hence, from a systematic perspective, we should consider all systems as a whole, and not view the immune system in isolation.

Immune-related molecules (e.g., chemokines, cytokines, interferons, etc.) and cell types (e.g., lymphocytes, monocytes, mast cells, etc.) are commonly studied with respect to the molecular functions and cellular constitution of the immune system. After activating the first line of the host defense mechanisms (i.e., innate immunity), several cell types (e.g., macrophages, dendritic cells, natural killer cells, etc.) are recruited to protect the host from pathogen invasion and eliminate the threats from pathogens. The recognition of pathogen-associated molecular patterns (PAMPs) and/or damage-associated molecular patterns (DAMPs) by pattern recognition receptors (PRRs) (e.g., toll-like receptors, C-type lectin receptors, etc.) [10, 11] can be viewed as a starting point in a series of the following complex mechanisms. The PRRs initiate downstream pathways that promote the activation of other parts of the innate immune system and the clearance of pathogens (e.g., production and secretion of cytokines, chemokines, and chemotactic cues to recruit more leukocytes). Meanwhile, the macrophages and dendritic cells are responsible for presenting antigens to induce the synthesis of the antibodies specific to the presented antigens if it is the first exposure of the host to the pathogen (Fig. 1A). If it was not the first exposure of the host to the pathogen, existing immunological memory cells proliferate and induce the synthesis of antibodies (Fig. 1A). In short, the interplays between T cells, B cells, macrophages, dendritic cells, etc. have been elaborated in detail at the physiological level. For the treatment of infectious diseases, current drug targets focus on some key molecules rather than the cellular level. Therefore, investigation of the systematic offensive and defensive mechanisms at a molecular level is the most important topic from a drug discovery and design perspective.

Compared with the host immune system, pathogenic mechanisms, not to mention interspecies protein-protein interactions (PPIs) between the host and pathogen, have attracted less attention. The battle is a two-sided affair, that is, the interplays between the host and pathogen shape the whole infection process, from the first exposure to the pathogen to the final outcomes of the infection [12]. Therefore, about a decade ago, the traditional viewpoint to treat the host and pathogen separately shifted to a more holistic viewpoint that included both players in the infection process. This viewpoint transition resulted from (i) the realization of the indispensability of pathogen-host interactions (PHIs) in infectious diseases and (ii) the advent of omics biotechnology to quantify genes, transcripts, and proteins at whole cell/organism levels [13]. This permitted a comprehensive interrogation of both the pathogen and host at the whole-genome, transcriptome, and proteome levels. Despite tremendous advances in understanding pathogenic mechanisms and the subsequent triumphs in drug development [14], the remaining issues (e.g., drug resistance) of infectious diseases have become more troublesome. The dynamic and complex interactions between the host and pathogen may partially explain why certain drugs are often not effective in vivo [15]. Hence, to investigate infection processes from a systematic perspective, in this study we constructed dynamic pathogen-host PPI networks (PH-PPINs) of innate and adaptive immunity.

Fig. 1.
figure 1

Study design and the flowchart of PH-PPIN construction. (A) The first and second exposure induced the innate and adaptive immune responses, respectively, and the two-sided temporal gene expression profiles were recorded by microarray experiments (the rectangles are the observation windows of microarray experiments). (B) The flowchart delineates the procedures used in this study. Selected proteins of interest based on the microarray data formed a protein pool. The PPI candidates collected from the database mining and ortholog information were further pruned into the innate and adaptive dynamic PH-PPINs by the dynamic interaction model, system order detection method, and microarray data. Finally, the interaction variation scores were used to evaluate the significance of proteins in the interaction difference network, which was derived from the two constructed PH-PPINs.

To obtain systematic molecular interaction networks for targeted therapy, we utilized the C. albicans-zebrafish infection model [16]. We measured the temporal gene expression profiles of C. albicans and zebrafish during the infection process, constructed the interspecies PPIs using a dynamic interaction model, and identified the crosstalk network biomarkers with proposed interaction variation scores (Fig. 1B). Given the success of the C. albicans-zebrafish infection model [16] as well as its amenability to genetic manipulation [17], the zebrafish is a novel and potential model organism to study immunity. Furthermore, the zebrafish and human immune systems are remarkably similar and more than 75 % of human genes implicated in diseases have counterparts in zebrafish [18]. This provides a strong connection between the zebrafish and humans with respect to pathogenic mechanisms as well as immune responses, which are important for biomedical applications. The immune system of zebrafish as well as other vertebrates can be divided into two subsystems: i.e., innate and adaptive immunity [11]. The first dataset (GSE32119, [19]) we used to construct a dynamic PH-PPIN measured the two-sided gene expression profiles during the first 18 hours after zebrafish was first exposed to C. albicans to induce primary responses. The second dataset (GSE51603, [20]) measured the two-sided gene expression profiles during the first 42 hours after zebrafish was secondarily exposed to C. albicans to induce secondary responses. To extract the interaction information from the time course microarray data, two dynamic PH-PPINs were constructed for innate and adaptive immunity in the infection process. By evaluating interaction variations based on the corresponding interaction variation scores, critical proteins and crosstalk network biomarkers of larger interaction variations in the infection process were identified. These crosstalk network biomarkers suggest the strategies taken by the host and pathogen during the infection process. Thus, these crosstalk network biomarkers could be potential drug targets when battling infectious diseases [13].

2 Material and Methods

2.1 Overview of Microarray Data

In this study, we used two microarray datasets: one was the two-sided temporal gene expression profiles of the host (zebrafish) and pathogen (C. albicans) in the period after first exposure (GSE32119, [19]), which were used to record the pathogen-host interaction information of innate immunity; the other was the two-sided temporal expression profiles of the host and pathogen in the period after secondary exposure (GSE51603, [20]), which were used to record the pathogen-host interaction information of adaptive immunity. For the first dataset, an experiment was performed to simultaneously profile the genome-wide gene expressions of innate immunity in both C. albicans and zebrafish during the infection process. C. albicans (SC5314 strain) was intraperitoneally injected into Adult AB strain zebrafish. The second dataset measured the genome-wide gene expressions of adaptive immunity in both C. albicans and zebrafish after the second exposure to C. albicans, fourteen days after the first exposure. Then, a two-step homogenization/mRNA extraction procedure was performed using the whole zebrafish infected with C. albicans. This approach can provide separate pools of gene transcripts from both the host and the pathogen, which provides individual estimates of specific gene expression profiles in either the host or pathogen using sequence-targeted probes derived from the individual genomes. Agilent in situ oligonucleotide microarrays, which cover 6,202 and 26,206 genes for C. albicans and zebrafish respectively, were used to profile temporal gene expressions; the first dataset consisted of three replicates of each organism measured at 9 time points (0.5, 1, 2, 4, 6, 8, 12, 16, and 18 h post-injection), and the second dataset consisted of two replicates of each organism measured at 8 time points (2, 6, 12, 18, 24, 30, 36, and 42 h post-re-injection). Both datasets were prepared under similar experimental conditions.

2.2 Protein Pool Selection and Database Integration

There are two steps that need to be completed before a dynamic protein-protein interaction (PPI) network with a dynamic interaction model can be constructed. The first step is to have a protein pool from which the nodes in the resultant networks will be selected, and the second step is to obtain all possible PPIs among the proteins in the protein pool by integrating the interaction information from database mining. Here, our protein pool consisted of the union of the differentially expressed genes in the first and second datasets and the differentially expressed genes between the first and second microarray datasets. The criterion used to select the differentially expressed genes in the first and second microarray datasets was to compute the p-value of the ANOVA test to determine whether the average expression levels differed over time (i.e., for the first dataset, the null hypothesis was \(\mu _1=\cdots =\mu _9\), and the average expression levels were the same for all 9 time points; for the second dataset, the null hypothesis was \(\mu _1=\cdots =\mu _8\), and the average expression levels were the same for all 8 time points), and then to select those proteins with a Bonferroni corrected p-value\(<0.05\) for inclusion in the protein pool. In addition, the genes in the top 5 % of the expression difference between the first and second datasets were selected for the protein pool. Next, to know all possible interactions between the proteins in the protein pool, interaction information for the zebrafish-zebrafish, C. albicans-C. albicans, and zebrafish-C. albicans pairs are needed. However, the lack of information about these three kinds of interactions makes it difficult to collect all possible interactions. In addition, it is impossible to consider all interactions between the proteins in the protein pool. To overcome this issue, interaction information from human and yeast was used because of their similarity to our study subjects (zebrafish and C. albicans) and data availability. To infer the possible interactions of the study subjects (zebrafish and C. albicans), the ortholog information in the Inparanoid database [21] was used to convert the interactions of human and yeast [19, 22] into the interactions of zebrafish and C. albicans. It should be noted that the interactions inferred from the ortholog-based method were derived under different experimental conditions. Consequently, the data do not accurately reflect the actual biological condition of the pathogen-host interactions during the C. albicans infection process; that is, false positive interactions exist in the complete set of inferred possible interactions of zebrafish and C. albicans, and these false positive interactions need to be validated and removed using real microarray data. Therefore, the false positive interactions were deleted from the candidate PPIs and realistic pathogen-host crosstalk PPI networks in innate and adaptive immunity were constructed using the two-sided microarray data and the dynamic model of PPI interaction in the following section.

2.3 Pathogen-Host Protein-Protein Interaction Network (PH-PPIN) Construction

To construct the interspecies PPI network from the protein pool and candidate PPIs, the dynamic protein-protein interaction model was used to determine the realistic PH-PPIN using individual proteins in succession. Given that the total numbers of the host and pathogen proteins are N and M, respectively, then for a host target protein i in the PH-PPIN, the dynamic interaction model is as follows [26]:

$$\begin{aligned} p_i^{\left( h \right) }\left[ {k + 1} \right] = \sigma _i^{\left( h \right) }p_i^{\left( h \right) }\left[ k \right] + \sum \limits _{n = 1}^N {\alpha _{in}^{\left( h \right) }p_n^{\left( h \right) }\left[ k \right] } + \sum \limits _{m = 1}^M {{\gamma _{im}}p_m^{\left( p \right) }\left[ k \right] } + \beta _i^{\left( h \right) } + \epsilon _i^{\left( h \right) }\left[ {k + 1} \right] \end{aligned}$$
(1)

where \(p_i^{(h)}[k]\) denotes the protein level of the host target protein i at time k, \(\epsilon _i^{(h)}[k]\) denotes the environmental noise at time k, \(\sigma _i^{(h)}\) denotes the self-regulation ability of the host target protein i, \(\alpha _{in}^{(h)}\) denotes the interaction strength between the host protein n and the host target protein i, \(\gamma _{im}\) denotes the interaction strength between the pathogen protein m and the host target protein i, and \(\beta _i^{(h)}\) denotes the basal level of the host target protein i. Similarly, the dynamic interaction model of a pathogen target protein j can be written as follows:

$$\begin{aligned} p_j^{\left( p \right) }\left[ {k + 1} \right] = \sigma _j^{\left( p \right) }p_j^{\left( p \right) }\left[ k \right] + \sum \limits _{m = 1}^M {\alpha _{jm}^{\left( p \right) }p_m^{\left( p \right) }\left[ k \right] } + \sum \limits _{n = 1}^N {{\gamma _{jn}}p_n^{\left( h \right) }\left[ k \right] } + \beta _j^{\left( p \right) } + \epsilon _j^{\left( p \right) }\left[ {k + 1} \right] \end{aligned}$$
(2)

The biological significance of this formulation is that the protein level of the host (pathogen) target protein i (j) in the future (at time \(k+1\)) is determined by its current protein level (at time k) with self-regulation ability \(\sigma _i^{(h)}\) (\(\sigma _j^{(p)}\)), the interaction strength between the host (pathogen) target protein i (j) and the proteins of the same species \(\alpha _{in}^{(h)}\) (\(\alpha _{jm}^{(p)}\)) and the other species \(\gamma _{im}\) (\(\gamma _{jn}\)), the basal level \(\beta _i^{(h)}\) (\(\beta _j^{(p)}\)), and the environmental noise \(\epsilon _i^{(h)}\) (\(\epsilon _j^{(p)}\)) in the future. Due to the unavailability of proteomic data, the expression levels measured by the two-sided microarray experiments were used to represent the protein levels in the dynamic interaction model. The dynamic interaction model for the host target protein i can be further rewritten into a concise form as follows:

$$\begin{aligned} \mathbf{{p}}_i^{\left( h \right) } = {\varPhi _i}{} \mathbf{{\theta }}_i^{\left( h \right) } + \mathbf{{\epsilon }}_i^{\left( h \right) } \end{aligned}$$
(3)

where \( \mathbf {p}_i^{(h)} = \begin{bmatrix} p_i^{(h)}[1]&\cdots&p_i^{(h)}[K] \end{bmatrix}^T\), \(\theta _i =\begin{bmatrix} \alpha _{i1}^{(h)}&\cdots&\alpha _{iN}^{(h)}&\gamma _{i1}&\cdots&\gamma _{iM}&\sigma _i^{(h)}&\beta _i^{(h)} \end{bmatrix}^T\), \(\epsilon _i^{(h)}=\begin{bmatrix} \epsilon _i^{(h)}[1]&\cdots&\epsilon _i^{(h)}[K] \end{bmatrix}^T\), and

$${\varPhi _i} = \left[ {\begin{array}{*{20}{c}} {p_1^{\left( h \right) }\left[ 0 \right] }&{} \cdots &{}{p_N^{\left( h \right) }\left[ 0 \right] }&{}{p_1^{\left( p \right) }\left[ 0 \right] }&{} \cdots &{}{p_M^{\left( p \right) }\left[ 0 \right] }&{}{p_i^{\left( h \right) }\left[ 0 \right] }&{}1\\ \vdots &{} \ddots &{} \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots &{} \vdots \\ {p_1^{\left( h \right) }\left[ {K - 1} \right] }&{} \cdots &{}{p_N^{\left( h \right) }\left[ {K - 1} \right] }&{}{p_1^{\left( p \right) }\left[ {K - 1} \right] }&{} \cdots &{}{p_M^{\left( p \right) }\left[ {K - 1} \right] }&{}{p_i^{\left( h \right) }\left[ {K - 1} \right] }&{}1 \end{array}} \right] \; .$$

The dynamic model for the pathogen can also be rewritten into a similar form. The only unknown parameter \(\theta _i^{(h)}\) can then be estimated using parameter estimation methods, such as the least-squares estimation. However, due to the lack of large-scale measurements of host and pathogen protein levels, we used the temporal gene expression profiles as a substitute of protein activities to identify the parameter \(\theta _i^{(h)}\) in the model. Furthermore, to ensure the model was not unnecessarily complex, the Akaike information criterion (AIC) was introduced to detect the true model order (the number of interactions). The true model order with minimum AIC was considered as the criterion to delete false positive interactions in the candidate PH-PPINs. Hence, the final dynamic PH-PPINs encompass the dynamic interaction model of each protein with the minimum AIC value to remove the false positive PPIs. Finally, after identifying the parameters for each protein in the protein pools, the identified interactions parameters \(\left( \alpha _{in}^{(h)},\alpha _{jm}^{(p)},\gamma _{im},\text { and }\gamma _{jn}\right) \) formed the final dynamic PH-PPIN.

2.4 Interaction Variation Score (IVS) Calculation

To target the network biomarkers in the PH-PPINs, the IVSs were calculated for proteins to correlate proteins with the transition of the pathogen-host interactions from innate to adaptive immunity. The proteins in the PH-PPINs with the largest PPI variations from innate to adaptive immunity can be considered as crosstalk network biomarkers in the entire infection process and are considered as significant drug targets. Therefore, we investigated these crosstalk network biomarkers as follows. The IVS is a measurement of the variation of the interaction strength under a biological condition transition. According to the dynamic interaction models, the constructed PH-PPIN under a specific condition (innate or adaptive) can be written as follows:

$$\begin{aligned} \left[ {\small \begin{array}{*{20}{c}} {p_1^{\left( h \right) }\left[ {k + 1} \right] }\\ \vdots \\ {p_N^{\left( h \right) }\left[ {k + 1} \right] }\\ {p_1^{\left( p \right) }\left[ {k + 1} \right] }\\ \vdots \\ {p_M^{\left( p \right) }\left[ {k + 1} \right] } \end{array}} \right] = \left[ {\small \begin{array}{*{20}{c}} {\sigma _1^{\left( h \right) }}&{} \cdots &{}{\alpha _{1N}^{\left( h \right) }}&{}{{\gamma _{11}}}&{} \cdots &{}{{\gamma _{1M}}}\\ \vdots &{} \ddots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ {\alpha _{N1}^{\left( h \right) }}&{} \cdots &{}{\sigma _N^{\left( h \right) }}&{}{{\gamma _{N1}}}&{} \cdots &{}{{\gamma _{NM}}}\\ {{\gamma _{11}}}&{} \cdots &{}{{\gamma _{1N}}}&{}{\sigma _1^{\left( p \right) }}&{} \cdots &{}{\alpha _{1M}^{\left( p \right) }}\\ \vdots &{} \ddots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ {{\gamma _{M1}}}&{} \cdots &{}{{\gamma _{MN}}}&{}{\alpha _{M1}^{\left( p \right) }}&{} \cdots &{}{\sigma _M^{\left( p \right) }} \end{array}} \right] \left[ {\small \begin{array}{*{20}{c}} {p_1^{\left( h \right) }\left[ k \right] }\\ \vdots \\ {p_N^{\left( h \right) }\left[ k \right] }\\ {p_1^{\left( p \right) }\left[ k \right] }\\ \vdots \\ {p_M^{\left( p \right) }\left[ k \right] } \end{array}} \right] + \left[ {\small \begin{array}{*{20}{c}} {\beta _1^{\left( h \right) }}\\ \vdots \\ {\beta _N^{\left( h \right) }}\\ {\beta _1^{\left( p \right) }}\\ \vdots \\ {\beta _M^{\left( p \right) }} \end{array}} \right] + \left[ {\small \begin{array}{*{20}{c}} {\epsilon _1^{\left( h \right) }\left[ {k + 1} \right] }\\ \vdots \\ {\epsilon _N^{\left( h \right) }\left[ {k + 1} \right] }\\ {\epsilon _1^{\left( p \right) }\left[ {k + 1} \right] }\\ \vdots \\ {\epsilon _M^{\left( p \right) }\left[ {k + 1} \right] } \end{array}} \right] \end{aligned}$$
(4)

where the notations are the same as in the dynamic interaction models. The above equation can be written in a more concise form:

$$\begin{aligned} \mathbf{{p}}\left[ {k + 1} \right] = A\mathbf{{p}}\left[ k \right] + \mathbf{{\beta }} + \mathbf{{\epsilon }}\left[ {k + 1} \right] \end{aligned}$$
(5)

where A is a systematic interaction matrix of the PH-PPIN constructed under a specific condition. The interaction difference of two PH-PPINs between innate and adaptive immunity can be expressed in the following interaction difference matrix form:

$$\begin{aligned} {D_{\mathrm{{adaptive}} - \mathrm{{innate}}}} = {A_{\mathrm{{adaptive}}}} - {A_{\mathrm{{innate}}}}\; . \end{aligned}$$
(6)

If the variation of the interaction strength of a protein is larger during a biological condition transition (innate\(\rightarrow \)adaptive immunity in this study), this may imply the protein has a more important role in the transition from innate to adaptive immunity. Therefore, the IVS used to evaluate the interaction variability of a protein in the transition from innate to adaptive immunity can be defined as follows:

$$\begin{aligned} {IVS_p} = \frac{{\sum \nolimits _{q = 1}^Q {\left| {{d_{pq}}} \right| } }}{{{\text {Degree of protein }}p}} \end{aligned}$$
(7)

where \(d_{pq}\) is the pq-entry of \(D_{\text {adaptive}-\text {innate}}\), that is, the average interaction variation of the protein p in the transition from innate to adaptive immunity. The degree of protein p is the number of non-zero elements in the pth row of the interaction difference matrix \(D_{\text {adaptive}-\text {innate}}\). Those proteins with larger IVSs are considered as significant proteins that play an important role in the transition from innate to adaptive immunity in the infection process.

3 Results

3.1 The Pathogen-Host Protein-Protein Interaction Networks (PH-PPINs) of Innate and Adaptive Immunity

In this study, we aimed to investigate the systematic offensive and defensive mechanisms of pathogen and host at the molecular level. In particular, we aimed to understand the roles of pathogen-host interactions (PHIs) in innate and adaptive immunity from a systems biology perspective. The outcomes of interactions between the host and pathogen were recorded based on the two-sided temporal gene expression profiles of C. albicans and zebrafish that were simultaneously measured during the primary and secondary response periods in the infection process. During the two periods (the rectangles in Fig. 1A), the observed variations in the gene expression levels were mainly due to innate and adaptive immunity, respectively. We further selected 1620 proteins of interest for the protein pool, including those with differentially expressed features and the top 5 % of the expression level difference between the two datasets. The comparison of their temporal profiles (Fig. 2A) implied that their expression patterns changed: the activation of a group of pathogens genes was delayed and the repression of a group of hosts’ genes was advanced. The changes in the gene expression patterns implied the PHIs in these two periods should have corresponding variations. To determine the variations of the underlying PHIs, 26060 PPI candidates inferred from the database mining and ortholog information were further pruned using the dynamic interaction models, model order detection method, and two-sided microarray data (Fig. 1B) and then the innate and adaptive dynamic PH-PPINs were formed (Fig. 2B). In particular, the two constructed PH-PPINs were the underlying mechanisms used to explain the observed changes in the gene expression patterns in the infection process.

Fig. 2.
figure 2

Temporal gene expression profiles of the proteins of interest and the constructed innate and adaptive dynamic PH-PPINs. (A) The horizontal axis indicates the sampling time points in the microarray experiments. The vertical axis shows the genes clustered according to their expression patterns in innate immunity. (B) The innate and adaptive PH-PPINs consist of PPIs in three domains: pathogen-pathogen, pathogen-host, and host-host.

Fig. 3.
figure 3

The interaction difference network between innate and adaptive immunity. The IDN between innate and adaptive immunity consists of interactions in the three domains: pathogen-pathogen, pathogen-host, and host-host domains. The round and square nodes indicate the pathogen and host proteins, respectively.

The resultant PH-PPINs consisted of 1512 proteins (1431 C. albicans proteins; 81 zebrafish proteins) and 5721 PPIs (5510 intracellular interactions inside C. albicans; 145 interspecies interactions; 66 intracellular interactions inside zebrafish) for innate immunity, and 1578 proteins (1480 C. albicans proteins; 98 zebrafish proteins) and 3755 PPIs (3577 intracellular interactions inside C. albicans; 96 interspecies interactions; 82 intracellular interactions inside zebrafish) for adaptive immunity. Looking at the amount of variation in the nodes and edges of the pathogen, although most of the pathogenic nodes are shared between innate and adaptive immunity, the number of edges changed from 5511 to 3577: that is, only 1203 edges are shared. This implies that the pathogen may use almost the same set of proteins (85 %) but with different links to interact with the host and to regulate functions within the pathogen itself in response to various challenges from innate and adaptive immunity. In contrast, the host may use a different strategy since a different distribution of node and edge numbers was found compared with the pathogen. In the zebrafish, there are three more significantly enriched functions (angiogenesis, coagulation, and circadian clock) in the adaptive PH-PPIN compared with the innate PH-PPINs (metabolic processes, immune responses, and apoptosis). In addition, in C. albicans, there are two more significantly enriched functions (circadian clock and filament growth) compared with the innate PH-PPINs (response to stimulus, redox status, and budding). The new functions in the adaptive PH-PPIN compared with the innate PH-PPIN indicated changes in the response strategies of the host and pathogen. To efficiently identify and evaluate the significance of proteins in the innate and adaptive dynamic PH-PPINs, we differentiated the two PH-PPINs into an interaction difference network (IDN) (Fig. 3), i.e., the matrix D in Eq. (6), and then used interaction variation scores (IVSs) to evaluate the interaction variations of proteins in the IDN.

3.2 Identifying Crosstalk Network Biomarkers in the IDN Using IVS

Cell signaling depends on dynamic PPINs [23]. Hence, the interaction variation in the PPINs indicates the change in cell signaling and the corresponding consequences in the cellular functions. To illustrate the variation of PPINs, we adopted the notations of node color, edge color, and edge line style as shown in Fig. 3 to illustrate the existence of proteins and their interactions, and the variation of interactions from innate to adaptive dynamic PH-PPINs. Further, to focus on the proteins with significant variations, the IVS stated in Eq. (7) was used to evaluate the average interaction variation of a protein: that is, the ratio of the total interaction variation of a protein to the number of links possessed by the protein. Hence, the IVS can quantify the extent of the interaction variations, which may signify the importance of the proteins in the transition from innate to adaptive immunity, i.e., the IDN between innate and adaptive PH-PPINs (Fig. 3). In the following, we focused on the proteins with the ten highest IVSs in the three domains, that is, the host-host (zebrafish-zebrafish), pathogen-pathogen (C. albicans-C. albicans), and pathogen-host (C. albicans-zebrafish) domain, and determined the crosstalk network biomarkers in these domains.

The Crosstalk Network Biomarkers in the Host-Host Domain. In the host-host domain of the IDN, the ten proteins with the highest IVSs show close relationships with innate and adaptive immune responses. Extracting the ten proteins and their first neighbors from the IDN, there are five components in the host-host domain (Fig. 4A). The largest component consists of f2, LOC798231, LOC793315, ace2, gnai1, and their first neighbors (the left part of Fig. 4A). gnat2, a host G-protein that formed one end of the interspecies interaction, has connections with chemokine-related proteins (ccl-c5a and si:dkey-269d20.3) and chemotaxis-related proteins (ENSDARP00000105159 and ENSDARP0000111107). The angiogenesis- and coagulation-related proteins (agt, ace2, f2, and ENSDARP00000098661) are connected to these chemokine-related proteins. This component also consists of two other proteins: i.e., serine proteinase inhibitor (serpinc1) and prokineticin (ENSDARP00000109666). The roles of angiogenesis, coagulation, and chemokines are manifested in innate and adaptive immunity in this component. The second component mainly consists of complements (c7b, c8g, c8a, c8b, and c9) and vitronectins (vtna and vtnb). Given the well-known roles of the complements system in immunity, vitronectins have recently attracted much attention in the field of immunity [24]. The cd36 and apolipoproteins (apob1, apoba, and apobb) form the third component (the lower right part of Fig. 4A). CD36 plays a pivotal role in macrophage foam-cell formation and atherogenesis, which is reduced by apolipoproteins. Although the last two components are much less documented, the roles of versican (vcanb) and tank in inflammation have been reported [25].

Fig. 4.
figure 4

The crosstalk network biomarkers in the host-host and pathogen-pathogen domains. (A) Chemokines, the complements system, and angiogenesis and coagulation are the three major crosstalk network biomarkers in the host-host domain of the IDN owing to the higher IVSs of their members. (B) Redox status and pathogen expansion are the two major crosstalk network biomarkers in the pathogen-pathogen domain of the IDN owing to the higher IVSs of their members. The shadowed nodes represent the proteins with the ten highest IVSs in their domains.

Fig. 5.
figure 5

The crosstalk network biomarkers in the pathogen-host domain. Redox status, circadian rhythm, gene transcriptions, and chemokines are the four major crosstalk network biomarkers in the pathogen-host domain. The shadowed nodes represent the proteins with the ten highest IVSs in the pathogen-host domain. The round and square nodes indicate the pathogen and host proteins, respectively.

The Crosstalk Network Biomarkers in the Pathogen-Pathogen Domain. In the pathogen-pathogen domain, the ten proteins with the highest IVSs and their first neighbors form a single component (see Fig. 4B). In this component, the importance of redox status in the innate and adaptive immune responses is re-emphasized [26]. ERG1, CAL0005908, MET10, and GCV3 are all related to the redox status of C. albicans. In addition, CAL0005225, ERG1, and SDS24 are related to the expansion of C. albicans due to their functions in budding, filament growth, and the cell cycle, respectively. In particular, MET10 is related to the response to stress from the host and environment. Another major function in this component is transferase activity. MET2 is a homoserine acetyltransferase that can transform homoserine, a toxin for C. albicans, and is important for C. albicans survival. ARG3 facilitates the production of citrulline, which can induce pseudohyphal morphogenesis. The morphological transformation of C. albicans has been proven to be important in its pathogenesis. The hydrolase CAF16 exerts its influence on RNA polymerase II although the specific targeted genes are still unknown.

The Crosstalk Network Biomarkers in the Pathogen-Host Domain. In the pathogen-host domain, we also selected ten proteins from both the host and pathogen. These interspecies proteins form crosstalks that are more complicated than those in the pathogen-pathogen domain (Fig. 5). A possible mechanism for the correlation between redox status in the host and pathogen is shown in the pathogen-host domain, i.e., the interaction between thioredoxin (txn) and ribo-nucleotide reductase 1 (RNR1). In addition to its role in redox status, RNR1 also influences the iron utility, filament growth, and cell cycle of C. albicans. This implies that the effect of redox status on the pathogen is multifaceted. Compared with chemokine-related functions in the host-host domain, the role of chemokine-related functions in the PHIs of the pathogen-host domain are more interesting. CAG1, one protein involved in how chemokine-related functions affect the pathogen, is related to the hyphal growth, mating, and biofilm formation of the pathogen, which are all important in pathogenesis. In contrast to the appearance of redox status and chemokines in the pathogen-pathogen and host-host domains, respectively, gene transcription and the circadian clock can only be seen in the pathogen-host domain. Interactions between TAF60, gtf2a2, and polr2e emerged in adaptive immunity. TAF60, a transcription factor, is responsible for the drug responses in the pathogen, and gtf2a2 and polr2e are related to gene transcription in the host. Their interactions indicate a possible mechanism as to how the PHIs affect the gene expression level. In addition to gene transcription, the circadian clock has an interesting function in the host and pathogen. The circadian clock-related proteins of the host (cry2a, cry2b, and per2) and pathogen (HRR25) form a sub-network in the host-pathogen domain. The circadian rhythms in the host and pathogen are correlated and numerous functions of the pathogen (yeast-hyphal switch, gene transcription, pathogenesis, etc.) are affected through HRR25.

In summary, we found that the proteins of the most variable interactions in the IDN are the elements related to chemokines, angiogenesis, coagulation, redox status, pathogen expansion, gene transcription, and circadian clock functions: i.e., the so-called crosstalk network biomarkers. Thus, these crosstalk network biomarkers change considerably in the transition from innate to adaptive immunity in the infection process and are potential targets for treatment and vaccination. To further evaluate the plausibility of the crosstalk network biomarkers, we selected angiogenesis, coagulation, redox status, and the circadian clock due to their systemic influence and investigated the interplay between these biomarkers based on the IDN.

4 Discussion

We presented pathogen-host protein-protein interaction networks (PH-PPINs), which were generated by dynamic interaction models and two-sided microarray data during innate and adaptive response periods. The dramatic changes in the number of PHIs from innate to adaptive immunity (145 interactions in innate immunity, 96 interactions in adaptive immunity, and 36 shared interactions) and almost the same nodes appearing in both innate and adaptive immunity suggest that the strategy used by the pathogen are characterized by the use of almost the same subset of proteins to respond to the two different defense mechanisms of the host (i.e., innate and adaptive immunity) but with different interactions. On the other hand, although the strategies used by the host were quite subtle, we can tell that the expression patterns of the host genes change from innate to adaptive immunity based on the temporal gene expression profiles (Fig. 2A). Once the lack of PPI information for zebrafish is remedied, the resultant PH-PPIN can provide further insight into the responding strategies of the host and pathogen. For now, we were able to quantitatively investigate the interaction variations from innate to adaptive immunity in the infection process by following the clues regarding the variations in the number of interaction and the interaction strengths identified in the dynamic interaction models.

To focus our investigation over a smaller and more meaningful range, we utilized IVS to evaluate the average interaction variation of a protein in the transition from innate to adaptive immunity in the infection process. The IVS rules out the possibility of a large IVS being caused by many small interaction variations, a weakness of the carcinogenesis relevance value (CRV) of [28]. Hence, the IVS could better focus on proteins with large interaction variations. Further, we visualized the interaction difference matrix from innate to adaptive immunity as an IDN (Fig. 3B), which can be divided into three domains according to the types of interactions involved. For the three domains, we focused on the proteins with the ten highest IVSs to investigate their interaction variations and determined the crosstalk network biomarkers. Not surprisingly, several immune-related and pathogenic crosstalk network biomarkers emerged: chemokines, cytokines, the complement system, pathogen expansion, and redox status. Nevertheless, three additional crosstalk network biomarkers—circadian clock, angiogenesis, and coagulation—were found for the larger interaction variations of their components. Although these functions are not totally new in immunity research, crosstalk among these crosstalk network biomarkers is a novel contribution of this study. In particular, the influences of circadian clock, redox status, angiogenesis, and coagulation are systemic. The samples and sampling time points of the microarray data provided us an opportunity to gain insight into the mechanisms of how these systemic crosstalk network biomarkers interact. The whole fish body samples provided a holistic view of the systemic variations of the transcriptomes from innate to adaptive immune response. The observation windows of the microarray experiments (Fig. 1A) revealed the involvement of the circadian clock in innate and adaptive immunity, which may be concealed if there are not enough sampling points over several days. Thus, we identified several significant proteins and crosstalk network biomarkers in the three domains based on their larger interaction variations from innate to adaptive immunity and then explored them by taking a closer look at the IDN.

In summary, our findings underpin the criticality of the circadian clock crosstalk network biomarker in terms of the type of immune response generated by an organism [27] and further show how the circadian clock, redox status, angiogenesis, and coagulation crosstalk network biomarkers are tightly coupled with pathogenesis and the host immune systems. This provides an opportunity to design new and efficient therapeutic guidelines for drug targets and the time window for treatments.