Keywords

13.1 Introduction

Complex diseases comprise a large class of common diseases, which originate from interactions among multiple factors such as gene mutations, environmental effects, and personal lifestyle choices [1]. The morbidity, mortality, and recurrence rates of these diseases are growing rapidly throughout the world at present. Due to the initiation and development of P4 (predictive, preventive, personalized, and participatory) medicine and precision medicine, medical paradigms are constantly shifting. Many traditional methods that focus only on single genes or proteins and that view a living organism as simple systems cannot provide a clear understanding of the essential mechanisms of complex diseases such as cancer, diabetes, and cardiovascular and neuronal diseases [2]. Therefore, systematic theories and approaches need to be provided and translated into clinical practice.

Systems biology is one of the most effective and powerful weapons for fighting complex diseases [3], where it emphasizes the dynamic interactions among biological molecules at different omics levels as well as elucidating their in-depth behaviors or mechanisms from a systematic perspective. These interactions connect biological components to generate complex interacting modules or networks, which have great significance for case studies and clinical applications [4]. More importantly, most of these biological networks tend to have meaningful structural characteristics, which are of great value for discovering potential rules or patterns of occurrence and progression for complex diseases [5, 6].

Among the principles of systems biology, network analysis is now becoming the main approach for investigating biological processes and functions in the field of biomedical informatics [7]. Various holistic concepts such as network biomarkers [8] and network medicine [9], which break the shackles of reductionist viewpoints, offer new methods for exploring the complexities of human diseases as well as helping to address biomedical problems at the systems level. Due to the popularization of next-generation sequencing (NGS), increasing numbers of studies are combining static networks with large-scale dynamic expression data [10], thereby elucidating the changes in diseases at different time points. All of these innovations facilitate the diagnosis and treatment of complex diseases, as well as building a strong bridge between fundamental research and clinical sciences.

13.2 Networks and Graphs

A network is a description and abstraction of real things and their relationships, which can be represented as a graph model with two essential components: a set of vertexes (or nodes), V = {v 1 ,v 2 ,…,v N }, and a set of edges (or lines), E = {e 1 ,e 2 ,…,e M }, between pairs of vertexes. There are many instances of networks such as social networks, traffic networks, and financial networks, but we focus on biological networks.

13.2.1 Classification of Networks

13.2.1.1 Directed and Undirected Networks

Networks can be divided into two types according to the directivity that they indicate: directed networks and undirected networks. In a directed network, an edge (i, j) indicates that a relationship exists from vertex i to j, but not vice versa. Thus, i and j are known as the starting points and ending points, respectively. In an undirected network, two vertexes are bidirectionally reachable when their edges are linked. Among the different types of biological networks, a protein-protein interaction (PPI) network is typically an undirected network, whereas microRNA-mRNA regulatory networks are recognized more commonly as directed networks.

13.2.1.2 Weighted or Valued Networks

Network data often contain extra information regarding the extent or strength of each relationship. For example, in gene co-expression networks, correlation coefficients are usually calculated in order to quantify the extent of the interactions among genes [11]. This extra information is referred to as a weight or value in network science. Thus, the two basic network types mentioned above can be extended to four, as shown in Fig. 13.1a–d.

Fig. 13.1
figure 1

Fundamental types of networks. (a) Unweighted undirected network. (b) Unweighted directed network. (c) Weighted undirected network. (d) Weighted directed network. (e) Bipartite network

13.2.1.3 Bipartite Networks

Given a network N = {V, E}, if the vertex set V can be divided into two independent subsets V 1 and V 2 (V 1 ∪V 2= V, V 1 ∩V 2 = Φ), and all edges are between paired vertexes belonging to different subsets, then the network can be represented as a bipartite network (or bipartite graph; see Fig. 13.1e). Bipartite networks are used widely in biological research. For instance, a human gene-disease network is bipartite, where one set of vertexes are diseases and the other set are genes that are closely related to linked diseases.

13.2.2 Centrality of Vertexes

13.2.2.1 Degree Centrality

In undirected networks, the degree of a certain vertex i (termed k(i)) equals the number of edges that are incident on it. In directed networks, the vertex degree k(i) is partitioned into two parts, the in-degree k I (i) and out-degree k O (i) (mathematically, k(i) = k I (i) + k O (i)), which are equivalent to the number of vertexes that are adjacent to and from the vertex i, respectively. Degree centrality is the most common property used to measure the importance of a vertex in a network, and it equals the ratio of the actual to theoretical maximum degree of a given vertex. This metric indicates that vertexes with larger degrees are more critical in the network. For example, old genes with significant biological functions often have large degrees, and they lie at the heart of a PPI network [12].

13.2.2.2 Closeness Centrality

This metric indicates how close the given vertex is to all of the other vertexes in the network. In general, the vertex with the highest closeness centrality is located at the optimum position for viewing the information flow. The vertex closeness centrality can be calculated for both nondirectional and directional relations. If we consider an undirected network as an example, the closeness centrality of vertex i (CC(i)) in an undirected network with N vertexes is

$$ CC(i)=\frac{N}{\sum_{j=1}^Nd\left(i,j\right)} $$
(13.1)

where d(i, j) represents the distance from vertex i to j.

13.2.2.3 Betweenness Centrality

Interactions between two nonadjacent vertexes in a network can be affected by the actions of other vertexes, especially by those that lie between them. Some vertexes are important because all the shortest paths along which information flows from any vertex at one side to the other must pass through them, and thus the betweenness centrality is the metric used to describe the importance of a given vertex based on the number of shortest paths that it penetrates. Vertexes with higher betweenness centrality hint may have a greater capacity to control the flow of information. In an undirected network, the betweenness centrality of the given vertex i (BC(i)) is

$$ BC(i)={\sum}_{a\ne i\ne b}\frac{m_{ab}^i}{n_{ab}} $$
(13.2)

where n ab is the number of shortest paths linking vertex a and b and m i ab is the number of shortest paths linking vertex a and b that contain vertex i.

13.2.3 Topological Properties of Networks

13.2.3.1 Degree Distribution

The degree distribution P(k) of a network is equivalent to the fraction of vertexes in the network with degree k. If the network is directional, the distribution should be refined as an in-degree distribution or out-degree distribution. Vertexes in different networks tend to follow different degree distributions, such as the normal distribution (or Gaussian distribution), binomial distribution, and long tail distribution (or scale-free distribution) (see Fig. 13.2a–c).

Fig. 13.2
figure 2

Schematic diagrams of four common degree distributions. (a) Normal distribution. (b) Binomial distribution. (c) Long tail distribution. (d) Power-law distribution

13.2.3.2 Power-Law Distribution

In 1999, Barabasi and his colleagues showed that the in-degree and out-degree distribution of the World Wide Web has a power-law tendency [13]. In network science, networks with this property are often known as scale-free networks. Mathematically, P(k) ~ k –γ, where γ is a parameter with a value that usually ranges among 2 < γ < 3 (see Fig. 13.2d). In fact, many important biological networks are also scale-free. For example, the human PPI network has an approximately scale-free characteristic [14] with a degree exponent of 1.49 [12], which indicates that proteins (or genes) with large degrees (i.e., hubs) are few in number and they may affect the whole network greatly.

13.3 Biomolecular Networks and Their Clinical Applications

Biological molecules interact to promote the activity and evolution of living organisms. These interactions contribute to various types of biological networks, where they may influence the significance and complexity of biological processes in many ways.

13.3.1 Protein-Protein Interaction (PPI) Networks

Proteins are the direct products of functional genes, and they are large biological molecules that mediate the functions of living organisms. Accumulating evidence indicates that PPIs are closely associated with biological processes [15, 16], where they play pivotal roles in a large number of cellular behaviors and their abnormal activities may lead to the development of numerous diseases [17].

Protein interactions (“interactome”) have generally been identified based on multiple biological experiments or computational approaches. However, due to the development of biological and computational techniques, the volume of PPI data increases year, and many publicly available databases have been created to store these data, thereby providing valuable information for interactome research.

Table 13.1 lists six manually curated PPI databases. The data in these databases have been verified by experiments or published studies. To fully exploit these data and analyze them at a higher level, Wu et al. integrated their interactions and constructed a PPI network analysis platform (called PINA) for investigating the underlying latent information [18]. The platform was then enhanced in version 2.0 by including interactome modules identified by the global PPI network for six model organisms [19].

Table 13.1 Protein-protein interaction databases

It is widely acknowledged that the functions of biomolecules are affected by their structures [26, 27] and network structures with special characteristics can also indicate the possible mechanisms of interactions among given biomolecules [28]. Many studies have shown that PPI networks tend to be scale-free [29] and proteins/genes with large degrees or centralities (or hubs) may play important roles in relevant biological processes. In addition, PPI networks have been reported as modular [30], so some studies have addressed the substructural analysis of PPI networks. For example, Luo et al. [31] separated five modules related to the initiation of early-onset colorectal cancer using a PPI network based on gene expression data and cluster analysis and then screened five hub genes as key indicators or candidate therapeutic targets for this disease. Gene ontology and pathway enrichment analyses demonstrated the validity of their results. Zanzoni and Brun [32] designed a computational approach that considers both PPI network and stage-based proteomics profiles to identify dysregulated cellular functions during the progression of different cancers. They extracted several functional modules using the OCG algorithm [33] and annotated them based on gene ontology terms and pathway signals. Combined with actual proteomics datasets obtained at different stages of cancers, they selected modules with increasing, decreasing, or stage-specific importance during cancer progression. This study showed that protein modules are functional in different biological processes and that the interactions among them are usually as important as the proteins themselves. To some extent, PPI networks can provide a comprehensive understanding of molecular interactivity rather than single proteins, thereby presenting more opportunities for elucidating the potential mechanisms under different conditions. This could allow great breakthroughs in the diagnosis and treatment of complex diseases.

13.3.2 Gene Co-expression Networks

PPI networks represent the interactions among proteins/genes from a static perspective. However, these interactions might not be exactly the same in different conditions due to the specificity of samples or groups with different backgrounds. In recent decades, due to the rapid development of experimental technologies, the number of expression profile data identified by high-throughput screening has increased greatly, thereby providing unprecedented opportunities for integrative analyses of clinical diseases based on static networks and dynamic expression information.

Gene co-expression networks combine the similarity of expression among coordinated genes with the topological properties of networks, which can provide a systematic view of the dynamic changes of molecular activities and cellular functions during the evolution of biological processes. Using gene co-expression networks to analyze complex biological phenomena is simple and efficient [34]. Importantly, they are beneficial for building condition- or disease-specific networks, which are useful for elucidating the underlying mechanisms related to the progression of specific diseases [35].

Rotival and Petretto [36] reviewed some well-known computational methods for co-expression network analysis, which can be divided into two categories according to specific guiding principles. The first category comprises potential foundational factors, the influences of which may lead to changes in gene expression. These methods first select the principal factors and their induced genes from a pool of candidate factors based on principal components analysis or nonnegative matrix factorization algorithms [37], before extracting functional modules based on the factor-gene pairs. The other methods for co-expression network analysis are largely dependent on graph-based modeling, where vertexes or edges with similar features are clustered into the same modules. As shown in Fig. 13.3, co-expressed genes are usually measured by correlation analysis, such as Pearson’s correlation coefficient (PCC), Spearman’s rank correlation, or Kendall correlation tests, where the functional modules are finally inferred for further research.

Fig. 13.3
figure 3

Pipeline for graph-based gene co-expression network analysis

In the era of biomedical informatics, co-expression network analysis greatly improves the speed and accuracy of disease-associated gene discovery. Zhang et al. [38] confirmed that five crucial genes can be used as prognostic markers for chronic lymphocytic leukemia, where they constructed a co-expression network using the CODENSE algorithm [39] and they focused mainly on modules containing the key gene ZAP70. Yang et al. [40] built gene co-expression networks for four different types of cancers and found that the features of prognostic genes did not lie at hub positions in cancer-specific co-expression networks, but instead they were often enriched in modules conserved among different cancer networks. This may be an important insight that could facilitate the identification of cancer prognostic genes in clinics.

13.3.3 MicroRNA-mRNA Regulatory Networks

MicroRNAs (miRNAs) are small noncoding RNAs that comprise approximately 22–24 nucleotides. miRNAs silence gene expression at the posttranscriptional level by base-paring with their target mRNAs [41]. According to previous studies, miRNAs are involved in a variety of important biological processes, such as cell proliferation, development, apoptosis, and immune responses [42, 43, 44]. In addition, the aberrant expressions of miRNAs may cause many serious diseases [45, 46, 47].

The relationships between miRNAs and their targets can be abstracted as a bipartite network (or bipartite graph), which is called a miRNA-mRNA regulatory network. The pairs in the network comprise miRNA-mRNA regulations, which can be determined using experimental and computational methods. Table 13.2 lists several useful databases that store miRNA-mRNA pairs.

Table 13.2 miRNA-mRNA regulatory pair databases

Some well-known tools are also available for miRNA-target prediction. For example, TargetScan [55] infers miRNA targets by matching the seed region of each input miRNA. miRanda [56] is an optimized method that relies only on sequence complementarity and user-specified rules to enhance the accuracy of predicted results. In general, the miRNA-mRNA pairs identified by low-throughput experiments such as real-time PCR are more convincing than those determined using high-throughput techniques such as microarrays or NGS, while the pairs predicted by computational algorithms often have a high false-positive rate. Thus, it is necessary to clean the data before constructing the final network.

miRNAs function in the development of many diseases, and many studies have attempted to discover disease-associated miRNAs based on miRNA-mRNA regulatory networks. One of the most popular approaches is based on the theory that miRNAs may be functionally synergistic so they can co-regulate the expression of their target genes. Bandyopadhyay et al. [57] found that the miRNAs included in a module may have a combinatorial effect on their targets, where those located next to the module appeared to have similar dysregulatory patterns. Based on this observation, several computational frameworks or programs have been developed to identify abnormal miRNAs or miRNA regulatory modules in human diseases [58, 59].

Instead of the synergistic functions of miRNAs, Zhang et al. [5] focused on the substructures of miRNA-mRNA regulatory networks and found evidence that miRNAs can regulate genes independently. They defined a novel bioinformatics model using the NOD (novel out-degree) parameter to quantify the independent regulatory power and employed it to detect key miRNAs in prostate cancer [5, 60], gastric cancer [61], and sepsis [62]. The model was expanded later by considering the biological functions of miRNA targets [6]. Unlike some machine learning-based methods that are highly reliant on the training data, the improved model identified crucial miRNAs without any prior knowledge, and its application to biomarker discovery for pediatric acute myeloid leukemia demonstrated its great predictive power.

Another typical application of miRNA-mRNA networks in clinical research is the approach proposed by Zhao et al. [63], who utilized a network as a bridge to infer cancer-related miRNAs from dysfunctional genes and their enriched pathways. The method is flexible because it can identify cancer-related miRNAs without requiring miRNA expression profiles. All of the studies mentioned above demonstrate the importance of miRNA-mRNA regulatory networks, especially in the field of disease-associated miRNA discovery.

13.3.4 Competing Endogenous RNA (CeRNA) Regulatory Networks

It has been widely reported that miRNAs may repress a large proportion of transcripts and they can act as oncogenes [64] or tumor suppressor genes [65] in many diseases such as cancers. Recent studies have demonstrated that the transcriptome has a large number of components, including protein-coding RNAs (or mRNAs), pseudogene transcripts, and long noncoding RNAs (lncRNAs), which “talk” with each other using the “letter” miRNA response elements (MREs) by competitively binding the limited sites in common miRNAs to influence the regulatory effects of miRNAs on their targets [66]. Salmena et al. [67] formally proposed the ceRNA concept to represents the group of RNAs with these abilities.

The activities of competing endogenous RNA (ceRNAs) form a large-scale regulatory network at the posttranscriptional level, and thus the traditional paradigm of “miRNA→RNA” has gradually been replaced by “RNA→miRNA→RNA.” As shown in Fig. 13.4, in this new model, miRNAs are often recognized as mediators, where different ceRNAs bind them competitively to promote changes in the expression of the target genes (or mRNAs) mediated by miRNAs.

Fig. 13.4
figure 4

Schematic diagram of two regulatory paradigms. (a) “miRNA→RNA” paradigm and miRNA regulatory network. (b) “RNA→miRNA→RNA” paradigm and ceRNA regulatory network

In ceRNA regulatory networks, miRNAs can target a large number of co-expressed transcripts, and the expression of one targeted transcript can be affected by changes in the concentration of other transcripts [68]. Multiple RNA transcripts may share one miRNA via MREs in their 3’ untranslated regions. Su et al. [69] found that overexpressed ceRNAs may increase the concentration of specific MREs to change the distribution of miRNAs, thereby leading to increases in the expression levels of their targets.

In recent years, studies have demonstrated that the initiation and progression of cancer are closely related to the dysregulation of ceRNA networks. Thus, Sumazin et al. [66] discovered a miRNA-mediated network with more than 248,000 interactions, and they showed that the network regulated various established genes and oncogenic pathways with close relationships to the initiation and development of glioblastoma. Tay et al. [70] confirmed that the ceRNA regulatory network was functional for protein-coding RNAs and tests based on the tumor suppressor gene PTEN showed that the expression patterns of protein-coding RNA transcripts were consistent with PTEN. Overall, it was concluded that ceRNAs and their networks may play crucial roles in disease development processes.

Understanding the competition mechanisms of ceRNAs may provide great insights into the pathogenesis of specific diseases. For instance, Zhou et al. [71] constructed a breast cancer-specific ceRNA regulatory network by combining miRNA-mRNA relationships with miRNA and mRNA expression datasets from patients with breast cancer, where they found that the network also tended to follow a power-law. Moreover, functional analysis indicated that the hub genes and dense clusters were strongly linked to cancer hallmarks, which proved valuable for risk assessments in breast cancer. Thus, ceRNA regulatory network-based analyses may inspire new approaches to both fundamental and clinical studies of complex diseases.

13.3.5 Others

Due to the complexity of disease progression, other biological networks such as drug-target interaction networks [72], metabolic networks [73], and epigenetic networks [74] may also have important functions during the occurrence and development of diseases. However, due to space limitations, please refer to the references cited for further details.

13.4 Network Biomarkers in Complex Diseases

Biological markers, also known as biomarkers, are unique molecules that can indicate changes or potential changes in biological conditions from normality to abnormality in living organism [75]. Clinically, biomarkers with high sensitivity and specificity could serve as powerful indicators for disease diagnosis and prognosis. Instead of using single biomarkers, network-based biomarkers are now becoming more popular because they can help to investigate the overall behaviors of biological molecules and they may reflect the system-level states of diseases.

13.4.1 Single Molecular Biomarkers and Network Biomarkers

Many studies have shown that single biological molecules can be effective biomarkers for both the diagnosis and prognosis of human diseases. For instance, the protein prostate-specific antigen is widely used for the early detection of prostate cancer [76]. The BRCA1 and BRCA2 genes can also be useful markers for breast cancer [77]. In addition, some noncoding RNAs such as miRNAs may also have diagnostic or prognostic roles in many complex diseases [78, 79].

The traditional methods used to detect candidate biomarkers rely mainly on biological experiments. Most begin by identifying differentially expressed or deregulated molecules based on large-scale expression profiling data, before validating the selected candidates in low-throughput experiments [80]. Considering the limited availability of samples and time-consuming pipelines, several computational approaches have been developed to improve the efficiency of biomarker signature discovery [81].

Single molecules may be dysfunctional in many cellular processes, but they are still not sufficiently powerful to explore the underlying mechanisms of certain diseases due to the diversity and complexity of disease development. In fact, complex diseases are usually due to interactions among multiple factors rather than the breakdown of single molecules. Moreover, single biomarkers identified in samples from patients with similar diseases by different methods tend to exhibit high heterogeneity [82]. Complex diseases should be considered more as disorders in a system; therefore, the concept of network biomarkers has been proposed, and novel strategies have been developed for explaining genetic or epigenetic changes across diseases.

There are two main types of network biomarkers: static network biomarkers (SNBs) and dynamic network biomarkers (DNBs). As shown in Fig. 13.5, the former integrates the interactions, annotations, and pathway signals of molecules by focusing only on the static nature of networks, whereas the latter pays considers the states of a disease at different time points, which is useful for monitoring the progression of diseases [83].

Fig. 13.5
figure 5

Three different types of biomarkers: single molecular biomarkers, static network biomarkers (SNBs), and dynamic network biomarkers (DNBs)

13.4.2 Static Network Biomarkers (SNBs)

Complex diseases are always caused by system-level disorders in living organisms. Thus, network biomarkers are more useful for explaining the pathogenesis of diseases than single molecular markers. Improvements in experimental techniques and theories of informatics mean that more interactions among biological molecules have been elucidated as well as their annotations and signal transduction pathways, thereby providing static information for exploring diseases within a systems biology framework and helping to translate theoretical analyses into clinical research.

As a solid bridge between the genotype and phenotype, proteins are vital biological molecules with significant roles in the occurrence and evolution of diseases. Thus, many studies have focused on protein-based network biomarkers, and they are valuable for validating mechanistic hypotheses related to the progression of diseases. The main pipeline is shown in Fig. 13.6. First, disease-associated proteins/genes are selected by analyzing experimental data or other publications, which are then mapped onto the reference PPI network where the knowledge-based PPIs are integrated. Thus, a disease-specific PPI network is constructed. Second, subnetworks of candidate biomarkers are scored and identified from the disease-specific network according to their actual expression levels, existing knowledge, or the topological properties of the network. Finally, in vitro experiments or machine learning methods such as support vector machines (SVMs) [84] or artificial neural networks [85] can be used to validate the results and to perform further research.

Fig. 13.6
figure 6

Pipeline for protein-based network biomarker discovery

To highlight the carcinogenic mechanisms of lung cancer, Wang and Chen [86] constructed a biomarker network based on microarray data and PPIs. They identified 40 proteins that had significant associations with lung carcinogenesis using the network, and they found that three-quarters of the total (30/40) had annotations related to cell growth. The biomarker network had the potential to diagnose smokers with signs of lung cancer, which could be an effective therapeutic target to fight cancer.

In addition to disease diagnosis, biomarker networks are capable of distinguishing metastatic and non-metastatic tumors. Chuang et al. [87] combined breast cancer metastatic and non-metastatic data with a PPI network using the “subnetwork activity matrix” and greedy algorithm to prioritize high-ranked subnetworks as candidate biomarkers. They found that genes in these biomarker networks were enriched for the hallmarks of cancer, and the results of SVM classification showed that these network biomarkers were highly accurate in separating metastatic and non-metastatic breast tumors, which may have significant utility for tumor progression investigations.

In addition to protein-based biomarker networks, noncoding RNAs are essential during the disease development process. It is obvious that interactions among these RNAs and their targets or regulators can form functional or even biomarker networks. Lu et al. [88] built miRNA biomarker networks containing miRNA targets and relevant transcription factors and applied them to the diagnosis of gastric cancer. Cui et al. [89] identified three lncRNA co-expression modules connected with prostate cancer, one of which may be recognized as a module biomarker for prostate cancer diagnosis.

13.4.3 Dynamic Network Biomarkers (DNBs)

Traditional molecular biomarkers and network biomarkers can only distinguish between diseases in two stable states. This static information limits their capacity to detect certain pre-disease states. However, pre-disease states may reflect crucial signs of disease progression, and they could be key indicators for early diagnosis and the prevention of diseases.

The novel DNB concept was proposed to overcome these limitations and to elucidate more dynamic changes in diseases. Based on complex network theory and nonlinear dynamical theory, DNBs can evaluate the stages of diseases at different time points and represent molecules and their relations in a three-dimensional image, as well as facilitating the discovery of stage-specific or personalized biomarkers in the era of biomedical informatics [83].

Chen et al. [90, 91] partitioned the process of disease development into three stages: normal, pre-disease, and disease. The normal stage is stable, and it represents the state of health or early disease. In this stage, changes are usually gradual. The pre-disease stage indicates the state immediately before critical changes have been reached. Molecules in living systems undergo dramatic transitions during this stage until another stable stage (the state of disease or advanced disease) occurs. The pre-disease stage is crucial because it may provide latent signals of disease progression, which could be pivotal markers for the early diagnosis of disease.

To quantify signals and detect DNBs during system-level transitions, a composite index (CI) is defined as follows [90]:

$$ CI=\frac{{{\mathrm{SD}}_d\times PCC}_d}{PCC_o} $$
(13.3)

where SD d is the average standard deviation (SD) of the DNB molecules (molecules in DNB), PCC d represents the average PCC among DNB molecules as absolute values, and PCC o represents the average PCC among DNB molecules and other molecules as absolute values. In fact, the DNB comprises a group of molecules in the system, which can provide significant information about the changes at critical points of the pre-disease stage. These molecules are functional compared with other non-DNB molecules in the same system. The expression of these molecules is identified mainly using experimental data, especially those obtained from high-throughput omic experiments.

The theory of DNB has been employed to detect early-warning signs for both type 1 and type 2 diabetes, especially recognizing the key points at which the state reverses. In a study of type 1 diabetes [92], two DNBs were built to predict sudden changes during the progressive disease deterioration. Previous studies and functional analyses demonstrated that these two DNBs are highly relevant to type 1 diabetes and they may be useful for its early diagnosis. Based on this study, tissue-specific DNBs were constructed for type 2 diabetes mellitus, and two significant states were identified that had strong associations with severe inflammation and insulin resistance [90]. The genes in the DNBs were shown to be dysfunctional at the point of disease deterioration according to a cross-tissue analysis. Importantly, they were mostly located upstream of the signaling pathways, and they acted as leaders during the transcriptional processes. These results demonstrate that DNB can be predictors of the occurrence of disease, as well as transducers that may facilitate a better understanding of the molecular mechanisms of disease development.

13.4.4 Evolution of Network Biomarkers During Disease Progression

Network biomarkers are system-level molecular modules that are helpful for investigating the evolutionary mechanism of disease progression. Wong et al. [93] constructed two PPI-based network biomarkers for the early and late stages of bladder cancer. First, they downloaded microarray data for the two stages of bladder cancer and for normal samples from the Gene Expression Omnibus repository, before constructing two different networks for the two stages of bladder cancers using statistical methods. Second, proteins/genes were extracted with significant carcinogenesis relevance values together with their network structures. The activities of these proteins tended to exhibit remarkable changes in normal and disease samples, and these changes may be essential in bladder cancer carcinogenesis. The results obtained by pathway enrichment analysis showed that proteins in the biomarker network for early-stage bladder cancer were significantly more enriched in pathways related to ordinary cancer mechanisms such as the cell cycle, pathways in cancer, and Wnt signaling pathway, and these proteins may also be functional in other cancers such as prostate cancer, chronic myeloid leukemia, and small cell lung cancer. By contrast, the ribosome and spliceosome pathway were the top two pathways targeted by the biomarker network for late-stage bladder cancer. Obviously, during the evolution of bladder cancer, proteins and their interactions change gradually, but ultimately there is a shift in the enriched pathways from universal to specific types.

Meaningful evolutionary patterns were also discovered in a study of hepatocellular carcinoma (HCC) [94], where Wong et al. analyzed the evolution of network biomarkers from the early to late stages of HCC using a framework analogous to that employed for bladder cancer research. However, NGS datasets were used in this study. They found that the common pathways enriched for network biomarkers in both the early and late stages of HCC were associated with the ordinary mechanisms of cancers, where the spliceosome pathway was prominent in the late stages of both hepatocellular and bladder cancer.

Both of these studies provide new insights into disease-targeted therapies at different stages or time points, and they merit further clinical research.

13.5 Network Medicine in Clinical Practice

The Human Genome Project shifted genome-wide studies from isolated genes or proteins to the networks of interconnections among them. The traditional methods for disease diagnosis and drug discovery are symptom-based or molecule-based. However, the occurrence of diseases is rarely a consequence of the disorder of single molecules, and different diseases are likely to share similar symptoms. Thus, the concept of network medicine, which emphasizes treating disease progression at the systems level, may provide new directions for disease analysis and therapeutics.

13.5.1 Paradigm of Network Medicine

The pattern for disease classification and drug discovery has changed greatly due to the continuous deepening of biomedical ideas and techniques. During the early period, diseases were often simply classified based on knowledge of clinical symptoms. However, this method is inaccurate, and it may miss opportunities for disease prevention due to its low sensitivity and specificity. Clearly, symptoms may be totally absent during the early stage of a disease, and most ordinary symptoms are not specific to a certain disease [95].

The emergence and development of genomic research has provided various types of molecular data, which facilitate investigations of the underlying mechanisms of disease progression. Therefore, the disease analysis paradigm has gradually shifted from studies of outward manifestations to internal mechanisms. For example, complex diseases can be caused by multiple changes in biomolecules, such as DNA methylation, single nucleotide polymorphisms, and DNA copy number variations. Similar disease symptoms may be apparent, but the treatments will be quite different according to the differences in pathogenesis. Therefore, molecule-based methods are more beneficial for the personalized and precise treatment of diseases [96].

Recently, many analyses have shown that complex diseases are multigenic, resulting from the synergistic actions of genetic and environmental factors. Simple molecule-based methods focus only on biological molecules that act as key players in the system. However, these single components are not sufficient to create system-level disruption. Instead, network medicine treats disease diagnosis and therapeutics from a global perspective by linking the potential factors that are relevant to disease occurrence and development to form an organic network, thereby identifying reasonable therapeutic strategies at specific time points according to both the static and dynamic properties of the network. The pathogenic behavior of complex interactions among molecules can be uncovered at various omics levels using this systemic approach, and effective drugs may be obtained to reach the goal of precision medicine [97].

13.5.2 Foundations and Resources

Network medicine is based on a series of hypotheses that are widely acknowledged by researchers. However, the theory continues to improve due to the development of systems biology and network science. The main focus is on linking network structures and disease occurrence. Thus, the topological structures of biological networks might potentially reflect the roles of specific molecules during disease initiation and progression. In particular, evidence has shown that essential proteins/genes often lie at the heart of a PPI network, whereas nonessential disease proteins are not found in these central locations. This is quite similar to a social network where important people or leaders are usually hub nodes who can control the information flow. In addition, proteins appear to cooperate with each other, especially those involved in the same diseases. Many studies have shown that proteins often participate in biological processes in the form of modules, which highlights the existence of synergistic mechanisms. Moreover, cells that exist in a microenvironment of diseases with similar phenotypes tend to have common disease-associated components. This may help to explain why comorbidities usually occur. Finally, the causal molecular pathways are parsimonious, and they often form the shortest paths between known components of diseases [95].

The essential resources for network medicine study are suitable data or datasets. It is obvious that sufficient data can drive research to become more precise and specific due to differences between omics levels, disease stages, and even individuals or groups. Chen and Butte [96] summarized eight publicly available data sources for network medicine, which offer great opportunities for disease analysis and drug discovery. Furthermore, databases such as HMDD [98] and DriverDBv2 [99] aim to represent the relationships between biomolecules and diseases, thereby providing great insights into the pathogenic nature of diseases. Details of these databases are listed in Table 13.3.

Table 13.3 Publicly available data sources for network medicine study

Bioinformatics approaches perform well at mining functional molecules or molecular modules for disease diagnosis and treatment. The most remarkable achievement is the discovery and application of network biomarkers. As described in Sect. 11.4, network biomarkers indicate dysfunctional modules during disease progression, and they facilitate the development of macroscopic explanations of disease initiation. It is clear that they are indispensable components of network medicine because they can provide important signals, which are sensitive and specific for both disease research and drug design.

13.5.3 Research Significance and Practical Challenges

Network medicine combines systematic thinking with clinical sciences, and by utilizing network theory as a mediator, it is poised to promote the understanding of disease pathogenesis and to forecast disease development trajectories or tendencies. It focuses on predicting the key players in disease progression, with the aim of providing better therapeutic strategies for patients [100].

The development of ideas is always accompanied by opportunities and challenges, and network medicine is not an exception. The volume of data available for network medicine study is huge, but that with practical value may be limited. Furthermore, the structure of the data is inconsistent, especially clinical data, which is rooted in different schemas and ontologies [96]. Thus, necessary criteria should be established for data representation, or the process of data integration and further analysis may be hindered. Due to the complexity of biological mechanisms, networks should be more specific. It has been reported that response networks for the same drug tend to exhibit distinct heterogeneity in different cell lines. The components and activities of real living organisms are more complex than those in computational models because networks or functions are generally not condition-specific and they fail to consider the effects of the external environment. Thus, effective methods and tools should be developed for constructing models across different omics levels in the era of big data, as well as to aid discovery in personalized therapeutics for different populations with different diseases under the guidance of precision medicine.

13.6 Conclusions

Analyzing complex biological problems within a network framework facilitates deeper investigations of the behaviors of biomolecules and their interconnections. The application of network biomarkers and network medicine may accelerate the understanding of disease pathogenesis, as well as promoting the transformation from fundamental research to clinical practice in the era of biomedical informatics. In particular, the human system is far more complex than simply emulating networks, where even the size or shape of cells may affect their biological functions. Thus, cross-scale analyses and dynamic simulations are urgently needed in the future.