G-Network Modelling Based Abnormal Pathway Detection in Gene Regulatory Networks

Kim, Haseong; Atalay, Rengul; Gelenbe, Erol

doi:10.1007/978-1-4471-2155-8_32

Haseong Kim⁴,
Rengul Atalay⁵ &
Erol Gelenbe⁴

942 Accesses
5 Citations

Abstract

Gene expression centered gene regulatory networks studies can provide insight into the dynamics of pathway activities that depend on changes in their environmental conditions. Thus we propose a new pathway analysis approach to detect differentially behaving pathways in abnormal conditions based on G-network theory. Using this approach gene regulatory network model parameters are estimated from normal and abnormal samples using optimization techniques with corresponding constraints. We show that in a “p53 network” application, the proposed method effectively detects anomalous activated/inactivated pathways related with MDM2, ATM/ATR and RB1 genes, which could not be observed from previous analyses of gene regulatory network normal and abnormal behaviour.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Perturbation-based gene regulatory network inference to unravel oncogenic mechanisms

Article Open access 25 August 2020

Algorithm for Inference of Gene Regulatory Networks Using Partial Least Squares Regression and Mutual Information

Bayesian differential analysis of gene regulatory networks exploiting genetic perturbations

Article Open access 09 January 2020

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

One of the fundamental problems of biology is to understand complex gene regulatory networks (GRNs), and various mathematical and statistical models have been introduced for inference from GRNs [1]. Based on such networks, over-represented biological processes or pathways of a group of genes are identified by mapping them onto the gene ontology (GO) terms or regulatory structures [2]. These pathway analyses provide the annotations and functional insight of the group of genes which are usually determined by conventional statistical tests such as the t-test. However, these differentially expressed gene (DEG) derived analyses are limited in detecting defective pathways since they only observe the amount of expression of a gene itself rather than considering the flows of expression signals that communicate with neighboring genes.

Here we aim to detect the abnormal pathways of GRNs by modelling them using G-Networks [3] which is a probabilistic model of a system with special agents such as positive and negative customers, signals and triggers. In contrast to normal queuing networks, the negative customers of G-Networks describe the inhibitory effects of GRNs [4, 5]. G-networks have a product form solution which enables us to handle the dynamics of complex GRNs without heavy computation times. The parameters of the modelled GRN are inferred from normal samples with the assumed transition probabilities of gene expression signals. Then the transition probabilities of abnormal conditioned samples are estimated by minimizing the difference between the observed and predicted steady-state probabilities with constraints. Finally, permutation tests are performed to determine the statistical significance of the estimated transition probabilities.

2 G-Networks for Gene Regulatory Networks

Following [4] consider the notion of a “packet” that contains the gene expression signals, and a network node that represents a gene consisting of a queue where its packets are stored and a server where the packets’ fates are determined. Let $\lambda^{+}_{i}$ and $\lambda^{-}_{i}$ be the positive and negative packet input rates to the ith node, respectively. $\mu_{i}$ is the packet firing rate (service rate) of the ith node. Furthermore we define ${\mathbf x}=\{x_1,...,x_n\}$ a non-negative integer n-vector with ${\mathbf x}^{+}_{i}=\{x_1, ..., x_i+1, ..., x_n\},\;{{\mathbf x}}^{-}_{i}=\{x_1, ..., x_i-1, ..., x_n\},$ and ${{\mathbf x}}^{+-}_{ij}=\{x_1, ..., x_i+1, x_j-1, ..., x_n\}.$ Let $p_{ij}^+$ and $p_{ij}^-$ be the transition probabilities for packet motion from the ith node to the jth node as a positive and a negative packet, respectively. Note that a negative packet has the effect of disappearing after it destroys one packet of the target node, or it disappears also if it does not find a positive packet to destroy. Lastly, $d_i$ denotes the probability that a packet leaves the system so that $\sum_{j=1}^n (p_{ij}^{+} + p_{ij}^{-}) + d_{i}=1.$

Consider now a random process ${{\mathbf X}}(t)=\{X_1(t),...,X_n(t)\}$ where $X_i(t)$ is an integer-valued random variable representing the number of packets in the ith node at time $t \geq 0.$ If Pr(x,t) is the probability that X(t) takes the value x at time t, then the G-network equations are:

$$ \begin{aligned} Pr({\mathbf x},t+\Updelta t) & = \sum^n_{i=1} \bigg[ (\lambda^{+}_{i} \Updelta t + o(\Updelta t)) Pr({\mathbf x}^-_i,t)I({\mathbf x}_i>0) + (\lambda^{-}_{i} \Updelta t + o(\Updelta t)) Pr({\mathbf x}^+_i, t) \\ & \quad+ \sum^n_{j=1}\Big\{(p^+_{ij}\mu_i \Updelta t + o(\Updelta)) Pr({\mathbf x}^{+-}_{ij}, t) I({\mathbf x}_j>0) \\ & \quad+ (p^-_{ij} \mu_i \Updelta t + o(\Updelta))Pr({\mathbf x}^{++}_{ij}, t) + (p^-_{ij} \mu_i \Updelta t + o(\Updelta))Pr({\mathbf x}^{+}_{i}, t) I({\mathbf x}_j=0) \\ & \quad + \sum^n_{l=1} \big( (p_{ijl} \mu_i \Updelta t + o(\Updelta t)) Pr({\mathbf x}^{++-}_{ijl},t) + (p_{jil} \mu_j \Updelta t + o(\Updelta t)) Pr({\mathbf x}^{++-}_{ijl}, t) \big) I({\mathbf x}_l > 0) \Big \} \\ & \quad+ (d_i \mu_i \Updelta t + o(\Updelta t)) Pr({\mathbf x}^+_i, t) + (1- (\lambda^{+}_i + \lambda^{-}_i + \mu_i) \Updelta t + o(\Updelta t)) Pr({\mathbf x},t) \bigg] \\ \end{aligned} $$

(1)

where I(C) is 1 if C is true and 0 otherwise, and $o(\Updelta t) \rightarrow 0$ as $\Updelta t \rightarrow 0.$ The complete equilibrium solution of (1) was given in [4]. Let $q_i$ be the steady-state probability that the ith gene is activated:

$$ q_i= \hbox{min} \left[1, {\frac{\lambda_{i}^+ + \Uplambda_i^+}{\mu_i + \lambda_{i}^- + \Uplambda_i^-}} \right] $$

(2)

with

$$ \Uplambda_i^+ = \sum^n_{j=1} q_j \mu_j p^+_{ji} + \sum^n_{j, l=1, l \neq j} q_j q_l \mu_j p_{jli}\; \hbox{and}\; \Uplambda_i^- = \sum^n_{j=1} q_j \mu_j p^-_{ji} + \sum^n_{j, l=1, l \neq j} q_l \mu_l p_{lij} $$

then the steady-state probability that there are $x_i$ packets of ith node in each of the n cells is:

$$ \lim_{t \rightarrow \infty} Pr (X_1=x_1,...,X_i=x_i,...,X_n; t) = \Uppi_{i=1}^nq_i^{x_i}( 1- q_i) $$

(3)

3 Abnormal Edge Detection

The packets in the G-network represent latent objects containing the gene expression signal, and we assume that the number of packets is proportional to the mRNA expression levels which are actually observable data. We also assume that the mRNA levels are observations of the steady-state. Therefore the steady-state probability that there is at least one mRNA of ith gene is $q_i={\frac{a_i}{a_i + 1}}$ from (3) if we denote by $a_i$ the average mRNA level (average queue length) of ith gene, also given by $a_i=q_i/(1-q_i).$

To determine the G-network parameters under normal conditions we use (2) where there are four sets of unknown parameters $p_{ji} = \{p_{ji}^+, p_{ji}^-, q_{jli}, q_{lij} \},\;\lambda_i^{+}, \;\lambda_i^{-},$ and $\mu_i.$ We initially set $p_{ji}=1/(n_j^{out}+1)$ where $n_j^{out}$ is the out-degree of gene j. We set the packet output rate $\mu_i$ based on the values of $\lambda_i^+$ and $\lambda_i^-$ which are $\lambda_i^+ = 0.0062\,\text{sec}^{-1}$ and $\lambda_i^- = 0.002\,\text{sec}^{-1}$ [6], with $\mu_i=c \cdot n_i^{out}$ where c is a scaling constant. From (2) we have $q_i=f_i(\lambda^+_i, \lambda^-_i, \mu_i |{\mathbf q}_j, p_{ji})$ where ${\mathbf q}=(q_1,..,q_n).$ Then c can be found by minimizing the following equation given the initial values of $\lambda^+_i$ and $\lambda^-_i;$

$$ \tilde{c} = \hbox{arg}\; \hbox{min}_{c} \sum_i (q_i-f_i( c | {\mathbf q}, p_{ji}, \lambda^+_i, \lambda^-_i))^2 $$

(4)

Once each $\mu_i$ is determined, we can find the optimal positive input rate $\lambda^+_i$ which minimizes $(q_i-f_i( \lambda^+_i | {\mathbf q}, p_{ji}, \tilde{\mu}_i, \lambda^-_i))^2$ for each gene with the initial value $\lambda^-_i$ and a constraint $0 \le \tilde{\lambda}^+_i \le \mu_i + \lambda^-_i + \Uplambda^-_i-\Uplambda^+_i.$ Then we determine $\tilde{\lambda}^-_i$ which produces exactly the same values of $q_i.$

3.1 Transition Probabilities in Abnormal Conditions

In an abnormal condition, let $q^{\prime}_i$ be the steady-state probability that ith gene is activated and $p^{\prime}_{ji}$ be a packet transition probability from the ith gene to jth gene in the same condition. If there are k unknown $p^{\prime}_{ji}$ for ith gene, then we will denote them by a vector ${{\mathbf p}}_{ki}.$ For the detection of the abnormally behaving pathways, ${{\mathbf p}}_{ki}$ needs to be estimated given the input ($\tilde{\lambda}_i^+$ and $\tilde{\lambda}_i^-$) and output ($\tilde{\mu}_i$) rates found in normal conditions. ${{\mathbf p}}_{ki}$ can be determined by minimizing the following sqaured error with two constraints, $0 \le \tilde{p}^{\prime}_{ji}$ and $\sum_{i}{\tilde{p}^{\prime}_{ji}} \le 1;$

$$ \tilde{{\mathbf p}}^{\prime}_{ki} = \arg\min_{{\mathbf p}^{(h)}_{ki}} (q^{\prime}_i-f_i({\mathbf p}^{(h)}_{ki} | {\mathbf q}^{\prime}_j, \tilde{\lambda}^+_i, \tilde{\lambda}^-_i , \tilde{\mu}_i))^2 $$

(5)

where ${\mathbf p}^{(h)}_k$ is the hth hypothesis in the constrained parameter space. Our algorithm searches for the optimal solution iteratively with different initial starting values to reduce the possibility of remaining in a local minimum.

3.2 Permutation Test for the Estimated Transition Probabilities

When the estimated $\tilde{p}^{\prime}_{ij}$ differs from its initially assumed value $p_{ij},$ it is necessary to determine if the difference is statistically significant. The null hypothesis of this test will be $\tilde{p}^{\prime}_{ij} = p_{ij}.$ To proceed with the test the set of samples is shuffled at random and divided into normal and abnormal groups with the same sample size of the original group. Then the proposed method is applied in the same way as the original data. Let M be the number of permutations and $\tilde{p}^{(m)}_{ij}$ be the estimated transition probability of the mth permutation. Then we can compute the emperical p-value of the $\tilde{p}^{\prime}_{ij}$ as follows,

$$ p{\text-}\hbox{value\;of}\;\tilde{p}^{\prime}_{ij}=\left\{ \begin{array}{ll} {\frac{1}{M}}\sum_{m=1}^M I(p^{\prime}_{ij} \leq \tilde{p}^{(m)}_{ij}) & \text {if } p^{\prime}_{ij} > p_{ij}\\ {\frac{1}{M}}\sum_{m=1}^M I(p^{\prime}_{ij} > \tilde{p}^{(m)}_{ij}) & \text {if} p^{\prime}_{ij} \leq p_{ij} \end{array} \right. $$

where I(C) is the indicator function. Thus if the p-value is less than $\alpha_2$ then the null hypothesis is rejected. In our study, $\alpha_2 = 0.1.$

4 The p53 Pathway

In order to evaluate our approach using experimental data, we selected the p53 pathway which is a well studied system in human cells whose most important feature is tumor suppression when DNA is damaged. The regulatory structure of the p53 pathway with 30 genes was constructed on the basis of the KEGG database, and we also downloaded two microarray mRNA expression datasets from GEO. The first dataset (GSE12941) consists of 10 non-tumor liver tissue and 10 hepatocellular carcinoma (HCC) samples. The second (GSE6222) is a dataset for the study of liver cancer progression in HCC. In this dataset, we use 2 normal and 10 HCC samples. Before applying the proposed method, the data was normalized and scaled with mean 3 so that the average number of mRNAs of a gene without its interactions in a single cell are assumed to be approximately 3, while the variance was scaled to 1. The gene input and output rates are assumed to be $0.0062\,{\text sec}^{-1}$ and $0.002\,{\text sec}^{-1}$ from [6] so that $0.0062/0.002 \approx 3.$

Figure 1 shows the average expression levels of genes in each dataset, and the corresponding p-values of t-tests which detect DEGs. The two datasets share nine significant DEGs with 0.05 significance level while GSE12941 has four more DEGs. This similarity can be confirmed by observing their expression patterns in Fig. 1.However interpreting the DEGs even when we know their regulatory structure is yet another challenge. Figure 2 shows the results from our proposed method. Despite the apparent lack of significance of p53 and MDM2 in the t-test, the p53-MDM feedback loop was clearly activated in cancer samples in both datasets. In [7], the p53-MDM2 feedback loop appears to produce oscillatory expression patterns. Thus in terms of the system dynamics, the activation of two pathways between p53 and MDM2 in our result might be more appropriate than the activation of only one pathway from MDM2 to p53. One of the significant pathways in both datasets is TP53-IGFBP3-IGF1 [8]. Also our method properly detects two pathways, ATM-CHEK2-TP53 and ATR-CHEK1-TP53 as expected from [9] in both datasets, which cannot be detected merely by observing the p-values of the DEG test.

5 Discussion

We have proposed a new approach for detecting abnormal pathways in GRNs based on G-network modelling. This method provides an effective way to describe the flows of gene expression signals including negative or inhibitory effects on gene expression. Using some experimental data, we show that one advantage of our approach is that it can detect abnormal information flows in the dynamics of gene pathways. Thanks to existing G-network theory, the model uses a computationally tractable steady-state analysis and therefore does not require a large number of samples from time-dependent data. Moreover the analytic solution provided by G-network theory offers the possibility that our approach may be extended to very large-scale GRN systems.

In order to exploit this analytical tool, our work shows that a successful application of this method requires that the model be started with a reliable prior network structure based on real experimental data, or carefully calibrated GRN information. Though our intial experimental evaluation appears quite positive, further experimental studies will be needed to validate the proposed approach and apply it to attain biological meaningful and clinically useful results.

References

Opgen-Rhein, R., Strimmer, K.: Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process. BMC Bioinformatics 8(suppl 2), S3 (2007)
Article Google Scholar
Beissbarth, T., Speed, T.: GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20(9), 1464–1465 (2004)
Google Scholar
Gelenbe, E.: G-networks with triggered customer movement. J. Appl. Probab. 30(3), 742–748 (1993)
Google Scholar
Gelenbe, E.: Steady-state solution of probabilistic gene regulatory networks. J. Theor. Biol. Phys. Rev. E 76, 031903 (2007)
Article Google Scholar
Kim, H., Gelenbe, E.: Anomaly detection in gene expression via stochastic models of gene regulatory networks. BMC Genomics 10(suppl 3), S26 (2009)
Article Google Scholar
Thattai, M., van Oudenaarden, A.: Intrinsic noise in gene regulatory networks. In: Proceedings of the National Academy of Sciences 98(15), 8614–8619 (2001)
Google Scholar
Wilkinson, D.J.: Stochastic modelling for quantitative description of heterogeneous biological systems. Nature Rev. Genetics 10(2), 122–133 (2009)
Article MathSciNet Google Scholar
Schedlich, L., Graham, L.: Role of insulin-like growth factor binding protein-3 in breast cancer cell growth. Microsc. Res. Tech. 59(1), 12–22 (2002)
Article Google Scholar
Brown, C., Lain, S., Verma, C., Fersht, A., Lane, D.: Awakening guardian angels: drugging the p53 pathway. Nature Rev. Cancer 9(12), 862–873 (2009)
Article Google Scholar

Download references

Acknowledgments

We would like to thank to Omer Abdelrahman and Zerrin Isik for helpful discussions.

Author information

Authors and Affiliations

Department of Electrical and Electronic Engineering, Imperial College, London, UK
Haseong Kim & Erol Gelenbe
Department of Molecular Biology and Genetics, Bilkent University, Ankara, Turkey
Rengul Atalay

Authors

Haseong Kim
View author publications
You can also search for this author in PubMed Google Scholar
Rengul Atalay
View author publications
You can also search for this author in PubMed Google Scholar
Erol Gelenbe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erol Gelenbe .

Editor information

Editors and Affiliations

, Dept of Electrical and Electronics Eng'g, Imperial College, London, SW7 2BT, United Kingdom
Erol Gelenbe
Imperial College, London, United Kingdom
Ricardo Lent
University of East London, London, United Kingdom
Georgia Sakellari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, H., Atalay, R., Gelenbe, E. (2011). G-Network Modelling Based Abnormal Pathway Detection in Gene Regulatory Networks. In: Gelenbe, E., Lent, R., Sakellari, G. (eds) Computer and Information Sciences II. Springer, London. https://doi.org/10.1007/978-1-4471-2155-8_32

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2155-8_32
Published: 29 September 2011
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2154-1
Online ISBN: 978-1-4471-2155-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

G-Network Modelling Based Abnormal Pathway Detection in Gene Regulatory Networks

Abstract

Similar content being viewed by others

Perturbation-based gene regulatory network inference to unravel oncogenic mechanisms

Algorithm for Inference of Gene Regulatory Networks Using Partial Least Squares Regression and Mutual Information

Bayesian differential analysis of gene regulatory networks exploiting genetic perturbations

Keywords

1 Introduction

2 G-Networks for Gene Regulatory Networks

3 Abnormal Edge Detection

3.1 Transition Probabilities in Abnormal Conditions

3.2 Permutation Test for the Estimated Transition Probabilities

4 The p53 Pathway

5 Discussion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

G-Network Modelling Based Abnormal Pathway Detection in Gene Regulatory Networks

Abstract

Similar content being viewed by others

Perturbation-based gene regulatory network inference to unravel oncogenic mechanisms

Algorithm for Inference of Gene Regulatory Networks Using Partial Least Squares Regression and Mutual Information

Bayesian differential analysis of gene regulatory networks exploiting genetic perturbations

Keywords

1 Introduction

2 G-Networks for Gene Regulatory Networks

3 Abnormal Edge Detection

3.1 Transition Probabilities in Abnormal Conditions

3.2 Permutation Test for the Estimated Transition Probabilities

4 The p53 Pathway

5 Discussion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation