Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In systems biology [18], reaction networks are used to represent biological systems. They enable formal analyses [9], simulations with several semantics [7], parameter estimations and identifications [1], etc. With bigger and bigger networks, in order to keep the analyses as simple as possible, or to have quick simulations (in particular in the context of real-time control [29]), we need to be able to simplify reaction networks. Indeed, the reactions of many metabolic reaction networks are often motivated by simplifications of concrete chemical reactions, see e.g. [21], but these simplifications are always done in informal manner without any semantical guarantees. An exception is Michaelis-Menten’s simplification rule of enzymatic reactions, which is properly justified under quasi-steady-state assumption [27].

One usual approach is to simplify the ordinary differential equation (ODE) systems, that describe the deterministic semantics of reaction networks, but not the reaction networks themselves. In [17], authors presented a method based on the structure of enzyme-catalysed reactions to compute a simplified ODE system at steady-state. In [6], authors used dependency analysis of rule-based models to obtain a simplified ODE system. Many other simplification methods use the distinction between slow and fast reactions, as for instance methods based on invariant manifolds [11], quasi-steady state [3, 27], quasi-equilibrium approximation [12] or tropicalization [28]. Other methods reduce the number of parameters, for instance by using Lie symmetries [19]. However, most of those methods require the parameter values, or at least their magnitudes, and those data are often unknown. Moreover, it is useful to preserve the reaction network and not just its ODE system, and transforming an ODE system back to a reaction network is a difficult issue, since not always possible, or not possible in a unique manner [8].

Another approach is to consider reaction networks as programs [5, 16, 25], and to apply simplification rules directly to such programs, similarly to what is done in compiler construction [24, 26]. This means to directly simplify the reaction network and not the corresponding ODE system, or even while ignoring the kinetics all over. Such structural simplification methods are usually based on a small-step semantics, saying how chemical solutions may evolve non-deterministically. They are often contextual, i.e. the simplification rules remain correct when the network is plugged into a bigger context. In our own previous work [20], we proposed to simplify reaction networks while preserving the reachability of final components, called attractors. However, those methods do not fit well with the deterministic semantics, even though the simplification rules obtained seem sensible for biological systems. Previous structural simplification methods were presented in [10], where subgraph epimorphisms are used to reduce reaction networks. Similar works had been done in Petri Nets [2, 23], preserving its usual properties (liveness, deadlock, termination, etc.). In [4], Cardelli presented morphisms that preserve the deterministic semantics, but does not give simplification rules for them.

In this article, we aim at finding a new approach for simplifying reaction networks that preserves the deterministic semantics, i.e. the evolution of concentrations of molecular species over time. The approach should be structural in that it applies to reaction networks directly without computing the ODE system. It should be contextual, so that we can easily simplify modules or subnetworks in a larger context while preserving the overall dynamics. Therefore, we propose a collection of simplification rules that eliminate intermediate molecules while preserving the dynamics of all others. Some simplification rules are based on partial equilibrium conditions on the intermediate molecules (but a general steady-state is not assumed). Such conditions were already assumed to justify Michaelis-Menten’s exact simplification for enzymatic reactions [22] which is widely accepted. There the intermediate complex needs to be at equilibrium; when it is only close to the equilibrium, then a small error is made which can be estimated. A network obtained by applying a simplification rule has the same deterministic semantics than the original one, in all contexts that preserve the equilibrium conditions on intermediate molecules. For applying a simplification rule, the corresponding ODE system is not needed, and the kinetic parameters may be unknown. We illustrate the usefulness of the simplification by applying it to biological examples, where it allows to drastically reduce the size of reaction networks.

Outline. We first illustrate the basic ideas and motivations at an example in Sect. 2. We recall the formal definitions of reaction networks with their deterministic semantics in Sect. 3. In Sect. 4, we contribute a contextual equivalence relation for reaction networks, and in Sect. 5 a set of simplification axioms, that we prove correct with respect to this equivalence relation. In Sect. 6, we illustrate at a biological example, how much reaction networks can be simplified in practice. We finally conclude and discuss future work in Sect. 7.

2 Preliminary Example

We first present a preliminary example, to illustrate our simplification.

Fig. 1.
figure 1

Reaction graphs of the \( Gene \) network on the left, and its two simplifications. Molecules are represented by circles, and reactions by squares. In the kinetic expressions near the reactions, the \(k_i\) are parameters while \(x_{A}\) is a variable representing the concentration of a molecule A. \(x_{ G }^0\) denotes the initial concentration of \( G \). A dash arrow means that the molecule acts as a modulator in the reaction, while a dot arrow means that the molecule can be modified by the context.

Consider the reaction network \( Gene \) in Fig. 1 on the left. It has four species: a gene \( G \), an inhibitor \( Inh \), some \( mRNA \), and a protein \( P \). The reaction \(r_1\) describes a transcription, the production of \( mRNA \) in presence of a gene \( G \). This gene is required to apply the reaction, but its amount is not modified by it. This reaction has also a modulator, \( Inh \), indicated by a dashed arrow. A modulator influences the speed rate of a reaction, but is not required to apply it. Here, \( Inh \) slows down the reaction \(r_1\). The reaction \(r_2\) is the translation of \( mRNA \) into the protein \( P \), while the reaction \(r_3\) (resp. \(r_4\)) describes the degradation of \( mRNA \) (resp. P). Aside from the first one, every reaction has a simple mass-action kinetic.

In order to simplify the network, we first need to specify how the environment interacts with the network: this is indicated by pending dotted arrows in Fig. 1. We consider here that \( G \) and \( mRNA \) are internal molecules, that is, they can not be modified by the context. Then, the context can be any set of reactions that does not contain \( G \) and \( mRNA \). It can for instance transform \( P \) into another protein, or produces something else in presence of \( Inh \), etc.

In this network, we are especially interested in the protein \( P \), and on the contrary we want to eliminate the intermediate \( mRNA \). To do that, we will assume that \( mRNA \) is at equilibrium, i.e. its concentration, \(x_{ mRNA }\), is constant over time.

Let us simplify our network. First, notice that the gene \( G \) is not modified by any reaction. It is used in reaction \(r_1\), but only as an activator, i.e. on both sides of the reaction. Moreover, \( G \) is an internal molecule, that can not be modified by the context. Therefore its concentration, \(x_{ G }\), is constant over time: \(x_{ G }=x_{ G }^0\). Then we make this modification in the kinetic expression of reaction \(r_1\), and remove completely \( G \) from the network. The new network is pictured in Fig. 1 (middle).

Now, consider the intermediate \( mRNA \). It is an internal molecule, and its (complete) ordinary differential equation is:

$$\begin{aligned} \dfrac{dx_{ mRNA }}{dt} = \dfrac{k_1x_{ G }^0}{k_0+x_{ Inh }} - k_{-1}x_{ mRNA } \end{aligned}$$

Since we assumed that \( mRNA \) is at equilibrium, i.e. \( \dfrac{dx_{ mRNA }}{dt} = 0\), we deduce:

$$\begin{aligned} x_{ mRNA } = \dfrac{k_1x_{ G }^0}{k_{-1}(k_0+x_{ Inh })} \end{aligned}$$

Therefore we remove \( mRNA \) from the network, and replace the variable \(x_{ mRNA }\) in the kinetics of reaction \(r_2\), by the expression computed above. We obtain the simplified network in Fig. 1 (right) where \(r_1\), \(r_2\) and \(r_3\) are merged into the new reaction \(r_{123}\).

As we will see in this paper, the simplification rules used above preserve the deterministic semantics of reaction networks, in every context. So the simplified network is contextual equilibrium-equivalent to the first one. Note that we can not simplify the network anymore, since both \( Inh \) and \( P \) can be modified by the context.

3 Reaction Networks

We introduce reaction networks and define their deterministic semantics in terms of ordinary differential equations.

Let \(\textit{Spec}\) be a set of molecular species ranging over by ABC. We define a (chemical) solution \(s\in \textit{Sol}:\textit{Spec}\rightarrow \mathbb {N}_0\) as a function from molecular species to natural numbers. Given natural numbers \(n_1,\ldots ,n_k\), we denote by \(n_1 A_1 +\ldots +n_k A_k\) the solution that contains \(n_i\) molecules of species \(A_i\) for all \(1\le i\le k\) and 0 molecule of all other species.

A kinetic reaction is a pair composed of a reaction and a kinetic expression e. The reaction transforms the solution \(s_1\), called reactants, into the solution \(s_2\), called products. The molecules present in the same amount in both reactants and products are called activators. They are not modified by the reaction, but are required to apply it. Kinetic expressions are symbolic functions defined from concentration variables, \(\textit{Vars}_{\textit{Spec}}=\{x_{A}\mid A\in \textit{Spec}\}\), symbols of initial concentrations, \(\textit{Const}_i=\{x_{A}^0\mid A\in \textit{Spec}\}\), and symbols of kinetic parameters, \(\textit{Const}_k=\{k_0,k_1,\ldots \}\):

$$\begin{aligned} e,f,\ldots {:}{:}{=}x_{} \mid x_{}^0 \mid k \mid e+f \mid e-f \mid e\times f \mid e/f \mid -e \mid (e) \end{aligned}$$

where \(x_{}\in \textit{Vars}_{\textit{Spec}}\), \(x_{}^0\in \textit{Const}_i\) and \(k\in \textit{Const}_k\). As usual, we also simply denote ef for \(e\times f\).

The (chemical) concentration of a chemical species is a function from time to positive numbers \(\mathbb {R}_+\rightarrow \mathbb {R}_+\). Kinetic expressions are interpreted as actual kinetic functions by means of an assignment \(\alpha \) that maps concentration variables to concentrations (\(\alpha _c\)), initial concentrations to non negative real values (\(\alpha _0\)) and kinetic parameters to non negative real values (\(\alpha _k\)):

$$\begin{aligned} \alpha _c: \textit{Vars}_{\textit{Spec}}\rightarrow (\mathbb {R}_+\rightarrow \mathbb {R}_+) \qquad \alpha _0: \textit{Const}_i\rightarrow \mathbb {R}_+\qquad \alpha _k: \textit{Const}_k\rightarrow \mathbb {R}_+\end{aligned}$$

We only consider assignments \(\alpha \) consistent on initial concentrations, that is for any species A, \(\alpha _c(x_{A})(0) = \alpha _0(x_{A}^0)\). Given an assignment \(\alpha \), the interpretation \([e]_{\alpha }\) of a kinetic expression e is thus defined as follows:

$$\begin{aligned}{}[x_{}]_{\alpha }(t) = \alpha _c(x_{})(t)\quad [x_{}^0]_{\alpha }(t) = \alpha _0(x_{}^0)\quad [k]_{\alpha }(t)= \alpha _k(k)\quad [(e)]_{\alpha }(t) = [e]_{\alpha }(t) \end{aligned}$$
$$\begin{aligned}{}[-e]_{\alpha } = -[e]_{\alpha } \quad [e \textit{ op } f]_{\alpha }(t)= [e]_{\alpha }(t) \textit{ op } [f]_{\alpha }(t) \text { where } \textit{op}\in \{+,-,\times ,/\} \end{aligned}$$

Given a set of kinetic reactions, we only consider assignments \(\alpha \) such that for any kinetic expression e occurring in this network, its interpretation \([e]_{\alpha }: \mathbb {R}_+\rightarrow \mathbb {R}_+\) is a continuously differentiable function from time to non negative real numbers, standing for the actual reaction rate. Kinetic reactions also have to respect the following coherence property: the actual rate given by any assignment \(\alpha \) is equal to zero if and only if one of the reactants is not present: \(\forall \alpha .\) \([e]_{\alpha }(t)=0\) iff \(\exists A \in s_1. [x_{A}]_{\alpha }(t)=0\). Note that a kinetic expression can contain concentration variables of molecules that are not present in the reactants of the reaction; such molecules, called modulators, are not required to apply the reaction, but modify its rate.

Definition 1

A reaction network is a pair \(\langle I, R \rangle \), composed of a set of internal molecules \(I\), which specifies that some molecules can not interact with the context, and a set of kinetic reactions R.

From any network \(N = \langle I, R \rangle \) and from its kinetic expressions, we can infer a system of ordinary differential equations defined by

Given any assignment \(\alpha _0\) of the initial concentrations and any assignment \(\alpha _k\) of the kinetic parameters, by the Cauchy-Lipschitz theorem, the system \(\textit{ODE}(N)\) has a unique differentiable solution \(\alpha _c\), defined on a maximal interval including 0. Moreover, we only consider solutions \(\alpha _c\) defined on (at least) \([0, +\infty [\). Otherwise, we say that N has no valid solution for these assignments.

An equilibrium condition \(e\) is defined similarly to kinetic expressions and interpreted as function from time to positive numbers. It is satisfied by an assignment \(\alpha \) iff \(\alpha _c\) satisfies \(\dfrac{de}{dt} = 0\) given the initial concentration and parameter assignments \(\alpha _0\) and \(\alpha _k\). An equilibrium condition can for instance impose the equilibrium of a particular molecule (for instance \(e= x_{A}\)), a solution (\(e= \sum _{A \in s} s(A)x_{A}\)), or a reaction (\(e= f\) for the reaction \((r \;; f)\)). We denote by \(\mathbf E \) a set of equilibrium conditions. Given a network N and equilibrium conditions \(\mathbf E \), the deterministic dynamics of N that satisfies E is defined as

$$\begin{aligned} sol(N,\mathbf E ){}&= \{ \alpha \mid \alpha _c \text { satisfies } E \text { and is a valid solution of } \textit{ODE}(N)\\&\text { for initial concentrations }\alpha _0\text { and parameter assignments } \alpha _k\} \end{aligned}$$

Since we are particularly interested in the molecules that are not at equilibrium, we say that two assignments \(\alpha \) and \(\alpha '\) are equal modulo equilibrium conditions, denoted \(\alpha _\mathbf{E } \alpha '\), if they are equal on those molecules.

4 Contextual Equilibrium-Equivalence

We present here a notion of weak equilibrium-equivalence between reaction networks, then the definition of contexts, and finally the contextual equilibrium-equivalence.

Definition 2

(Weak Equilibrium-Equivalence). Two networks N and M are weakly equilibrium-equivalent for \(\mathbf E \), denoted \(N \sim ^\mathbf{E } M\), if they have the same solutions modulo equilibrium conditions \(sol(N,\mathbf E )=_\mathbf{E } sol(M,\mathbf E )\).

A context \(\mathcal {C}\) is itself a reaction network. Given a set of internal molecules \(I\), we say that a context \(\mathcal {C}\) is compatible with \(I\) if \(\forall A \in I\), A has no occurrence in \(\mathcal {C}\). We denote by \(\textit{Context}(I)\) the set of compatible contexts with \(I\). Given a network \(N = \langle I, R \rangle \) and a compatible context \(\mathcal {C}= \langle I', R' \rangle \in \textit{Context}(I)\), we denote by \(\mathcal {C}[N] = \langle I\cup I', R \cup R' \rangle \) the network placed into the context.

Definition 3

(Contextual Equilibrium-Equivalence). Let \(\mathbf E \) be an equilibrium, the reaction networks \(N = \langle I, R \rangle \) and \(M = \langle I', R' \rangle \) are contextually equilibrium-equivalent for \(\mathbf E \), denoted \(N \equiv ^\mathbf{E } M\), if they are weakly equilibrium-equivalent in any compatible context, i.e. \(\forall \mathcal {C}\in \textit{Context}(I\cup I'). \,\mathcal {C}[N] \sim ^\mathbf{E } \mathcal {C}[M]\).

5 Simplification Axioms

In this section, we present some simplification axioms, that transform a network into a contextually equilibrium-equivalent network. The soundness proofs of those axioms are given in the annexFootnote 1. These simplification axioms reduce the size of a reaction network, either by completely removing a molecule from the set of reactions, by decreasing the number of reactions, or by simplifying a reaction.

We first present 2 simple simplification axioms, followed by 4 instances of a more general axiom, based on the presence of an intermediate molecule. Finally, we present this general axiom. Notice that the axioms are quite similar to the ones we presented for the attractor equivalence with a qualitative and observational semantics in [20].

The first 2 simplification axioms are given in Fig. 2. The first one, (useless), deletes a reaction that does not impact the network dynamics. The axiom (activator) removes an internal molecule A only used as an activator in the reactions (i.e. is always present in the same amount in both sides of the reaction). It is for instance the case for the gene \( G \) in the \( Gene \) network in Sect. 2.

Fig. 2.
figure 2

Simple simplification axioms.

Fig. 3.
figure 3

Instances of intermediate molecule axiom.

The next four axioms in Fig. 3 are instances of the more general axiom (intermediate). These axioms aim at eliminating an internal and intermediate molecule which is at equilibrium.

In the first one, (inter), the intermediate molecule A is only used in two reactions, one time as the unique product, and the other as the unique reactant. Since A is at equilibrium, the kinetic expressions of these reactions have to be equal, i.e. \(e=k_2 x_{A}\). The axiom removes A and merges both reactions into one, keeping only the kinetic expression e. The parameter \(k_2\) is eliminated.

The second axiom, (Michaelis-Menten), simplifies a three-steps enzyme-catalyzed transformation. A substrate S binds to an enzyme E to form the complex C. Then the complex either transforms back to \(S +E\), or produces the product P while releasing \(E\). Assuming that the enzyme E and the complex C are at equilibrium, the axiom merges the reactions into a unique one, that directly transforms S into P. The equilibrium of C imposes that the simplified reaction has a Michaelis-Menten kinetics of the form \(V\dfrac{x_{S}}{x_{S} + K}\) [22].

The last two, (cascade \(_1\)) and (cascade \(_2\)), concern a cascade of reactions, where the intermediate molecule A, at equilibrium, is produced in presence of some activators \(s\), and then is either degraded or used to produce some \(s'\). The axioms eliminate A, so the simplified networks directly produced \(s'\) in presence of \(s\). The simplified kinetic expressions are obtained by computing the value of \(x_{A}\) at equilibrium, and by replacing it in the third kinetic reaction.

We finally present in Fig. 4 the general axiom (intermediate). In this axiom, we consider an intermediate internal molecule A, at equilibrium. It simplifies a model with one reaction that can produce A, with a (non-empty) set of reactions that has only A as reactant and whose kinetic expressions are linear in \(x_{A}\), and possibly a set of reactions with A as activator. Then the axiom eliminates A, and merges two-by-two the reactions. The linearity of the kinetic expression of some reactions is necessary to easily compute the expression of \(x_{A}\) at equilibrium, that is in this case .

Fig. 4.
figure 4

General intermediate molecule axiom.

6 Simplification of the Tet-On Reaction Network

We present here the simplification of the Tet-On system [1315] using our axioms. The initial \(\textit{Tet-On}_{detailed} \) reaction network, depicted in Fig. 5 (left), has 10 reactions and 11 parameters. We simplify it into the contextually equilibrium-equivalent \(\textit{Tet-On}_{simple} \) network, depicted on Fig. 5 (right), with only two reactions and 3 parameters.

Fig. 5.
figure 5

Reaction graphs of the detailed (left) and simplified (right) \(\textit{Tet-On} \) networks. Molecules are represented by circles, and reactions by squares. In the kinetic expressions near the reactions, the \(k_i\) are parameters while \(x_{A}\) is a variable representing the concentration of a molecule A. A dash arrow means that the molecule acts as a modulator in the reaction, while a dot arrow means that the molecule can be modified by the context. In the right network, the parameters are \(V = {x_{{ P _{ TRE3G }}}^0V_1k_4k_6}/{k_3(k_5+k_6)}\) and \(K = {k_1k_{-2}K_1}/{x_{ rtTA }^0k_{in}k_2}\).

The Tet-On system [1315] describes how the production of activated green fluorescent proteins (\( GFP _a \)) in a cell can be stimulated by the presence of doxycycline (\( Dox \)) outside the cell. The detailed network is \(\textit{Tet-On}_{detailed} = \langle I, R \rangle \) where every molecule is internal except for \( Dox \) (i.e. \(I= \textit{Spec}\backslash Dox \)), and R is the set of reactions from Fig. 6, inspired by the \(\textit{Tet-On} \) model from [15].

Fig. 6.
figure 6

Reactions of the detailed \(\textit{Tet-On}_{detailed} \) network.

In the network, the doxycycline \( Dox \) moves into the cell and becomes \( Dox _i\) by reaction (1). We assume here that the amount of \( Dox \) is controlled by the environment (for instance by a microfluidics device [30]), and therefore the network can not modify its concentration. Then \( Dox _i\) is either degraded by reaction (2), or binds to the artificial transcription factor \( rtTA \) by reaction (3). The complex \( rtTADox \) either dissociates (4), or activates the transcription of the gene \({ P _{ TRE3G }}\), producing \( mRNA \) (5). \( mRNA \) either degrades (6) or is translated into \( GFP \) (7). Finally, \( GFP \) needs to be activated into \( GFP _a \) (9) in order to become fluorescent and thus observable by a microscope. Both \( GFP \) and \( GFP _a \) can also be degraded (8, 10).

We are particularly interested by \( GFP _a \), since it is the only experimentally observable molecule. Therefore we assume that all other molecules are at equilibrium, i.e. \(\mathbf E = \{x_{X} \mid X \in \textit{Spec}\backslash GFP _a \}\). The simplification follows the axioms from Figs. 2, 3 and 4, so that will prove that the two networks are contextually equilibrium-equivalent for \(\mathbf E \). Note that in the following simplification, for the sake of readability, some kinetic expressions were sometimes slightly rewritten into equivalent expressions.

Let us first remark that the gene \({ P _{ TRE3G }}\) is only used as an activator, in the reaction 5. So we apply the axiom (activator), removing \({ P _{ TRE3G }}\) from this reaction, while replacing \(x_{{ P _{ TRE3G }}}\) by \(x_{{ P _{ TRE3G }}}^0\) in its kinetic function. Then \( rtTADox \) is an internal molecule at equilibrium, present in three reactions: one that produces it (3), one that consumes it (4), and one that uses it as an activator (5). Then we use the axiom (intermediate) on it, followed directly by (useless), and merge the three reactions into:

(11)

\( rtTA \) is only used as activator, so we apply (activator) and simplify (11) into:

(12)

Apply axiom (cascade) \(_1\) on \( GFP \), replacing the reactions (7), (8) and (9) by:

(13)

Also, apply (cascade) \(_2\) on \( Dox _i\), and replace reactions (1), (2), and (12) by:

(14)

Finally we use the axiom (intermediate) followed by (useless) on \( mRNA \), and merge the reactions (6), (13) and (14) into:

(15)

Defining two new parameters \(V = x_{{ P _{ TRE3G }}}^0V_1k_4k_6/(k_3(k_5+k_6))\) and \(K = k_1k_{-2}K_1/(x_{ rtTA }^0k_{in}k_2)\), we eventually obtain the following reaction network:

Notice that, aside from the kinetics, the simplified network is equal to the one we obtained with our qualitative simplification in [20].

7 Conclusion

We presented a new structural simplification of reaction networks, that preserved the deterministic semantics. The simplification is contextual, and is based on equilibrium conditions on intermediate molecules. We shown the usefulness of the simplification by applying it on two biological networks.

We are currently implementing the simplification algorithm, with a more complete set of axioms and compatible with the SBML format. This axioms include variants of the axioms presented here, for instance with different equilibrium conditions, but also other types of axioms, using for instance symmetries in the network. We plan to apply the simplification more systematically to biological systems. It would also be interesting to compare in depth the power of our structural simplification rules to that of the King-Altman method on ODE system [17]. On the theoretical side, as future work, we want to investigate an approximated equivalence, with approximated equilibrium conditions, and to compute the maximal error of a simplification. A similar simplification method with a stochastic semantics will also be considered.