1 Introduction

We start by introducing the cartoon mechanisms of two enzymatic signalign pathways depicted in research articles. The important RAS signaling pathway in Fig. 1 includes an extracellular ligand and a transmembrane receptor, which trigger a cascade of protein-protein interactions and enzymatic reactions, then integrated into key biological responses controlling cell proliferation, differentiation or death. When this pathway is altered, it can drive to unhealthy cell proliferation [41]. Figure 2 presents a more precise description of the last part of the enzymatic cascade.

Fig. 1
figure 1

The RAS signaling pathway, starting in the membrane of the cell

Fig. 2
figure 2

Part of the RAS signaling pathway inside the cell, possibly with retroactivity

Figure 3 depicts an osmolarity regulation network in bacteria, which is implemented in part by the EnvZ/OmpR two-component system [49]. The sensor kinase EnvZ (denoted by E in the diagram) autophosphorylates on a histidine residue (Ep) and catalyzes the transfer of the phosphate group to the aspartate residue of the response regulator OmpR (O), which then acts as an effector. In this mechanism, when EnvZ is bounded to ATP (ET), it also catalyzes hydrolysis of the phosphorylated OmpR-P (Op), which is a transcription factor that regulates the expression of various protein pores. This unusual design keeps the limit concentration of OmpR-P at a value that is independent of the positive initial concentrations.

Fig. 3
figure 3

EnvZ-OmpR bacterial model

When we first look at these biological mechanisms, it does not seem evident that algebra and geometry can be used to analyze them. But we will argue in this chapter that this is indeed the case and that we can contribute with these mathematical tools to the understanding of questions in Systems Biology.

In particular, in the realm of biochemical reaction networks, that is, chemical reaction networks in biochemistry, the usual mass-action kinetics modeling of the evolution of the concentrations of the different chemical species along time (as RAS, RAF, MEK, ERK, E, O, etc. above) yields an autonomous system of polynomial ordinary differential equations \( \frac {dx} {dt} = f_\kappa (x)\) in the unknown vector of concentrations x of the species as functions of time, for each choice of the (real positive) reaction rate constants κ (see Definition 1). In fact, these equations are associated to a labeled directed graph G of reactions. The monomial terms come from the labels of the nodes of G by complexes in the given species, the coefficients depend on the (positive) reaction rate constants κ that label the edges of G, and the total production of each reaction (which is the difference of the labels of the target and source nodes). The real polynomials f κ(x) carry a combinatorial structure inherited from G and we will also think of κ as parameters and consider the family of differential systems parametrized by them. Chemical Reaction Network Theory (CNRT) was initiated by Horn and Jackson and subsequently by Feinberg and his students and collaborators [22] and has seen a great development over the last years, when new combinatorial and algebro-geometric techniques have been introduced. We refer the reader to the survey article [16] for basic definitions, results and further references, and we review here some advances developed after that article was published.

In Sect. 4 we recall the notion of MESSI systems we introduced in [42]. Many post-translational modification networks are MESSI networks. For example: the motifs in [23], sequential distributive multisite networks [52], sequential processive multisite phosphorylation networks [12], phosphorylation cascades or the bacterial EnvZ/OmpR network from [49] in Fig. 3. Our work is inspired by and extends some results in several previous articles [24, 28, 29, 31, 39, 43, 48, 51]. MESSI is an acronym for Modifications of type Enzyme-Substrate or Swap with Intermediates (see Definition 2). Networks with an underlying MESSI structure include many post-translational modification networks, as well as all linear systems arising from mass-action kinetics (a.k.a. Laplacian dynamics [38]). We summarize some results and algorithms based on this structure to predict conservation relations, persistence, the capacity for multistationarity, and the description of regions of multistationarity. Once the network has the capacity for multistationarity, the next main question is how to predict parameters of, if possible, regions in parameter space which give rise to multistationary systems, which are called multistationarity regions. In Sect. 5 we comment on several recent approaches to study multistationarity in chemical reaction networks. Section 6 mentions the mostly unexplored question of the a priori determination of the occurrence of oscillations in chemical reaction networks, in particular, in enzymatic networks. We end the paper with two main open questions.

2 Basics of Mass-Action Kinetics

In this section we set the basic terminology and the mathematical concepts mentioned in the introduction. In particular, we discuss the notion of multistationarity.

Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions. Before giving the precise Definition 1, we instantiate mass-action kinetics in a biological example of a simple two-component mechanism. It relies on phosphotransfer reactions. Upon receiving a signal, the hybrid histidine kinase HK can self-phosphorylate. This is a hybrid histidine kinase with two phosphorylatable domains. We denote the phosphorylation state of each site by p, if the site is phosphorylated, and 0, if it is not; the four possible forms are HK00, HKp0, HK0p, HKpp. The response regulator protein is denoted by RR when it is unphosphorylated and RRp denotes the phosphorylated form. Given a vector of reaction rate constants \(k=(k_1, \dots , k_6)\in \mathbb {R}^6_{>0}\), the (directed) graph of reaction equals:

where each of the ten nodes corresponds to a complex on the six chemical species, that we number in the following order: HK00, HKp0, HK0p, HKpp, RR, RRp. Mass-action kinetics specifies how the respective concentrations x 1, …, x 6 of these six species evolve with time. The basic principle in this modeling is derived from the idea that the rate of an elementary reaction is proportional to the probability of collision of the reactants, which under an independence assumption equals the product of their concentrations. We derive the following autonomous polynomial dynamical system \(\frac {dx_i}{dt} = f_i(x), \, i=1, \dots , 6\):

It is straightforward to check that the following linear dependencies hold and generate all the linear dependencies among f 1, …, f 6:

from which we deduce two linear conservation relations:

Thus, trajectories lie in a 4-plane in 6-space. The total conservation constants T 1, T 2 are determined by the initial conditions (x 1(0), …x 6(0)).

Given a numbering of the species as above, we usually identify a complex on these species with a nonnegative integer vector. For example, the complex y = X 3 + X 5 is identified with the vector \(e_3+e_5=(0,0,1,0,1,0) \in \mathbb {Z}_{\ge 0}^6\). The general definition is as follows.

Definition of Chemical Reaction Networks and Mass-Action Kinetics

Definition 1

A chemical reaction network (on a finite set of s species, which we assume ordered) is a finite labeled directed graph G = (V, E, (κ ij)(i,j) ∈ E, (y i)i=1,…,m), whose vertices V  are labeled by complexes \(y_1, \dots , y_m \in \mathbb Z_{\ge 0}^{s}\) and whose edges (i, j) ∈ E are labeled by positive real numbers \(i \stackrel {\kappa _{ij}}{\rightarrow } j\). We will also say that G is a network.

Mass-action kinetics specified by the network G gives the following autonomous system of ordinary differential equations in the concentrations x = (x 1, x 2, …, x s) of the species as functions of time:

$$\displaystyle \begin{aligned} \frac {dx} {dt} \quad = \quad \sum_{(i,j) \in {E}} \kappa_{ij} \, {x^{y_i}} \, (y_j -y_i) \quad = \quad f_\kappa (x). \end{aligned} $$
(1)

Here, \(\frac {dx} {dt} \) and y j − y i are column vectors.

Note that the coordinates f 1, …, f s of f κ are polynomials in \(\mathbb R[x_1, \dots ,x_s]\) (to ease the notation we omit the dependence of f i on κ). Many systems occurring in population dynamics, for example the oscillatory Lotka-Volterra equations, can be viewed as arising from a chemical reaction network as in (1), but for instance not the “chaotic” Lorenz equations. A simple characterization of autonomous dynamical systems arising from chemical reaction networks under mass-action kinetics has been given by Hárs and Tóth. We refer to the book [20], which also contains an introduction to the stochastic modeling of chemical kinetics.

Another direct consequence of the form of the equations in (1) is that for any trajectory x(t), the vector \(\frac {d x}{dt}\) lies for all t (in any interval I containing 0 where it is defined) in the so called stoichiometric subspace S, which is the linear subspace generated by the differences {y j − y i | (i, j) ∈ E}. Using the shape of the polynomials f i it can be seen that the positive orthant \(\mathbb R_{>0}^s\) and its closure \(\mathbb R_{\ge 0}^s\) are forward-invariant for the dynamics. Then, any trajectory x(t) starting at a nonnegative point x(0) lies for all \(t \in I \cap \mathbb {R}_{>0}\) in the closed polyhedron \((x(0) +S) \cap \mathbb {R}_{\geq 0}^s\), which is called a stoichiometric compatibility class, or for short, an S-class.

Denote by q the codimension of S. Given a basis 1, …, q of linear forms in the dual of S, let T i =  i(x(0)), i = 0, …, q. The equations 1(x) = T 1, …, q(x) = T q of x(0) + S = S T give linear conservation relations and, as above, the constant coefficient T i of such a linear equation is called a total conservation constant.

The Steady State Variety and the Notion of Multistationarity

The steady state variety V κ(f) of the kinetic system (1) equals the nonnegative real zeros of f 1, …, f s:

$$\displaystyle \begin{aligned} V_\kappa(f) \, = \{ x \in \mathbb{R}^s_{\ge 0}\, : \, f_1(x) = \dots =f_s(x)=0 \}. \end{aligned} $$
(2)

An element of V κ(f) is called a steady state of the system and corresponds to a constant trajectory in the nonnegative orthant. We say that system (1) exhibits multistationarity if there exist at least two positive steady states with the same total conservation constants, that is, in the same S-class. This is an important property for chemical reaction networks modeling biological processes, since the ocurrence of multistationarity allows for different responses of the cell under the same total conservation constants, depending on the initial conditions.

In fact, our point of view will be the following. The underlying reaction network (V, E, (y i)i=1,…,m) defines a family of autonomous polynomial dynamical systems depending on the positive parameters \(\kappa \in \mathbb R_{>0}^{\# E}\). We say that it has the capacity for multistationarity if there is a choice of reaction rate constants κ = (κ ij)(i,j) ∈ E and total conservation constants T = (T 1, …, T q) for which the intersection of the steady state variety V κ(f) with the positive points of linear variety S T consists of more than one point (that is: there exist parameters κ and T such that there are at least two points in the positive orthant lying in the intersection of the steady state variety V κ(f) with the S-class defined by T).

There are many results to decide the capacity for multistationarity of a given chemical reaction network, starting with [14]. Most of them have been summarized in Theorem 1.4 of [39]. In fact, these results give in general necessary and sufficient conditions for the stronger condition that the map f κ is injective on the positive points of all S-classes. There are several implementations of different algorithms, starting with the pioneering algorithm implemented by Feinberg and his group in the Chemical Reaction Network Toolbox. The link to the corresponding webpage together with links to other algorithms can be found at https://reaction-networks.net/wiki/Mathematics_of_Reaction_Networks#. We recall some of the tools to address this question in Sects. 4 and 5.

In Fig. 4, there is a range of values of T for which there are three positive steady states on the corresponding translate S T of S (i.e., in an S-class) for a fixed value κ of positive rate constants. So, the chemical reaction network has the capacity for multistationarity and κ is a choice of multistationarity parameter.

Fig. 4
figure 4

The green curve represents the steady state variety \(V_{\kappa ^*}(f)\). The subspace S = { = 0} is a line. The number of points of intersection of the translates S T = { = T} of S with \(V_{\kappa ^*}(f)\) in the positive orthant depends on the total conservation constant T

We feature two kinds of multistationarity pictures from the literature. One way to find the special values rendering these figures is by measurements in experiments or by exhaustive (and lucky) simulations of the trajectories taking sample values in the space of parameters and initial conditions. Instead, one can try to develop algebro-geometric tools to analyze the mathematical models arising from biochemical reaction networks, with the goal of making predictions from the structure of the networks.

Figure 5 corresponds to a 2-site sequential phosphorylation and dephosphorylation that we describe in Sect. 3 below. This network has 15 parameters: 12 reaction constants and 3 total conservation constants. In the picture, all the reaction rate constants and two of the total conservation constants have been specialized and only the total conservation constant Etot of one enzyme is varying. This is considered to be the input variable (or stimulus) and it is represented on the x-axis. The number of chemical species is equal to 9, but only one of the phosphorylated substrates s at steady state is represented, which is consider the response of the system. It happens that in this case any positive value of s is one coordinate of a positive steady state and different steady states in the same S-class have different s coordinates. The steady state s -coordinate is represented on the y-axis. For small or big values of Etot, only one value of s is possible, so this is a monostationary regime. In the middle zone, there are three steady states, two stable and one unstable, so this is the bistable regime (stability of steady states is determined by the negativity of the real part of the eigenvalues of the Jacobian). This figure corresponds to a two dimensional very particular “slice” of points originally in 24 = 15 + 9 variables, where 14 variables have been specialized and 8 variables are not shown.

Fig. 5
figure 5

Only one parameter is allowed to vary

Figure 6 represents a two dimensional “slice”, but in parameter space, of another mechanism that we do not specify, but in which only two of the parameters (a, b) are allowed to vary. For each of the values of (a, b) outside the line segments separating the regions, there are either one or three positive steady states, which could be stable or unstable. In fact, in most biochemical networks these curves separating the regions are far from being line segments; they are high order algebraic hypersurfaces that separate different semialgebraic regions where the qualitative dynamics is the same, in a high dimensional parameter space. Moreover, regions with interesting behaviour could be small.

Fig. 6
figure 6

Only two of the parameters are allowed to vary

The separating hypersurfaces related to the question of multistationarity are described by the union of the discriminant associated to the equations describing V κ(f) and S T with respect to the x variables (which vanishes whenever there is a point where the intersection of the steady state variety and the S-class is non-transversal), and the union for any i ∈{1, …, s} of the resultant describing the fact that there is a common point with x i = 0. In each chamber (connected component) of the complement of the union of these algebraic varieties, the number of real roots is the same and moreover, for each of the real roots it holds that the sign of each of the coordinates does not change as the parameters are moved, and thus the number of real roots with a fixed sign (for instance, positive roots) is constant along the chamber. We refer the reader to the book [26] for the notions of discriminant and resultant, which are in general not linear. These polynomials in the parameters can be computed effectively—in theory—via different computational algebraic geometry methods of elimination of variables, but standard computations are not feasible when there are many variables. Even if one can compute these equations, it is a very complicated task to describe then all the possible chambers in the complement of its zero locus, or at least to find one representative in each chamber. There are implementations by M. Safey El Din, which work very well in small examples using his package RAGlib [47].

3 Two Important Families of Enzymatic Networks

In this section, we introduce common enzymatic mechanisms that will help us exemplify and clarify the concepts we will introduce in Sect. 4.

Sequential Phosphorylations

The multisite n-phosphorylation system describes the site phosphorylation of a protein (with n sites where a phosphate group can be absorbed or emitted) by a pair of enzymes (a kinase and a phosphatase) in a sequential and distributive mechanism. The Nobel Prize in Physiology or Medicine was awarded in 1992 to Edmond Fischer and Edwin Krebs “for their discoveries concerning reversible protein phosphorylation as a biological regulatory mechanism.” The kinase and the phosphatase speed up the transformation of other proteins without being incorporated in the final products of the process, which is crucial in the regulation of metabolism in the body. Multi-site phosphorylation plays important regulatory roles in cell cycle regulation and inflammation pathways, and is implicated in multiple disorders, including Alzheimer disease. Because of the important role played by these systems in signal transduction networks inside the cell, there is a body of work on the mathematics of phosphorylation systems (which belong to the more general class of post-translational modification systems). We refer the reader to the papers [33, 43, 50] and the references therein.

We now describe the special case of a sequential phosphorylation/dephosphorylation with n = 2 sites, which is also known as the dual futile cycle. There are nine species: three substrates (the unphosphorylated substrate S0, the substrate with one and two phosphorylated sites S1 and S2), two enzymes (the kinase E and the phosphatase F), and four intermediate species (ES0, ES1,FS2 and FS1). We give to the twelve rate constants the usual names in the literature [52].

We number the species and their concentrations as follows: denote the respective concentrations of ; denote the respective concentrations of the intermediate species , is the concentration of the kinase , and the concentration of the phosphatase . The associated system of ODE’s defined in (1) equals in this case:

There are 3 independent linear conservation laws, for instance:

where Stot, Etot, Ftot are positive real numbers for any choice of initial condition in the positive orthant. As we pointed out in Sect. 1, there are 12 + 3 = 15 parameters. The n-site sequential mechanism is similar, with 3n + 3 variables, 6n reaction rate constants and always 3 total conservation constants, so a total of 6n + 3 parameters.

Phosphorylation Cascades

We have already encountereda coarse diagram of an enzymatic cascade in Fig. 2. MAP kinase cascades are important signal transduction systems in molecular biology for which there is also a body of mathematical work, see for instance [35, 41] and the references therein. These cascades correspond to a network of enzymatic reactions arranged in layers, where usually in each of them there is a futile cycle of sequential phosphorylations and such that the fully phosphorylated substrate serves as an enzyme for the next layer.

The simplest case of a cascade with the capacity of multistationarity [23] consists of a cascade with two layers and a single phosphorylation/dephosphorylation at each layer, with one phosphatase. It corresponds to the a labeled digraph, with 9 variables and 18 parameters, where each single phosphorylation follows the same mechanism as in our previous example, with an intermediate species. The nine species are the substrates in the first layer, the substrates in the second layer, four intermediate complexes, a kinase and the same phosphatase to dephosphorylate the substrates in both layers. The forward enzyme in the second layer is the phosphorylated substrate from the first layer.

This mechanism is usually depicted as follows, hiding the reaction rate constants and the intermediate species:

In this case, there are 4 linearly independent conservation relations. Denoting with small letters the concentration of each of the species, these conservation relations can be chosen as follows, as predicted in Theorem 3.2 in [42] (see (4) below):

where Stot, PtotEtot, Ftot are positive real numbers for any choice of initial condition in the positive orthant.

We can also consider cascades with any number n of layers. In this case, the number of variables, the number of reaction rate constants and the number of independent linear conservation relations (as well as the number of linear conservation constants) grow linearly with n.

4 MESSI Systems

In this section we recall the notion of MESSI networks from [42], to describe a common structure underlying the four examples above in their different variants as well as many “popular” biological networks, that consist of Modifications of type Enzyme-Substrate or Swap with Intermediates. The occurrence of this structure allows us to prove general results for quite different mechanisms. The basic ingredient of a MESSI structure is a partition of the set of species, which reflects the different chemical behaviors. This grouping of the chemical species into disjoint subsets is in accordance with the intuitive partition of the species according to their function that biochemists have. We will denote the disjoint union of sets with the symbol \(\bigsqcup \).

Definition of a MESSI System

Definition 2

A MESSI network is a chemical reaction network satisfying the following properties. First of all, there exists a partition of the set \(\mathcal {S}\) of species

$$\displaystyle \begin{aligned} {\mathcal S} \, = \, \mathcal{S}^{(0)}\bigsqcup \mathcal{S}^{(1)} \bigsqcup \mathcal{S}^{(2)} \bigsqcup \dots \bigsqcup \mathcal{S}^{(m)}, \end{aligned} $$
(3)

where m ≥ 1, \(\mathcal {S}^{(0)}\) is the subset of intermediate species and could be empty, and all \(\mathcal {S}^{(i)}\) with i ≥ 1 are nonempty subsets, formed by what we call core species. We requiere that the complexes and reactions satisfy the following conditions. An intermediate species can only be part of a monomolecular complex consisting only of this speces (called an intermediate complex). Non-intermediate complexes are called core complexes and consist of one or otherwise two chemical species belonging to different subsets of the partition. Denote by y →y′ the existence of an edge from complex y to complex y′ or a directed path of reactions from y to y′ through intermediate complexes. We require that for any intermediate complex y 0, there exist core complexes y, y′ such that y →y 0 →y′. If there are two monomolecular core complexes y →y′, then both should consist of a species in the same S (α). We further ask that if there is a reaction between a monomolecular and a bimolecular complex, the monomolecular complex is an intermediate, and that if y, y′ are bimolecular core complexes such that y →y′, then there exist two different core subsets \(\mathcal {S}^{(\alpha )}, \mathcal {S}^{(\beta )}\) in the partition, such that both y and y′ consist of a species in each of them.

When endowed with mass-action kinetics, a MESSI network gives rise to a MESSI system of polynomial autonomous ODE’s.

All the Networks We Mentioned Are MESSI

All the networks we mentioned in the text (plus many other common biochemical networks) can be endowed with the structure of a MESSI system. We gave different colors to the different subsets in a possible partition of the species.

For instance, in the cascade depicted in Fig. 2 in the Introduction, the intermediate species (complexes) are not displayed, but we presented with different colors a possible partition of the core species that defines a MESSI structure. In the network depicted in Fig. 3 the partition into a subset of intermediate species (in black), and two subsets of core species (in red and blue) also defines a MESSI structure.

In the two-component system in Sect. 2, we could take \(\mathcal {S}^{(0)} = \emptyset \), , and .

In the example of the sequential phosphorilation in Sect. 3, we could take , ; , and . It can be checked that all conditions are satified. Note that if we consider the coarser partition with the same set of intermediate species \(\mathcal {S}^{(0)}\), the same set \(\mathcal {S}^{(1)}\) of core species, and just one other set {E, F} of core species, we also have a MESSI structure. In fact, there is in general a poset of possible partitions (and in other examples there could be non-comparable partitions).

On the other side, in the example of the cascade in Sect. 3, we can partition the set of nine species as follows to define a MESSI structure in the 2-layer cascade: \(\mathcal {S}^{(0)}\) consists of the four intermediate species {ES 0, FS 1, S 1P 0, FP 1}, plus the core subsets , , , and .

Conservation Laws

The first general results about MESSI systems is that we can describe enough (explicit) conservation linear relations with positive coefficients. Given a partition (3) of the set \(\mathcal {S}\) of variables into one intermediate subset and m ≥ 1 nonempty core subsets defining a MESSI structure in a given network G, note that the associated autonomous polynomial dynamical system defined in (1) is linear in the variables of each S (i) union the subset Int i consisting of those intermediate species y′ for which there exists a core complex y containing one species of \(\mathcal {S}^{(i)}\) such that y →y′ (for any fixed i = 1, …, m). The union of these subsets Inti equals \(\mathcal {S}^{(0)}\), but they are in general not disjoint, because if in the recent notation y also contains a species in another \(\mathcal {S}^{(j)}\), then y′ also belongs to Intj. These intersections account for several important properties of the systems.

Theorem 3.2 in [42] asserts that given a partition of \(\mathcal {S} =\{x_1, \dots , x_s\}\) defining a MESSI structure as in (3), the following linear forms 1, …, m belong to the dual of the stoichiometric subspace S:

$$\displaystyle \begin{aligned} \ell_i(x) = \sum_{x_j \in \mathcal{S}^{(i)}} \, x_j \, + \, \sum_{x_j \in \mathrm{Int}_i} x_j, \, \, i=1, \dots, m. \end{aligned} $$
(4)

We refer the reader to Section 3 in [42] for conditions ensuring that these are a basis of conservation relations (and examples where this is not the case). We conclude that all MESSI systems are conservative. Thus, all S-classes are compact, and all trajectories are bounded and defined for any positive time. In fact, given a MESSI network, if x is a trajectory of the associated mass-action kinetics dynamical system \(\dot {x}(t) = f(x(t))\), for all t in an open interval containing \(\mathbb {R}_{\ge 0}\)) with \(x(0) \in \mathbb {R}^s_{>0}\), let (T 1, …, T m) = ( 1(x(0)), …, m(x(0)). Then, we have that for any t ≥ 0 it holds that i(x(t)) = T i for any i. Then, all the coefficients of the linear form \(\ell =\sum _{i=1}^m \ell _i\) are positive and \(\ell (x(t))= \sum _{i=1}^m T_i >0\).

The Associated Digraphs

In order to state some other general results for MESSI networks, we introduce three associated digraphs G 1, G 2, G E associated with a given MESSI network G with a vector of rate constants k. We refer the reader to Section 3 in [42] for complete definitions, explanations and examples.

We eliminate all intermediate species to define G 1, which naturally inherits a MESSI structure: the species of G 1 are the core species of G, its complexes are the core complexes of G and there is an edge between two core complexes y, y′ precisely when y →y′ in G. The rate constants of G 1 are rational functions τ(κ) with nonzero denominator over all positive κ, in such a way that when viewed with mass action kinetics gives rise to a system of the form \(\dot {x'} = f^1(x')\), the steady state variety V τ(κ)(f 1) of the system defined by G 1 is a projection of the steady state variety V κ(f) of the original system. They have been explicitly defined in display (15) of the Supplementary Material in [24], see displays (5.3) and (5.8) in [6]. To define the digraph G 2, we first consider for any i = 1, …, m the linear network obtained by “hiding” in the rate constants the concentration of all species \(x_j \notin \mathcal {S}^{(i)}\). For instance, an edge \(X_j + X_k \to X_{j_1} +X_{k_1}\) with \(X_j, X_{j_1} \in \mathcal {S}^{(i_1)}\), \(X_k, X_{k_1} \in \mathcal {S}^{(i_2)}\), with rate constant c, gives raise to the following two edges in G 2: the edge \(X_j \to X_{j_1}\) with rate constant cx j, and the edge \(X_k \to X_{k_1}\), with rate constant cx k. Note that we get this way a multidigraph MG 2 with possibly repeated edges and loops. We then denote by G 2 the digraph derived from MG 2 after collapsing multiple edges into a single edge, with label equal to the sum of the labels of the different edges. The nodes in each connected component of G 2 correspond to the species in one of the subsets \(\mathcal {S}^{(i)}\) of the partition if and only if this partition is minimal (in the poset of partitions of \(\mathcal {S}\) defining a MESSI structure on G). The digraph G 2 is linear (each node is labeled with a monomolecular complex with a single species) and again, if we formally associate to it mass-action kinetics, its steady state variety coincides with that of G 1. Finally, we denote by \(G_2^\circ \) the multidigraph obtained from G 2 after deleting all loops. On the other side, the nodes of the digraph G E are the subsets \(\mathcal {S}^{(1)}, \dots , \mathcal {S}^{(m)}\) and there is an edge from \(\mathcal {S}^{(i_1)}\) to \(\mathcal {S}^{(i_2)}\) with a label contaning as a factor the concentration of any species in \(\mathcal {S}^{(i_1)}\).

The graphs \(G_1, G_2^\circ \) and G E associated to two of the networks in the previous sections are depicted in Figs. 7 and 8.

Fig. 7
figure 7

The graphs G 1, \(G_2^\circ \) and G E for the phosphorylation cascade in Sect. 3

Fig. 8
figure 8

The graphs G 1, \(G_2^\circ \) and G E for the EnvZ/OmpR two-component network in the Introduction

Persistence

A chemical reaction system (1) is persistent if any trajectory starting from a point with positive coordinates stays at a positive distance from any point in the boundary, or informally, if no species which is present can tend to be eliminated in the course of the reaction. A steady state lying in the boundary of the nonnegative orthant (that is, with some coordinates equal to zero) is called relevant if it lies in the intersection of the boundary of the nonnegative orthant with a stoichiometric compatibility class through a point in \(\mathbb {R}^s_{>0}\). As MESSI systems are conservative, Theorem 2 in [1] proves that a MESSI system is persistent when there are no relevant boundary steady states.

Given a MESSI network G, we identify the following hypotheses:

  1. (A)

    The associated digraph G 2 is weakly reversible.

  2. (B)

    The associated digraph G E has no directed cycles.

Hypothesis (A) means that for any pair of nodes in the same connected component, there is a directed path from one to the other. For instance, in the two examples considered in Figs. 7 and 8, hypothesis (A) is verified. Hypothesis (B) is also verified in the case of the cascade network, but not in the EnvZ/OmpR two-component network. However, even if they sound restrictive, there is a big range of signaling pathways that satisfy both hypotheses.

Theorem 3.15 in [42] asserts that a MESSI network G which satisfies hypotheses (A) and (B) does not have relevant boundary steady states, and is thus persistent. Moreover, as MESSI systems are conservative, a version of Brouwer’s fixed point theorem ensures the existence of a non-negative steady state in each S-class. So, the abscence of relevant boundary steady states implies the existence of a positive steady state in each S-class.

Explicit Parametrization of\(V_\kappa (f) \cap \mathbb {R}^s_{>0}\)

We describe a big class of MESSI networks for which the steady state variety V  is rational. This is a very uncommon property for general algebraic varieties.

Explicit Rational Parametrizations

We want to describe the intersection V κ(f) ∩ S T in the positive orthant. The steady state variety is defined in principle by s polynomial equations. Assume the dimension of S (and thus of S T for any T) equals s − q and can thus be defined by q linear equations. This implies that there are (at most) s − q linearly independent polynomials among f 1, …, f s. A finite number of common solutions is expected, but this might not be true.

One way to simplify the computation of the intersection is the following. As S T are linear varieties, they can be parametrized by s − q parameters. One could then parametrize S T solving for q variables in terms of the other ones and then replace this in the equations of the steady state variety. This reduces the number of variables from s to s − q, but the polynomials f 1, …, f s are particular, with a monomial structure that comes from G and we would in general destroy the sparsity.

Denote by \(V_{>0,\kappa }(f) = V_\kappa (f) \cap \mathbb {R}^s_{>0}\). One could then try to parametrize V >0,κ(f) but general algebraic varieties do not have rational parametrizations. This is a very uncommon property for general algebraic varieties. However, rational parametrizations do exist for the positive points of the steady state variety in certain enzymatic biochemical networks, as proved by Thomson and Gunawardena in [51]. We extended this result for many other networks of biological interest which are MESSI. Theorem 4.1 in [42] proves the existence of an explicit and algorithmically constructible rational parametrization of V >0,κ(f) for any MESSI network G satisfying conditions (A) and (B) above. Moreover, if the partition is minimal with m subsets of core species, we have that \(\dim V_{>0,\kappa }(f) = m = s- \dim S\).

Moreover, we identify conditions that ensure that this parametrization is monomial, or equivalently, that V >0,κ(f) can be cut out by binomial equations (that is, polynomials with two terms) and, in this case, we give explicit binomials in Theorem 4.8 in [42] for what we call s-toric MESSI systems. Again, the conditions seem to be very restrictive, but there are plenty of interesting signaling pathways that satisfy them; for instance the n-site phosphorilation networks and many enzymatic cascades, as the ones we presented in Sect. 3. In the case of the n-sequential phosphorylation network (which has 3n + 3 variables) we can parametrize the positive steady state variety with 3 parameters for any value of n. To compute the intersection V κ(f) ∩ S T (which equals V >0,κ(f) ∩ S T due to the abscence of relevant boundary steady states, as we pointed out before), we can write 3 of the variables in terms of the remaining 3n variables from the 3 conservation relations and replace them into 3n linearly independent f i (which exist in this case). We could substitute the parametrization into the conservation relations and thus get 3 equations in 3 variables. This is what makes the n-site amenable to computations even if in principle the number or variables tends to infinity with n. Note that if instead we plug in a parametrization of S T into the equations of the steady state variety, we get a system, that besides losing sparsity, consists of 3n equations in 3n variables.

Recognizing the existence of a MESSI structure on a given network, checking the hypotheses in all our results and finding the rational parametrization are algorithmic and only depend on the structure and not on the particular parameters.

Deciding Multistationarity

The important biological mechanism of n sequential phospho-dephosphorylations has the capacity for multistationarity for n = 2, that is, there can be up to 3 positive steady states in V κ(f) ∩ S T (for particular choices of the rate constants κ and positive linear conservation constants T). This system has been first studied by L. Wang and E. Sontag in [52]. They proved that the maximal possible number of positive steady states is 2n − 1 and identified parameters for which there are n + 1 positive steady states for n even (and n for n odd). Note that n + 1 = 2n − 1 for n = 2. It has been proved in [36] that the upper bound 2n − 1 is attained for n = 3, 4, and it is probable that 2n − 1 is a sharp upper bound, but this has not been proven yet for n ≥ 5. See also [33,34,35] for a discussion of other dynamical features (stability and oscillations).

In fact, the steady states of most popular MESSI systems (including all those recalled above) present an s-toric structure, and we gave in this case a characterization of the capacity for multistationarity, which lead to an algorithm based on tools from oriented matroid theory. The main ideas in this approach, which go back to [14] and several other papers, including articles in other applied areas, are collected and clarified in the paper [39]. We give below a simple version of the multistationarity results in Section 5 in [42], which is valid for other biochemical reaction networks for which the positive steady states can be defined by binomials in a parametric way and satisfying certain conditions (that we can ensure from the structure of the network, see e.g. Proposition 5.6 in [42]). In particular, these binomials are of the form p κ = a(κ)x α − b(κ)x β, with \(\alpha , \beta \in \mathbb {Z}_{\ge 0}^s\), and a, b polynomial functions on the vector of rate constants \(\kappa \in \mathbb {R}^r_{>0}\) taking positive values over \(\mathbb {R}^r_{>0}\).

Given such a binomial p κ, consider the vector \(v_{p_\kappa }= \alpha - \beta \in \mathbb {Z}^s\) (note that \(v_{p_\kappa } =- v_{-p_\kappa }\), so indeed \(v_{p_{\kappa }}\) are integer vectors defined up to sign). Also, given a matrix M of size m 1 × m 2 of rank m 1, a subset J of indices of cardinality m 1 determines a maximal minor of M, which we denote by M J.

Deciding Mono/Multistationarity

Let G be a chemical reaction network. Denote by S a matrix whose rows define the dual of the stoichiometric subspace S with rank(S ) = d. Assume that V >0,κ(f) is cut out by s − d binomials p j,κ, j = 1, …, s − d, with exponents \(v_{p_{j,\kappa }}\) which form the columns of a matrix B. Assume moreover that rank(B) = s − d. Then, the following statements are equivalent

  1. 1.

    Monostationarity: There is at most a single positive solution in V >0,κ(F) ∩ S T, for any S-class intersecting the positive orthant, for any \(\kappa \in \mathbb {R}^r_{>0}\).

  2. 2.

    For all subsets J ⊆{1, …, s} of cardinality d, the product

    $$\displaystyle \begin{aligned}(-1)^{{\sum_{j \in J} j}} \, { \det(S^\bot_{J})}\, {\det(B_{\{1, \dots,s\} \setminus J}})\end{aligned}$$

    either is zero or has the same sign as all other nonzero products, and at least one such product is nonzero.

The previous result can be turned into an algorithm to decide if a network has the capacity for multistationarity, together with an algorithm to produce vectors of rate constants k for which multistationarity occurs (in case the network is not monostationary).

5 Other Approaches to the Question of Multistationarity

The reader might have noticed that within a reasonable extension for a survey, we cannot properly define and explain all concepts. This section will then be only a pointer to some recent papers addressing the question of multistationarity, besides the articles and tools we have mentioned before. We also refer the reader to the recent survey [13] and the references therein.

Craciun, Helton and Williams applied in [15] the homotopy invariance of degree to determine the number of equilibria of biochemical reaction networks and how this number depends on parameters in the model. Conradi, Feliu, Mincheva and Wiuf give in [8] necessary and sufficient conditions for the multistationarity of networks having a positive rational parametrization, in terms of the reaction rate constants, also based on degree theory. This approach is very interesting since they can describe open multistationarity regions in rate constant space. However, it does not describe particular stoichiometric compatibility classes for which there is multistationarity, as it is also the case with the methods based on signs as the result we described about mono/multistationarity. The reason is that all these approaches are related (in more explicit or hidden ways) to properties of a Jacobian, for instance of an appropriate choice of the polynomials f 1, …, f s and linear functions 1 − T 1, …, q − T q giving equations for S T with respect to the x variables, and so the linear conservation constants T 1, …, T q do not appear. In [18] we considered extensions and simplifications of this approach via critical functions, for networks with special structure, in particular for special MESSI networks which are commonly used in modeling enzymatic pathways. We also propose a method based on the existence of triangular forms, relying on techniques from computational algebra.

Sadeghimanesh and Feliu provide in [46] a new determinant criterion to decide whether a network is multistationary, when the network obtained by removing intermediates has a binomial steady state ideal. In this case, they characterize the multistationarity structure of the network, i.e. which subsets of complexes are responsible for multistationarity. In particular, they compute the multistationarity structure of the n-site sequential distributive phosphorylation cycle for any n.

Together with Bihan and Giaroli, we incorporated in [6] a new tool from real algebraic geometry based on the article [7] by Bihan, Santos, and Spaenlehauer. The basic idea is the following. Given a sparse polynomial system, that is, with exponents in a specified finite set of integer points A, if it is possible to find p decorated simplices in a regular subdivision of A, then it is possible to scale the coefficients of the given system in an explicit way to get at least p nondegenerate positive real roots. This gives a lower bound on the number of positive roots. The hypotheses of regularity of the subdivision means that it comes from a lifting of the points in A after considering the projection of the domains of linearity of the lower convex hull of the lifted points. This is what gives the necessary compatibility to find a common open set in the space of coefficients where the p positive solutions can be jointly continued. The meaning that a simplex is decorated is the following. Let {a 0, …, a d}⊂ A denote the set of vertices of a maximal dimensional simplex in dimension d. Given (Laurent) polynomials g 1, …, g d with support A, consider their subsums of monomials corresponding only to these exponents. So one gets a system with d polynomials in d variables and d + 1 monomials of the form:

$$\displaystyle \begin{aligned} \sum_{j=0}^d \, c^i_j \,x^{a_i} =0, \, i=1, \dots, d.\end{aligned}$$

This system has at most one positive root and it does have a (nondegenerate) positive root exactly when the following linear system does:

$$\displaystyle \begin{aligned} c^i_{0} + \sum_{j=1}^d \, c^i_j \,x_i =0, \, i=1, \dots, d.\end{aligned}$$

This condition is equivalent to an alternance of signs of the minors of the d × (d + 1) real matrix with coefficients c ij. The simplex is said to be decorated by a choice of coefficients of the input polynomials when this is the case. It is interesting to note that, differently from the case of complex roots with nonzero coordinates, it is not always true that the lower bound in the case of positive solutions matches the maximum number of positive real roots for any regular subdivision. A simple example is the following. Assume A = {(0, 0), (1, 0), (1, 2), (2, 1)} are the vertices of a paralellogram of Euclidean volume 2 in the plane. A sparse polynomial system (g 1 = g 2 = 0) with this support can have 2 ⋅ 2 = 4 isolated complex solutions with nonzero coordinates by Kouchnirenko’s theorem and 3 positive solutions (and this number can be attained, see [5] and the references therein). But it is clear that the support can only have three regular subdivisions: either nothing is subdivided or we get any of the two subdivisions depicted in Fig. 9, so the maximum lower bound p that one can obtain is 2. Nevertheless, this is up to now the only systematic way to find conditions on jointly on all the parameters that ensure the existence of several positive steady states, as for instance degree considerations are eventually based on parity considerations. But the best advantage of this approach is that it allows us to describe multistationarity regions in the space of all parameters, both reaction rate constants and linear conservation constants. Remark however that our conditions are only sufficient.

Fig. 9
figure 9

The two proper subdivisions of a circuit

We refer the reader to Section 3 in [27] for a simple example explaining the technical results in [6]. These tools allowed us to find in that article precise multistationarity regions in enzyme cascades with any number n of layers of Goldbeter-Koshland loops (with a single phosphorylation/dephosphorylation in each layer), which are multistationary as soon as the two first phosphatases are the same. Interestingly, the number of variables is of the order of 4n and the dimension of the stoichiometric subspace S is of the order of 2n, so it is cut out by roughly 2n linear equations and parametrized by a similar number of variables. So, even taking advantage of the parametrizations of the steady state variety and a translate S T of S, we need to deal with of the order of 2n equations in 2n variables. When the two layers with the same phosphatase are the last ones, it is possible to find particular multstationarity reaction rate constants for the cascade following the approach in [4]. Other papers based on the study of extrapolation of multistationarity from that of simpler subnetworks are for instance [9, 37].

In ongoing work with Giaroli, Pérez Millán and Rickster [17], we are able to use this setting to give a precise region in the space of all parameters for which the n-sequential phospho/dephosphorylation mechanism can have n + 1 for n even (and n for n odd) positive steady states, assuming that only \(\frac {1}{4}\) of the intermediate complexes are part of the reactions. In another recent work Conradi, Iosif, and Kahle [10] also use tools from polyhedral geometry. They show that for reaction networks whose positive steady states can be cut out by binomials, multistationarity is scale invariant in the space of linear conservation constants (that is, if there is multistationarity for some value of the linear concentration constants, then there is multistationarity on the entire ray containing this value (possibly for different reaction rate constants). They consider the chamber decomposition in linear conservation constant space, which allows them to show that for values of these constants in one of the five chambers the 2-site sequential phosphorylation network cannot be multistationary.

Other approaches use numeric or symbolic methods to detect points in different chambers of the complement of the discriminant and the resultants that we mentioned before, see for instance [30, 32]. The general mathematical problem is the search of positive roots of sparse polynomial systems; see for instance [21] where these techniques have been applied to a geometric problem.

Stability and Convergence

The important question of deciding stability of a given steady state x of a chemical reaction network with fixed constants k can be formalized via Routh-Hurwitz theorem by means of the satisfiability of certain polynomial inequalities which correspond to minors of the Jacobian matrix at the point x , as a pattern of signs of these minors corresponds to all eigenvalues of the Jacobian having negative real part. However, this is a difficult question if the point x is given implicitly and if one tries to trace these inequalities as the parameters vary. So, only in few cases there is a complete analysis (see for instance [33]).

Another important question is to ensure convergence of the trajectories. Note that if a trajectory defined on the whole positive real line converges for t → + to a point p, then p is a steady state. A first question is to decide global convergence in the presence of a single steady state in each S-class. We refer the reader to the results (and the references) in [19] for diverse architectures of processive multisite phoshorylation networks, which are based on previous work by Angeli, De Leenher and Sontag [2].

6 Oscillations

Another important biological feature is the possible occurrence of oscillations. Oscillations have been observed experimentally in signaling networks formed by phosphorylation and dephosphorylation, which seems to be the main mechanism in the 24-hour period in eukaryotic circadian clocks (see for instance [11, 44] and the references therein). Despite the many articles studying sequential phospho/dephosphorylation networks, it is not currently known whether in the 2-site sequential mechanism there could be trajectories which oscillate.

Instead, Suwanmajo and Krishnan showed recently in [50] that oscillations occur intrinsically in the the dual-site phosphorylation and dephosphorylation network, in which the mechanism for phosphorylation is processive while the one for dephosphorylation is distributive (or vice-versa), arising from a Hopf bifurcation. We also refer to the interesting paper [45], where the authors propose a systematic analysis of the long-term dynamics of phosphorylations systems. They describe bistability and oscillations when the network has nonzero levels of reaction processivity. Processivity means that the intermediate complex does not dissociate into substrate plus enzime after a phospho/dephosphorylation, but only after two or more. Conradi, Mincheva, and Shiu showed in [11] for the mixed mechanism in [50] that in the three-dimensional space of linear conservation constants, the border between the existence of a stable or an unstable steady state is defined by the vanishing of a single Hurwitz determinant, which consists generically of simple Hopf bifurcations. Besides the Routh-Hurwitz criterion, their analysis relies on an algebraic Hopf-bifurcation criterion due to Yang and a monomial parametrization of the positive steady state variety. It would be very interesting to extend these kind of analyses to other mechanisms, in particular, to other phosphorylation networks.

Rendall and Hell studied in [34, 35] the existence of parameters for which Hopf bifurcations occur and generate periodic orbits in the case of (MAP kinase) cascades. They also explain how geometric singular perturbation theory allows to generalize results from simple models to more complex ones. Also Banaji presents in [3] some results are presented on how oscillation is inherited by chemical reaction networks (CRNs) when they are built in natural ways from smaller oscillatory networks, showing a particularly nice result for fully open networks (where for any species X, there are reactions 0 → X and X → 0), also based on regular and singular perturbation theory. We also mention the pioneering work of Karin Gatermann introducing algebraic and combinatorial techniques for the search of Hopf bifurcations [25].

7 Mathematical Challenges

In this section we enumerate some of the main open questions in this area. They involve difficult mathematical questions and moreover, systems of biological interest usually have a big number of variables and parameters.

  1. 1.

    Give general precise bounds for the number of positive solutions of (parametric families of) sparse polynomial systems and apply them to find the number of positive steady states: (a) develop tools to obtain better lower bounds for the number of positive steady states; (b) develop tools to get good upper bounds for the number of positive steady states. Moreover, find regions in parameter space with the predicted number of positive steady states, or at least where lower/upper bounds apply.

  2. 2.

    Predict or preclude oscillations from structure: how do (sustained) oscillations arise in phosphorylation networks? Can we find “atoms of oscillation”? Moreover, describe “regions of oscillation” in parameter space.

Conclusion

We can use algebro-geometric notions and methods to analyze system biology models. Algebraic and combinatorial methods allow us to predict (some) qualitative dynamic behaviours of our models from the structure of the network, without simulations and without measuring all the parameters a priori. We do have several promising results, but in many cases they tend to be too complex to be understood or computed. Answers to the above questions would require to develop a combination of tools from dynamical systems, real algebraic geometry, computational and numerical algebraic geometry, differential algebra, and biochemistry!