Key words

1 Introduction

Despite the growing number of therapeutic options available to clinicians, gaps remain in our fundamental understanding of many biological processes. Acquiring this additional knowledge requires that we focus on the molecular players that operate in intercellular and intracellular environments. Revealing the complex networks and dynamics that control cellular, tissue- and host-level behavior may enable us to improve existing treatments and design new drug targets.

Many intercellular signals are initiated by signaling proteins such as cytokines and hormones. When cytokines bind to receptors of a target cell, they trigger a cellular response by signal transduction pathways: multistep sequences of intracellular signaling events and communication between molecules. Most of these molecules are proteins. Enzymes such as kinases and phosphatases, for example, catalyze (respectively) the addition/removal of a phosphate group to/from a substrate, and thus perform a crucial role in relaying information [1]. Phosphorylation (the addition of a phosphate group) can be associated with protein activation, and information can be communicated downstream, engaging multiple signaling cascades by successive chemical reactions. While some reactions are linear, with the output proportional to the input [2], many are complex, involving feedback loops or pathway redundancies. Often the output of these pathways is activation or inhibition of regulatory proteins called transcription factors, which modify gene transcription and the cellular state.

To turn a gene on, an activated transcription factor translocates from the cytoplasm into the nucleus, binds to the enhancer or promoter region of DNA, and RNA polymerase transcribes the DNA template to synthesize RNA. Then messenger RNA (mRNA) leaves the nucleus and enters the cytoplasm where ribosomes translate mRNA into protein [1]. Conversely, transcription factors may turn a gene off by repressing the recruitment of RNA polymerase. These possible responses thus regulate protein synthesis. In addition to the subcellular processes that changes in protein synthesis stimulate, proteins may be released by the cell and act as signaling molecules in other pathways.

Gene regulatory pathways are crucial to the normal functioning of cells, with many diseases caused by dysfunction of one or more pathways. For example, signaling pathways such as NF-κB, MAP Kinase, and Wnt/β-catenin are involved in a host of cellular processes and functions, including cancer. Due to their complexity, a systems approach is needed to understand normal and aberrant pathway function. Only by building theoretical models that describe how cells signal and validating/updating them using experimental data can we develop new drug therapies that target specific diseases.

The remainder of the chapter is organized as follows. In Subheading 2, we review methods used to model signal transduction pathways, and introduce an exemplary enzyme kinetics model. We then describe the biology of Wnt signaling, with reference to relevant models, and introduce two models of the Wnt signaling pathway that we focus on throughout the chapter to demonstrate various techniques. In Subheading 3, we detail methods that can be used to analyze a particular model and discuss the insight that each approach can generate. In Subheading 4, we introduce techniques that can be used to compare models, including some new methods for systems medicine. We conclude in Subheading 5 with a discussion of the different techniques, and ideas for their further application in systems medicine.

2 Mathematical Modeling

Signaling pathways are complex and may be difficult to understand by linear logic alone. Theoretical models can be used to gain insight into the dynamics of multiple biochemical interactions. Constructing a mathematical model is a nontrivial task that requires sufficient understanding of the system to determine not only the type of model that should be used to address a particular question but also the limitations of the model. After reviewing some of the modeling approaches that are used to study signaling pathways, we focus on ordinary differential equation (ODE) models. We introduce basic principles that can be used to construct ODE models and illustrate them by reference to enzyme kinetics and two models of the Wnt pathway.

2.1 Modeling Approaches for Systems Medicine

Many processes associated with systems medicine in general, and signaling pathways in particular, can be modeled. These include: gene/protein abundances; gene/protein interactions; abundances of cellular species; the effects of cytokines, chemicals, drugs, or other interventions on system or tissue-level phenomena. Modeling strategies for systems medicine can be classified as either deterministic or stochastic; we describe stochastic approaches briefly here, since the methods introduced in later sections are generally only applicable to deterministic systems.

Deterministic approaches describe systems for which, given full details of the model (parameter values and initial conditions), its time evolution can be determined exactly. This means that if a system is restarted multiple times from the same initial state it will always return to the same future states. Ordinary and partial differential equations (PDEs) are two examples [3]. PDEs with two or more independent variables (e.g., space and time) are more flexible than ODEs, but their simulation and analysis can be computationally expensive. Deterministic methods provide accurate descriptions of population-level behavior if the population sizes are large enough that the effects of random fluctuations can be neglected.

Stochastic approaches describe systems whose temporal evolution has unpredictable elements due to randomness somewhere in the system. They are popular for modeling biological systems where randomness and heterogeneity abound, and should be used when population sizes are small enough that fluctuations cannot be ignored. In most cases, population averages will be recovered from a stochastic model when the abundances become large enough. One can, for example, construct stochastic models of protein dynamics with stochastic differential equations [4] (i.e., ODEs with noise terms—often Gaussian—added). Such models can be used to study the dynamics of species that fluctuate about a well-defined mean value.

Stochastic modeling can also be developed via agent-based approaches [5, 6]. Here, individual agents act according to a set of rules. For example, within a given pathway, a protein could be phosphorylated or dephosphorylated with probabilities that depend on its environment. Such a framework treats protein species very differently to differential equation methods: each protein is viewed as an autonomous agent and population dynamics emerge in a “bottom up” manner. Whilst such methods may appeal to our intuition about protein heterogeneity, the approach is limited since analyses are often computationally expensive. As such, agent-based models should be used when population-averaged models fail to capture the behavior that the modeler seeks to describe.

Cellular automata are a subset of agent-based models that impose spatial structure on the system by constraining the agents to lie on a grid, in two or three dimensions [7, 8]. The agents are updated via rules which may be deterministic or stochastic. Each grid point may be occupied by a finite number of cells (typically only one) and the model can accommodate multiple cell types. Cellular automata can account for spatial relationships between different cell types and have the advantage of being easy to interpret biologically. A challenge associated with these models is that the update rules may not translate clearly into biological hypotheses. Additionally, as for other agent-based models, simulation of cellular automata can be computationally expensive. Fitting such models to data is at the limits of what is currently feasible since, despite significant advances in cellular imaging technology, obtaining cell data of sufficient resolution and quality to fit to a model is rare.

The above overview of modeling approaches is not exhaustive: in limited space, we make no mention of Boolean, semi-quantitative, hybrid, or branching processes. Instead, we continue by explaining how to develop ODE models for signaling pathways.

2.2 Formulating Mathematical Models of Signaling Pathways

In this section our focus is on using ODEs to develop dynamic models of signaling pathways. Two basic principles are integral to the development of such models:

  • The Principle of Mass Balance states that the rate of change of a species is equal to the difference between the rate at which the species is added to the system and the rate at which it is removed;

  • The Law of Mass Action states that a reaction proceeds at a rate proportional to the product of its reactants.

If, for example, substrate A is irreversibly phosphorylated by enzyme B, to produce C then we write

$$ A+B\overset{r_1}{\to }C+B, $$
(1)

where r 1 is the rate at which phosphorylation occurs. We construct ODEs that describe the dynamics of A, B, and C by appealing to the Principle of Mass Balance and the Law of Mass Action:

$$ \frac{dA}{dt}=-{r}_1 AB,\kern1em \frac{dB}{dt}=-{r}_1 AB+{r}_1 AB\equiv 0,\kern1em \frac{dC}{dt}={r}_1 AB. $$
(2)

By inspecting the above ODEs, it is straightforward to deduce that the following quantities are preserved:

$$ A+C={A}_0+{C}_0,\kern1em \mathrm{and}\kern1em B={B}_0, $$

where \( A\left(t=0\right)={A}_0 \), \( B\left(t=0\right)={B}_0 \) and \( C\left(t=0\right)={C}_0 \) are prescribed as initial conditions. We can exploit these Conservation Laws to simplify the governing equations: in this case, we can eliminate both B and C and our model reduces to give

$$ \frac{dA}{dt}=-{r}_1{B}_0A,\mathrm{with}A\left(t=0\right)={A}_0\Rightarrow A(t)={A}_0{e}^{-{r}_1{B}_0t}. $$

Thus, substrate levels decay exponentially, at rate \( {r}_1{B}_0 \).

2.2.1 Case Study I: The Enzyme Kinetics Model

We now consider a biochemical reaction that is catalyzed by an enzyme. In more detail, the enzyme E binds reversibly with the substrate S to form a complex C. While complexed with the substrate, the enzyme converts it into a product P and the enzyme is recovered. We represent these reactions as follows:

$$ E+S\underset{k_{-1}}{\overset{k_1}{\rightleftharpoons }}C\overset{k_2}{\to }E+P. $$

By applying the Law of Mass Action to this reaction scheme and appealing to the Principle of Mass Balance, we deduce that the following system of ODEs describes the time-evolution of S, E, C, and P:

$$ \frac{dS}{dt}=-{k}_1ES+{k}_{-1}C, $$
(3)
$$ \frac{dE}{dt}=-{k}_1ES+\left({k}_{-1}+{k}_2\right)C, $$
(4)
$$ \frac{dC}{dt}={k}_1ES-\left({k}_{-1}+{k}_2\right)C, $$
(5)
$$ \frac{dP}{dt}={k}_2C. $$
(6)

If we assume further that \( S\left(t=0\right)={S}_0 \), \( E\left(t=0\right)={E}_0 \), \( C\left(t=0\right)=0 \), and \( P\left(t=0\right)=0 \), and take suitable combinations of the governing ODEs, then we deduce

$$ \frac{d}{dt}\left(E+C\right)=0\mathrm{and}\frac{d}{dt}\left(S+C+P\right)=0,\kern1em \Rightarrow \kern1em E+C={E}_0\mathrm{and}S+C+P={S}_0, $$

We can exploit these conservation laws to eliminate E and P and obtain the following reduced model:

$$ \frac{dS}{dt}=-{k}_1S\left({E}_0-C\right)+{k}_{-1}C, $$
(7)
$$ \frac{dC}{dt}={k}_1S\left({E}_0-C\right)-\left({k}_{-1}+{k}_2\right)C, $$
(8)
$$ \mathrm{with}\kern1em S\left(t=0\right)={S}_0\kern1em \mathrm{and}\kern1em C\left(t=0\right)=0. $$
(9)

2.3 Modeling Wnt Signaling

Wnt signaling is implicated in many biological processes. The pathway is activated when Wnt ligands bind to specific receptors on the cell surface, resulting in the stabilization and nuclear accumulation of the transcriptional co-activator β-catenin. Canonical Wnt signaling encompasses cellular responses to external Wnt stimuli mediated by β-catenin. Noncanonical signaling describes cellular signaling and responses to Wnt not mediated by β-catenin. The canonical Wnt pathway plays a key role in essential cellular processes ranging from proliferation and cell specification during development to adult stem cell maintenance and wound repair [9]. Dysfunction of Wnt signaling is implicated in many pathological conditions, including degenerative diseases and cancer [1012]. Despite further molecular advances [1315], certain details of the dynamics of the pathway are still not well understood.

The basic steps that constitute canonical Wnt signaling are as follows (although these are not undisputed; discussed below): Wnt binds to cell-surface receptors Frizzled and LRP5/6 [11] that transduce a signal via a multistep process involving Dishevelled (Dsh) to the so-called destruction complex (DC). The DC contains forms of Axin, adenomatous polyposis coli (APC), glycogen synthase kinase 3 (GSK-3), and casein kinase 1α (CK1α). In the absence of a Wnt signal, the DC actively degrades β-catenin—which is being continually synthesized in the cell—by binding and phosphorylating the protein and thus marking it for proteasomal degradation. Following Wnt stimulation, degradation of β-catenin is inhibited through phosphorylation of DC member proteins. This leads to accumulation in the cytoplasm of free β-catenin, which is able to translocate to the nucleus where it can form a complex with T-cell factor (TCF) and lymphoid-enhancing factor (LEF) proteins and, thereby, influence the transcription of target genes associated with processes such as self-renewal and proliferation [16, 17].

In addition to these core mechanisms, evidence for other important processes has been found, some of which may challenge the Wnt signaling paradigm. Spatial localization within the cell has been found to be important not only for β-catenin but also for Dsh and DC member proteins including Axin, APC, and GSK-3 [1823]. There is also evidence of competitive binding of β-catenin to cell membrane proteins such as E-cadherin [24] and intricate cross-talk with the Hippo pathway, this being mediated by Yap and Taz which promote translocation of cytoplasmic β-catenin to the nucleus via phosphorylation and then compete with TCF for β-catenin in the nucleus [25]. This spatial organization of Wnt pathway members may be key to understanding the pathway, as some modeling suggests [26, 27]. Equally, an alternative description for the degradation of β-catenin exists: in this picture, β-catenin can be actively degraded while still bound to the DC, rather than being released marked for degradation [28]. Discriminating between competing hypotheses is needed in order to fully elucidate canonical Wnt signaling: mathematical modeling is a natural framework within which to achieve this.

The first quantitative model of Wnt/β-catenin signaling was developed in 2003 [29], based on data from Xenopus extracts. Formulated as a system of ODEs, the model describes known interactions between core components of the canonical pathway, these being Wnt, Dishevelled, GSK3β, APC, Axin, β-catenin, and TCF. The DC is assumed to act only in the well-mixed cytoplasm and, hence, only cytoplasmic levels of pathway components are considered. Since its publication, the Lee model has been extended in many ways (for recent reviews of mathematical models of Wnt signaling, see [16, 30]). The effect of mutations in APC was investigated by Cho et al. [31], the action of Wnt inhibitors was studied by Kogan et al. [32], and the impact of Wnt-ERK cross-talk considered by Kim et al. [33]. The effect of competition for β-catenin with adhesion proteins was investigated by van Leeuwen et al. [34], while Schmitz showed how shuttling of core proteins between cytoplasm and nucleus could influence pathway dynamics [35, 36]. More recently, a new shuttling model was constructed that accounts not only for exchange of pathway proteins between the nucleus and cytoplasm, but also degradation of β-catenin while it is bound to active destruction complex (DC) and activation of the DC by dephosphorylation of its components [27]. Table 1 summarizes the key features of some of these models and Fig. 1 illustrates the localization and known interactions between key proteins involved in Wnt signaling.

Table 1 Comparison of features across different models of Wnt signaling
Fig. 1
figure 1

Reaction scheme that incorporates many different Wnt signaling models and additional molecular players (e.g., Yap/Taz). Solid arrows denote direct reactions; long-dashed arrows denote species that act as catalysts in degradation reactions; and dotted arrows denote alternative paths for the direct activation of Y. Note that active/inactive forms of Y are equivalent to active/inactive forms of ANG. Species names are defined in Table  2

We now present the Lee model [29] and the Schmitz model [36], using the notation presented in Table 2. These models, together with the enzyme kinetics model introduced above, will be revisited throughout the chapter to illustrate how the techniques discussed in Subheadings 3 and 4 are applied to specific models.

Table 2 Definition of notation for the variables used by the Lee and Schmitz models

2.3.1 Case Study II: The Lee Model

In its original form, the Lee model comprises 15 time-dependent ODEs for protein species and complexes that participate in the canonical Wnt pathway, the reaction rates being based on mass action kinetics [29]. The model targets the assembly of the destruction complex from the constituent parts of APC, Axin, and GSK3β. It does not distinguish between nuclear and cytoplasmic compartments, instead assumes that all species are uniformly distributed throughout the cell. A schematic diagram of the reactions described in the Lee model is given in Fig. 2. Using the variable names defined in Table 2 and primes to denote differentiation with respect to time, the ODEs that specify this model are:

$$ {D}_i^{\prime }=-{\alpha}_1{D}_i+{\alpha}_2{D}_a, $$
(10)
$$ {D}_a^{\prime }={\alpha}_1{D}_i-{\alpha}_2{D}_a, $$
(11)
$$ {Y}_a^{\prime }={\alpha}_3{Y}_i-{\alpha}_4{Y}_a-{\alpha}_{10}X{Y}_a+{\alpha}_{11}{C}_{XY}+{\alpha}_{13}{C}_{XYp}, $$
(12)
$$ {Y}_i^{\prime }={\alpha}_6G{C}_{NA}-{\alpha}_5{D}_a{Y}_i-{\alpha}_3{Y}_i+{\alpha}_4{Y}_a-{\alpha}_7{Y}_i, $$
(13)
$$ {G}^{\prime }={\alpha}_5{D}_a{Y}_i-{\alpha}_6G{C}_{NA}+{\alpha}_7{Y}_i, $$
(14)
$$ {C}_{NA}^{\prime }={\alpha}_5{D}_a{Y}_i-{\alpha}_6G{C}_{NA}+{\alpha}_7{Y}_i+{\alpha}_8NA-{\alpha}_9{C}_{NA}, $$
(15)
$$ {A}^{\prime }=-{\alpha}_8NA+{\alpha}_9{C}_{NA}-{\alpha}_{21}XA+{\alpha}_{22}{C}_{XA}, $$
(16)
$$ {C}_{XY}^{\prime }={\alpha}_{10}X{Y}_a-{\alpha}_{11}{C}_{XY}-{\alpha}_{12}{C}_{XY}, $$
(17)
$$ {C}_{XYp}^{\prime }={\alpha}_{12}{C}_{XY}-{\alpha}_{13}{C}_{XYp}, $$
(18)
$$ {X}_p^{\prime }={\alpha}_{13}{C}_{XYp}-{\alpha}_{14}{X}_p, $$
(19)
$$ \begin{array}{l}{X}^{\prime }=-{\alpha}_{10}X{Y}_a+{\alpha}_{11}{C}_{XY}+{\alpha}_{15}-{\alpha}_{16}X-{\alpha}_{19}XT\\ {}\kern1.44em +{\alpha}_{20}{C}_{XT}-{\alpha}_{21}XA+{\alpha}_{22}{C}_{XA},\end{array} $$
(20)
$$ {N}^{\prime }=-{\alpha}_8NA+{\alpha}_9{C}_{NA}+{\alpha}_{17}-{\alpha}_{18}N, $$
(21)
$$ {T}^{\prime }=-{\alpha}_{19}XT+{\alpha}_{20}{C}_{XT}, $$
(22)
$$ {C}_{XT}^{\prime }={\alpha}_{19}XT-{\alpha}_{20}{C}_{XT}, $$
(23)
$$ {C}_{XA}^{\prime }={\alpha}_{21}XA-{\alpha}_{22}{C}_{XA}. $$
(24)

To facilitate comparison with the Schmitz model (see below), the nonnegative rate constants α k , k ∈ (1, 2, , 22) have been redefined from those used in [29]. Wnt dependence is incorporated via the parameter α 1 = α 1(W) that controls the activation of Dsh.

Fig. 2
figure 2

Schematic of the Lee model [29], which describes the activation of the destruction complex and its effect on β-catenin in a single cellular compartment (cytoplasm and nucleus combined). Notation of the model species is given in Table 2. Solid arrows represent reactions and dashed arrows represent catalytic processes

Inspection of Eqs. 1024 reveals that there are four conservations laws:

$$ \begin{array}{c}{D}_0={D}_i+{D}_a,\\ {}{G}_0=G+{Y}_i+{Y}_a+{C}_{XY}+{C}_{XYp},\\ {}{A}_0=A+{Y}_i+{Y}_a+{C}_{XY}+{C}_{XYp}+{C}_{XA}+{C}_{NA},\\ {}{T}_0\kern0.36em =T+{C}_{XT},\end{array} $$

the constants \( {D}_0,{G}_0,{A}_0 \), and T 0 denote the (assumed constant) levels of Dishevelled, GSK3β, APC, and TCF initially present in the system. These conservation laws are consistent with experimental observations which suggest that levels of these proteins do not fluctuate during Wnt signaling (i.e., they are produced and degraded at the same rates). They can be used to eliminate four variables and, in so doing, to reduce the model from 15 to 11 ODEs. Further simplifications are achieved by assuming that all binding processes, except those for the binding of GSK3β to APC/Axin, reach equilibrium rapidly and that all species involving Axin are present at low levels. Under these assumptions, and after some algebra, the following expressions for \( {D}_0,G,A,T,{X}_p,{C}_{XT},{C}_{XYp} \), and C NA are obtained:

$$ \begin{array}{l}{D}_i={D}_0-{D}_a,\kern1em G={G}_0,\kern1em A=\frac{A_0}{1+\frac{\alpha_{21}}{\alpha_{22}}X},\kern1em T=\frac{T_0}{1+\frac{\alpha_{19}}{\alpha_{20}}X},\kern1em {X}_p=\frac{\alpha_{12}}{\alpha_{14}}{C}_{XY},\\ {}{C}_{XT}=\frac{X.{T}_0}{1+\frac{\alpha_{19}}{\alpha_{20}}X},\kern1em {C}_{XA}=\frac{A_0X}{1+\frac{\alpha_{21}}{\alpha_{22}}X},\kern1em {C}_{XYp}=\frac{\alpha_{12}}{\alpha_{13}}{C}_{XY},\kern1em {C}_{NA}=\frac{\alpha_8}{\alpha_9}\;\frac{A_0N}{1+\frac{\alpha_{21}}{\alpha_{22}}},\end{array} $$

and a reduced system of 7 ODEs for the remaining species is eventually recovered (equations not presented since they are rather involved and less instructive than Eqs. 1024). In [29] and [37], this model reduction is performed in an ad hoc manner; it would be instructive to repeat it by first nondimensionalizing the governing equations (see Subheading 3.2) and using asymptotic analysis to perform the model reduction (see Subheading 3.3).

2.3.2 Case Study III: The Schmitz Model

Like the Lee model, the Schmitz model [36] focuses on the canonical Wnt pathway. Key differences between the Lee and Schmitz models are that the latter distinguishes between the cytoplasm and nucleus and accounts for exchange of β-catenin and DC between these compartments (see Table 2 and Fig. 3 for further description). In each compartment, DC binding to β-catenin leads to its phosphorylation, and phosphorylated β-catenin is degraded. We use subscript n to denote species residing in the nucleus with the exception of TCF (T) and the β-catenin-TCF complex (C XT ); since these species are localized in the nucleus and to facilitate comparison with the Lee model, the subscript is omitted. Using notation that is modified from that used in [36], the ODEs that define the Schmitz model are:

$$ {X}^{\prime }={\delta}_0+\left({\delta}_2{X}_n-{\delta}_1X\right)+\left({\delta}_6{C}_{XY}-{\delta}_5X{Y}_a\right), $$
(25)
$$ {X}_n^{\prime }=\left({\delta}_1X-{\delta}_2{X}_n\right)+\left({\delta}_9{C}_{XYn}-{\delta}_8{X}_n{Y}_{an}\right)+\left({\delta}_{12}{C}_{XT}-{\delta}_{11}{X}_nT\right), $$
(26)
$$ {X}_p^{\prime }={\delta}_7{C}_{XY}-{\delta}_{13}{X}_p, $$
(27)
$$ {X}_{pn}^{\prime }={\delta}_{10}{C}_{XYn}-{\delta}_{14}{X}_{pn}, $$
(28)
$$ {Y}_a^{\prime }=\left({\delta}_4{Y}_{an}-{\delta}_3{Y}_a\right)+\left({\delta}_6{C}_{XY}-{\delta}_5X{Y}_a\right)+{\delta}_7{C}_{XY}+\left({\delta}_{16}{Y}_i-{\delta}_{15}{Y}_a\right), $$
(29)
$$ {Y}_i^{\prime }={\delta}_{15}{Y}_a-{\delta}_{16}{Y}_i, $$
(30)
$$ {Y}_{an}^{\prime }=\left({\delta}_3{Y}_a-{\delta}_4{Y}_{an}\right)+\left({\delta}_9{C}_{XYn}-{\delta}_8{X}_n{Y}_{an}\right)+{\delta}_{10}{C}_{XYn}, $$
(31)
$$ {C}_{XY}^{\prime }=\left({\delta}_5X{Y}_a-{\delta}_6{C}_{XY}\right)+{\delta}_7{C}_{XY}, $$
(32)
$$ {C}_{XYn}^{\prime }=\left({\delta}_8{X}_n{Y}_{an}-{\delta}_9{C}_{XYn}\right)-{\delta}_{10}{C}_{XYn}, $$
(33)
$$ {T}^{\prime }={\delta}_{12}{C}_{XT}-{\delta}_{11}{X}_nT, $$
(34)
$$ {C}_{XT}^{\prime }={\delta}_{11}{X}_nT-{\delta}_{12}{C}_{XT}, $$
(35)

where δ k  (k = 1, 2, , 17) are nonnegative rate constants and \( {\delta}_{15}={\delta}_{15}(W) \) so that Wnt acts to inactivate the destruction complex in the cytoplasm.

Fig. 3
figure 3

Schematic of the Schmitz model [36], which describes the interaction between β-catenin and the destruction complex in two cellular compartments: cytoplasm and nucleus. Notation of the model species is given in Table 2

By taking appropriate combinations of Eqs. 2535, it is straightforward to show that there are two conservation laws:

$$ {Y}_i+{Y}_a+{Y}_{an}+{C}_{XY}+{C}_{XYn}={Y}_{\mathrm{TOT}}\kern1em \mathrm{and}\kern1em T+{C}_{XT}={T}_{\mathrm{TOT}}, $$
(36)

the constants Y TOT and T TOT denoting, respectively, the total number of molecules of DC and TCF in the system, as determined from the initial conditions. These identities may be used to reduce the order of the Schmitz model from 11 to 9. As explained below, further systematic simplifications may be possible following model nondimensionalization and parameter estimation.

3 Techniques for the Analysis of a Specific Model

Once model construction is complete, the modeler aims to extract from it new insight. This can be done in a number of ways: if no data are available, standard mathematical techniques can be used to increase understanding of the behavior of the model; however, if data are available, then it may be possible to estimate model parameters. In this section we describe a number of techniques, some standard and others less so, that can be used to analyze models. We demonstrate these methods by reference to the models of enzyme kinetics and Wnt signaling introduced in Subheading 2.

3.1 Steady State Analysis

Broadly speaking, the behavior of an ODE model can be categorized as either transient or steady state. The latter describes the behavior at large timescales (\( t\to \infty \)). For systems that reach single valued (i.e., not oscillating) steady states, we refer to the long time values that system variables take as the fixed points. Much theory exists for the analysis of fixed points, which can be helpful in characterizing model behavior and predicting the effects of perturbations [38]. We continue by calculating the steady states for the enzyme kinetics model and the Schmitz model (similar analysis can be performed for the Lee model but the resulting expressions are rather involved and therefore omitted).

3.1.1 Case Study I: The Enzyme Kinetics Model (Steady State)

Setting \( \frac{d}{dt}=0 \) in Eqs. 36, we deduce that our model for enzyme kinetics evolves to the following unique, steady state solution:

$$ S=0,\kern1em E={E}_0,\kern1em C=0\kern1em \mathrm{and}\kern0.48em P={S}_0. $$

Thus, as expected, the reaction proceeds until all of the substrate S has been converted to product P.

3.1.2 Case Study III: The Schmitz Model (Steady State)

Setting \( \frac{d}{dt}=0 \) in Eqs. 2535 and manipulating the resulting algebraic equations supplies the following expressions for \( {Y}_{an},{Y}_i,{X}_p,{X}_{pn},{C}_{XY},{C}_{XYn},T \), and C XT in terms of X, X n , and Y a :

$$ \begin{array}{l}\kern2.28em {Y}_{an}\kern0.36em =\frac{\delta_3}{\delta_4}{Y}_a,{Y}_i=\frac{\delta_{15}}{\delta_{16}}{Y}_a,\\ {}{X}_p=\frac{\delta_7}{\delta_{13}}\;\frac{\delta_5}{\delta_6+{\delta}_7}X{Y}_a,\kern0.48em {X}_{pn}=\kern0.36em \frac{\delta_8}{\delta_{14}}\;\frac{\delta_{10}}{\delta_9+{\delta}_{10}}X{Y}_a,\end{array} $$
$$ {C}_{XY}=\frac{\delta_5}{\delta_6+{\delta}_7}X{Y}_a,\kern0.36em {C}_{XYn}=\frac{\delta_3}{\delta_4}\;\frac{\delta_8}{\delta_9+{\delta}_{10}}{X}_n{Y}_a, $$
$$ T={\left(1+\frac{\delta_{11}}{\delta_{12}}{X}_n\right)}^{-1}{T}_{\mathrm{TOT}},\kern0.36em {C}_{XT}=\frac{\delta_{11}}{\delta_{12}}{\left(1+\frac{\delta_{11}}{\delta_{12}}{X}_n\right)}^{-1}{X}_n{T}_{\mathrm{TOT}}, $$

wherein \( {Y}_a={Y}_a\left(X,{X}_n\right) \) satisfies

$$ {Y}_{\mathrm{TOT}}={Y}_a\left(1+\frac{\delta_3}{\delta_4}+\frac{\delta_{15}}{\delta_{16}}+\frac{\delta_5}{\delta_6+{\delta}_7}X+\frac{\delta_3}{\delta_4}\;\frac{\delta_8}{\delta_9+{\delta}_{10}}{X}_n\right), $$

while X n depends linearly on X via

$$ \begin{array}{l}\left(1+\frac{\delta_3}{\delta_4}+\frac{\delta_{15}}{\delta_{16}}\right)=\frac{\delta_5}{\delta_6+{\delta}_7}\left(\frac{\delta_7}{\delta_0}{Y}_{\mathrm{TOT}}-1\right)\;X\\ {}\kern4.44em +\frac{\delta_8}{\delta_9+{\delta}_{10}}\frac{\delta_3}{\delta_4}\left(\frac{\delta_{10}}{\delta_0}{Y}_{\mathrm{TOT}}-1\right){X}_n,\end{array} $$
(37)

and X solves a quadratic of the form

$$ 0=A{X}^2+BX+C $$
(38)

where the constant coefficients \( A,\kern0.24em B, \) and C are functions of the model parameters. For physically realistic solutions, we require X, X n  > 0. Therefore, we conclude that this model has at most two steady states and at most one of them may be stable.

As models increase in complexity, the algebra usually prohibits the construction of analytical expressions for the steady state solutions. In the following sections we present other methods that can be used to generate insight in such situations.

3.2 Nondimensiona-lization

When a mathematical model is first developed, the independent and dependent variables typically represent physical quantities (e.g., protein levels) which are measured in dimensional units (e.g., protein levels may be measured as the number of molecules per unit volume or the number of molecules per cell). The model may also contain parameters which relate to physical processes (e.g., reaction rates, Michaelis–Menten constants) and are also dimensional (e.g., rates may be measured per second, per hour, or per day). Nondimensionalization involves recasting the model in terms of dimensionless (or unit-less) variables. This process is instructive for several reasons. First, the number of model parameters is typically reduced. Second, the resulting dimensionless parameter groupings can provide useful information about the system’s behavior. Further, if estimates of these parameters can be obtained and then compared, it is possible to identify physical processes that dominate on a particular timescale and, thereby, rationale to simplify the governing equations. We illustrate these concepts by nondimensionalizing the enzyme kinetics and Schmitz models.

3.2.1 Case Study I: The Enzyme Kinetics Model (Nondimensionalization)

We introduce the dimensionless variables τ, s, e, c, and p where

$$ t=T\tau, S={S}_0s,E={E}_0e,C={E}_0c,P={S}_0p. $$

and the timescale T is specified below. It is natural to scale the complex C with E 0 since the amount of complex that forms is limited by the amount of enzyme present. If the enzyme is working effectively (i.e., serving as an efficient catalyst), then the amount of product created will be comparable to the amount of substrate. Therefore, we scale P with S 0 rather than E 0.

There are several possible choices for the timescale T. Consider Eq. 3. Initially, when C = 0, the maximum rate of uptake of S is \( {k}_1{E}_0 \) and similarly the initial rate of uptake of E is \( {k}_1{S}_0 \). The associated timescales are \( {T}_1=1/\left({k}_1{E}_0\right) \) and \( {T}_2=1/\left({k}_1{S}_0\right) \). Since enzyme levels are typically much smaller than substrate levels (i.e., \( {E}_0/{S}_0=\varepsilon \ll 1 \)), it is clear that \( {T}_2/{T}_1={E}_0/{S}_0\ll 1 \). We conclude that T 1 represents a long timescale, associated with substrate depletion, while T 2 represents a short timescale, associated with the initial rapid uptake of enzyme.

Rescaling on the longer timescale, so that \( t=\tau {T}_1=\tau /\left({k}_1{E}_0\right) \), Eqs. 79 transform to give

$$ \frac{ds}{d\tau }=-s\left(1-c\right)+{\kappa}_ec, $$
(39)
$$ \varepsilon \frac{dc}{d\tau }=s\left(1-c\right)-{\kappa}_mc, $$
(40)
$$ s\left(\tau =0\right)=1,\kern1em c\left(\tau =0\right)=0, $$
(41)
$$ \mathrm{where}\varepsilon =\frac{E_0}{S_0},{\kappa}_e=\frac{k_{-1}}{k_1{S}_0}\mathrm{and}{\kappa}_m=\frac{k_{-1}+{k}_2}{k_1{S}_0}. $$
(42)

Comparing Eqs. 79 and 3942 we note that nondimensionalization has reduced the number of model parameters from five to three. We remark further that in Eq. 40, the initial conditions supply \( dc(0)/d\tau = 1/\varepsilon. \) Thus, if ε ≪ 1, then c will initially increase very rapidly on the timescale τ.

3.2.2 Case Study III: The Schmitz Model (Nondimensionalization)

The procedure for nondimensionalizing the Schmitz model is identical to that used for the enzyme kinetics model. As the dimension of the system increases, and more processes are included, the number of ways to rescale the independent and dependent variables increases rapidly. In such situations, it is important to consider which variables are expected to vary and over what timescale: the answers to these questions should help to identify appropriate scalings.

When studying Wnt signaling, inactivation of the DC plays a key role in the system dynamics and therefore when we nondimensionalize the Schmitz model time is rescaled so that \( t=\tau /{\delta}_{15} \) (\( {\delta}_{15}^{-1} \) is the timescale for inactivation of the DC). Variables relating to free β-catenin (i.e., \( X,{X}_n,{X}_p,{X}_{pn} \)) are all rescaled with \( \tilde{B}={\delta}_0/{\delta}_{15}, \) the amount of β-catenin produced during the typical timescale \( \tilde{t}. \) This scaling eliminates δ 0 from the dimensionless equations (see below). When choosing the scalings for variables involving DC and TCF, we aim to preserve conservation laws. Accordingly, guided by Eq. 36, we scale \( {Y}_a,{Y}_i,{Y}_{an},{C}_{XY} \), and C XYn with Y TOT, the total amount of DC in the system. Similarly, we scale T and C XT with T TOT, the total amount of TCF in the system. Summarizing, we have

$$ \begin{array}{c}\left(X,{X}_n,{X}_p,{X}_{pn}\right)=\tilde{B}\left(x,{x}_n,{x}_p,{x}_{pn}\right),\\ {}\left({Y}_a,{Y}_i,{Y}_{an},{C}_{XY},{C}_{XYn}\right)={Y}_{\mathrm{TOT}}\left({y}_a,{y}_i,{y}_{an},{c}_{xy},{c}_{xyn}\right),\\ {}\left(T,{C}_{XT}\right)={T}_0\;\left(\theta, {c}_{x\theta}\right),\\ {}\end{array} $$

where \( x\left(\tau \right),{x}_n\left(\tau \right),\dots, {c}_{x\theta}\left(\tau \right) \) are dimensionless variables. Under these scalings, the Schmitz model gives the following nondimensional system:

$$ {x}^{\prime }=1+\left({\tilde{\delta}}_2{x}_n-{\tilde{\delta}}_1x\right)+\left({\tilde{\delta}}_6{c}_{xy}-{\tilde{\delta}}_5x{y}_a\right), $$
(43)
$$ {x}_n^{\prime }=\left({\tilde{\delta}}_1x-{\tilde{\delta}}_2{x}_n\right)+\left({\tilde{\delta}}_9{c}_{xyn}-{\tilde{\delta}}_8{x}_n{y}_{an}\right)+\left({\tilde{\delta}}_{12}{c}_{x\theta }-{\tilde{\delta}}_{11}{x}_n\theta \right), $$
(44)
$$ {x}_p^{\prime }={\tilde{\delta}}_7{c}_{xy}-{\tilde{\delta}}_{13}{x}_p, $$
(45)
$$ {x}_{pn}^{\prime }={\tilde{\delta}}_{10}{C}_{xyn}-{\tilde{\delta}}_{14}{x}_{pn}, $$
(46)
$$ \frac{1}{\omega }{y}_a^{\prime }=\frac{1}{\omega}\left({\tilde{\delta}}_4{y}_{an}-{\tilde{\delta}}_3{y}_a\right)+\left({\tilde{\delta}}_6{c}_{xy}-{\tilde{\delta}}_5x{y}_a\right)+{\tilde{\delta}}_7{c}_{xy}+\frac{1}{\omega}\left({\tilde{\delta}}_{16}{y}_i-{y}_a\right), $$
(47)
$$ \frac{1}{\omega }{y}_i^{\prime }=\frac{1}{\omega}\left({y}_a-{\tilde{\delta}}_{16}{y}_i\right), $$
(48)
$$ \frac{1}{\omega }{y}_{an}^{\prime }=\frac{1}{\omega}\left({\tilde{\delta}}_3{y}_a-{\tilde{\delta}}_4{y}_{an}\right)+\left({\tilde{\delta}}_9{c}_{xyn}-{\tilde{\delta}}_8{x}_n{y}_{an}\right)+{\tilde{\delta}}_{10}{c}_{xyn}, $$
(49)
$$ \frac{1}{\omega }{c}_{xy}^{\prime }=\left({\tilde{\delta}}_5x{y}_a-{\tilde{\delta}}_6{c}_{xy}\right)+{\tilde{\delta}}_7{c}_{xy}, $$
(50)
$$ \frac{1}{\omega }{c}_{xyn}^{\prime }=\left({\tilde{\delta}}_8{x}_n{y}_{an}-{\tilde{\delta}}_9{c}_{xyn}\right)-{\tilde{\delta}}_{10}{c}_{xyn}, $$
(51)
$$ \frac{1}{\nu }{\theta}^{\prime }=\left({\tilde{\delta}}_{12}{c}_{XT}-{\tilde{\delta}}_{11}{x}_n\theta \right), $$
(52)
$$ \frac{1}{\nu }{c}_{x\theta}^{\prime }=\left({\tilde{\delta}}_{11}{x}_n\theta -{\tilde{\delta}}_{12}{c}_{x\theta}\right), $$
(53)

where primes denote differentiation with respect to τ and \( {\tilde{\delta}}_i\left(i=1,2,\dots, 16\right) \) are the following dimensionless parameters:

$$ {\tilde{\delta}}_1=\frac{\delta_1}{\delta_{15}},\kern1em {\tilde{\delta}}_2=\frac{\delta_2}{\delta_{15}},\kern1em {\tilde{\delta}}_3=\frac{\delta_3}{\delta_{15}},\kern1em {\tilde{\delta}}_4=\frac{\delta_4}{\delta_{15}},\kern1em {\tilde{\delta}}_5=\frac{\delta_5{Y}_{\mathrm{TOT}}}{\delta_{15}}, $$
(54)
$$ {\tilde{\delta}}_6=\frac{\delta_6{Y}_{\mathrm{TOT}}}{\delta_0},\kern1em {\tilde{\delta}}_7=\frac{\delta_7{Y}_{\mathrm{TOT}}}{\delta_0},\kern1em {\tilde{\delta}}_8=\frac{\delta_8{Y}_{\mathrm{TOT}}}{\delta_{15}},\kern1em {\tilde{\delta}}_9=\frac{\delta_9{Y}_{\mathrm{TOT}}}{\delta_0},\kern1em {\tilde{\delta}}_{10}=\frac{\delta_{10}{Y}_{\mathrm{TOT}}}{\delta_0}, $$
(55)
$$ {\tilde{\delta}}_{11}=\frac{\delta_{11}{T}_0}{\delta_{15}},\kern1em {\tilde{\delta}}_{12}=\frac{\delta_{12}{T}_0}{\delta_0},\kern1em {\tilde{\delta}}_{13}=\frac{\delta_{13}}{\delta_{15}},\kern1em {\tilde{\delta}}_{14}=\frac{\delta_{14}}{\delta_{15}},\kern1em {\tilde{\delta}}_{16}=\frac{\delta_{16}}{\delta_{15}}, $$
(56)
$$ \omega =\frac{\left({\delta}_0/{\delta}_{15}\right)}{Y_{\mathrm{TOT}}}\kern1em \mathrm{and}\kern1em \nu =\frac{\left({\delta}_0/{\delta}_{15}\right)}{T_0}. $$
(57)

3.3 Asymptotic Analysis

In applied mathematics, if the (dimensionless) governing equations contain a small parameter, it is common to assume that there is an asymptotic expansion for the solution, as a power series in the small parameter. As we demonstrate below, this technique can be used systematically to simplify a mathematical model and, in so doing, provide useful information about the dynamics of its components.

3.3.1 Case Study I: The Enzyme Kinetics Model (Asymptotics)

A key assumption of the enzyme kinetics model is that initial enzyme levels are much smaller than substrate levels. This assumption is represented in the dimensionless model equations via the small parameter \( \varepsilon ={E}_0/{S}_0\ll 1 \). We exploit this small parameter by seeking a solution to Eqs. 3941 of the form

$$ s\left(\tau \right)\sim {s}_0\left(\tau \right)+\varepsilon {s}_1\left(\tau \right),\kern1em c\left(\tau \right)\sim {c}_0\left(\tau \right)+\varepsilon {c}_1\left(\tau \right). $$
(58)

Substituting with Eq. 58 in the governing equations and equating to zero terms of O(ε n), we deduce that, at leading order, s 0 and c 0 satisfy

$$ \frac{d{s}_0}{d\tau }={\kappa}_e{c}_0-{s}_0\left(1-{c}_0\right), $$
(59)
$$ 0={s}_0\left(1-{c}_0\right)-{\kappa}_m{c}_0, $$
(60)
$$ {s}_0(0)=1,\kern1em {c}_0(0)=0. $$
(61)

Thus the ODE for c reduces to an algebraic relation, giving c 0 in terms of s 0, and an ODE for s 0, with the implicit solution

$$ {\kappa}_m \log {s}_0\left(\tau \right)+{s}_0\left(\tau \right)=A-\kappa \tau, \kern1em {c}_0=\frac{s_0}{\kappa_m+{s}_0}, $$
(62)

where A is a constant of integration. A problem arises when we attempt to impose the initial conditions: it is not possible simultaneously to satisfy both initial conditions. This is because the leading order problem is of lower order than the original one.

In order to resolve this problem, we use matched asymptotic expansions. We recall that c varies rapidly near τ = 0 and, hence, examine the system dynamics near τ = 0 by switching to the short timescale \( T=\tau /\varepsilon. \) In terms of T, the model becomes

$$ \frac{d\tilde{s}}{dT}=\varepsilon \left({\kappa}_e\tilde{c}-\tilde{s}\left(1-\tilde{c}\right)\right), $$
(63)
$$ \frac{d\tilde{c}}{dT}=\tilde{s}\left(1-\tilde{c}\right)-{\kappa}_m\tilde{c}, $$
(64)
$$ \tilde{s}(0)=1,\kern1em \tilde{c}(0)=0. $$
(65)

where \( \tilde{s}(T)=s\left(\tau \right) \), \( \tilde{c}(T)=c\left(\tau \right) \). As before, we seek asymptotic expansions for \( \tilde{s} \) and \( \tilde{c} \) in terms of ε ≪ 1, of the form specified at Eq. 58. In this way, we obtain the following leading order solutions for \( {\tilde{s}}_0(T) \) and \( {\tilde{c}}_0(T) \):

$$ {\tilde{s}}_0(T)=1,\kern1em {\tilde{c}}_0(T)=\frac{1-{e}^{-\left(1+{\kappa}_m\right)T}}{1+{\kappa}_m}. $$
(66)

The above approximate solution is accurate near τ = 0 but not for τ = O(1), whereas Eq. 62 is accurate for τ = O(1) but not for τ ≪ 1. The method of matched asymptotics involves choosing the constant of integration A to match Eqs. 62 and 66 [39]. By imposing the matching conditions

$$ \underset{\tau \to 0}{ \lim}\left({s}_0\left(\tau \right),{c}_0\left(\tau \right)\right)=\underset{T\to \infty }{ \lim}\left({\tilde{s}}_0(T),{\tilde{c}}_c(T)\right), $$

we deduce that A = 1.

In practice, similar asymptotic analyses can be used to study ODE models of signaling pathways. As we have seen, such models may involve large numbers of variables and parameters, and estimates for many parameters may be lacking. In such cases, progress can be made by using order of magnitude estimates for certain processes. For example, in [29], the authors assume that all binding reactions are rapid, apart from the binding of GSK3β to APC/Axin. Under this fast kinetics assumption, the ODEs for the relevant species reduce to algebraic equations, in the same way that, for the enzyme kinetics model, on the longer timescale the ODE for the complex c reduces to an algebraic relation (see Eq. 60).

To the best of our knowledge, the Schmitz model has yet to be subject to such asymptotic analysis. Referring to Eqs. 4353, and by analogy with the asymptotic analysis of the enzyme kinetics model presented above, we note that the dynamics of the system will be strongly influenced by the ratios ω and ν. For example, if typical levels of β-catenin are much greater than levels of TCF and DC, then we could construct approximate solutions to the Schmitz model in the limit for which ν ≪ 1 ≪ ω. Such an analysis of the Lee model was performed by Mirams et al. [40]. Since the details are rather involved, we summarize the key points below and refer the interested reader to [40] for further details.

3.3.2 Case Study II: The Lee Model (Asymptotics)

Numerical simulations of the Lee model generated using parameter estimates reported in [29] (see Fig. 5) suggest that the processes involved in the Wnt signaling pathway act over at least two different timescales. Lee et al.’s parameter estimates indicate that the basal rate at which β-catenin is degraded is much smaller than the rate at which the DC becomes inactive. This discrepancy is exploited to define a small parameter, \( \eta ={\alpha}_{16}/{\alpha}_{15} \), which is the ratio of the rate at which β-catenin undergoes natural decay to the rate at which the DC becomes inactive. The dimensionless parameters are then rescaled by multiplying them by appropriate powers of η so that they are O(1). By retaining terms of leading order, the following reduced model is obtained:

$$ \frac{d{D}_a}{dt}={\overline{\alpha}}_1W\left(1-{D}_a\right)-{\overline{\alpha}}_2{D}_a, $$
(67)
$$ \frac{d{Y}_i}{dt}=-\left({\overline{\alpha}}_5{D}_a+{\overline{\alpha}}_3+{\overline{\alpha}}_7\right){Y}_i+{Y}_a+\frac{{\overline{\alpha}}_6N}{1+\eta {\overline{K}}_1X}, $$
(68)
$$ \eta \frac{d{C}_{XY}}{dt}={\overline{\alpha}}_{10}X{Y}_a-{\overline{\alpha}}_{11}{C}_{XY}, $$
(69)
$$ \frac{dN}{dt}=\left(\left({\overline{\alpha}}_5{D}_a+{\overline{\alpha}}_7\right){Y}_i-\left(\frac{{\overline{\alpha}}_6}{\left(1+\eta {\overline{K}}_1X\right)}+{\overline{\alpha}}_{18}\right)N+1\right)\frac{1}{1+{\overline{K}}_2}, $$
(70)
$$ \frac{d{Y}_a}{dt}=\frac{{\overline{\alpha}}_3{Y}_i-{Y}_a-\frac{d{C}_{XY}}{dt}}{1+{\overline{K}}_3X}, $$
(71)
$$ \frac{1}{\eta}\frac{dX}{dt}={\overline{\alpha}}_{15}-{\overline{\alpha}}_{10}X{Y}_a-{\overline{\alpha}}_{16}X. $$
(72)

We remark that Eq. 67 decouples and if a constant Wnt stimulus is applied (W(t) = W, constant), then

$$ {D}_a\to \frac{{\overline{\alpha}}_1W}{{\overline{\alpha}}_1+{\overline{\alpha}}_2}. $$

We note further that the time derivatives in Eqs. 6772 are premultiplied by three different powers of η. This suggests that model processes act on three distinct timescales, a prediction that is consistent with the rapid fluctuations and slow increases depicted in Fig. 5.

As we have already seen for the enzyme kinetics model (Eq. 58), it is possible to analyze the reduced Lee model on different timescales; here we have short, medium, and long timescales for which t = O(η), O(1), and O(η −1), respectively. In each case, asymptotic expansions in powers of the small parameter η are sought and used to simplify the governing equations. The results of this analysis can be summarized as follows (see [40] for details).

  1. 1.

    Short timescale (t = O(η)): all model variables except Y i and C XY are constant, at leading order. The dominant reaction is phosphorylation of β-catenin by active destruction complex.

  2. 2.

    Intermediate timescale (t = O(1)): the dominant reaction is found to involve inactivation of the destruction complex.

  3. 3.

    Long timescale (\( t=O\left({\eta}^{-1}\right) \)): the dynamics are dominated by degradation of free β-catenin.

Pathway components acting on the short, intermediate, and long timescales are highlighted in Fig. 4, while Fig. 5 shows good agreement between the approximate solutions and those of the full model.

Fig. 4
figure 4

Series of schematics showing which components of the Lee model of Wnt signaling are active on the short (top), medium (middle), and long timescales. The active components on each timescale are highlighted with bold borders. Figure reproduced from [40], with permission

Fig. 5
figure 5

Series of figures showing how the Lee model responds to a Wnt stimulus (W = 1) that is applied at t = 0 when the pathway is in equilibrium (W = 0) at t = 0. Also shown is the asymptotic solution obtained by matching the short, medium, and long time approximations to the Lee model. There is good agreement between the approximate and numerical solutions at all timescales. Key: numerical simulations of the (dimensionless) Lee model, Eqs. 1024 (solid line); short, medium, and long time approximations are represented by dash-dotted, dotted, and dashed lines, respectively. Figure reproduced from [40], with permission

3.4 Parameter Analyses

The selection of model parameters, their physical meaning, and numerical values are especially important; parameter analysis examines the response of the system to changes in parameters. Many methods for estimating parameters depend on time course data. These data generally give a quantitative measure of the variable level, such as mRNA or protein concentration level, at different time points. Testing a model against experimental data is a good way to validate or invalidate it; however, gathering experimental data is often too expensive to determine all parameter values and overfitting, i.e., describing noise instead of the relationship is a risk, as demonstrated for Wnt signaling later in this section. Following parameter estimation (using optimization) or parameter inference (using statistics), a good way to test a model is by performing parameter sensitivity analysis: this evaluates qualitative or quantitative relationships between parameters and their effect on the system outcome [41].

3.4.1 Parameter Estimation and Wnt Data

Ultimately, every model should be tested against data, a process that can either invalidate the model or provide evidence in its favor, if it provides a good fit under acceptable conditions. The aim is to estimate parameters that drive the model close to the data; this can be done using minimization techniques. Effectively, one calculates an objective function which is defined as the difference between the model simulated for particular value of parameters κ and the observations (data), and aims to minimize the error of the objective function, often performed iteratively [4244].

Since the publication of the Lee model [29], where estimates of the parameters controlling Wnt signaling were based on data from Xenopus extracts, few studies have quantitatively studied the dynamics of the Wnt pathway. This knowledge gap means that currently it remains difficult to test the models that have arisen in recent years. This problem is not uncommon in systems medicine. We also remark that the Xenopus data gathered by Lee et al. may be markedly different from those for mammalian Wnt signaling. In [13], dynamic changes in β-catenin levels were investigated in Xenopus extracts. They demonstrated that absolute levels of β-catenin did not dictate the Wnt signaling outcome: rather the β-catenin fold-change was the crucial variable. They used the Lee model to test their experimental results and, via sensitivity analysis, identified that the model confirmed their experimental findings.

Quantification of Wnt signaling in mammalian cell lines was undertaken by Hernández et al. [14] and Tan et al. [15]. Discrepancies with data from Xenopus extracts (such as higher Axin levels and lower APC levels in mammalian cells) highlight the need for caution in data gathering and for further quantification of the pathway. Since these measurements were made at steady state, they do not yet permit elucidation of transient Wnt signaling. More recent measurements of cytoplasmic and nuclear β-catenin in response to a Wnt stimulus provide a valuable first look at the dynamics of the pathway [45].

The above studies provide preliminary insight into the Wnt pathway but much remains to be done. The data are not yet of sufficient quality to discriminate between most models (which typically contain many molecular species). Caution must be taken when applying data. For example, where data generated from nonmammalian systems may be used in a model that addresses clinical outcomes. For systems medicine to have the greatest impact, modeling (with prediction) and experimentation (to test predictions) must proceed iteratively.

3.4.2 Parameter Inference

There are often cases where it is either infeasible or impossible experimentally to determine values for parameters that describe a given model. In such cases, we may be able to estimate (some of) the parameters using statistical inference. In general the aim is to identify the values of the parameters, \( \theta \) (ideally including corresponding confidence regions), for which a model best explains the data.

A reliable way of doing so is to focus on the likelihood \( L\left(\theta \right) \), which is defined as the probability of observing the data (x) given parameters (\( \theta \)):

$$ L\left(\theta \right):=P\left(x\Big|\theta \right). $$

Varying \( \theta \) to identify the value for which this probability is maximized gives the maximum-likelihood estimate. There is a rich literature on this topic and how confidence of the estimates can be assessed [46].

Likelihood estimates center around the available data. In many circumstances we may have additional information, for example based on biophysical arguments, about which parameter values can be ruled out. Incorporating such prior information is hard in a pure likelihood framework, but lies at the heart of Bayesian inference [47]. Here inferences are based on the posterior distribution over model parameters. The posterior distribution can be described starting from Bayes rule:

$$ P\left(\theta \Big|x\right)\propto P\left(x\Big|\theta \right)\pi \left(\theta \right). $$
(73)

\( P\left(\theta \Big|x\right) \), the probability of \( \theta \) given x, is called the posterior probability, \( P\left(x\Big|\theta \right) \) is the likelihood function, and \( \pi \left(\theta \right) \) is the prior probability (knowledge about parameters before we begin fitting to data) [48]. As well as the full (joint) posterior distribution, one may also analyze the marginal posterior distributions which are the individual distributions over each parameter.

In certain cases, such as for large, complex systems, computing the likelihood is impractical. In such cases approximate Bayesian computation (ABC) should be considered [49]. Instead of the likelihood, a distance function is used to compare the actual data with data simulated by a model, denoted x m . If the underlying model is given by \( f=f\left({x}_m\Big|\theta \right) \), then we express the ABC posterior function by

$$ {P}_{\mathrm{ABC}}\left(\theta \Big|x\right)\propto 1\left(\varDelta \left(x,{x}_m\right)\le \varepsilon \right)f\left({x}_m\Big|\theta \right)\pi \left(\theta \right) $$
(74)

where \( \varDelta \left(a,b\right) \) denotes a distance measure between a and b, and ε is the tolerance level that determines how well real and simulated data should agree.

By evaluating the posterior function, ABC allows the modeler to identify parameter regions that are of interest, and ignore those that are not. Furthermore, the posterior distribution gives information about joint distributions in parameter space and can reveal multivariate dependencies between parameters.

ABC for parameter inference has been implemented in the software package ABC-SysBio with support for parallelization [50]. For the examples given below, we used the CUDA implementation of ABC-SysBio with a Euclidean distance measure between model and data [51, 52]. Proceeding to analyze the Lee and Schmitz models, we do not try to infer all of the model parameters, since this is not possible with the data available, but instead study a 3D subset of parameter space. We choose free parameters that have direct (or strong) influence on the dynamics of β-catenin, since this is the species for which we have experimental measurements. The data used for fitting are published in [45]: they describe how the level of β-catenin changes over time in the cytoplasm and nucleus, following application of a Wnt stimulus to the system. These data, alongside the results of the parameter inference, are shown in Fig. 6.

Fig. 6
figure 6

Data published in [45] were used to fit the Lee and Schmitz models using approximate Bayesian computation for parameter inference. β-catenin concentration units were normalized based on their initial values. From the inference, we can see that the Lee model provides a better fit to the data

For the Lee model, we study the β-catenin-DC binding rate (α 10) that has a prior of [0, 100], the β-catenin degradation rate that is independent of the DC (α 16), and the binding rate of β-catenin to TCF (α 19). The latter two parameters both have priors of [0, 1]. The marginal posterior distributions for these three parameters (Fig. 7) show that the β-catenin-DC binding parameter takes values over the lower half of its prior range, whereas the other two parameters can take any values spanning the prior range. This suggests that for this model the parameter that has the greatest impact on outcome is the β-catenin-DC binding rate; however, we note the larger prior range over this parameter.

Fig. 7
figure 7

Posterior distributions and sensitivity analysis for the Lee and Schmitz models. Histograms of marginal posteriors for each free parameter in the two models are shown. The marginal posterior is the probability distribution for a single parameter, given data describing β-catenin dynamics in cytoplasmic and nuclear compartments [45]. Principal component (PC) analysis allows us to assess the sensitivity of the parameters to small perturbations: the last PC (PC3), contains the most sensitive parameters. We see that for each model, two parameters dominate PC3 and, thus, are most sensitive in this system

For the Schmitz model, we study the β-catenin production rate (δ 0), the β-catenin shuttling rate (δ 1), and the binding rate of β-catenin to TCF (δ 11). The prior used for each parameter is [0, 1] and we see from Fig. 7 that the marginal posterior distributions are relatively stiff: each parameter is constrained to lie within a narrow range relative to its prior. In order to fit the data, the rates of β-catenin shuttling and binding to TCF must be low, while the rate of β-catenin production must be high.

3.4.3 Sensitivity Analysis

Sensitivity analysis investigates how a model responds to perturbations around a set of parameter values and characterizes its robustness: a robust system is one for which perturbations of the parameters or initial conditions do not change the outcome. However, many trade-offs between sensitivity and robustness exist [5355].

Local sensitivity analysis determines how parameter perturbations affect the output of a system. Estimated or inferred parameters can be used as a baseline for parameter sensitivity. If the output of \( dx/dt=f\left(x,\kappa \right) \) is approximated by a first-order Taylor series in a neighborhood of reference input values, then the local sensitivity coefficient s i, j is the partial derivative of the ith state to the jth parameter:

$$ {s}_{i,j}(t)=\frac{\partial {x}_i(t)}{\partial {\kappa}_j}, $$
(75)

The elements s i, j define a sensitivity matrix \( S = \partial x \; \diagup \; \partial \kappa \). This local method provides information about the sensitivity in a given parameter region but not the global sensitivity landscape. Local sensitivity analysis can reveal parameters that are sensitive or robust to perturbations in the region of interest.

Principal component analysis (PCA) offers another way to investigate system sensitivity. This technique can be readily applied to the posterior distribution obtained following Bayesian inference. The principal components are constructed by evaluating the eigenvalues and eigenvectors of the covariance matrix of the parameters: the first principal component (given by the largest eigenvalue) corresponds to the direction in which the posterior is most wide; the last principal component (given by the smallest eigenvalue) corresponds to the direction in which the posterior is most narrow [49, 56]. The last few principal components represent the most sensitive (or “stiff” parameters) [57].

In Fig. 7, sensitivity analysis via PCA for the Lee and Schmitz models is shown. The principal components (PC) are ordered 1–3, thus PC3 is the last component and contains the most sensitive parameter combinations. For both models, PC3 is dominated by two parameters: the rates of β-catenin binding to the destruction complex (DC) or to TCF for the Lee model (\( {\alpha}_{10},{\alpha}_{19} \)); and the rates of β-catenin production or binding to TCF for the Schmitz model (\( {\delta}_0,{\delta}_{11} \)). These results suggest that the Lee model is more robust to changes in the β-catenin degradation rate (α 16), and that the Schmitz model is more robust to changes in the β-catenin shuttling rate (δ 1).

4 Techniques for the Comparison and Discrimination of Models

Given a set of models that describe similar biological phenomena, a challenge is to determine which model best describes the system, given the evidence available. In this section we describe two methods that enable comparison and discrimination between models. The first employs ABC, introduced above, and has already gained a strong foothold in systems medicine [50, 5860]. The second is model discrimination with the use of algebraic matroids; as far as we know this is a recent addition to the modeler’s toolkit and holds great potential for advances in systems medicine.

4.1 Model Selection via ABC

Returning now to the Lee and Schmitz models, we consider how to choose between models using ABC model selection. We have already demonstrated how methods for parameter inference, such as ABC, can yield the posterior distributions over the parameters of a model (given data) and discussed briefly how this can be interpreted. For two or more models (M i ,  i = 1, , n) some measure of the evidence for each model is needed [61],

$$ P\left({M}_i\Big|x\right)\propto P\left(x\Big|{M}_i\right)\pi \left({M}_i\right), $$
(76)

where (as previously) x represents the data and π the prior probability.

The ABC approach may be extended to parameter inference and model selection simultaneously using a joint space approach [49]. This may be performed for M models where \( M=\left[{M}_1,\dots, {M}_n\right] \), by assigning to each model (and parameters therein) a prior distribution and perturbation kernel that designates weights for model transition. The algorithm accepts N particles at the ε F tolerance, which forms the joint posterior distribution \( P\left(\alpha, M\Big|\widehat{\mathbf{x}}\right) \) and upon marginalizing over parameters, the marginal posterior distribution \( P\left(M\Big|\widehat{\mathbf{x}}\right) \) is approximated, providing a measurement for model selection. Bayesian model selection, like other approaches including the likelihood ratio test or Akaike Information Criteria (AIC), also penalizes over-parameterization.

The AIC for model M i , with i ∈ { 1, , n}, is defined as

$$ {\mathrm{AIC}}_i=-2 \log L\left({\theta}_i^{\ast };x,{M}_i\right)+2{k}_i, $$
(77)

where L is the likelihood, and \( {\theta}_i^{\ast } \) and k i are (respectively) the maximum likelihood parameter and number of parameters in model M i . This criterion, probably the best known model selection tool, makes explicit the penalty for an increased number of parameters. However, as the amount of data increases, the AIC introduces bias and tends to favor models that are over-parameterized. Therefore the Bayesian information criterion (BIC),

$$ {\mathrm{BIC}}_i=-2 \log L\left({\theta}_i^{\ast };x,{M}_i\right)+{k}_i \log n, $$
(78)

may be preferred, as it remains unbiased for large samples, n. The BIC is effectively an approximation to the model probability (76); the penalty term, explicit in the AIC and BIC definitions, is implicit in (76), where it enters via the priors for each model.

Model selection chooses, from among a set of candidate models, the model that best explains observed data. Two things need to be kept in mind: (1) one model will always be chosen as the best but this does not mean that the model is necessarily a good one; ideally model selection should go hand-in-hand with model checking (and topological sensitivity analysis [62]). (2) Model selection depends on the data available for testing the different models; since different data may favor different models, careful experimental design should precede model selection. With these issues in mind we have the pragmatic choice about which statistical model selection framework to employ. Fully Bayesian, even in an ABC context, is more expensive than identifying the maximum likelihood parameter set and applying AIC or BIC.

Shown in Fig. 8 are the results of ABC model selection for the Lee and Schmitz models, with the probability of the model given for successive iterations (populations). We see that initially both models are equally probable, but subsequently the probability of selecting the Schmitz model drops to close to zero and we conclude that the Lee model is favorable given these data and parameter combinations.

Fig. 8
figure 8

Model selection via ABC for the Lee and Schmitz models. The results show that, over successive populations, evidence in favor of the Lee model grows until there is a high probability that this model will be selected, given the data published in [45]

4.2 Model Discrimination Using Parameter-Free (Algebraic) Approaches

When parameter values are unknown or cannot be estimated from data, one may still be able to discriminate between competing models. We present two approaches, one that requires no data (rather qualitative insight into whether the system can have multiple responses) and another method which requires either highly resolved single cell data or multiple replicates of steady state measurements.

4.2.1 Precluding/Asserting Behaviors via Chemical Reaction Network Theory

Chemical reaction network theory (CRNT) studies the structure of a model (which can also be described as a network) constructed from chemical reactions without relying on specific parameter values. The aim here is to use such theory to preclude (and sometimes assert) possible qualitative behaviors in the positive orthant, i.e., \( {\mathbb{R}}_{>0} \). Cases where multiple positive states are stable (i.e., biologically accessible) are of particular biological importance for cellular decision making, for example, differentiation into one of two or more specialized cell lineages.

The field of CRNT initially focused on a structural property of a model called deficiency, which could preclude multiple steady states [63, 64]. Then theorems were proved for precluding/asserting multiple equilibria by studying the cycles in the graph of a network, or the sign of the determinant of the Jacobian; some of these approaches can provide conditions on the parameters for behaviors such as bistability and oscillations [6570]. An excellent and comprehensive survey of techniques for multistationarity was written by Joshi and Shiu [71]. One main tool for precluding multistationarity of a model is testing whether it is injective (a model, including conservation relations, is injective if \( F\left(x,\kappa \right)=F\left(\tilde{x},\kappa \right)\Rightarrow x=\tilde{x} \)). Here we demonstrate the application of multistationarity tests (developed for chemical reaction networks) to Wnt signaling models.

We begin with the Lee model. First we test injectivity, noting that while injectivity precludes multistationarity, failure of injectivity does not imply multistationarity. We use the algorithms in the CRNT Toolbox to determine whether the system can ever admit multiple positive steady states—multistationarity [72]. The Lee model fails injectivity, but cannot admit multiple positive steady states for any values of the system parameters and/or total concentration amounts (algorithms within [72]). Conversely, the Schmitz model has the capacity for multiple steady states; however, as calculated earlier, only one can ever be stable. Therefore, in this example, since both models only can have one stable steady state, it is difficult to use only qualitative data to discriminate between them. Clearly, if data suggested two stable states could exist, and all of the data had the same initial conditions, then one could rule both models out.

4.2.2 Model Discrimination Using Coplanarity via Algebraic Geometry

When data from a model clearly supports a specific behavior—whether monostable, bistable, or oscillatory, qualitative approaches such as those mentioned above may be a good first step for classifying models, especially if the data are not sufficient to estimate parameters. However, if steady state data are available, then determining steady state invariants may be helpful for determining whether a model is compatible with given data using a statistical parameter-free model discrimination method.

Since often data are not available for all model species, variables must be eliminated. A systematic technique from algebraic geometry proceeds by computing the Gröbner Bases of the model variety (studying the model at steady state) and eliminating unobservable variables. The resulting steady state invariant enables us to focus on part of the system and to test whether the data suggests that the relationships between species still hold. Notions of dependence and independence between model variables can also be studied using algebraic matroids and were recently applied to steady state model discrimination [27].

For smaller models, the steady states can be determined explicitly. For example, for the Schmitz model, the steady state values can be expressed in terms of X and X n : all other variables can be eliminated by exploiting conservation laws and using variable substitution (see Eqs. 3738). Either by hand, by computing the matroid, or by using Gröbner bases, the polynomial relationship/algebraic dependence between X and X n in the Schmitz model gives the following invariant:

$$ \begin{array}{l}\mathcal{I}={\delta}_0{\delta}_3{\delta}_4{\delta}_6\left({\delta}_8+{\delta}_9\right){X}^2+\left({\delta}_0{\delta}_2{\delta}_7{\delta}_9\left({\delta}_5+{\delta}_6\right)-{\delta}_1{\delta}_3{\delta}_4{\delta}_6\left({\delta}_8+{\delta}_9\right)\right)X{X}_n\\ {}\kern1.32em -{\delta}_1{\delta}_2{\delta}_7{\delta}_9\left({\delta}_5+{\delta}_6\right){X}_n^2,\end{array} $$

which vanishes at steady state (i.e., \( \mathcal{I}=0\Big) \). Effectively, we aim to test whether the data are coplanar with our model, via the steady state invariant transformation. Model compatibility is determined by computing the coplanarity error (\( \varDelta \)) via the singular value decomposition of the matrix

$$ \left(\begin{array}{ccc}& & \\ {}{\overset{\frown }{X}}^2& {\overset{\frown }{X}}_n^2& \overset{\frown }{X}{\overset{\frown }{X}}_n\\ {}& & \end{array}\right)\left(\begin{array}{c}{\tilde{h}}_1\\ {}{\tilde{h}}_2\\ {}{\tilde{h}}_3\end{array}\right)=0, $$

where \( \overset{\frown }{X} \) denotes the observed value of species X. The null hypothesis (that the model is compatible with the data) can be rejected when the coplanarity error (normalized smallest singular value) is less than a statistical bound, which is determined by the Gaussian measurement noise in the data and the invariant structure [73]. This method was recently applied to β-catenin localization data (cytoplasmic, X; and nuclear, X n ) published in [27, 45]. The Schmitz model could be ruled out if data were perturbed less than 10−5 by measurement error/noise; for higher levels of noise, the model is compatible.

5 Discussion

Paradoxically, technological advances sometimes create new challenges for clinicians. For example, as the number and variety of treatments for cancer increase, it can be difficult to identify the combination of treatments that will most benefit a given patient (if a unique, optimal treatment even exists). The situation is further complicated when we consider the different types of data that can be used as a basis for diagnosis and treatment planning; it is often impossible to integrate the available data by linear thinking alone. Systems medicine aims to address these challenges by developing mathematical and computational tools that integrate different types of information in order to generate objective decisions for patient treatment. In this chapter we have focused on ODE models, a class of models widely used in systems medicine, particularly to study signaling pathways. We have reviewed a variety of techniques that can be used to develop and analyze ODE models, using models of enzyme kinetics and the Wnt signaling pathway as test cases.

Many of the techniques that we have presented are already well established (such as model development, nondimensionalization, identification of steady state solutions, asymptotic analysis, and parameter sensitivity analysis); however, others are less well known (such as ABC, CRNT, and matroid-informed coplanarity). In addition to the benefit that these methods bring to the field, model development for systems medicine—in its increasing sophistication—is helping to stimulate further development and application of mathematical and statistical techniques.

Many of the challenges in systems medicine arise because most biological processes, including the actions of whole pathways, do not act in isolation. For example, at the subcellular level, pathway cross-talk can have a significant effect on cell function. In particular, there is growing evidence of cross-talk between Wnt and E-cadherin [74], Wnt and Erk [33]), and Wnt and the Hippo pathway [75]. Even simplistic models of such pathway cross-talk quickly become large and demand sophisticated techniques for their analysis. The situation becomes even more complex when we consider the impact of signaling pathways at the multicellular and tissue scales. The impact of Wnt signaling at the multicellular and tissue levels has been studied theoretically, most prominently in models of intestinal crypts [7679]. These models (for example) introduce spatial dependence by imposing a graded Wnt distribution along the crypt axis [78] or provide comparison of a continuum model with a cell-based model that incorporates heterogeneity and noise [79]. In [74], a multiscale model of interactions between the pathways affecting β-catenin and E-cadherin is developed and used to study the role of epithelial–mesenchymal transitions in cancer growth and metastasis, whereas in [80] a simple rule-based model for cross-talk between the Wnt and delta-notch pathways is embedded within discrete epithelial cell agents and used to study cell fate specification within the intestinal crypt. In addition to these theoretical studies (ever growing in complexity), more sophisticated data collection is urgently needed as a basis for hypothesis testing and model (in)validation.

We end by proposing two grand challenges, whose solutions will bear much fruit in systems medicine. The first is to incorporate multiple levels of information—from biochemical reactions within a single cell to tissue-level processes—into cohesive models. The second is to incorporate data which is resolved in space and time into a theoretical framework. There are, of course, many other important challenges, and work in these areas should provide many exciting opportunities for theoreticians in systems medicine for years to come.