Keywords

1 Introduction

Heterogeneity is a key property of biological systems at all scales: from the molecular level to the population level [1,2,3]. Many systems have evolved ways of either minimising or exploiting this noise [4, 5]. In particular, some critical cellular systems have evolved to minimise or take advantage of noise [6, 7], for example, persistence in bacteria [8, 9]. We can classify heterogeneity as arising from three main sources: genetic (nature), environmental (nurture) and stochastic (chance). In this chapter we focus on the latter. We make the distinction between Environmental heterogeneity often called extrinsic noise and intrinsic noise, which arises from random thermal fluctuations [1, 9].

Intrinsic noise was perhaps first observed in a cell biology setting by Spudich and Koshland [10]. They noticed that individual bacteria from an isogenic population maintained different swimming patterns throughout their entire lives. Although they called this ‘non-genetic individuality’, they believed it arose from random fluctuations of low copy-number molecule or intrinsic noise. This affects the microscopic DNA, RNA and protein molecules in many ways, most notably via the Brownian motion of molecules and the randomness associated with their reactions. It ensures that the process of gene expression, one of the most fundamental processes of a cell, proceeds differently each time, even in the absence of the previous two sources of heterogeneity [9, 11].

Some biological systems have evolved to make use of intrinsic noise: a good example is persister-type bacteria, which can withstand antibiotic treatments even though they do not have genetic mutations for resistance [12]. These effects can arise through the phenomenon of stochastic switching, where cells randomly transition from one state to another [13, 14]. Another well-known example of exploiting stochasticity is the bacteriophage Lambda decision circuit [15]. Stochasticity also plays a crucial role in causing genetic mutations [16, 17]; these are essential for creating the heritable heterogeneity upon which natural selection can act, thus allowing evolution to occur [18].

On the other hand, intrinsic noise can interfere with the precise regulation of molecular numbers [19], and cells have to compensate for this by adopting mechanisms that enable robustness of certain key properties [4, 5], such as feedback loops [9, 11]. Furthermore, mutations may also have negative effects, as certain mutations in, for example, stem and somatic cells are thought to be the cause of cancer [20,21,22].

2 Temporal Modelling in Computational Cell Biology

There has been a long and successful history in computational cell biology of using rate kinetic ordinary differential equations to model chemical kinetics within a living cell. For instance, these techniques have been applied on the plasma membrane, in the cytosol and in the nucleus of eukaryotic cells to understand cell processes ranging from gene regulation to transport between cellular compartments. Modifications via delay differential equations were first considered as early as [23], in order to represent the fact that the complex regulatory processes of transcription and translation were not immediate but were in fact examples of delayed processes.

Thus in a purely temporal homogeneous setting and when there are large numbers of molecules present, chemical reactions are modelled by ordinary differential equations that are based on the Law of Mass Action and that estimate reaction rates on the basis of average values of the reactant density. Any set of m chemical reactions can be characterised by two sets of quantities: the stoichiometric vectors (update rules for each reaction) ν 1 ,  …  , ν m and the propensity functions a 1(X(t)) ,  …  , a m (X(t)). The propensity functions represent the relative probabilities of each of the m reactions occurring. These are formed by multiplying the rate constant and the product of the reactants on the left-hand side of each reaction. Here X(t) is the vector of concentrations at time t of the N species involved in the reactions. The ODE that describes this chemical system is

$$ {X}^{\prime}(t)=\sum_{j=1}^m{\nu}_j{a}_j\left(X(t)\right). $$

This formulation may have many different timescales (stiffness) but there are a wide variety of numerical methods that can deal effectively with such systems.

It was the pioneering work of Gillespie [24] and Kurtz [25] who challenged this deterministic view of cellular kinetics. They argued that when the cellular environment contained small to moderate numbers of proteins, then the Law of Mass Action is not an adequate description of the underlying chemical kinetics as it only describes the average behaviour. In this regard, the fundamental underlying principle is that of intrinsic noise. Intrinsic noise is associated with the inherent uncertainty in knowing when a reaction occurs and what that reaction is. The variance associated with this uncertainty increases as the number of proteins in the cellular environment becomes small. [24, 25] showed how to model intrinsic noise through the concept of nonlinear discrete Markov processes and Poisson processes, respectively. These two approaches both model the same processes and are now lumped together under the title the Stochastic Simulation Algorithm (SSA), although their formulations are different. The essential observation underlying the SSA is that the waiting time between reactions is exponentially distributed and that the most likely reaction to occur in this time interval is based on the relative sizes of the propensity functions.

More formally, the SSA is an exact procedure that describes the evolution of a discrete nonlinear Markov process. It accounts for the inherent stochasticity of the m reactions within a system and only assigns integer numbers of molecules to the state vector. At each step, the SSA samples two random numbers from the uniform distribution U[0,1] to evaluate an exponential waiting time, τ, for the next reaction to occur and an integer j between 1 and m that indicates which reaction occurs. The state vector is updated at the new time point by the addition of the j th stoichiometric vector to the previous value of the state vector, that is,

$$ X\left(t+\tau \right)=X(t)+{\nu}_j. $$

Several more efficient, but more complex, variants of the SSA have been developed [26,27,27]. However, despite these increases in computational speed, the SSA has an inherent limitation: it must simulate every single reaction, and in cases where there are many reactions or molecular populations become too large, it is computationally intensive.

In a slightly different vein, the SSA describes the evolution of a nonlinear discrete Markov process and as such this stochastic process has a probability density function whose solution is described by the Chemical Master Equation (CME). The CME is a discrete parabolic partial differential equation in which there is one equation for each configuration of the ‘state space’. When the state space is enumerated, the CME becomes a linear ODE and the probability density function takes the form

$$ p(t)={e}^{At}p(0). $$

Here A is the state-space matrix. Thus the solution of the CME can be reduced to the computation of the evolution of the exponential of a matrix times an initial probability vector. As there is one equation for each possible configuration of the state space this can be very computationally challenging, although recently developed methods can cope with some of these computational costs [28,30,31,32,32] to make this a very feasible technique.

The main limiting feature of SSA is that the time step can become very small, especially if there are large numbers of molecules or widely varying rate constants. Thus tau-leap methods have been proposed in which the sampling of likely reactions is taken from either Poisson [33] or Binomial [34] distributions. In these approaches a much larger time step can be used at the loss of a relatively small amount of accuracy. The tau-leap method [33] allows steps that are much larger than the SSA by estimating the total number of occurrences of each type of reaction over that step. Thus if there are m reactions, we take m Poisson random number samples based on the sizes of the propensity functions evaluated at the beginning of the step. The algorithm can thus be written as

$$ X\left(t+\tau \right)=X(t)+\sum_{j=1}^m{\upsilon}_jP(\tau {a}_j\left(X(t)\right). $$

Note that now we have complete flexibility in choosing τ and it can be done adaptively in such a way as to control the local error at each step. This approach has the key advantage that individual reactions need not be simulated. However, the main drawback is a loss of accuracy compared to the SSA as a function of the step size. Several schemes have been devised to optimise the time step [35, 36], while some implementations combine the SSA with tau-leaping via a threshold on the number of molecules in the system at any given time. However, it is possible that molecular species that are depleted in any reaction can go negative if the time step is too large and schemes have been developed that allow the choice of larger time steps whilst avoiding negative populations [34, 36,37,38,39,40]. One important way in which this can be done is to sample from the Binomial distribution rather than the Poisson distribution [34]. Another approach is to consider the order of accuracy as a function of the step size. In this regard the (weak) order of accuracy of the tau-leap method can be shown to be one [41,43,43], meaning that error decreases proportionally to the time step. Higher-order methods allow larger time steps to achieve the same error, thus decreasing computational time [43,44,45,46,47].

Although it is not uncommon for chemical systems to be rather complicated, a difficult situation arises when a system has reactions that operate at very different timescales, for instance, slow and fast. Although standard tau-leap and higher-order tau-leap methods are able to simulate these systems, their time steps must be reduced, which can dramatically slow down simulation time when the separation of the scales is significant and the fast reactions occur frequently.

In such cases, there are two options: either to use special methods for these ‘stiff’ systems or to use multiscale methods. The former are often based on deterministic methods for stiff ordinary differential equation systems and allow the use of normal time steps by expanding the range of time steps where the method is stable, thus opening the door for stiff systems to be simulated in similar time periods as non-stiff ones [48,49,50]. Multiscale methods, on the other hand, partition the reactions into fast and slow types, and simulate the fast reactions using an approximate method and the slow reactions using an accurate stochastic method. The partitioning implies that the slow reactions are constant over the timescale of the fast reactions and that the fast reactions have relaxed to asymptotic values between each slow reaction. Thus, the two sets of reactions are simulated iteratively to take into account the coupling. This can also be generalised to three regimes: fast, medium and slow [51]. This approach allows for very significant reductions in computational time as the fast reactions, which would take up the most computational effort, can be simulated very quickly with continuous methods. However, it also introduces errors associated with coupling between two scales, as well as the possibility of errors from the simulation of the fast reactions. An interesting approach runs short bursts of a single SSA for the fast reactions, which is used to infer parameters for a differential equation approximation of the slow reactions [52, 53].

There is in fact an intermediate regime that can still capture the inherent stochastic effects but reduce the computational complexity associated with the SSA. This intermediate framework is called the Chemical Langevin Equation (CLE). It is described by an Itô stochastic differential equation (SDE) driven by a set of Wiener processes that describes the fluctuations in concentrations of the molecular species. Various numerical methods can then be applied to this equation—the simplest method being the Euler–Maruyama method [54].

The CLE attempts to preserve the correct dynamics for the first two moments of the SSA and takes the form

$$ dX=\sum_{j=1}^m{\nu}_j{a}_j\left(X(t)\right) dt+B\left(X(t)\right) dW(t). $$

Here W(t) = (W 1(t),  … , W N (t))T is a vector of N independent Wiener processes whose increments ΔW j  = W j (t + h) − W j (t) are N(0, h) and where

$$ B(x)=\sqrt{C,}\kern1em C=\left({\nu}_1,\dots, {\nu}_m\right)\mathrm{Diag}\left({a}_1(X),\dots, {a}_m(X)\right){\left({\nu}_1,\dots, {\nu}_m\right)}^{\mathrm{T}}. $$

Here h is the time discretisation step. Effective methods designed for the numerical solution of SDEs [54,55,56] can be used to simulate the chemical kinetics in this intermediate regime. [57] have shown how to construct the CLE so that it minimises the number of Wiener processes. Furthermore, adaptive multiscale methods have been developed that attempt to move back and forth between the deterministic and stochastic regimes as the numbers of molecules change [51].

These temporal approaches are applied under the hypothesis of homogeneous and well-mixed systems. It is well known, for example that diffusion on the cell membrane is not only highly anomalous but the diffusion rate of proteins on live cell membranes is also between one and two orders of magnitude slower than in reconstituted artificial membranes with the same composition [58]. Furthermore, diffusion is dependent on the dimensions of the medium so that diffusion on the highly disordered cell membrane is not a perfectly mixed process and therefore the assumptions underlying the classical theory of chemical kinetics fail, requiring either new approaches to modelling chemistry on a spatially crowded membrane [59] or methods based on detailed spatial simulations.

However, rather than abandoning temporal models entirely, it is possible to capture important spatial aspects and incorporate them into temporal models. This can be done in a number of ways. For example, compartmental models have been developed that couple together the plasma membrane, cytosol and nucleus—see, for example [60], in which an SSA implementation of Ras nanoclusters on the plasma membrane is coupled with an ODE model for the MAPK pathway in the cytosol. Diffusion and translocation can be captured through the use of distributed delays that can then be incorporated into mathematical frameworks through the use of delay differential equations or delay variants of the Stochastic Simulation Algorithm (see [61], for example). [62] have explored a number of spatial scenarios, run detailed spatial simulations to capture diffusion and translocation processes and then incorporated this information into purely temporal models through distributed delays—see Sect. 5 for more details. Another way in which spatial information can be captured and then incorporated into purely temporal models is the area of anomalous diffusion, where spatial crowding and molecular binding can affect chemical kinetics. In this setting the mean square deviation of a diffusing molecule is no longer linear but sublinear in time t and of the form

$$ E\left[{X}^2(t)\right]=2\kern0.1em D\kern0.1em {t}^{\alpha },\kern1em \alpha \in \left(0,1\right]. $$

Here, α is called the anomalous diffusion parameter. If the value of α can be estimated, either experimentally or from detailed Monte Carlo simulations, then the SSA can be modified so that the waiting time between reactions is no longer exponentially distributed but has a heavy tail [59].

3 Spatial Models in Computational Cell Biology

One of the fundamental goals of integrative Computational Biology is to understand complex spatio-temporal processes within cells. However, such a task may become exceedingly difficult, due to the intrinsic multiscale nature of these processes. For example, in order to fully understand cell signal transduction, a careful description of membrane-bound receptor activation processes must be accounted for. However, the plasma membrane is a highly complex structure, compartmentalised on multiple length and timescales stemming from lipid–lipid, lipid–protein interactions with the cytoskeleton. As a result, detailed simulations accounting for all processes may be computationally prohibitively expensive, or simply infeasible.

Of the class of spatial methods, the approach with the lowest computational cost consists of solving reaction-diffusion partial differential equations, representing the concentration of a given molecular species in the system. However, this approach is only valid if and when: (1) all molecular species in the system have large molecular concentrations, and (2) noise is not amplified throughout the system. If at least one of these conditions fails to hold, we must rely on spatial stochastic simulation that can be discrete or continuous in form. In the discrete spatial setting either lattice or off-lattice particle based methods are appropriate.

As a particular example of lattice based methods, we consider the plasma membrane. The plasma membrane is a complex and crowded environment that has many roles including signalling, cell–cell communication, cell feeding and excretion and protection of the interior of a cell. It is heterogeneous—the cytoskeletal structure just inside the plasma membrane can corral and compartmentalise membrane proteins. Chemically inert objects can form barriers to protein diffusion on the plasma membrane. Trying to capture such complexity using higher-level mathematical frameworks such as partial differential equations is extremely challenging, so instead a stochastic spatial model using Monte-Carlo simulation is appropriate. In such an approach the plasma membrane can be mapped to a two dimensional lattice, usually regular but not necessarily so. The size of each computational cell ‘voxel’ depends on the biological questions that are being addressed, but taking into account volume-exclusion effects, usually the voxel is such that at most one protein per voxel is allowed. A protein undergoes a random walk, so that at each time step a protein is selected at random, and a movement direction (north, south, east or west, in the case of a rectangular lattice arrangement) is randomly determined. The distance moved depends on the diffusion rates for each species. Chemical reactions can be simulated by checking the chemical reaction rules and then replacing that protein and/or creating a new protein at that location whenever a collision (volume exclusion event) occurs [63, 64]. Although only relatively small sections of the membrane on short timescales can be simulated with this approach due to the very slow computational performance, we note that diffusion can be considered as a unimolecular reaction. Thus if we order the voxel elements within the lattice into a vector, we can consider this method in the SSA framework and apply the same tools that have been developed in the purely temporal setting. This can allow us to make use of vectorisation to speed up the performance. Furthermore, there is a spatial CME associated with this approach [65] and again techniques used in the purely temporal case can be used, although at the cost of very significantly increased computational complexity.

In off-lattice methods, all particles in the system have explicit spatial coordinates at all times. At each time step, molecules are able to move, in a random walk fashion, to new positions. In many cases, reaction zones whose size depends on the particular diffusion rates are drawn around each particle. If one or more molecules happen to be inside such a zone, appropriate chemical reactions can take place with a certain probability; if a reaction is readily performed, the reactant particles are flagged, to avoid repetition of chemical events. Noticeably, in off-lattice methods, the domains and/or compartments can still be discretised, to aid the localisation of particles within the simulation domain. In this vein, [66] have considered how to combine tau-leaping and compartmentalisation in a spatial setting.

A less computationally intensive alternative, albeit still costly in many scenarios, is to consider molecular interactions in the mesoscopic realm. Here, the discretisation of the Reaction-Diffusion Master Equation (RDME) results in reactive neighbouring subvolumes within which several particles can coexist, with well-mixedness assumed in each subvolume. There are a number of algorithms extending discrete stochastic simulators to approximate solutions of the RDME by introducing diffusion steps as first-order reactions, with a reaction rate constant proportional to the diffusion coefficient. In [67, 68] the authors provide the specific outline for extending discrete stochastic simulators to the RDME regime, while the algorithms in [69, 70] provide clever extensions of the ‘next reaction method’ [26], commonly known as the ‘next subvolume method’. A review on the construction of such methods is given in [71].

The Next Subvolume Method [69, 70, 72, 73] is a generalisation of the SSA [24], where the simulation domain is divided into uniform separate subvolumes that are small enough to be considered homogeneous by diffusion over the timescale of the reaction. At each step, the state of the system is updated by performing an appropriate reaction within a single subvolume or by allowing a molecule to jump to a randomly selected neighbouring subvolume. Diffusion is then modelled as a unary reaction. In a two dimensional setting the rate is proportional to the diffusion coefficient divided by the length of a side of the subvolume. In this way, diffusion inside the algorithm becomes another possible event with a regular propensity function, and follows the same update procedure as any chemical reaction. The expected time for the next event in a subvolume is calculated in a similar way to the SSA algorithm, including the reaction and diffusion propensities of all molecules contained in that subvolume, at that particular time. However, times for subsequent events will only be recalculated for those subvolumes that were involved in the current time step, and they are subsequently re-ordered in an event queue. Similar to accelerated approaches to simulate exact trajectories from the CME, there exist methods to coarse-grain, and therefore accelerate, computations for the RDME [66]. Separately, [74] split the time integration of the RDME into a macroscopic diffusion (for species with large numbers of molecules) and a stochastic mesoscopic reaction/diffusion part (for species with small numbers) obtaining the mesoscopic diffusion coefficients from appropriate Finite Element discretisations.

In addressing these spatial issues, we are led to consider the role of anomalous diffusion. Anomalous diffusion refers to processes where the mean squared displacement (MSD) of a particle is no longer linear in time [75]. Anomalous diffusion can be viewed in two ways: as a mechanism to localise molecules and control signalling [76], or a macroscopic result of underlying microscopic events. From the experimental perspective, various techniques have been used to study such processes, including Single Particle Tracking [77], Fluorescence Recovery after Photo bleaching [78] and Fluorescence Correlation Spectroscopy [79]. However, the quantification of the degree and nature of the anomalous diffusion has shown to be more difficult than anticipated, due to experimental limitations [76, 80]. In spite of these discussions and recent technical developments, the nature of anomalous subdiffusion is still not well understood, either from experimental or simulation perspectives.

There are many reasons for these discrepancies including the fact that often only short tracks are recorded and the MSD relationship is not necessarily a robust metric for inferring complex spatial information. A more robust metric would be to construct a probability density function evolving in space and time, but this would require a very time-consuming experimental study. From the simulation perspective, a number of simulation studies have reported crowding-dependent values of α [64, 81]. But a prediction of percolation theory is that for immobile obstacles α depends only on the embedding space dimension and not on the obstacle density [82]. From theoretical perspectives, in the case of immobile objects, we can expect to observe Brownian diffusion for initial small time periods (when no obstacles have yet been encountered), an anomalous regime at intermediate times and Brownian diffusion in the asymptotic regime [83, 84]. It is these crossovers that can create confusion when trying to interpret diffusive characteristics.

It remains to be seen whether anomalous diffusion scenarios can only be captured by explicit spatial models, or whether equivalent dynamics could be obtained using off-lattice simulations with single molecules devoid of explicit obstacles, thus capturing essential dynamics while significantly reducing computational time. Other possibilities would include deriving an SSA and its associated CME that would work in an anomalous diffusion temporal setting only. We could then attempt to replicate the deterministic and stochastic regimes in the time anomalous setting. This would lead to time fractional or space fractional representations. For example, molecular concentration dynamics C(x, t) in a typical subdiffusive setting could be represented by the time-fractional differential equation

$$ \frac{\partial }{\partial t}C\left(x,t\right)={K}_{\alpha }{D}_t^{1-\alpha }\ {\nabla}^2\ \mathrm{C}\left(x,t\right)+f\left(x,t\right), $$

where x and t are the space and time variables, respectively, and the anomalous exponent α is the fractional order of the time derivative operator. Here \( {D}_t^{1-\alpha }\ g(t) \) is the fractional Riemann–Liouville derivative operator that reduces to the identity operator when α = 1 while K α is the fractional equivalent to the classical diffusion coefficient and has dimension [K α ] = [l]2[t]α.

4 Modelling and Simulating Stochastic ion Channel Dynamics

Ion channels are multiconformational proteins that form a pore in the membrane of excitable cells. They open and close due to conformational changes in the proteins as a result of variations in membrane potential, and thus regulate the movement of ions across the lipid bilayer [85]. The dynamics of these proteins are fundamental to the generation of an action potential (AP) in excitable cells [86]. Single-channel recordings have demonstrated that the conformational changes the protein undergoes as it opens and closes occur at random [87]. This internal stochasticity causes fluctuations in individual ionic conductances, [86], and has important effects on the electrical dynamics of the cell [88,89,90,91].

In neuronal cells, stochastic ion channel behaviour can modify a number of electrical properties of the cell including the firing threshold [92] and spike timings [93]. In cardiac myocytes this intrinsic randomness leads to variability in the duration of successive APs [89, 90], termed beat-to-beat variability, which is thought to be an indicator of arrhythmias [94]. It can even cause alterations to the AP morphology under pathological conditions, resulting early-after-depolarisations (EADs) [89].

While a discrete-state modelling and simulation approach is often seen as the ‘gold-standard’, it becomes increasingly computationally costly as the number of channels increases beyond a few hundred [95]. This has led to the increasing popularity of using stochastic differential equations to describe ion channel dynamics [89, 96, 97]. Fox and Lu [98] were the first to take such an approach, which they applied to the Hodgkin–Huxley model, and their method has since been extensively used to model neuronal cells [96, 97], cardiac myocytes [89, 90] and pancreatic beta cells [91]. Their approach reduces the dynamics of the whole channel to the collective dynamics of a series of gating variables that can be either open or closed. The proportion of each gating variable in the open state is described by a Stochastic Differential Equation (SDE) and the proportion of open channels is given by the product of the number of open gates.

However, a number of studies have demonstrated discrepancies between the SDE and discrete-state Markov chain model [95, 99, 100] [101]. Goldwyn et al. [101] demonstrate that constructing the SDE model in terms of the kinetic dynamics of the channel, rather than the individual gating variables, preserves the stochastic behaviour of the discrete-state Markov chain model, see the recent review [104] for a further discussion.

In order to illustrate these ideas we give more details on the form of the Chemical Langevin equation for ion channel dynamics. We first consider an ion channel transitioning just between the closed and open states. Let the proportion of channels in the open state be y, let N be the total number of ion channels and W be a Wiener process. Then the form of the CLE is

$$ dy=\left(a-\left(a+b\right)y\right) dt+\frac{1}{\sqrt{N}}\sqrt{a+\left(b-a\right)y} dW.\vspace*{-4pt} $$

However, the complicating factor is that the parameters a and b are themselves functions of the trans-membrane voltage. In the more general setting when there are a number of transitions between the various ion channel states then the formulation is given by

$$ dy(t)= vD\left(y(t)\right) dt+\frac{1}{\sqrt{N}}\sum_{p=1}^{d/2}{b}^p\left(y(t)\right){dW}_p+{k}_t. $$

For the moment take k = 0. Here the columns of v are state changes resulting from a transition. D is a diagonal matrix whose entries are chance of transition occurring (namely the propensity functions), b p(y(t)) are columns of matrix B, where BBT = vD(y(t))v T. In the case of, say p, unimolecular transitions, which is the setting for ion channels, then this simplifies to B = ES, where E is a p × d/2 matrix and S is a d/2 × d/2 diagonal matrix. In particular, E is a matrix with 1s on the diagonal, −1s in certain lower triangular positions and 0s elsewhere. Here S is a diagonal matrix with square roots of linear combinations of certain pairs of components of y [57].

For example, in the case of Sodium and Potassium ion channels, that form key elements of the Hodgkin-Huxley ion channel model [103], the transitions are given~by

$$ \begin{array}{l}{n}_0\kern1.5em \underset{b_n}{\overset{4{a}_n}{\rightleftharpoons }}\kern1.5em {n}_1\kern1.5em \underset{2{b}_n}{\overset{3{a}_n}{\rightleftharpoons }}\kern1.5em {n}_2\kern1.5em \underset{3{b}_n}{\overset{2{a}_n}{\rightleftharpoons }}\kern1.5em {n}_3\kern1.5em \underset{4{b}_n}{\overset{a_n}{\rightleftharpoons }}{n}_4\\ {}\\ {}{m}_0{h}_0\kern1.5em \underset{b_m}{\overset{3{a}_m}{\rightleftharpoons }}\kern1.5em {m}_1{h}_0\kern1.5em \underset{2{b}_m}{\overset{2{a}_m}{\rightleftharpoons }}\kern1.5em {m}_2{h}_0\kern1.5em \underset{3{b}_m}{\overset{a_m}{\rightleftharpoons }}\kern1.5em {m}_3{h}_0\\ {}{b}_h\uparrow \downarrow {a}_h\kern2.5em {b}_h\uparrow \downarrow {a}_h\kern3em {b}_h\uparrow \downarrow {a}_h\kern3.5em {b}_h\uparrow \downarrow {a}_h\\ {}{m}_0{h}_1\kern1.5em \underset{b_m}{\overset{3{a}_m}{\rightleftharpoons }}\kern1.5em {m}_1{h}_1\kern1.5em \underset{2{b}_m}{\overset{2{a}_m}{\rightleftharpoons }}\kern1.5em {m}_2{h}_1\kern1.5em \underset{3{b}_m}{\overset{a_m}{\rightleftharpoons }}\kern1.5em {m}_3{h}_1.\end{array} $$

In the case of Sodium, for example, then E and S are given by

$$ E=\left(\begin{array}{cccc}\hfill 1\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \\ {}\hfill -1\hfill & \hfill 1\hfill & \hfill 0\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill -1\hfill & \hfill 1\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill -1\hfill & \hfill 1\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill -1\hfill \end{array}\right)\kern1em \operatorname{diag}(S)=\left(\begin{array}{l}\sqrt{a_n{y}_{n_3}+4{b}_n{y}_{n_4}}\\ {}\sqrt{2{a}_n{y}_{n_2}+3{b}_n{y}_{n_3}}\\ {}\sqrt{3{a}_n{y}_{n_1}+2{b}_n{y}_{n_2}}\\ {}\sqrt{4{a}_n{y}_{n_0}+{b}_n{y}_{n_1}}\end{array}\right). $$

In passing, we note that due to the structure of the ion channels there is an explicit formulation to the underlying probability density function in terms of multinomials [104] but as the transition rates are nonlinear functions of the voltage this does not really help in simulating the ion channel dynamics.

Although this approach gives the correct dynamics, the solution to the SDE must remain non-negative for the path to have any biological relevance. Yet it has been shown that the solution can become negative [105]. Furthermore, since the noise term in the SDE model involves the square root of some function of the state variable, this can result in numerical solutions becoming imaginary [105]. Alterations to the numerical scheme can be made to force the solution to remain positive, for example, the Wiener increment can be continually resampled [89]. However, such alterations can bias the results [105,106,107]. Another approach is to replace the variable in the noise term with its equilibrium value, [96, 101] so that the square root term is independent of the state of the system. However, such an approach can still result in the proportion of channels becoming negative [101]. In [105] a hybrid simulation method for the Hodgkin–Huxley model is developed that attempts to improve the computational efficiency of the discrete-state Markov chain model whilst ensuring individual simulations remain non-negative by switching between regimes.

figure 1

Example of the reflection process on a three dimensional bounded domain

We argue that the boundary conditions are not naturally incorporated into the standard CLE and that they can do so by the use of reflected SDEs [108] in which k t in the general formulation is a reflected process that only comes into play at the boundaries of the domain. The basic idea is to evaluate the non-reflected process at the next time step using the Euler–Maruyama method. If the simulation is in the desired region, set Y at the next time step to be this value. Otherwise orthogonally project onto the boundary of the domain D. We assume that the process will reflect y(t) into the interior of the domain in the direction of the inward pointing unit normal. It can be shown that this approach will converge with strong order ½-h for all h > 0. This can be visualised for the three state models—see figure. Thus in our general formulation we can interpret k t as a reflecting process and this determines the behaviour at the boundary.

Finally, very recently Schnoerr et al. [109] show in a very nice paper that, by extending the domain of the CLE to the Complex space, the CLE’s accuracy for unimolecular systems is restored. This is at the cost of having to perform simulations in complex arithmetic and taking care with the use of pre-defined functions.

5 The Role of Delays in Modelling Biochemical Reaction Systems and Model Reduction

The origins of delay differential equations (DDEs) date, most likely, back to the second half of the eighteenth century. According to Schmidt [110], some of the early work on DDEs (and, more generally, the so-called functional differential equations) originates from famous mathematicians such as Laplace, Poisson and Lacroix (see references in [110]). Two of the earliest examples of DDEs from the early twentieth century found in the mathematical medicine and mathematical biology literature are by Lotka, who studied a model of malaria epidemics with incubation delays (in particular in [111]), and by Volterra, who investigated delayed predator–prey models [112]. Ever since, DDEs have become an integral part of mathematical modelling of biological, biomedical and physiological processes. Examples of delay models can be found in areas such as population dynamics and epidemiology [113, 114], gene regulation [115], cell signalling [116], viral dynamics [117], tumour growth [118], drug therapy [118, 119], immune response [120] and respiratory systems [121].

The use of delays and systems’ histories was driven by the desire for more realistic and, consequently, more accurate mathematical models. Indeed, introducing delays is many times essential for reconciling models with observations and experimental data. Moreover, delays provide a way for a more intuitive modelling approach, i.e. phenomenological rather than mechanistic. In this case, complex processes are lumped while underlying mechanisms and inherent intermediate steps are not explicitly accounted for. Yet, the time that such processes require is included in the form of a constant delay or delay distribution.

With growing interest in the stochastic dynamics of chemical reaction networks, delays have also been introduced into stochastic simulation algorithms (SSAs). Several so-called delay SSAs (DSSAs) have been proposed [61, 122,123,124,125], and the Delay Chemical Master Equation (DCME) was introduced as a generalisation to the Chemical Master Equation for reactions with associated constant delays [61, 126] or delay distributions [126]. However, contrary to CMEs, for which either closed solutions have been presented or a large number of numerical and computational strategies have been suggested [104, 127,128,129,130,131,132], DCMEs have been largely disregarded. This is because DCMEs do no longer represent Markovian processes, since transitions between states depend on the current state and the process history. This leads to terms in a DCME that, for a delay reaction with stoichiometric vector ν and constant delay τ, look like

$$ \sum_{x_i\in l(x)}\left(a\left({x}_i\right)P\left(x-v,t;{x}_i,t-\tau \right)-a\left({x}_i\right)P\left(x,t;{x}_i,t-\tau \right)\right), $$

where the sum is taken over all previous states x i prior to the current state x. The joint probabilities are usually unknown and these terms can only be simplified if (a) the coupling of the system states at times t and t − τ is weak, in which case we obtain a reasonably good approximation; or (b) the triggering of the delayed reaction is fully independent of the occurrences of other reactions and of the state x i at the time of triggering. The latter implies in particular that none of the reactions in the system, including the delayed reaction, change the number of reactants of the delayed reaction, nor its kinetic function [126]. In this case, the propensity function of the delayed reaction is constant, i.e. a(x i ) = c for all x i , and the above sum simplifies to

$$ \begin{array}{l}c\sum_{x_i\in l(x)}\left(P\left(x-v,t|{x}_i,t-\tau \right)P\Big({x}_i,t-\tau \left)-P\right(x,t|{x}_i,t-\tau \left)P\right({x}_i,t-\tau \Big)\right)\\ {}\hspace*{1pc}=c\left(P\left(x-v,t\right)-P\Big(x,t\Big)\right).\end{array} $$

If the delay is given in the form of a distribution instead of a constant delay, then its cumulative density function appears as just another factor. The DCME remains a homogeneous system of linear first-order ODEs, except that it now includes a time-dependent coefficient, and can then be solved numerically using the available tools for CMEs.

Moreover, the analytic solution for CMEs of monomolecular reaction systems was shown to be the convolution of multinomial and product Poisson distributions [104]. For a simple delayed, unidirectional reaction scheme, a multinomial distribution was recently derived as its solution using purely probabilistic arguments [126]. This suggests that such a distribution may also be the solution to a more general class of monomolecular, delayed reaction systems.

As already mentioned above, delays can also be used for model reduction. A number of reduction techniques have been proposed in the past, including the classical equilibrium approximation by Michaelis and Menten, the quasi-steady state approximation (QSSA) by Briggs and Haldane, several variations of the QSSA [133, 134], methods based on the linear noise approximation [135, 136], and the finite-state projection method [130]. However, they all approximate the true solution and/or make certain assumptions, for instance, a timescale separation, which, if not met, can lead to inaccurate results. The method presented in [137, 138] uses delayed reactions between species of interest that replace a number of intermediate species and reactions between these. The individual delays are obtained as first-passage time distributions in analytic form. These can then be placed into DSSA implementations for generating sample trajectories of the abridged system’s dynamics. This method is fully accurate for all bidirectional, unimolecular reaction chains, including degradations, synthesis and bypass reactions, and allows for large computational savings. This holds true in particular if, over the same simulation time span, the number of reactions in the unabridged system is considerably larger than the number of DSSA steps in the abridged system.

Lastly, it has become evident that spatial aspects play a crucial role in biological and biomedical processes. Even in relatively simple biochemical systems, the observed behaviour can vary considerably from the often assumed, well-mixed scenario, where spatial dependencies, geometries and structures are not taken into account. Thus, it is important to consider these spatial aspects in modelling approaches, both for our understanding and for accurate predictions of such processes. While detailed spatial models are more realistic they are also much more computationally demanding—if not prohibitively expensive. Alternatively, we can try to incorporate the effects of such spatial features into temporal models, without any explicit spatial representation. As has been shown recently, delay distributions may provide an appropriate tool [62]. The proposed methodology is similar to the model reduction with delays described above: the first step consists of obtaining proper delay distributions. These can stem from diffusion profiles and can be directly obtained from particle simulations, in vitro experiments or corresponding PDE solutions. Such tailored distributions are then used along with their associated reactions in a modified version of the DSSA. When applied to a variety of simple scenarios of molecular translocation processes large computational savings were achieved.

6 Conclusions

There is still considerable ongoing research to refine both the SSA and approximate methods. However, there are four areas where significant improvements can be made easily. Firstly, graphical processing units on graphics cards are now very powerful for some applications. Parallelised stochastic methods allow us to harness this power in order to run many thousands of simulations simultaneously. Secondly, multiscale problems are common in many real applications and they are often large in nature. The use of adaptive multiscale approaches where processes on many different scales are integrated into one model could, for example, play an important role in personalised medicine. Thirdly, we can attempt to reduce the number of simulations needed in order to gain a certain accuracy using ideas from multi-level theory developed by Mike Giles—this is a form of variance reduction. Fourthly, the limitation of non-spatial methods is that they can only be accurately applied to macroscopically homogeneous systems, but this assumption does not hold in many (or even most) cases of interest. Therefore it is important to develop methods that can take appropriate account of such heterogeneous environments. Finally, we note that one area that we have not covered in this chapter is parameter inference and model selection. This area involves using statistical methods to find model parameters from experimental data, and to discriminate between those models best fit this data [139,140,141]. This is typically not an easy task, as the data may be noisy, missing or sparse; Bayesian approaches offer a way of addressing these issues [142].