1 Introduction

Stochastic reaction networks are now commonly utilized to model various types of systems in the biological sciences. These mathematical models are often continuous-time Markov chains and are used when the counts of at least some of the underlying “species,” which are most commonly different molecule types, are low. In this low copy-number case, the state of the model is a vector giving the integer counts of the different species and transitions are governed by the different possible “reactions” that can take place. These models are typically simulated via the Gillespie algorithm (Gillespie 1976, 1977) or the next reaction method (Gibson and Bruck 2000; Anderson 2007). See Anderson and Kurtz (2015), and references therein, for more on this type of model.

One potential drawback to the standard model is that it assumes a homogeneous environment. There are multiple ways to generalize, however. One common generalization is to split the environment into different fixed pieces (often called “voxels”) and then allow for transitions between adjacent voxels Isaacson (2013); Popovic et al. (2011). Thinking of the size of the voxels going to zero leads naturally to a model with continuous space in which the state of the system is given by the type, position, velocity, etc. of each particle in the system. A reaction can then only take place when the necessary constituent molecules are near each other (with the precise mechanism for defining when they are “near enough” left to the modeler). One of the first examples of such a continuous space model was introduced by Doi (1976). More generally, there are a whole class of continuous space models known as reaction-diffusion models. For a brief overview of such models, see Erban and Othmer 2014. For a comparison of two specific such models, with an approachable introduction, see Agbanusi and Isaacson (2014); for a more general approach, see the introduction of del Razo et al. (2023).

A different approach to generalize from the homogeneous case is to imagine some fixed collection of compartments and model the dynamics within each compartment in the usual way (as a continuous-time Markov chain as described in the first paragraph above) while also allowing for interactions between adjacent compartments. This is the approach taken in McKane and Newman (2004) in an ecological context (their “patches" are our “compartments"). However, ideally one might like to also account for situations like in biological tissue, where reactions take place in cells that are not static but, for example, can appear, divide, possibly merge, or even be destroyed. That is the approach presented in a recent paper by Duso and Zechner, where they developed a Markov model for stochastic reaction networks within interacting compartments (Duso and Zechner 2020). In particular, their model consists of two basic components:

  1. 1.

    a stochastic model of a chemical reaction network;

  2. 2.

    a dynamic model of compartments, or cells, which themselves undergo basic transitions such as (i) arrivals, (ii) departures, (iii) mergers, and (iv) divisions. In the context of Duso and Zechner (2020), these four transition types are referred to as inflows, exits, coagulations, and fragmentations, respectively.

Each compartment, or cell, contains a copy of the (evolving) chemical reaction network. When two cells merge, their contents are combined. When a cell divides, its contents are randomly split among the two new daughter cells. Beyond the framework itself, their paper focuses on the framework’s practical use, using moment closure methods to derive estimates for various population statistics which are then validated by simulation. They also derive stationary distributions for some special cases.

In the present paper, we attempt to lay the groundwork for exploration of mathematical questions about the Markov chain model developed in Duso and Zechner (2020). We focus on the special case where the compartments can only enter, leave, merge, and divide, all according to mass action kinetics and unaffected by their contents. Questions pertaining to recurrence, transience, and explosivity are all considered. We show that in most, but not all, parameter regimes the overall qualitative behavior of the model (i.e., recurrence, transience, or explosivity) is the same as that of one of the associated stochastic reaction networks. We also analyze myriad examples that, taken together, demonstrate some of the non-intuitive (and interesting) possible behaviors of the model. Moreover, we derive the stationary distribution for the model in the case where the chemistry inside the compartments is well understood in the sense that a formula for the distribution is known for all time (e.g., the DR models of Anderson et al. (2020)) and the compartments themselves are not allowed to interact (but are not totally static, being allowed to enter and leave the system). Two special cases of this stationary distribution are provided as illustration, both of which generalize formulas from an example in Duso and Zechner (2020).

Before moving on, we warn the reader that in the field of epidemiology, the term “compartment model" has a different meaning. There the compartments are what we would call species. For example, they would speak of an SIR model as dividing individuals into a susceptible compartment, an infected compartment, and a recovered compartment. See e.g. (Brauer 2008).

A standard knowledge of continuous-time Markov chains is assumed. See for example Norris (1997) for a detailed introduction to the topic. For notational convenience, we will use the following shorthand notations: for any two vectors \(v,w \in \mathbb R^d_{\ge 0}\) and any vector \(x,y\in \mathbb Z^d_{\ge 0}\) we denote

$$\begin{aligned} v^w = \prod _{i=1}^d (v_i)^{w_i} \quad \text {and} \quad x!= \prod _{i=1}^d(x_i)!\quad \text {and} \quad \left( {\begin{array}{c}x\\ y\end{array}}\right) =\prod _{i=1}^d\left( {\begin{array}{c}x_i\\ y_i\end{array}}\right) , \end{aligned}$$

with the conventions that \(0^0=1\) and that \(\left( {\begin{array}{c}x\\ y\end{array}}\right) =0\) for \(y<0\) or \(y>x\). Moreover, we will always use d to represent the number of species in the model. Finally, for \(x \in \mathbb Z^d_{\ge 0}\) we define \(e_x: \mathbb Z^d_{\ge 0} \rightarrow \mathbb Z\) to be the function taking the value of one at x and zero otherwise.

The remainder of the paper is outlined as follows. In Sect. 2, we fully specify the model. Further, we give two different mathematical representations that are both useful and prove some first basic properties. In the brief Sect. 3, we prove that the full model is explosive if and only if the associated reaction network is. In Sect. 4, we give conditions for when the full model is recurrent, positive recurrent, or transient. Finally, in Sect. 5, we provide the stationary distribution for a special class of models.

2 The Reaction Network Within Interacting Compartments (RNIC) Model

As discussed in the introduction, the full model we consider here consists of two sub-models: (i) a stochastic reaction network and (ii) a dynamic model of compartments, or cells, each of which contains an evolving copy of the stochastic reaction network. We first describe these sub-models individually and then specify how they are combined to make the full model.

2.1 Stochastic Reaction Networks

Suppose we have a finite set \(\mathcal S\), whose elements we shall call species, and a directed graph whose vertices are unique linear combinations of species with non-negative integer coefficients. The edges of the graph are called reactions; let \(\mathcal R\) denote the set of reactions. The linear combinations which appear as vertices in the graph are called complexes; the set of complexes will be denoted \(\mathcal C\). A chemical reaction network (or just reaction network; CRN for short) is the tuple \(\mathcal I= (\mathcal S,\mathcal C,\mathcal R)\), where \(\mathcal S\), \(\mathcal C\) and \(\mathcal R\) are as above. See Fig. 1 for an example reaction network.

Fig. 1
figure 1

The CRN with species A and B and reactions \(A+B\rightarrow 0\), \(0\rightarrow B\), \(B\rightarrow 0\), \(2B\rightarrow 0\), and \(A+2B\rightarrow A\). Note that 0 here denotes the linear combination \(0A+0B\)

When talking about specific reaction networks, the species will usually be represented by capital Latin letters. When talking generally, there will be d species \(S_1,\dots ,S_d\). In this case we will identify \(\mathbb Z^d\) with the space of linear combinations of species with integer coefficients. That is, we naturally identify \(\nu \in \mathcal C\) with the vector in \(\mathbb Z^d\) whose ith element is the coefficient of \(S_i\) in \(\nu \). We will speak of reactions \(\nu \rightarrow \nu '\in \mathcal R\), or sometimes, when we wish to enumerate the reactions as \(\{\nu _r \rightarrow \nu _r'\}\), we will simply write \(r\in \mathcal R\).

There are multiple ways to associate a mathematical model to a given reaction network, including the use of a deterministic ODE (Shinar and Feinberg 2010), a diffusion process (Leite and Williams 2019; Anderson et al. 2019), and a continuous-time Markov chain (Anderson and Kurtz 2015). The only one of concern to us here is the continuous-time Markov chain model with stochastic mass-action kinetics, in which the state of the system is a vector giving the number of each species present and transitions are determined by the reactions. To fully specify the model, positive (or sometimes, merely non-negative) numbers, called rate constants, are assigned to each reaction. If the reaction \(\nu \rightarrow \nu '\) has rate constant \(\kappa \), then in state x that particular reaction occurs with rate \(\kappa \left( {\begin{array}{c}x\\ \nu \end{array}}\right) \) and when it occurs the chain transitions to state \(x+\nu '-\nu \). So the reactions will happen with rate proportional to the number of ways the chemicals can combine to allow them to happen, and \(\kappa \) is the constant of proportionality. If \(\mathcal K\) is a set of rate constants, one for each reaction, we denote by \(\mathcal I_{\mathcal K} = (\mathcal S,\mathcal C,\mathcal R,\mathcal K)\) the corresponding stochastic mass-action system. If we let \(\kappa _{\nu \rightarrow \nu '}\) be the rate constant for the reaction \(\nu \rightarrow \nu '\), then the Markov chain transitions from state \(x\in \mathbb Z^d_{\ge 0}\) to state \(y\in \mathbb Z^d_{\ge 0}\) with rate

$$\begin{aligned} q(x,y) = \mathop {\sum _{\nu \rightarrow \nu '\in \mathcal R}}_{\nu '-\nu =y-x}\kappa _{\nu \rightarrow \nu '} \left( {\begin{array}{c}x\\ \nu \end{array}}\right) =\mathop {\sum _{\nu \rightarrow \nu '\in \mathcal R}}_{\nu '-\nu =y-x}\kappa _{\nu \rightarrow \nu '}\prod _{j=1}^d\left( {\begin{array}{c}x_j\\ \nu _j\end{array}}\right) \end{aligned}$$
(1)

where the sum is over those reactions for which \(\nu '-\nu =y-x\). For \(r=\nu _r\rightarrow \nu '_r\in \mathcal R\), we denote the rate of the reaction r in state \(x\in \mathbb Z_{\ge 0}^d\) by \(\lambda _r(x)\):

$$\begin{aligned} \lambda _r(x)=\kappa _{r} \left( {\begin{array}{c}x\\ \nu _r\end{array}}\right) \end{aligned}$$
(2)

Note that \(\lambda _{\nu \rightarrow \nu '}(x)=0\) if \(x_i < \nu _i\) for some i, since \(\left( {\begin{array}{c}m\\ k\end{array}}\right) =0\) for \(k>m\). Note also that not all authors take the same conventions as we do here. In fact, the convention we use here pertaining to our rate constants is more in line with the biology literature (Wilkinson 2018). In the mathematical literature it is more common to use a falling factorial \(\lambda _{\nu \rightarrow \nu '}(x)=\kappa _{\nu \rightarrow \nu '}\prod _j (x_j)(x_j-1)\cdots (x_j-\nu _j+1) = \kappa _{\nu \rightarrow \nu '} \frac{x!}{(x-\nu )!}\), at the cost that their rate constant \(\kappa \) is no longer the constant of proportionality when the reaction takes multiple inputs (Anderson and Kurtz 2015). This choice plays no fundamental role in our results, but makes certain expressions cleaner in the present context.

We note here that many of the results found in this paper can be generalized to systems with kinetics, i.e., rate functions \(\lambda _r\), other than mass-action. See Remark 4.7.

Put more succinctly, we have a Markov process on \(\mathbb Z^d_{\ge 0}\) with infinitesimal generator

$$\begin{aligned} \mathcal {L}f(x) = \sum _{r \in \mathcal R} \lambda _{r}(x) (f(x+\nu _r' - \nu _r)-f(x)), \end{aligned}$$

where \(\lambda _r\) is determined via (2), and the above is valid for all functions f that are compactly supported (Ethier and Kurtz 2009). The Kolmogorov forward equation, often called the chemical master equation in the context of reaction networks, is then

$$\begin{aligned} \frac{d}{dt} P_\mu (x,t) = \sum _{r\in \mathcal R} \lambda _r(x-(\nu _r'-\nu _r)) P_\mu (x-(\nu _r'-\nu _r),t) - \sum _{r\in \mathcal R} \lambda _r(x) P_\mu (x,t), \end{aligned}$$

where \(P_\mu (x,t) = P_\mu (X(t) = x)\) is the probability the process X is in state \(x\in \mathbb Z^d_{\ge 0}\) at time t, given an initial distribution of \(\mu \). We take the convention that \(P_\mu (x,t) = 0\) for \(x\notin \mathbb Z^d_{\ge 0}\).

One way to represent the solution to the stochastic model described above is via a representation developed and popularized by Thomas Kurtz. Let \(\{Y_{r}\}_{r\in \mathcal R}\) be a collection of independent, unit-rate Poisson processes, one for each possible reaction, and let \(X(t), t\ge 0\), be the solution to

$$\begin{aligned} X(t) = X(0) + \sum _{r \in \mathcal R} Y_r \left( \int _0^t \lambda _r(X(s)) ds\right) (\nu _r'-\nu _r), \end{aligned}$$
(3)

then X is a continuous-time Markov chain that satisfies the conditions of the model specified above (Anderson and Kurtz 2015; Ethier and Kurtz 2009; Kurtz 1978).

Example 2.1

Suppose we assign rate constants to the example CRN in Fig. 1 as follows:

figure a

Let \(x=(a,b)\in \mathbb Z_{\ge 0}^2\) denote an arbitrary state of the system. For the particular choice of rate constants given above the positive transition rates \(q((a,b),\cdot )\), for \(a, b \in \mathbb Z_{\ge 0}\), are

Reaction(s)

Transition

Rate

\(A+B\rightarrow 0\)

\((a,b)\mapsto (a-1,b-1)\)

10ab

\(0\rightarrow B\)

\((a,b)\mapsto (a,b+1)\)

2

\(2B\rightarrow 0\) and \(A+2B\rightarrow A\)

\((a,b)\mapsto (a,b-2)\)

\(\displaystyle 6\frac{b(b-1)}{2}+8a\frac{b(b-1)}{2}\)

\(B\rightarrow 0\)

\((a,b)\mapsto (a,b-1)\)

\(\kappa b\)

We chose to write \(6\frac{b(b-1)}{2}+8a\frac{b(b-1)}{2}\) instead of \(3b(b-1)+4ab(b-1)\) to emphasize our choice of intensity functions. Note that all other rates, such as \(q((a,b),(a+1,b))\) or \(q((a,b),(a+12,b-3))\), are zero. \(\square \)

2.2 Compartment Model

Having fully specified our CRN, \(\mathcal I_{\mathcal K} = (\mathcal S,\mathcal C,\mathcal R,\mathcal K)\), we turn to our next sub-model: the compartment model. As mentioned in the introduction, we will assume that compartments, or cells, can arrive, depart, merge, and divide. We can use the notation of chemical reaction networks to describe the four possibilities visually via a reaction network,

figure b

with \(0\rightarrow C\) representing arrivals, \(C\rightarrow 0\) representing departures, \(C \rightarrow 2C\) representing division, or fragmentation, and \(2C \rightarrow C\) representing mergers, or coagulations. Moreover, we assume that the stochastic model tracking the number of compartments behaves as a standard stochastic reaction network as already described in the previous section (however, see Remark 4.7 for an allowable generalization to the choice of kinetics). We will term this reaction network the compartment network, and denote it by \(\mathcal H=(\mathcal S_{\text {comp}},\mathcal C_{\text {comp}},\mathcal R_{\text {comp}})\). Note that \(\mathcal S_{\text {comp}} = \{C\}\) and \(\mathcal C_{\text {comp}}\) is a subset of \(\{0,C,2C\}\) (depending on which rate constants are non-zero). If rate constants are added as follows,

figure c

where each \(\kappa _E, \kappa _I, \kappa _C, \kappa _F\ge 0\), then we will denote the corresponding stochastic mass-action system by \(\mathcal H_\mathcal K=(\mathcal S_{\text {comp}},\mathcal C_{\text {comp}},\mathcal R_{\text {comp}},\mathcal K_{\text {comp}})\). According to (3), if we denote by \(M_C(t)\) the number of compartments at time t, then one way to represent this model is as the solution to

$$\begin{aligned} M_C(t)&= M_C(0) + Y_I\left( \kappa _I t\right) - Y_E\left( \int _0^t \kappa _E M_C(s) ds\right) + Y_F\left( \int _0^t \kappa _F M_C(s) ds\right) \\&\hspace{.5in}- Y_C\left( \int _0^t \kappa _C \frac{M_C(s) (M_C(s) - 1)}{2}\, ds\right) , \end{aligned}$$

where \(Y_I, Y_E, Y_F\), and \(Y_C\) are independent unit-rate Poisson processes.

2.3 Specifying the Full, Combined Model

Our full model, which we will term a reaction network within interacting compartments (RNIC), begins with two networks, one representing the dynamics of the compartments themselves and one representing the chemistry taking place inside the compartments.

  • A CRN \(\mathcal H_\mathcal K\) of the form \(0\leftrightarrows C\leftrightarrows 2C\), called the compartment network. The state of this CRN (in \(\mathbb Z_{\ge 0}\)) will be the number of compartments.

  • An CRN \(\mathcal I_\mathcal K\), called the chemistry (or \(\underline{Internal network}\)), with d species.

The behavior of the model between transitions of the compartment model is straightforward: the CRN within each compartment evolves independently as a Markov chain with transition rates specified by (1). All that remains is to specify what happens to the full model at the transition times of the compartment model. Hence, there are four cases to consider.

  • An arrival: \(0\rightarrow C\). We assume the existence of a probability measure \(\mu \) on \(\mathbb Z^d_{\ge 0}\). Each time an arrival event occurs, we add a new compartment whose initial state is chosen according to \(\mu \), independent of the past. (Note that \(\mu \) is not necessary when \(\kappa _I=0\).)

  • A departure: \(C\rightarrow 0\). When a departure event occurs, we choose one of the compartments, uniformly at random, for deletion.

  • A merger: \(2C \rightarrow C\). When a merger event occurs, we select two compartments, uniformly at random. We replace the chosen compartments with a single compartment. The state of the new compartment is the sum of the states of the two it replaced.

  • A division: \(C \rightarrow 2C\). When a division event occurs, we select a compartment, uniformly at random. We replace the chosen compartment with two new compartments, whose initial states are determined by having each molecule from the chosen compartment select one of the two new compartments uniformly. For example, if there are \(n_A\) type A species in the chosen compartment, then one of the new compartments will get a number of A molecules given by a binomial distribution with parameters \(n_A\) and \(p = \tfrac{1}{2}\), and the other compartment will get \(n_A\) minus that value.

This whole system will be denoted \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\).

Remark 2.2

Above, we assume that when divisions, i.e., compartment transitions of the form \(C\rightarrow 2C\), happen, each molecule picks a new compartment uniformly at random. This assumption makes the constructions in this paper easier. However, our proofs only require that the total number of each species across compartments is preserved when each division happens.

Similar to our network representations for reaction networks, we can specify the above model through a picture of the following form:

figure d

where “\(\mathcal I_\mathcal K\)” is a stand-in for a standard CRN diagram, such as the one in (4).

Example 2.3

If \(\mathcal I_\mathcal K\) is exactly the network diagrammed in Example 2.1 and \(\mu \) is the point mass with 3 molecules of A and 4 molecules of B, we would write

\(\square \)

See also Example 2.8 for another specific example.

There are multiple avenues for generalizations. For example, when a merger occurs it could be that not all the molecules make it into the new compartment, or when a division occurs it could be that some molecules are lost, or there is a non-uniform mechanism for distributing the molecules. Moreover, it could be that the rate of compartment fragmentation or exit depends on the internal state of the compartment. These models all fall under the more general framework given in Duso and Zechner (2020) and could be studied mathematically in the future if there is a desire, but for the initial development of the mathematics we choose to keep things simpler.

2.3.1 Simulation Representation

There are multiple ways to describe a Markov model satisfying the information given in the ingredients \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\). The first we give is what we term a “simulation” representation in which we enumerate the compartments and track the counts of the species in each compartment.

The simulation representation will be a Markov chain \(F^{\textrm{sim}}\) whose state is a finite vector of elements of \(\mathbb Z^d_{\ge 0}\), where d, as always, is the number of species. We first describe the model via an example. Afterwards we will provide the mathematical details.

Example 2.4

Consider again the model from Example 2.3. Suppose that at time T there are 4 compartments, where the first has two A and two B, the second has no A and one B, the third again has two of each, and the last has one A and twelve B. Then the state of the model \(F^{\textrm{sim}}\) would be the vector

$$\begin{aligned} \left( \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 0\\ 1\end{array} \right] , \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 1\\ 12\end{array} \right] \right) . \end{aligned}$$

We now suppose that at time T a transition occurs. We first consider four possibilities if the transition is due to a reaction of the compartment model.

  • Suppose first that the compartment transition is an inflow event. We will make the convention that the new compartment due to an inflow reaction will always be placed at the end of the vector of states. Hence, because the initial distribution for arriving compartments is a point mass at (3, 4) the new state of the full system is

    $$\begin{aligned} \left( \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 0\\ 1\end{array} \right] , \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 1\\ 12\end{array} \right] , \left[ \begin{array}{c} 3\\ 4\end{array} \right] \right) . \end{aligned}$$
  • Next suppose that the compartment transition is an exit event. In this case we must choose a compartment at random, delete it from the vector, and re-index the other components. Thus, we start by choosing from \(\{1,2,3,4\}\), each with probability 1/4. Suppose that the value 3 is chosen so that the third compartment will be deleted. In this case, the new state of the full system is

    $$\begin{aligned} \left( \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 0\\ 1\end{array} \right] , \left[ \begin{array}{c} 1\\ 12\end{array} \right] \right) . \end{aligned}$$
  • Now suppose that the compartment transition is a merger, or coagulation. Now we must select two compartments at random and combine their contents. We will always choose that the combined contents of the compartments will be placed within the compartment with the lower index and will delete the compartment with the higher index. Thus, assuming we choose the compartments indexed 1 and 2, we then merge the first and second compartments and place their contents into compartment 1 (since it has the smaller index of the two chosen) and then delete the second compartment. The resulting state is

    $$\begin{aligned} \left( \left[ \begin{array}{c} 2\\ 3\end{array} \right] , \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 1\\ 12\end{array} \right] \right) . \end{aligned}$$
  • Finally, we suppose that the compartment transition is a fragmentation. The procedure will be as follows. We will first choose the index of the compartment that fragments, we then create two new compartments and will then split the contents between these new compartments (with each particular molecule choosing between the new compartments with equal probability). The originally chosen compartment will be deleted and the two new compartments will be placed at the end of the vector of states.

    For example, suppose we choose compartment 3 for fragmentation (which occurs with probability 1/4). We then split the contents of the original third compartment (four molecules total, 2 of A and 2 of B) uniformly at random between the two new compartments. Suppose for concreteness that we split as \(\left[ \begin{array}{c} 1\\ 2\end{array}\right] \) and \(\left[ \begin{array}{c} 1\\ 0\end{array}\right] \). Then, after deleting the 3rd compartment and adding these two onto the end we have a new state for the full model of

    $$\begin{aligned} \left( \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 0\\ 1\end{array} \right] , \left[ \begin{array}{c} 1\\ 12\end{array} \right] , \left[ \begin{array}{c} 1\\ 2\end{array}\right] , \left[ \begin{array}{c} 1\\ 0\end{array}\right] \right) . \end{aligned}$$

It is also possible that the transition at time T was due to a reaction taking place within one of the compartments. For example, if the reaction \(A+2B\rightarrow A\) happens inside the fourth compartment, then the state of the whole system, \(F^{\textrm{sim}}\), will become

$$\begin{aligned} \left( \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 0\\ 1\end{array} \right] , \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 1\\ 10\end{array} \right] \right) . \end{aligned}$$

\(\square \)

Now we give the formal mathematical description of \(F^{\textrm{sim}}\). First, let \(\{M_C(t)\}_{t\ge 0}\) be the Markov chain associated to the compartment network \(\mathcal H_\mathcal K\). Then \(M_C(t)\) will be the number of compartments at time t. Let \(\{T_i\}_{i=0}^\infty \) be the jump times for this Markov chain, where \(T_0=0\). For any \(i\ge 0\) and any \(j=1,\dots ,M_C(T_i)\), let \(\{X^i_j(t)\}_{t\in [T_i,T_{i+1}]}\) be realizations of the Markov chain associated to \(\mathcal I_\mathcal K\) with initial distributions (at time \(T_i\)) specified below. Suppose that for any \(i_1,i_2\) and \(j_1,j_2\) with either \(i_1\ne i_2\) or \(j_1\ne j_2\), the chains \(X^{i_1}_{j_1}\) and \(X^{i_2}_{j_2}\) are independent conditional on their initial conditions, and suppose that the initial distributions are chosen in the following manner (which are just formal characterizations of the details provided in the example above):

  • If the compartment transition at time \(T_{i+1}\) was an inflow event (\(0\rightarrow C\)), then let \(X^{i+1}_j(T_{i+1})=X^i_j(T_{i+1})\) for \(j=1,\dots ,M_C(T_i)\), and for \(j=M_C(T_{i+1})=M_C(T_i)+1\) let \(X^{i+1}_j(T_{i+1})\) be distributed according to \(\mu \), independently of everything in the past.

  • If the compartment transition at time \(T_{i+1}\) was an exit event \((C\rightarrow 0\)), then let \(J_i\) be chosen uniformly at random from \(\{1,\cdots ,M_C(T_i)\}\), independently of everything in the past. Let \(X^{i+1}_j(T_{i+1})=X^i_j(T_{i+1})\) for \(j<J_i\), and let \(X^{i+1}_j(T_{i+1})=X^i_{j+1}(T_{i+1})\) for \(j\ge J_i\).

  • If the compartment transition at time \(T_{i+1}\) was a merger, or coagulation, event (\(2C\rightarrow C\)), then let \(J^1_i\) and \(J^2_i\) be chosen uniformly at random from \(\{1,\cdots ,M_C(T_i)\}\) and \(\{1,\cdots ,M_C(T_i)\}\setminus \{J^1_i\}\), respectively, independent of everything in the past. Let \(X^{i+1}_j(T_{i+1})=X^i_j(T_{i+1})\) for \(j<\max \{J^1_i,J^2_i\}\) with \(j\ne \min \{J^1_i,J^2_i\}\), let \(X^{i+1}_j(T_{i+1})=X^i_{j+1}(T_{i+1})\) for \(j\ge \max \{J^1_i,J^2_i\}\), and let \(X^{i+1}_j(T_{i+1})=X^i_{J^1_i}(T_{i+1})+X^i_{J^2_i}(T_{i+1})\) for \(j=\min \{J^1_i,J^2_i\}\).

  • If the compartment transition at time \(T_{i+1}\) was a fragmentation event (\(C\rightarrow 2C\)), then let \(J_i\) be chosen uniformly at random from \(\{1,\cdots ,M_C(T_i)\}\), independently of everything in the past. Let \(\{Z^i_k(x):x\in \mathbb Z^d,k=1,\dots ,d\}\) be a collection of random variables, independent of each other and everything else, with \(Z^i_k(x)\sim \textrm{Binom}\left( 0.5,x_k\right) \). Let \(Z^i(x)\) denote the vector \(\big (Z^i_1(x),\cdots ,Z^i_d(x)\big )\). Let \(X^{i+1}_j(T_{i+1})=X^i_j(T_{i+1})\) for \(j<J_i\), let \(X^{i+1}_j(T_{i+1})=X^i_{j+1}(T_{i+1})\) for \(j=J_i,\dots ,M_C(T_i)-1\), and for \(j=M_C(T_i)\) let \(X^{i+1}_j(T_{i+1})=Z^i(X^i_{J_i}(T_{i+1}))\) and \(X^{i+1}_{j+1}(T_{i+1})=X^i_{J_i}(T_{i+1})-X^{i+1}_j(T_{i+1})\).

Let \(F^{\textrm{sim}}(t)\) be the vector \(\left( X^i_1(t),X^i_2(t),\cdots ,X^i_{M_C(t)}(t)\right) \), where i is such that \(T_i\le t<T_{i+1}\).

Lemma 2.5

The process \(\{F^{\textrm{sim}}(t)\}_{t\ge 0}\) is a continuous time Markov chain with state space \(\bigcup _{m\ge 0}\left( \mathbb Z_{\ge 0}^d\right) ^m\), the space of finite tuples of elements of \(\mathbb Z_{\ge 0}^d\).

Proof

To show that this is a Markov process we have to show that the holding times are exponential and the updates are independent of the holding times. To see that the holding times are exponential, notice that since \(M_C\) is a Markov chain it has exponential holding times, and similarly for each \(X^i_j\). But the holding times for these processes are independent, and the minimum of independent exponential random variables is itself exponential.

Furthermore, the minimum of a (finite) collection of independent exponential random variables is independent of the index at which the minimum occurs, so the updates are indeed independent of the holding times.

The fact that \(F^{\textrm{sim}}\) takes values in the space of finite tuples is equivalent to \(M_C\) being finite for all time, which in turn is equivalent to the fact that \(M_C\) is not explosive, regardless of the choice of rate constants in \(\mathcal H_\mathcal K\). This is a standard result in the theory of 1-d mass action stochastic reaction networks; see for instance (Xu et al. 2022). \(\square \)

2.3.2 An Explicit Construction of the Simulation Representation

We discuss one way of constructing the model described in Sect. 2.3.1, in the spirit of the Kurtz representation (3). Here, by “construction” we mean an explicit detailing of the random processes and random variables needed to generate a single realization of the process. The construction is of interest since it is amenable to analysis, coupling methods, simulation methods, etc. The construction will be used later in this paper to verify some behaviors of Example 4.23.

Let \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) be as above. Suppose that \(M_C(0)\) is the initial number of compartments in the system and further suppose that \(M_C\) is given as the solution to

$$\begin{aligned} \begin{aligned} M_C(t)&= M_C(0) + Y_I\left( \kappa _I t\right) - Y_E\left( \int _0^t \kappa _E M_C(s) ds\right) + Y_F\left( \int _0^t \kappa _F M_C(s) ds\right) \\&\quad - Y_C\left( \int _0^t \kappa _C \frac{M_C(s) (M_C(s) - 1)}{2}\, ds\right) , \end{aligned} \end{aligned}$$
(6)

where \(Y_I, Y_E, Y_F\), and \(Y_C\) are independent unit-rate Poisson processes. Then \(M_C\) is the Markov chain on \(\mathbb Z_{\ge 0}\) associated to \(\mathcal H_\mathcal K\), so that \(M_C(t)\) gives the number of compartments at any time \(t \ge 0\).

The jump times of the counting processes \(R_I(t) = Y_I\big ( \kappa _I t\big )\), \(R_E(t) = Y_E\big ( \int _0^t \kappa _E M_C(s) ds\big )\), \(R_F(t) = Y_F\left( \int _0^t \kappa _F M_C(s) ds\right) \), and \(R_C(t) {=} Y_C\Big ( \int _0^t \kappa _C \frac{M_C(s) (M_C(s) {-} 1)}{2}\, ds\Big )\) determine when the RNIC model transitions due to changes in the count of the compartments. To each such transition we will also require a collection of random variables needed to carry out the updates in the RNIC model. We detail these random variables below. Before proceeding, we remind the reader that a uniform random variable can be used to generate a sample from a given distribution in a number of ways. For some standard transformation methods see, for example, (Anderson et al. 2017, Sect. 5.2).

In the description below all random variables are independent of each other and of the Poisson processes \(Y_I, Y_E, Y_F,Y_C\). We require:

  • A collection of independent uniform random variables \(\{u_i^I\}\), \(i = 1,2,\dots \). When \(R_I(T) - R_I(T-) = 1\), the random variable \(u_{R_I(T)}^I\) is used to generate a sample from \(\mu \).

  • A collection of independent uniform random variables \(\{u_i^E\}\), \(i = 1,2, \dots \). When \(R_E(T) - R_E(T-) = 1\), the random variable \(u_{R_E(T)}^E\) is used to determine which compartment exits at that time.

  • Two collections of independent uniform random variables: (i) \(\{u_i^{F}\}\), \(i = 1,2, \dots \), and (ii) an array \(\{{\hat{u}}_{i,j}^F\}\), \(i,j\in \{1,2,\dots \}\). When \(R_F(T) - R_F(T-) = 1\), the random variable \(u_{R_F(T)}^F\) is used to determine which compartment fragments. We then utilize the finite collection \(\{{\hat{u}}^F_{R_F(T),j}\}\), \(j = 1,\dots , M\), where M is the total number of molecules in the chosen compartment, to divide the different molecules between the two new cells.

  • A collection of independent uniform random variables \(\{u_i^C\}\), \(i = 1,2, \dots \). When \(R_C(T) - R_C(T-) = 1\), the random variable \(u_{R_C(T)}^C\) is used to determine which two compartments are chosen to merge.

Note that the collections detailed above are chosen before a realization is generated. Said differently, the realization of the RNIC model is a function of these independent random variables.

All that remains is to give the timing of the different chemical reactions. One method is the following. Let \(\{Y_{r}\}_{r\in \mathcal R}\) be a collection of independent (of each other, and all other random objects so far), unit-rate Poisson processes, one for each possible reaction in \(\mathcal I_\mathcal K\). Moreover, for each \(r \in \mathcal R\), let \(\{u^r_i\}\), \(i=1,2,\dots \) be a collection of independent uniform random variables. Then, for \(r \in \mathcal R\), we let

$$\begin{aligned} R_r(t) = Y_r \left( \mathop {\sum _{i\ge 0}}_{T_i\le t}\int _{T_i}^{T_{i+1}\wedge t}\sum _{j=1}^{M_C(T_i)} \lambda _r(X^i_j(s)) ds\right) , \end{aligned}$$

where the \(T_i\) are the jump times of the process \(M_C\), \(X_j^i(s)\) is the state of the process in compartment j at time s, and \(\lambda _r\) is given as in (2). Then \(R_r\) is the counting process that jumps by \(+1\) when the rth reaction takes place in some compartment. When \(R_r(T) - R_r(T-) = 1\), meaning a reaction has taken place somewhere, we use \(u^r_{R_r(T)}\) to determine the compartment within which the reaction took place. In particular, the probability that it took place in compartment k is simply

$$\begin{aligned} \frac{\lambda _r(X_k^i(T-))}{\sum _{j=1}^{M_C(T_i)} \lambda _r(X_j^i(T-))}. \end{aligned}$$

2.3.3 A Coarse-Grained Representation

While the description (and construction) above is often convenient for the sake of analysis and simulation, it is sometimes not the most natural way to think about these models. For example, suppose we have a model with a single species, denoted S, and for which there are two compartments at time t, so that \(M_C(t) = 2\). It is reasonable to think that we would not care to distinguish the situation in which there are 6 molecules of species S in the first compartment and 2 in the second, which is the state (6, 2), versus the situation of 2 molecules of S in the first and 6 in the second, which is the state (2, 6). In this situation, we would simply care that we have one compartment with two S molecules, another with six, and there are no other compartments.

To handle this, we consider a function \(n: \mathbb Z_{\ge 0} \rightarrow \mathbb Z_{\ge 0}\) in which \(n_x:=n(x)\) gives the number of compartments present with precisely x molecules of S (hence the notation that “n” gives the \(\underline{n}\)umber of compartments with different counts). In this case, the state of the example system described above would simply be the function with

$$\begin{aligned} {n_x} = \left\{ {\begin{array}{*{20}{l}} {1,}&{}{\mathrm{{ if }}x = 2}\\ {1,}&{}{\mathrm{{ if }}x = 6}\\ {0,}&{}{\mathrm{{else}}.} \end{array}} \right. \end{aligned}$$

Note that in this one-dimensional case we can also think of n as an “infinite vector.” For example, in our example above we would have

$$\begin{aligned} n = (0,0,1,0,0,0,1,0,0,\dots ), \end{aligned}$$

with only zeros continuing on.

For another example, we could consider the case discussed in Example 2.4, where there are two species A and B and the state for the simulation representation was

$$\begin{aligned} \left( \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 0\\ 1\end{array} \right] , \left[ \begin{array}{c} 2\\ 2\end{array} \right] , \left[ \begin{array}{c} 1\\ 12\end{array} \right] \right) . \end{aligned}$$

In this case, the state could naturally be described by the function

Note that in this example, it is not natural to view n as an “infinite vector.” Instead, it would be natural to view it as an “infinite array” with a two in the (2, 2) component, ones in the (0, 1) and (1, 12) components, and zeros elsewhere.

Thus, we may take the following approach, as done in Duso and Zechner (2020). The state space of the coarse-grained model will be

$$\begin{aligned} \begin{aligned} \mathcal N:={}&\{ \text {functions } n: \mathbb Z^d_{\ge 0} \rightarrow \mathbb Z_{\ge 0} \text { with compact support}\}\\ ={}&\{ \text {functions } n: \mathbb Z^d_{\ge 0} \rightarrow \mathbb Z_{\ge 0} \text { with finite support}\}\\ ={}&\{ \text {functions } n: \mathbb Z^d_{\ge 0} \rightarrow \mathbb Z_{\ge 0} \text { with finite }\ell ^1\text { norm}\}, \end{aligned} \end{aligned}$$
(7)

where we observe that all three sets are the same. Given \(n:\mathbb Z_{\ge 0}^d\rightarrow \mathbb Z_{\ge 0}\), we write \(n=(n_x)_{x\in \mathbb Z^d_{\ge 0}}\). For each possible state \(x \in \mathbb Z^d_{\ge 0}\) of the chemistry, \(n_x\in \mathbb Z_{\ge 0}\) represents the number of compartments whose chemistry has that particular state. Given Markov chains \(M_C\) and \(X^i_j\) as defined in Sect. 2.3.1, let N be the process where \(N_x(t)\) is the number of compartments in state \(x\in \mathbb Z^d_{\ge 0}\) at time \(t \ge 0\):

$$\begin{aligned} N_x(t)=\sum _{i=0}^\infty \mathbb I\{t\in [T_i,T_{i+1})\}\sum _{j=1}^{M_C(T_i)}\mathbb I\{X^i_j(t)=x\}. \end{aligned}$$

Note that the total number of compartments at time \(t\ge 0\) can be recovered from N(t) via

$$\begin{aligned} M_C(t)= \left\Vert N(t)\right\Vert _{\ell ^1}:= \sum _{x\in \mathbb Z^d_{\ge 0}} N_x(t). \end{aligned}$$

Note also that the process N transitions iff \(F^{\textrm{sim}}\) does. This fact is important enough that we state it as a lemma:

Lemma 2.6

Let \(F^{\textrm{sim}}\) and N be as above. Then N undergoes a transition at time t iff \(F^{\textrm{sim}}\) does.

Proof

On the one hand, N is defined as a function of \(F^{\textrm{sim}}\) and so N cannot transition if \(F^{\textrm{sim}}\) does not. On the other hand, all possible transitions of \(F^{\textrm{sim}}\) cause a change in N: If \(F^{\textrm{sim}}\) transitions because \(M_C\) does, then \(\left\Vert N\right\Vert _{\ell ^1}=M_C\) changes, whereas if \(F^{\textrm{sim}}\) changes otherwise then the contents of some single compartment updated, which changes N. \(\square \)

For the lemma below, we recall that for \(x \in \mathbb Z^d_{\ge 0}\) we define \(e_x\) to be the function taking the value of one at x and zero otherwise.

Lemma 2.7

Let N(t) be as defined above. Then \(\{N(t)\}_{t\ge 0}\) is a Markov chain taking values in \(\mathcal N\), defined in (7). Moreover, for \(n\in \mathcal N\), the transitions rates are as follows:

Transition type

 

Rate

Compartment inflow

\(n\mapsto n+e_x\)

\(\kappa _I\mu (x)\)

Compartment exit

\(n\mapsto n-e_x\)

\(\kappa _En_x\)

Compartment coagulation, \(x\ne y\)

\(n\mapsto n+e_{x+y}-e_x-e_y\)

\(\kappa _C n_xn_y\)

Compartment coagulation

\(n\mapsto n+e_{2x}-2e_x\)

\(\kappa _C \displaystyle \left( {\begin{array}{c}n_x\\ 2\end{array}}\right) \)

Compartment fragmentation

\(n\mapsto n-e_{x+y}+e_x+e_y\)

\(\kappa _F n_{x+y}\varphi (x+y,x)\)

(\(x=y\) allowed here)

Internal reaction \(r \in \mathcal R\)

\(n\mapsto n-e_x+e_{x+\nu _r'-\nu _r}\)

\(n_x \kappa _r \displaystyle \left( {\begin{array}{c}x\\ \nu _r\end{array}}\right) \)

where

$$\begin{aligned} \varphi (z,x):=\prod _{k=1}^d\left( {\begin{array}{c}z_k\\ x_k\end{array}}\right) 2^{-z_k} \end{aligned}$$

so that the distribution of the resulting compartments after a fragmentation is independently binomial in each species. Note that each row mentioning x or y corresponds to an infinite family of transitions and in the last row \(r\in \mathcal R\) also ranges over all reactions of the reaction network \(\mathcal I\).

Proof

The fact that N has finite support follows from the fact that \(F^{\textrm{sim}}\) is always a finite tuple, proved in Lemma 2.5.

The fact that N is Markovian with the rates given follows from consideration of the infinitesimal behavior of \(F^{\textrm{sim}}\). For example, for \(x\ne y \in \mathbb Z^d_{\ge 0}\),

$$\begin{aligned} \mathbb P(N(t+h) = n+e_{x+y}-e_x - e_y | N(t) = n)&= \kappa _C n_x n_y h + o(h), \hbox { as}\ h\rightarrow 0, \end{aligned}$$

since, to leading order, the probability that some compartment in state x merges with a compartment in state y in the time interval \([t,t+h)\) is \(\kappa _C n_x n_yh\). The other rows of the table follow similarly. \(\square \)

Example 2.8

Consider the following possible compartment model:

figure e

Here we are keeping track of some chemical S which forms with rate \(\kappa _b\) and degrades with rate \(\kappa _d\). Compartments are allowed to enter with rate \(\kappa _I\), and new compartments that enter this way have either 5 or 17 molecules of S, each with probability 1/2. Compartments can also exit with rate constant \(\kappa _E\), and merge (or coagulate) with rate constant \(\kappa _C\). Since there is only one species, the state space for the chemistry is \(\mathbb Z_{\ge 0}^1=\mathbb Z_{\ge 0}\). As we detail below, we will be assuming mass-action kinetics; in this case that means when the model is in state \(n\in \mathcal N\) the transition rates are given by

Transition type

 

Rate

Compartment inflow

\(n\mapsto n+e_5\)

\(\kappa _I/2\)

Compartment inflow

\(n\mapsto n+e_{17}\)

\(\kappa _I/2\)

Compartment exit

\(n\mapsto n-e_x\)

\(\kappa _En_x\)

Compartment coagulation (\(x\ne y\))

\(n\mapsto n+e_{x+y}-e_x-e_y\)

\(\kappa _C n_xn_y\)

Compartment coagulation

\(n\mapsto n+e_{2x}-2e_x\)

\(\kappa _C \displaystyle \left( {\begin{array}{c}n_x\\ 2\end{array}}\right) \)

S birth

\(n\mapsto n-e_x+e_{x+1}\)

\(\kappa _b n_x\)

S death

\(n\mapsto n-e_x+e_{x-1}\)

\(\kappa _d n_x x\)

As before, each row mentioning x or y corresponds to an infinite family of transitions, one for each \(x\ne y\in \mathbb Z^d_{\ge 0}\), and as always \(e_x\) is the unit vector in direction x. \(\triangle \)

3 Non-explosivity

A Markov Chain is explosive if it can undergo infinitely many transitions in finite time. The formal definition is below (Norris 1997).

Definition 3.1

(Explosivity) Let \(\{X(t)\}_{t\ge 0}\) be a continuous-time Markov chain with countable state space \({\mathbb {S}}\). For each \(m\in \mathbb Z_{\ge 0}\), let \(\tau _m\) be the time of the m-th transition of X (formally, \(\tau _0=0\) and \(\tau _m=\inf \{t>\tau _{m-1}:X(t)\ne X(\tau _{m-1})\}\)), and let \(\tau _\infty = \lim _{m\rightarrow \infty } \tau _m\). We say that X explodes if \(\tau _\infty <\infty \). If there is some state \(x\in {\mathbb {S}}\) such that with positive probability X explodes when started in state x, we say that X is explosive.

We will show that explosivity for the RNIC model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) is determined by explosivity for the internal reaction network \(\mathcal I_\mathcal K\). But to even talk about explosivity for \(\mathcal F\) instead of just the Markov chains \(F^{\textrm{sim}}\) or N, we need the following simple proposition.

Proposition 3.2

Suppose we have a RNIC \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\). Let \(F^{\textrm{sim}}\) and N be the corresponding simulation and coarse-grained representations. Then \(F^{\textrm{sim}}\) is explosive iff N is explosive.

Proof

This is immediate from lemma 2.6, which says that \(F^{\textrm{sim}}\) and N transition at the same times. \(\square \)

In light of the proposition, we will speak merely of \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) being explosive, and check the explosivity of either \(F^{\textrm{sim}}\) or N depending on convenience. As it turns out, it will be most convenient to check explosivity for \(F^{\textrm{sim}}\). (Indeed, the fact that explositivity is more easily checked for \(F^{\textrm{sim}}\) is one of the major reasons for introducing \(F^{\textrm{sim}}\) in the first place.)

Theorem 3.3

Suppose we have a RNIC \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\). Then \(\mathcal F\) is explosive iff \(\mathcal I_\mathcal K\) is explosive.

Proof

First, suppose that \(\mathcal I_\mathcal K\) is explosive. As discussed above, we intend to show that \(F^{\textrm{sim}}\) is explosive. By assumption, there is some \(x\in \mathbb Z^d\) such that when the Markov chain corresponding to \(\mathcal I_\mathcal K\) is started in state x it explodes with positive probability. In particular, there is some finite (nonrandom) time t so that the chemistry undergoes infinitely many transitions before time t with positive probability. Start \(F^{\textrm{sim}}\) in the state with one compartment whose state is x. With positive probability, no compartment transitions happen before time t. But the compartment transition times are independent of what is happening inside them by construction, and the compartment evolves according to \(\mathcal I_\mathcal K\), so on the event that no compartment transition happens before time t the compartment undergoes infinitely many transitions before time t with positive probability. It follows that \(F^{\textrm{sim}}\) is explosive.

Conversely, suppose that \(\mathcal I_\mathcal K\) is not explosive. Note that \(\mathcal H\), the compartment network, is not explosive for any choice of rate constants (see e.g. Xu et al. 2022). So with probability one \(F^{\textrm{sim}}\) undergoes only finitely many compartment transitions in finite time. But between each pair of consecutive compartment transitions there are finitely many compartments each evolving according to \(\mathcal I_\mathcal K\), and by assumption each of these undergoes only finitely many reactions in finite time a.s.. It follows that \(F^{\textrm{sim}}\) undergoes only finitely many transitions total in finite time, and hence is not explosive. \(\square \)

4 Transience, Recurrence, and Positive Recurrence

The following definitions are standard. For example, see Norris (1997).

Definition 4.1

Let M be a Markov chain with countable state space \({\mathbb {S}}\), and for \(x\in {\mathbb {S}}\) let \(T_x=\inf \{t>0:M_t=x\text { but }\exists s\in [0,t], M_s\ne x\}\) be the first time the process returns to x (or just arrives at x, if the process does not start from x). If \(\mathbb P_x(T_x<\infty )=1\), we say that the state x is recurrent, and if \(\mathbb E_x(T_x)<\infty \) we say that the state x is positive recurrent. A state which is not recurrent is called transient, and a recurrent state which is not positive recurrent is null recurrent. If \(\mathbb P_x(T_y<\infty )>0\) we say that y is reachable from x. If every state \(x\in \mathbb {S}\) is positive recurrent, null recurrent, or transient, we say M is positive recurrent, null recurrent, or transient, respectively.

A standard fact about (positive) recurrence is that it is a class property:

Proposition 4.2

(Theorems 3.4.1(iv) and 3.5.3(i)\(\iff \)(ii) in Norris (1997)) Suppose that y is reachable from x and x is recurrent (resp. positive recurrent). Then y is recurrent (resp. positive recurrent).

In other words, if you can get between x and y with positive probability (in both directions), then x and y are either both transient, both null recurrent, or both positive recurrent. So for irreducible chains (ones where you can pass between any two points of the state space with positive probability), the chain M is always positive recurrent, null recurrent, or transient.

Before proceeding with the theory, we summarize the results of this section with a table. The way to read Table 1 is as follows:

  • Suppose we have a RNIC \((\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\), and N is the associated coarse-grained model.

  • The top row indicates possible dynamics (transient, null recurrent, or positive recurrent) for \(\mathcal I_\mathcal K\), the chemical model, and the left column indicates possible dynamics for \(\mathcal H_\mathcal K\), the compartment model. Since the possible dynamics for N will turn out to depend crucially on whether the compartments can exit (\(\kappa _E>0\)) or not (\(\kappa _E=0\)), the left column is further subdivided along these lines.

  • Several cells are marked “Impossible", because \(\mathcal H_\mathcal K\) cannot be null recurrent if \(\kappa _E=0\).

  • The numbers inside each cell refer to the relevant theorems, lemmas, or examples that demonstrates the result.

Table 1 The possibly dynamics for N, classified in terms of the dynamics for \(\mathcal H_\mathcal K\) and \(\mathcal I_\mathcal K\). In the above “NR” and “PR” stand for “null recurrent” and “positive recurrent”, respectively, whereas “Trans.” stands for “transient”

Note that in all cases where we give an example of a recurrent N, the example is actually positive recurrent. We suspect that null recurrent examples will also exist, but we felt it more interesting to cover the behavioral extremes.

Moving to our theory, we begin by considering the dynamics of the compartment model of Sect. 2.2, which takes the form of a relatively simple reaction network, namely,

(8)

The (positive) recurrence of this model is already completely classified (Xu et al. 2022). We state this classification now as a lemma.

Lemma 4.3

Consider the CRN in (8).

  • Suppose \(\kappa _I=0\). Then 0 is an absorbing state. If some other rate constant is non-zero then all other states are transient, whereas if all four rate constants are zero then all states are absorbing.

  • Suppose \(\kappa _I>0\) and \(\kappa _E > 0\). The irreducible state space is \(\{0,1,2,\dots \}\) and:

    • If \(\kappa _C>0\), then the chain is positive recurrent.

    • If \(\kappa _C=0\) but \(\kappa _F<\kappa _E\), then the chain is positive recurrent.

    • If \(\kappa _C=0\) and \(\kappa _F>\kappa _E\), then the chain is transient.

    • If \(\kappa _C=0\) and \(\kappa _F=\kappa _E\), then either \(\kappa _I\le \kappa _E\) and the chain is null recurrent, or \(\kappa _I>\kappa _E\) and the chain is transient.

  • Suppose \(\kappa _I>0\) and \(\kappa _E = 0\). Then all statements remain the same as in the case \(\kappa _I>0\) and \(\kappa _E > 0\) except the irreducible state space is now \(\{1,2,\dots \}\) (and the state 0 is transient).

Now we begin with our positive results. The first fact is simple enough to be stated as a remark:

Remark 4.4

Notice that if N is the course-grained representation for \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) and n is a (positive) recurrent state for N, then the number of compartments in n, \(\left\Vert n\right\Vert _{\ell ^1}\), is a (positive) recurrent state for \(\mathcal H_\mathcal K\), since the return time to \(\left\Vert n\right\Vert _{\ell ^1}\) is bounded by the return time to n.

Said succinctly, if n is a positive recurrent state of the full model, then so is \(\left\Vert n\right\Vert _{\ell ^1}\) for the compartment model. One might hope that the converse would be true, and it turns out under relatively mild assumptions it is:

Theorem 4.5

Consider a non-explosive model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) where \(\kappa _E>0\), and let N be its course-grained representation. Then a state n is (positive) recurrent for N iff n is reachable from the empty state \(\textbf{0}\) for N and the state \(\left\Vert n\right\Vert _{\ell ^1}\) is (positive) recurrent for \(\mathcal H_\mathcal K\).

Proof

If \(\kappa _I=0\) the conclusions of the theorem are clear, since by Lemma 4.3 the state with no compartments is absorbing for both N and \(\mathcal H_\mathcal K\) and all other states are transient. From here on we assume \(\kappa _I>0\).

Let \(M_C=\left\Vert N\right\Vert _{\ell ^1}\) be the number of compartments; recall that \(M_C\) is a Markov chain which evolves according to \(\mathcal H_\mathcal K\). Suppose first that n is recurrent for N. By Remark 4.4, \(\left\Vert n\right\Vert _{\ell ^1}\) is recurrent for \(\mathcal H_\mathcal K\). Since \(\kappa _E>0\) and \(\kappa _I>0\), by Lemma 4.3\(\mathcal H_\mathcal K\) is irreducible, so \(\mathcal H_\mathcal K\) eventually hits zero with probability one when started from \(\left\Vert n\right\Vert _{\ell ^1}\). But when \(M_C\) hits zero, \(N=\textbf{0}\). Since n is recurrent for N, it must be that N eventually returns to state n after hitting state \(\textbf{0}\). This proves that n is reachable from \(\textbf{0}\) for N.

Now suppose that n is reachable from \(\textbf{0}\) and the state \(\left\Vert n\right\Vert _{\ell ^1}\) is positive recurrent (resp. recurrent) for \(\mathcal H_\mathcal K\). Since \(\mathcal H_\mathcal K\) is irreducible as in the previous paragraph, it follows that zero is positive recurrent (resp. recurrent) for \(\mathcal H_\mathcal K\). But \(N=\textbf{0}\) exactly when \(M_C\) is 0, so \(\textbf{0}\) is positive recurrent (resp. recurrent) for N. But positive recurrence (resp. recurrence) is a class property and by assumption n is reachable from \(\textbf{0}\), so we conclude that n is positive recurrent (resp. recurrent) for N, as desired. \(\square \)

The same theorem holds, mutatis mutandis, for \(F^{\textrm{sim}}\). The proof is the same so we omit it.

Theorem 4.6

Consider a non-explosive model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) where \(\kappa _E>0\), and let \(F^{\textrm{sim}}\) be its simulation representation. Then a state \((x_1,\dots ,x_k)\) is (positive) recurrent for \(F^{\textrm{sim}}\) iff \((x_1,\cdots ,x_k)\) is reachable from the empty vector () for \(F^{\textrm{sim}}\) and the state k is (positive) recurrent for \(\mathcal H_\mathcal K\).

Remark 4.7

Theorems 4.5 and 4.6 hold under more general assumptions. Note that the key idea of both is that 0 is (positive) recurrent for \(\mathcal H_\mathcal K\). Hence, one can generalize to the situation in which \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) has non-mass action kinetics for either \(\mathcal I_\mathcal K\) or \(\mathcal H_\mathcal K\), so long as the system is non-explosive and 0 is (positive) recurrent for \(\mathcal H_\mathcal K\).

4.1 Lyapunov Functions

In what follows we will need to make use of the theory of Lyapunov functions for Markov chains. This short section is devoted to introducing the extent of the theory we will use.

The following theorem is well-known. In full generality, it is due to Meyn and Tweedie (1993). The version below is a specialization to the countable state space case. For a proof of the version given below, see the later paper (Anderson et al. 2020).

Theorem 4.8

Let X be a continuous-time Markov chain on a countable state space \({\mathbb {S}}\) with generator \({\mathcal {L}}\). Suppose there exists a finite set \(K\subset {\mathbb {S}}\) and a positive function V on \({\mathbb {S}}\) such that

$$\begin{aligned} {\mathcal {L}}{\mathcal {V}}(x)\le -1 \end{aligned}$$

for all \(x\in {\mathbb {S}}\setminus K\). Suppose further that V is “norm-like,” in the sense that \(\{x\in {\mathbb {S}}:V(x)<B\}\) is finite for every \(B>0\). Then each state in a closed, irreducible component of \({\mathbb {S}}\) is positive recurrent. Moreover, if \(\tau _{x_0}\) is the time for the process to enter the union of the closed irreducible components given an initial condition \(x_0\), then \(\mathbb E_{x_0}[\tau _{x_0}]<\infty \).

We will also need the following, which provides a method to check for transience.

Theorem 4.9

Let X be a non-explosive continuous-time Markov chain on a countable discrete state space \({\mathbb {S}}\) with generator \(\mathcal L\). Let \(B\subset {\mathbb {S}}\), and let \(\tau _B\) be the time for the process to enter B. Suppose there is some bounded function V such that for all \(x\in B^c\),

$$\begin{aligned} {\mathcal {L}}{\mathcal {V}}(x)\ge 0. \end{aligned}$$

Then \(\mathbb P_{x_0}(\tau _B<\infty )<1\) for any \(x_0\) such that

$$\begin{aligned} \sup _{x\in B}V(x)<V(x_0). \end{aligned}$$

For a version of the theorem above that applies in much greater generality, see Theorem 3.3(i) in Tweedie (1994). Our theorem is not an immediate corollary of theirs (they define restricted versions of the chain X and state their theorem in terms of the generators of the restricted processes), so we will provide a proof of Theorem 4.9 in the appendix.

4.2 Instructive Examples

We now consider some examples. The first is an application of Theorem 4.5, and the rest show the various ways the conclusion of the theorem can fail if the hypothesis \(\kappa _E>0\) is not satisfied. These examples also serve to illustrate various techniques that are useful for analysing recurrence and transience of RNIC models. In Example 4.13, positive recurrence for the RNIC is shown via a Lyapunov function, applying Theorem 4.8. In Example 4.21, transience for the RNIC is shown via a Lyapunov function, applying Theorem 4.9. And in Example 4.23, transience for the RNIC is shown with the help of the construction of \(F^{\textrm{sim}}\) given in Sect. 2.3.2.

In the following, any rate constants not specified are assumed to be positive.

Example 4.10

Consider the following RNIC.

figure g

where \(\delta _0\) is the point mass at zero (so each compartment enters empty). Even though \(\mathcal I_\mathcal K\) is transient, by Theorem 4.5 the empty state is positive recurrent for N. Any state where every compartment has an even number of S molecules is reachable from the empty state, hence positive recurrent. Any state where any compartment has an odd number of S molecules is not reachable from the empty state, hence transient. \(\square \)

In all of the remaining examples in this section, we have \(\kappa _E=0\) and hence the state 0 will be transient for \(\mathcal H_\mathcal K\). Hence, when discussing the properties of the model we restrict ourselves to the state space \(\mathcal N\setminus \{0\}\) that does not include the state with zero compartments.

The case where \(\kappa _E=0\) is more complicated than the \(\kappa _E\ne 0\) case. For one thing, it is no longer enough just to look at \(\mathcal H_\mathcal K\) to decide if all states are transient. Indeed, if Example 4.10 is modified so that \(\kappa _E=0\) then every state becomes transient, despite the fact that all states are positive recurrent for the compartment network \(\mathcal H_\mathcal K\):

Example 4.11

Consider the model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) described by

figure h

where \(\delta _0\) is again the point mass at zero.

We reiterate that this is exactly the same as the previous example but with \(\kappa _E\) set to zero. However, that is enough to make every state transient for \(\mathcal F\):

Proposition 4.12

In the RNIC model (9), \(\mathcal I_\mathcal K\) is transient, \(\mathcal H_\mathcal K\) is positive recurrent on the irreducible state space \(\{1,2,\dots \}\), and N (the coarse-grained model corresponding to \(\mathcal F\)) is transient.

Proof

Except for the zero-compartment state (which cannot be returned to), all states are positive recurrent for \(\mathcal H_\mathcal K\) by Lemma 4.3. However, the total number of S molecules across all compartments can never shrink, and grows with some positive rate (at least \(\kappa _b\), and larger if there are more compartments), so all states are transient for N. \(\square \)

Thus we see that, in this example, the long-term behavior of \(\mathcal H_\mathcal K\) and the course-grained model N are different. \(\square \)

The above example shows that when \(\kappa _E=0\) and \(\mathcal I_\mathcal K\) is transient, \(\mathcal F\) may be transient even if \(\mathcal H_\mathcal K\) is not. However, this need not always be the case. Below we have an example that demonstrates that, when \(\kappa _E=0\) and \(\mathcal I_\mathcal K\) is transient, it is still possible for \(\mathcal F\) to be positive recurrent.

Example 4.13

Consider the model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) described by

figure i

where \(\delta _{(0,1)}\) is a point mass with zero A molecules and one B molecule. We will show that the chemical model \(\mathcal I_\mathcal K\) is transient but that the course-grained model, N, is positive recurrent. Intuitively, this can be understood in the following manner: B should be thought of as an enzyme that degrades the substrate A. Without the compartment model, the enzyme would simply disappear over time, and then the substrate would grow without bound (from the reaction \(0 \rightarrow A\)). However, each compartment brings in a new enzyme allowing for the further degradation of A.

Proposition 4.14

In the RNIC model (10), \(\mathcal I_\mathcal K\) is transient, \(\mathcal H_\mathcal K\) is positive recurrent on the irreducible state space \(\{1,2,\dots \}\), and N (the coarse-grained model corresponding to \(\mathcal F\)) is positive recurrent.

Proof

\(\mathcal H_\mathcal K\) is positive recurrent by Lemma 4.3. \(\mathcal I_\mathcal K\) is transient by the discussion above.

It just remains to check positive recurrence of N. For \(n\in \mathcal N\), let \(C(n)=\left\Vert n\right\Vert _{\ell ^1}=\sum _{a=0}^\infty \sum _{b=0}^\infty n_{(a,b)}\) denote the number of compartments, and let \(A(n)=\sum _{a=0}^\infty \sum _{b=0}^\infty an_{(a,b)}\) and \(B(n)=\sum _{a=0}^\infty \sum _{b=0}^\infty bn_{(a,b)}\) be the total number of A and B molecules, respectively, across all compartments. Define \(V:\mathcal N\rightarrow [0,\infty )\) via

$$\begin{aligned} V(n)={\left\{ \begin{array}{ll} A(n)+B(n)+5C(n)-1 &{} B(n)\ne 0\\ A(n)+B(n)+5C(n)+7 &{} B(n)=0. \end{array}\right. } \end{aligned}$$

We claim that this is a Lyapunov function for N. An upper bound for \({\mathcal {L}}{\mathcal {V}}(n)\), the generator applied to V at n, is given by

$$\begin{aligned} {\mathcal {L}}{\mathcal {V}}(n)\le {\left\{ \begin{array}{ll} -B(n)+7-15C(n)(C(n)-1) &{} B(n)\ge 2\text { and }C(n)\ge 2\\ 14-15C(n)(C(n)-1) &{} B(n)=1\text { and }C(n)\ge 2\\ -1-15C(n)(C(n)-1) &{} B(n)=0\\ -A(n)(A(n)-1)B(n)-B(n)+7 &{} B(n)\ge 2\text { and }C(n)=1\\ -A(n)(A(n)-1)+14 &{} B(n)=1\text { and }C(n)=1 \end{array}\right. } \end{aligned}$$

Note that the first two rows are upper bounds and the last three rows are exact. Specifically, in the first two rows we neglected the contribution of the \(2A+B\rightarrow B\) reaction — unlike everything else it crucially depends on how the A and B molecules are distributed across the compartments.

We see that \({\mathcal {L}}{\mathcal {V}}(n)\le -1\) for all n outside a finite set of states—for instance, you could take the states where there is exactly one compartment and it has at most 7 B and at most 4 A. So V is indeed a Lyapunov function for N, and hence N is positive recurrent by Theorem 4.8. \(\square \)

In the previous example we saw that even when \(\kappa _E=0\), positive recurrent compartments \(\mathcal H_\mathcal K\) can still tame transient chemistry \(\mathcal I_\mathcal K\). It should not be surprising, then, that positive recurrent compartments can tame null recurrent chemistry in the same manner. For the sake of filling in Table 1 completely, we present a modification of Example 4.13 where \(\mathcal I_\mathcal K\) is null recurrent instead of transient.

Example 4.15

Consider the model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) described by

figure j

where \(\delta _{(0,1)}\) is a point mass with zero A molecules and one B molecule.

The verification of this example is similar enough to that of Example 4.13 that we provide only a sketch.

Proposition 4.16

In the RNIC model (11), \(\mathcal I_\mathcal K\) is null recurrent on the irreducible state space \(\{0,1,2,\dots \}\times \{0\}\), \(\mathcal H_\mathcal K\) is positive recurrent on the irreducible state space \(\{1,2,\dots \}\), and N (the coarse-grained model corresponding to \(\mathcal F\)) is positive recurrent.

Proof Sketch

Similarly to Example 4.13, \(\mathcal H_\mathcal K\) is positive recurrent and \(\mathcal I_\mathcal K\) is eventually reduces (after all the B molecules degrade) to the network

figure k

This model is null recurrent by Lemma 4.3.

As for N, let V be the very same Lyapunov function used to prove positive recurrence in Example 4.13. The only difference between this example and that one is the addition of the reactions \(A\rightarrow 0\) and \(A\rightarrow 2A\). But notice that the contribution of \(A\rightarrow 0\) in \({\mathcal {L}}{\mathcal {V}}(n)\) is \(-A(n)\), and the contribution of \(A\rightarrow 2A\) is A(n). These are equal and opposite, so \({\mathcal {L}}{\mathcal {V}}(n)\) is exactly the same in this example and Example 4.13. Thus the remainder of the proof is identical. \(\square \)

Examples 4.11 and 4.13 showed that \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) can be either positive recurrent or transient when \(\kappa _E=0\) and \(\mathcal I_\mathcal K\) is transient. The next few examples are dedicated to showing the same when \(\mathcal I_\mathcal K\) is recurrent. First, if new compartments enter with a huge number of molecules, it can overwhelm otherwise positive recurrent chemistry:

Example 4.17

Consider the RNIC model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) described by

figure l

where \(\mu \) is not yet specified.

Proposition 4.18

Let N be the coarse-grained model associated with the RNIC model (12). For any choice of non-negative rate constants such that \(\kappa _I>0\), there is a distribution \(\mu \) on the non-negative integers such that N is transient.

Proof

We will show that in the case \(\kappa _b = 0\), \(\mu \) can be chosen so that the total number of S molecules is itself a transient Markov chain. The case of \(\kappa _b>0\) then immediately follows by a coupling argument. That portion of the proof is straightforward and is omitted.

Let M(t) denote the number of S molecules across all compartments at time t. Under the assumption that \(\kappa _b=0\), M is a Markov chain which transitions from state \(m\in \mathbb N\) to state \(m-1\) with rate \(\kappa _d m\) and to state \(m+j\) with rate \(\kappa _I\mu (j)\).

Our plan is the following: we will recursively define an increasing sequence of integers \(m_k\) for \(k=1,2,3,\dots \), and define \(\mu (m_k)=2^{-k}\) and \(\mu (j)=0\) otherwise. For \(k=2,3,4,\dots \), we will let \(A_k\) denote the event that the process M reaches \(m_{k-1}\) before it reaches (or exceeds) \(m_{k+1}\). It then suffices to show that \(\sup _k \mathbb P_{m_k}(A_k)<1/2\) to prove transience of M.

Continuing, we begin by letting \(m_1 = 0\). Now suppose \(m_1,\dots ,m_{k-1}\) have been defined. We will show that for any \(\varepsilon >0\) it is possible to pick \(m_k\) so that \(\mathbb P_{m_k}(A_k)<\varepsilon \) regardless of the values chosen for \(m_{k+1},m_{k+2},\dots \). To show this, we make the following observations.

  1. 1.

    Since M can only go down by one at a time, to get from \(m_k\) to \(m_{k-1}\) before hitting a state equal to or larger than \(m_{k+1}\), the process must visit every state \(m_k,m_k-1,\cdots ,m_{k-1}+1\) at least once.

  2. 2.

    On the event \(A_k\), during each visit to each of the states \(m_{k-1}+1,\dots , m_k\) there was no transition of size \(+m_{k+1}\) (for in that case the state of M would would necessarily reach or exceed \(m_{k+1}\)).

The probability of the process M transitioning up by \(m_{k+1}\) while in state m is \(\frac{2^{-(k+1)}\kappa _I}{\kappa _I+\kappa _dm}\) because the total rate out of state m is \(\kappa _I+\kappa _dm\), and the rate of inflows of size \(m_{k+1}\) in state m is \(\mu (m_{k+1})\kappa _I=2^{-k-1}\kappa _I\). Hence, combining the above observations we see

$$\begin{aligned} \mathbb P_{m_k}(A_k)&\le \prod _{m=m_{k-1}+1}^{m_k}\left( 1-\frac{2^{-(k+1)}\kappa _I}{\kappa _I+\kappa _dm}\right) \\&{\le } \prod _{m{=}m_{k{-}1}{+}1}^{m_k}\exp \left( {-}\frac{2^{-(k+1)}\kappa _I}{\kappa _I{+}\kappa _dm}\right) {=}\exp \left( {-}2^{-(k+1)}\kappa _I\sum _{m{=}m_{k{-}1}{+}1}^{m_k}\frac{1}{\kappa _I{+}\kappa _dm}\right) , \end{aligned}$$

where above we use the bound \(1-x\le e^{-x}\).

If \(m_{k-1}\) is fixed and we send \(m_k\rightarrow \infty \) in the sum above, we get \(\infty \) (it’s a tail of a harmonic series). Therefore, \(\mathbb P_{m_k}(A_k)\) can be made as small as we like by choosing \(m_k\) big enough. We conclude that for appropriate choice of \(m_k\), the process M is transient, and hence so is N. \(\square \)

Hence, so long as \(\kappa _E=0\), a distribution \(\mu \) that is “bad enough” can cause the whole model to be transient even if the chemical model \(\mathcal I_\mathcal K\) is positive recurrent. \(\square \)

In the previous example, the distribution \(\mu \) of incoming compartments was unbounded. As it turns out, \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) can be transient even when \(\mathcal I_\mathcal K\) and \(\mathcal H_\mathcal K\) are positive recurrent and \(\mu \) is bounded. The simplest, though not only, reason this can occur is the existence of some conservation law, as the next example demonstrates. Put simply, the total amount of species A and B is preserved by the chemistry, so any inflow of those species, no matter how small, will overwhelm it.

Example 4.19

Consider the RNIC model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) described by

figure m

where \(\mu \) is not yet specified.

Proposition 4.20

Let N be the coarse-grained model associated with the system \(\mathcal F\) from (13). If \(\mu \) is any measure on \(\mathbb Z_{\ge 0}^2\) other than the trivial measure \(\delta _{(0,0)}\), then N is transient even though all states are positive recurrent for \(\mathcal I_\mathcal K\). On the other hand, if \(\mu =\delta _{(0,0)}\) then N is positive recurrent.

Proof

\(\mathcal I_\mathcal K\) is not irreducible, but when it is partitioned into closed irreducible communicating classing, all are finite, and hence all states are positive recurrent. As always when \(\kappa _E=0\) but \(\kappa _C>0\), the empty state is transient for \(\mathcal H_\mathcal K\) but all other states are positive recurrent.

For \(n\in \mathcal N\), let \(S(n)=\sum _{a=0}^\infty \sum _{b=0}^\infty (a+b)n_{(a,b)}\) denote the sum of the number of A and B molecules, combined across all compartments in n.

First suppose that \(\mu \ne \delta _{(0,0)}\). Then S(N(t)) cannot shrink, and grows with positive probability every time a compartment enters. So N is transient in this case.

Now suppose \(\mu =\delta _{(0,0)}\). For \(n\in \mathcal N\), let \(C(n)=\left\Vert n\right\Vert _{\ell ^1}\) be the number of compartments in state n, and let \(V(n)=2C(n)\). Then

$$\begin{aligned} {\mathcal {L}}{\mathcal {V}}(n)=2\kappa _I+2\kappa _FC(n)-\kappa _CC(n)(C(n)-1), \end{aligned}$$

where \({\mathcal {L}}\) is the generator of N. This is less than \(-1\) outside a finite set because it is quadratic in C(n) with negative leading term, provided we restrict the state space to \(\{n\in \mathcal N: S(n)=S(N(0))\}\). So Theorem 4.8 applies and N is positive recurrent, as claimed. \(\square \)

A natural question at this point is whether, if the behaviors in the last two examples are ruled out, N can still be transient when \(\mathcal I_\mathcal K\) and \(\mathcal H_\mathcal K\) are both separately recurrent. Specifically, if \(\mathcal I_\mathcal K\) and \(\mathcal H_\mathcal K\) are both recurrent, there are no conservation laws, and the number of molecules that an incoming compartment can have is bounded, can N be transient? The answer is yes, as the next example demonstrates.

Example 4.21

Consider the RNIC model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) described by

figure n

where \(\delta _1\) is the point mass at one S.

Proposition 4.22

Let N be the coarse-grained model associated to the network \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) from (14). Then \(\mathcal I_\mathcal K\) is recurrent with no conservation laws and the number of molecules in new compartments is bounded, however every state is transient for N.

Proof

\(\mathcal I_\mathcal K\) is (null) recurrent, and \(\mathcal H_\mathcal K\) is positive recurrent on the irreducible state space \(\{1,2,\dots \}\), by Lemma 4.3.

It remains to show that every state is transient for N. As in all examples with \(\kappa _E=0\), the state with zero compartments can never be returned to and we restrict the state space of the chain to \(\mathcal N\setminus \{0\}\). With this assumption the state space is a closed irreducible set, so it suffices to pick one state and show that it is transient. We will show \(e_0\) (the state with one empty compartment) is transient. Denoting a state of N by n, let \(C(n)=\sum _{x=0}^\infty n_x\) and \(S(n)=\sum _{x=0}^\infty x\cdot n_x\) denote the total number of compartments and S molecules, respectively. Define \(V:\mathcal N\rightarrow [0,1]\) by

$$\begin{aligned} V(n)=\frac{S(n)}{1+S(n)}. \end{aligned}$$

If \({\mathcal {L}}\) denotes the generator of N, notice that

$$\begin{aligned} {\mathcal {L}}{\mathcal {V}}(n)&=(C(n)+S(n)+1)\left( \frac{S(n)+1}{S(n)+2}-\frac{S(n)}{S(n)+1}\right) \\&\qquad +S(n)\left( \frac{S(n)-1}{S(n)}-\frac{S(n)}{S(n)+1}\right) \\&=\frac{C(n)+S(n)+1}{(S(n)+2)(S(n)+1)}-\frac{1}{S(n)+1}\\&=\frac{C(n)-1}{(S(n)+2)(S(n)+1)}\\&\ge 0 \end{aligned}$$

for all \(n\in \mathcal N\setminus \{0\}\). In particular, if \(B=\{e_0\}\), we can apply Theorem 4.9 to conclude that when N is started from \(e_0+e_1\) (the state with two compartments, one empty and the other with one S), then the probability of reaching B is less than 1. But when N is started from \(e_0\), it reaches \(e_0+e_1\) with positive probability (the transition from \(e_0\) to \(e_0+e_1\) corresponds to an inflow event). Putting these together, when N is started from \(e_0\) it fails to return with positive probability, and hence \(e_0\) is transient. As discussed, this is enough to conclude that all states are transient for N. \(\square \)

In the previous example \(\mathcal I_\mathcal K\) was null recurrent. One may still be tempted to think that perhaps if it were positive recurrent then the whole process must be. The next example demonstrates that even this is not guaranteed.

Example 4.23

Consider the compartment model described by

figure o

where m is some non-negative integer and \(\delta _{(m,0)}\) is the point mass at m molecules of A and zero of B. Let \(\gamma >0\) denote the expected number of compartments in stationarity.

Proposition 4.24

Let \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) be the compartment model from (15), and let N be the associated coarse-grained model. Then \(\mathcal I_\mathcal K\) is positive recurrent, but N is transient when \(m > \gamma \).

Proof

That \(\mathcal I_\mathcal K\) is positive recurrent is witnessed by the Lyapunov function

$$\begin{aligned} V(a,b)={\left\{ \begin{array}{ll} 3a+3 &{} b=0\\ 3a+3b-2 &{} b\ge 1 \end{array}\right. } \end{aligned}$$

Indeed, if \({\mathcal {A}}\) denotes the generator of \(\mathcal I_\mathcal K\), then

$$\begin{aligned} {\mathcal {A}} V(a,b)&={\left\{ \begin{array}{ll} 3(1)-2(2) &{} b=0\\ 3(1)+3(2)-1(10a) &{} b=1\\ 3(1)+3(2)-6(20a)-1(10) &{} b=2\\ 3(1)+3(2)-6(10ab)-6(5b(b-1)) &{} b\ge 3 \end{array}\right. }\\&={\left\{ \begin{array}{ll} -1 &{} b=0\\ 9-10a &{} b=1\\ -1-120a &{} b=2\\ 9-60ab-30b(b-1) &{} b\ge 3 \end{array}\right. } \end{aligned}$$

This is at most \(-1\) away from (0, 1), so by Theorem 4.8\(\mathcal I_\mathcal K\) is positive recurrent.

Now regarding transience of N, let \(F^{\textrm{sim}}\) be the simulation representation of \(\mathcal F\), so that N is a fuction of \(F^{\textrm{sim}}\). Let \(X_A\) and \(X_B\) denote the total number of A and B molecules, respectively, across all compartments in N (equivalently, across all compartments in \(F^{\textrm{sim}}\)). To show that N is transient, we will show that \(X_A(t)\rightarrow \infty \) a.s., as \(t\rightarrow \infty \). To do this, we will make use of the construction of \(F^{\textrm{sim}}\) from Sect. 2.3.2. Let \(Y_I\) and \(Y_C\) be as in that section, so that the process \(M_C\) for the number of compartments is given by

$$\begin{aligned} M_C(t)&= M_C(0) + Y_I\left( t\right) - Y_C\left( \int _0^t \frac{M_C(s) (M_C(s) - 1)}{2} ds\right) . \end{aligned}$$

Similarly, for \(r\in \{A+B\rightarrow 0,0\rightarrow B,2B\rightarrow 0,0\rightarrow A\}\) let \(Y_r\) be as in Sect. 2.3.2, and let \(R_r\) be the associated counting process for the number of times reaction r has occurred across all compartments, so that

$$\begin{aligned} R_r(t) = Y_r \left( \mathop {\sum _{i\ge 0}}_{T_i\le t}\int _{T_i}^{T_{i+1}\wedge t}\sum _{j=1}^{M_C(T_i)} \lambda _r(X^i_j(s)) ds\right) , \end{aligned}$$

where the \(T_i\) are the jump times of the process \(M_C\), \(X_j^i(s)\) is the state of the process in compartment j at time s, and \(\lambda _r\) is given as in (2). Then

$$\begin{aligned} X_A(t)&=X_A(0)+R_{0\rightarrow A}(t)+mY_I\left( t\right) -R_{A+B\rightarrow 0}(t)\\&=X_A(0)+Y_{0\rightarrow A}\left( \int _0^t M_C(s) ds\right) +mY_I\left( t\right) -R_{A+B\rightarrow 0}(t). \end{aligned}$$

Notice that in the last line above we were able to simplify the expression for \(R_{0\rightarrow A}\) in terms of \(Y_{0\rightarrow A}\) from the expression given above for \(R_r\) in general. This was done by making use of the fact that the total rate of this reaction across all compartments, \(\sum _j \lambda _{0\rightarrow A}(X_j^i(s))\), is exactly the total number of compartments \(M_C(s)\). We cannot hope to do the same for \(R_{A+B\rightarrow 0}\) because the rate of that reaction depends on how the molecules are distributed across the compartments. However, notice that the total number of times the reaction \(A+B\rightarrow 0\) fires is at most the total number of B molecules ever present in the system:

$$\begin{aligned}&R_{A+B\rightarrow 0}(t) \le X_B(0)+R_{0\rightarrow B}(t)\\&\quad =X_B(0)+Y_{0\rightarrow B}\left( 2\int _0^t M_C(s) ds\right) . \end{aligned}$$

Therefore,

$$\begin{aligned} X_A(t)&\ge X_A(0)-X_B(0)\\&\quad +Y_{0\rightarrow A}\left( \int _0^t M_C(s) ds\right) +mY_I\left( t\right) -Y_{0\rightarrow B}\left( 2\int _0^t M_C(s) ds\right) . \end{aligned}$$

Recall that \(\gamma \) denotes the expected number of C in the CRN \(\mathcal H_\mathcal K\) at stationarity. By the CTMC ergodic theorem (see Theorem 45 in Chapter 4 of Serfozo (2009)), \(\frac{1}{t}\int _0^t M_C(s)ds\rightarrow \gamma \) almost surely as \(t\rightarrow \infty \). This will matter in its own right; it also follows that \(\int _0^t M_C(s)ds\rightarrow \infty \) a.s. as \(t\rightarrow \infty \). It is a standard fact about unit Poisson processes Y that \(Y(t)/t\rightarrow 1\) a.s. as \(t\rightarrow \infty \). Composing this Poisson limit with the limit from the previous sentence, we get that

$$\begin{aligned} \frac{Y_{0\rightarrow B}\left( 2\int _0^tM_C(s)ds\right) }{2\int _0^tM_C(s)ds}\rightarrow 1 \end{aligned}$$

a.s. as \(t\rightarrow \infty \), and similarly for \(Y_{0\rightarrow A}\). Putting this all together we have

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{X_A(t)}{t}&\ge \lim _{t\rightarrow \infty }\Bigg [ \frac{Y_{0\rightarrow A}\left( \int _0^tM_C(s)ds\right) }{\int _0^tM_C(s)ds}\cdot \frac{1}{t}\int _0^tM_C(s)ds+m\frac{Y_I(t)}{t}\\&\hspace{1in}-\frac{Y_{0\rightarrow B}\left( 2\int _0^tM_C(s)ds\right) }{2\int _0^tM_C(s)ds}\cdot \frac{2}{t}\int _0^tM_C(s)ds\Bigg ]\\&=\gamma +m-2\gamma . \end{aligned}$$

almost surely. Therefore, as long as the integer m is (strictly) larger than \(\gamma \), \(X_A(t)/t\) is converging almost surely to a positive number. In this case \(X_A(t)\rightarrow \infty \) a.s. as \(t\rightarrow \infty \), and hence N is transient. \(\square \)

Note that the above example shows the potential usefulness of the RNIC representation provided in Sect. 2.3.2.

5 Stationary Distribution in a Special Case

In light of Theorem 4.5, whenever \(\mathcal H_\mathcal K\) is positive recurrent and \(\kappa _E>0\), then N, the coarse-grained model associated to \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\), is positive recurrent for at least some states. In this case, the standard theory of Markov chains tells us that there is a stationary distribution supported on those states. Ideally, it would be possible to write down a formula for this stationary distribution in terms of information about the CRNs \(\mathcal I_\mathcal K\) and \(\mathcal H_\mathcal K\). Under the further assumption that \(\kappa _C=0=\kappa _F\) (so that compartments are not interacting), we are able to do so.

Theorem 5.1

Consider a non-explosive model \(\mathcal F=(\mathcal I_\mathcal K,\mathcal H_\mathcal K,\mu )\) with \(\kappa _F=\kappa _C=0\), and \(\kappa _E>0\):

figure p

Let N be the coarse-grained model associated to \(\mathcal F\). For \(x\in \mathbb Z_{\ge 0}^d\) and \(t\in [0,\infty )\), let \(P_\mu (x,t)\) denote the probability that \(\mathcal I_\mathcal K\) is in state x at time t when started from time zero with initial distribution \(\mu \). For \(x\in \mathbb Z_{\ge 0}^d\) define \(\alpha (x)\) via

$$\begin{aligned} \alpha (x)=\int _0^\infty P_\mu (x,t) \kappa _E e^{-\kappa _E t}\mathrm dt, \end{aligned}$$

and define a distribution \(\pi \) on \(\mathcal N\) via

$$\begin{aligned} \pi (n)=\left( \prod _{x\in \mathbb Z_{\ge 0}^d}\frac{\alpha (x)^{n_x}}{n_x!}\right) \cdot \left[ e^{-\kappa _I/\kappa _E}\cdot \left( \frac{\kappa _I}{\kappa _E}\right) ^{\left\Vert n\right\Vert _{\ell ^1}}\right] \end{aligned}$$

Then \(\pi \) is the unique stationary distribution for N.

Remark 5.2

To apply Theorem 5.1, one needs to know, not just the stationary distribution for the chemistry, but the distribution for all time. This restriction may seem daunting, and indeed for many models this distribution is not known. One class of models where it is know are the DR models of Anderson et al. (2020). A second class of models are monomolecular reaction networks with arbitrary initial conditions — see Jahnke and Huisinga (2007). Note that Anderson et al. (2020) allows for more general networks (all monomolecular networks satisfy the DR condition), but Jahnke and Huisinga (2007) allows for more general initial conditions (the DR paper requires Poisson initial conditions).

Proof of Theorem 5.1

Note that by Theorem 4.5, any state which is reachable from the zero state is positive recurrent, and all other states are transient. Furthermore, notice that N is irreducible if restricted to the set of states which are reachable from the zero state, since zero is reachable from any state. Thus there is a unique stationary distribution. To prove that the \(\pi \) given above is indeed this unique stationary distribution, it suffices to show that \(\pi \) is a distribution and \(\pi Q=0\), where Q is the transition rate matrix for N. That \(\pi \) is a distribution follows from the fact that \(\alpha \) is a distribution, which we will check later in the proof. So fix \(n\in \mathcal N\); we wish to show that \(\sum _{n'\in \mathcal N}\pi (n') q(n',n)=0\).

Note that there are only three possible types of transitions: inflow of compartment, outflow of compartment, and transition of reaction network. Expanding the sum above into three terms, one for each of these types of transitions, the desired equality can be written

$$\begin{aligned}&\sum _{x\in \mathbb Z_{\ge 0}^d}\Bigg [\pi (n-e_x)q(n-e_x,n)+\pi (n+e_x)q(n+e_x,n)\\&\qquad +\sum _j \pi (n-e_x+e_{x-\nu '_j+\nu _j})q(n-e_x+e_{x-\nu '_j+\nu _j},n)\Bigg ]\\&\quad =\pi (n)\sum _{x\in \mathbb Z_{\ge 0}^d}\left( q(n,n+e_x)+q(n,n-e_x)+\sum _j q(n,n-e_x+e_{x+\nu '_j-\nu _j})\right) \end{aligned}$$

or

$$\begin{aligned}&\sum _{x\in \mathbb Z_{\ge 0}^d}\Bigg [\pi (n-e_x)\kappa _I\mu (x)+\pi (n+e_x)\kappa _E (n_x+1)\nonumber \\&\qquad +\sum _j \pi (n-e_x+e_{x-\nu '_j+\nu _j})(n_{x-\nu '_j+\nu _j}+1)\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \Bigg ]\nonumber \\&\quad =\pi (n)\sum _{x\in \mathbb Z_{\ge 0}^d}\left( \kappa _I\mu (x)+\kappa _En_x+\sum _j n_x\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) \right) \end{aligned}$$
(16)

To prove this equality, we will consider two cases. Suppose first that n is such that \(n_y>0\) for some \(y\in \mathbb Z_{\ge 0}^d\) with \(\alpha (y)=0\), and fix such a y. Then \(\alpha (y)\) participates in the product defining \(\pi (n)\), and hence \(\pi (n)=0\). Thus the right-hand side of (16) is zero; we claim that the left-hand side is also zero. Specifically, we will argue for each x and each j, each of the three terms in the sum is zero. So fix x and j:

  • \(\pi (n-e_x)\kappa _I\mu (x)\): Notice that if \(x\ne y\) then \(\pi (n-e_x)=0\) for the same reason that \(\pi (n)=0\). If \(x=y\) then \(\mu (x)=0\), since if \(\mu (y)>0\) it would be the case that \(P_\mu (y,t)>0\) for all small enough t, and hence the integral defining \(\alpha (y)\) would be positive.

  • \(\pi (n+e_x)\kappa _E (n_x+1)\): Regardless of x, \(\pi (n+e_x)=0\) for the same reason that \(\pi (n)=0\).

  • \(\pi (n-e_x+e_{x-\nu '_j+\nu _j})(n_{x-\nu '_j+\nu _j}+1)\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \): As before, if \(x\ne y\) then \(\pi (n-e_x+e_{x-\nu '_j+\nu _j})=0\). Suppose towards a contradiction that \(\pi (n-e_y+e_{y-\nu '_j+\nu _j})\ne 0\) and that \(\kappa _j\left( {\begin{array}{c}y-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \ne 0\). Then \(\pi (n-e_y+e_{y-\nu '_j+\nu _j})\ne 0\) implies that \(\alpha (y-\nu '_j+\nu _j)\ne 0\), and hence \(P_\mu (y-\nu '_j+\nu _j,t)\ne 0\) for some t. But this means that the state \(y-\nu '_j+\nu _j\) is reachable for \(\mathcal I\) when started with initial distribution \(\mu \). But \(\kappa _j\left( {\begin{array}{c}y-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \ne 0\) implies that y is reachable from \(y-\nu '_j+\nu _j\) for \(\mathcal I\) via the j-th reaction, so we conclude that y is reachable from \(\mu \). But this implies that \(P_\mu (y,t)\ne 0\) for \(t>0\), which in turn means that \(\alpha (y)>0\). This contradicts our choice of y, so it must be that our assumption was wrong: either \(\pi (n-e_y+e_{y-\nu '_j+\nu _j})=0\) or \(\kappa _j\left( {\begin{array}{c}y-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) =0\). But either of those imply the desired equality \(\pi (n-e_y+e_{y-\nu '_j+\nu _j})(n_{y-\nu '_j+\nu _j}+1)\kappa _j\left( {\begin{array}{c}y-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) =0\).

This proves that (16) reduces to \(0=0\) in this case. The reminder of the proof will be devoted to the second case; namely, the case where n is such that \(n_y=0\) for all \(y\in \mathbb Z_{\ge 0}^d\) with \(\alpha (y)=0\).

Let \(\mathbb X=\{x\in \mathbb Z_{\ge 0}^d:\alpha (x)\ne 0\}\). We claim that for every \(x\notin \mathbb X\) and every j, every summand in (16) is zero. So fix \(x\notin \mathbb X\) and j:

  • \(\pi (n-e_x)\kappa _I\mu (x)\): Since \(\alpha (x)=0\), by choice of n we have \(n_x=0\). But this means that \(n-e_x\) is negative at x and hence \(n-e_x\notin \mathcal N\), so \(\pi (n-e_x)=0\).

  • \(\pi (n+e_x)\kappa _E (n_x+1)\): Notice that \(\alpha (x)=0\) participates in the product defining \(\pi (n+e_x)\), and hence \(\pi (n+e_x)=0\).

  • \(\pi (n-e_x+e_{x-\nu '_j+\nu _j})(n_{x-\nu '_j+\nu _j}+1)\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \): As before, \(n-e_x+e_{x-\nu '_j+\nu _j}\notin \mathcal N\) and hence \(\pi (n-e_x+e_{x-\nu '_j+\nu _j})=0\).

  • \(\kappa _I\mu (x)\): Since \(\alpha (x)=0\), it must be the case that \(\mu (x)=0\), as otherwise \(P_\mu (x,t)\) would be positive for sufficiently small t.

  • \(\kappa _En_x\): Since \(\alpha (x)=0\), by choice of n we have \(n_x=0\).

  • \(n_x\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) \): Once again, \(n_x=0\).

Thus we have shown that terms with \(x\notin \mathbb X\) do not contribute to (16). So to complete the proof, we have only to show that

$$\begin{aligned}&\sum _{x\in \mathbb X}\Bigg [\pi (n-e_x)\kappa _I\mu (x)+\pi (n+e_x)\kappa _E (n_x+1)\nonumber \\&\qquad +\sum _j \pi (n-e_x+e_{x-\nu '_j+\nu _j})(n_{x-\nu '_j+\nu _j}+1)\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \Bigg ]\nonumber \\&\quad =\pi (n)\sum _{x\in \mathbb X}\left( \kappa _I\mu (x)+\kappa _En_x+\sum _j n_x\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) \right) . \end{aligned}$$
(17)

Let \(x\in \mathbb X\) be arbitrary. Integration by parts gives

$$\begin{aligned}&\int _0^\infty \left( \frac{\mathrm d}{\mathrm dt}P_\mu (x,t)\right) \kappa _E e^{-\kappa _E t}\mathrm dt \\&=\kappa _E e^{-\kappa _E t}P_\mu (x,t)\Big |_{t=0}^{t=\infty }-\int _0^\infty P_\mu (x,t)(-\kappa _E^2 e^{-\kappa _E t})\mathrm dt\\&=-\kappa _E\mu (x)+\kappa _E\alpha (x). \end{aligned}$$

Because \(P_\mu \) is the distribution for \(\mathcal I_\mathcal K\), the Kolmogorov forward equations for \(\mathcal I_\mathcal K\) tell us that

$$\begin{aligned} \frac{\mathrm d}{\mathrm dt}P_\mu (x,t)&=\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) P_\mu (x-\nu _j'+\nu _j,t) -\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) P_\mu (x,t) \end{aligned}$$

for each t. Plugging this in above and rearranging yields

$$\begin{aligned} \begin{aligned}\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x-\nu '_j{+}\nu _j\\ \nu _j\end{array}}\right) \alpha (x-\nu _j'+\nu _j) {-}\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) \alpha (x)&=-\kappa _E\mu (x)+\kappa _E\alpha (x)\\\kappa _E\frac{\mu (x)}{\alpha (x)}+\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \frac{\alpha (x-\nu _j'+\nu _j)}{\alpha (x)}&=\kappa _E+\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) . \end{aligned} \end{aligned}$$

Note that we did not divide by zero in the second line because \(\alpha (x)\ne 0\) by definition of \(\mathbb X\). Since \(x\in \mathbb X\) was arbitrary, we can multiply through by \(n_x\) and sum over x, which yields

$$\begin{aligned}&\sum _{x\in \mathbb X}\bigg (n_x\kappa _E\frac{\mu (x)}{\alpha (x)}+n_x\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \frac{\alpha (x-\nu _j'+\nu _j)}{\alpha (x)}\bigg )\nonumber \\&\quad =\sum _{x\in \mathbb X}\bigg (n_x\kappa _E+n_x\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) \bigg ). \end{aligned}$$
(18)

Now we claim that \(\mu \) and \(\alpha \) are both probability measures supported on \(\mathbb X\). We know that \(\mu \) is a probability measure by assumption; it is supported on \(\mathbb X\) because if \(\mu (x)>0\) then \(P_\mu (x,t)>0\) for small enough t and hence \(\alpha (x)>0\). We know that \(\alpha \) is supported on \(\mathbb X\) by definition of \(\mathbb X\); to see that it is a probability measure, use the fact that the integrand in the definition of \(\alpha \) is non-negative to interchange a sum over x with the integral and then apply the fact that \(P_\mu (x,t)\) is a probability measure for each t. Therefore \(\mu \) and \(\alpha \) are both probability measures supported on \(\mathbb X\), as claimed; it follows that \(\sum _{x\in \mathbb X} \kappa _I \mu (x)=\kappa _I=\sum _{x\in \mathbb X}\kappa _I\alpha (x)\). So adding \(\kappa _I\) to both sides of (18) gives

$$\begin{aligned}&\sum _{x\in \mathbb X}\bigg (n_x\kappa _E\frac{\mu (x)}{\alpha (x)}+\kappa _I\alpha (x)+n_x\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) \frac{\alpha (x-\nu _j'+\nu _j)}{\alpha (x)}\bigg )\nonumber \\&\quad =\sum _{x\in \mathbb X}\bigg (\kappa _I\mu (x)+n_x\kappa _E+n_x\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) \bigg ). \end{aligned}$$
(19)

Now notice that, directly from the definition of \(\pi \), we have

$$\begin{aligned} \frac{\pi (n-e_x)}{\pi (n)}&=\frac{n_x}{\alpha (x)}\frac{\kappa _E}{\kappa _I}\\ \frac{\pi (n+e_x)}{\pi (n)}&=\frac{\alpha (x)}{n_x+1}\frac{\kappa _I}{\kappa _E}\\ \frac{\pi (n-e_x+e_{x-\nu _j'+\nu _j})}{\pi (n)}&=\frac{\alpha (x-\nu _j'+\nu _j)}{\alpha (x)}\frac{n_x}{n_{x-\nu _j'+\nu _j}+1}, \end{aligned}$$

where the last equality holds for each reaction \(\nu _j\rightarrow \nu _j'\). Applying these three in order on the left-hand side of (19), we get

$$\begin{aligned}&\sum _{x\in \mathbb X}\bigg (\kappa _I\mu (x)\frac{\pi (n-e_x)}{\pi (n)}+\kappa _E(n_x+1)\frac{\pi (n+e_x)}{\pi (n)}\\&\qquad +\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) (n_{x-\nu _j'+\nu _j}+1)\frac{\pi (n-e_x+e_{x-\nu _j'+\nu _j})}{\pi (n)}\bigg )\\&\quad =\sum _{x\in \mathbb X}\bigg (\kappa _I\mu (x)+n_x\kappa _E+n_x\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) \bigg )\\&\sum _{x\in \mathbb X}\bigg (\kappa _I\mu (x)\pi (n-e_x)+\kappa _E(n_x+1)\pi (n+e_x)\\&\qquad +\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x-\nu '_j+\nu _j\\ \nu _j\end{array}}\right) (n_{x-\nu _j'+\nu _j}+1)\pi (n-e_x+e_{x-\nu _j'+\nu _j})\bigg )\\&\quad =\pi (n)\sum _{x\in \mathbb X}\bigg (\kappa _I\mu (x)+n_x\kappa _E+n_x\sum _{\nu _j\rightarrow \nu _j'}\kappa _j\left( {\begin{array}{c}x\\ \nu _j\end{array}}\right) \bigg ), \end{aligned}$$

which is exactly the desired equality, (17). \(\square \)

Let us now consider some examples of applying this result.

Example 5.3

Let \(\lambda \ge 0\), and consider the compartment system

figure q

Then the stationary distribution of the system is given by

$$\begin{aligned} \pi (n)=\left( \prod _{x=0}^\infty \frac{\alpha (x)^{n_x}}{n_x!}\right) \cdot \left[ e^{-\kappa _I/\kappa _E}\cdot \left( \frac{\kappa _I}{\kappa _E}\right) ^{\left\Vert n\right\Vert _{\ell ^1}}\right] , \end{aligned}$$

where \(\alpha (x)\) is

$$\begin{aligned} \begin{aligned}\int _0^\infty \exp \left\{ -(\lambda -\kappa _b/\kappa _d)e^{-\kappa _dt} -\kappa _b/\kappa _d\right\} \frac{((\lambda -\kappa _b/\kappa _d)e^{-\kappa _dt}+\kappa _b/\kappa _d)^x}{x!} \kappa _E e^{-\kappa _E t}\mathrm dt. \end{aligned} \end{aligned}$$

Proof

Check that the distribution

$$\begin{aligned} P_\lambda (x,t):=\exp \left\{ -(\lambda -\kappa _b/\kappa _d)e^{-\kappa _dt}-\kappa _b/\kappa _d\right\} \frac{((\lambda -\kappa _b/\kappa _d)e^{-\kappa _dt}+\kappa _b/\kappa _d)^x}{x!} \end{aligned}$$

is \({\text {Poisson}}(\lambda )\) at time \(t=0\) and satisfies

$$\begin{aligned} \frac{\mathrm d}{\mathrm dt}P_\lambda (x,t)&=\kappa _b P_\lambda (x-1,t)+\kappa _d(x+1) P_\lambda (x+1,t) -\kappa _b P_\lambda (x,t)-\kappa _dx P_\lambda (x,t) \end{aligned}$$

for each x and t, and apply Theorem 5.1. \(\square \)

In the previous example, notice that the expected value of \(\alpha \) is

$$\begin{aligned} \begin{aligned}\sum _{x=0}^\infty x\alpha (x)&=\int _0^\infty \sum _{x=0}^\infty \exp \left\{ -(\lambda -\kappa _b/\kappa _d)e^{-\kappa _dt}-\kappa _b/\kappa _d\right\} \\ {}&\qquad \frac{((\lambda -\kappa _b/\kappa _d)e^{-\kappa _dt}+\kappa _b/\kappa _d)^{x+1}}{x!} \kappa _E e^{-\kappa _E t}\mathrm dt\\ {}&=\int _0^\infty (\lambda -\kappa _b/\kappa _d)\kappa _Ee^{-(\kappa _d+\kappa _E)t}+\frac{\kappa _b\kappa _E}{\kappa _d}e^{-\kappa _E t}\mathrm dt\\ {}&=\frac{(\lambda -\kappa _b/\kappa _d)\kappa _E}{\kappa _d+\kappa _E}+\frac{\kappa _b}{\kappa _d}\\ {}&=\frac{\lambda \kappa _E+\kappa _b}{\kappa _d+\kappa _E}. \end{aligned} \end{aligned}$$

This matches Duso and Zechner (2020), where the same example is consider in section 2.A (see specifically their equation [20] and the following discussion). Note that in Duso and Zechner (2020), though the expected value of \(\alpha \) is calculated in general, an explicit formula for \(\alpha (x)\) (in their notation, \(P_\infty (x)\)) is given in only two cases. The first is the case where \(\lambda =\kappa _b/\kappa _d\), where (in section S7.4 of their SI Appendix) they remark that \(\alpha \) is Poission with mean \(\lambda \). This matches the formula we give above in Example 5.3. The second case they cover is the one where \(\kappa _d=0\). In that case they obtain

$$\begin{aligned} \alpha (x)=(1-\xi )\xi ^x e^{\lambda (1/\xi -1)}\frac{\Gamma (1+x,\lambda /\xi )}{x!}, \end{aligned}$$

where \(\xi =\kappa _b/(\kappa _b+\kappa _E)\) and \(\Gamma \) is the upper incomplete Gamma function. One can check that this agrees with our next example, Example 5.4, in the case where \(\mu \) is taken to be Poission with parameter \(\lambda \) by applying the binomial theorem in our formula and then making a change of variable in the integral.

The following example is interesting for a few reasons. First, the chemistry is not converging to any sort of stationary distribution, and yet the whole compartment model is. Second, notice that when \(\mu \) is not a Poisson distribution, \(P_\mu (x,t)\) is not a Poisson distribution in x for all t unlike the previous example or more generally the DR models from Remark 5.2. Third, as discussed above, it generalizes an example from Duso and Zechner (2020).

Example 5.4

Let \(\mu \) be a probability distribution on \(\mathbb Z_{\ge 0}\), and consider the compartment system

figure r

Then the stationary distribution of the system is given by

$$\begin{aligned} \pi (n)=\left( \prod _{x=0}^\infty \frac{\alpha (x)^{n_x}}{n_x!}\right) \cdot \left[ e^{-\kappa _I/\kappa _E}\cdot \left( \frac{\kappa _I}{\kappa _E}\right) ^{\left\Vert n\right\Vert _{\ell ^1}}\right] , \end{aligned}$$

where

$$\begin{aligned} \alpha (x)=\int _0^\infty e^{-\kappa _b t}\left( \sum _{m=0}^x\frac{\kappa _b^mt^m}{m!}\mu (x-m)\right) \kappa _E e^{-\kappa _E t}\mathrm dt. \end{aligned}$$

Proof

Check that the distribution

$$\begin{aligned} P_\mu (x,t):=e^{-\kappa _b t}\left( \sum _{m=0}^x\frac{\kappa _b^mt^m}{m!}\mu (x-m)\right) \end{aligned}$$

satisfies

$$\begin{aligned} \frac{\mathrm d}{\mathrm dt}P_\mu (x,t)&=\kappa _b P_\mu (x-1,t)-\kappa _b P_\mu (x,t), \end{aligned}$$

with initial condition \(P_\mu (x,0)=\mu (x)\), and apply Theorem 5.1. \(\square \)