Abstract
The classic Luria–Delbrück model for fluctuation analysis is extended to the case where the split instant distributions of cells are not i.i.d.: the lifetime of each cell is assumed to depend on its birth date. This model takes also into account cell deaths and non-exponentially distributed lifetimes. In particular, it is possible to consider subprobability distributions and to model non-exponential growth. The extended model leads to a family of probability distributions which depend on the expected number of mutations, the death probability of mutant cells, and the split instant distributions of normal and mutant cells. This is deduced from the Bellman–Harris integral equation, written for the birth date inhomogeneous case. A new theorem of convergence for the final mutant counts is proved, using an analytic method. Particular examples like the Haldane model or the case where hazard functions of the split-instant distributions are proportional are studied. The Luria–Delbrück distribution with cell deaths is recovered. A computation algorithm for the probabilities is provided.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Mutation models are probabilistic descriptions of the growth of a population of cells, in which scarse mutations randomly occur. The first objective of these models is to study the distribution of the number of mutant cells at the end of the growth process. The classic mutation models can be interpreted as the result of the three following ingredients (Hamon and Ycart 2012):
-
1.
a random number of mutations occurring with small probability among a large number of cell divisions. Due to the law of small numbers, the number of mutations approximately follows a Poisson distribution. The expectation of that distribution is the product of the mutation probability by the total number of divisions;
-
2.
from each mutation, a clone of mutant cells growing during a random time. Due to exponential growth, most mutations occur close to the end of the process, and the developing time of a random clone has exponential distribution. The rate of that distribution is the relative fitness, i.e., the ratio of the growth rate of normal cells to that of mutants;
-
3.
the number of mutant cells that any clone developing for a given time will produce. The distribution of this number depends on the modeling assumptions, in particular the lifetimes of mutants.
One of the most used mutation models is the well-known Luria–Delbrück model (Luria and Delbrück 1943). Mathematical descriptions were introduced by Lea and Coulson (1949), followed by Armitage (1952) and Bartlett (1978). In that model, division times of mutant cells were supposed to be exponentially distributed. Thus, a clone develops according to a Yule process [see Yule (1925, p. 35), Athreya and Ney (1972, p. 109)], and its size at any given time follows a geometric distribution. The distribution of final mutant counts is also explicit when lifetimes of mutant cells are supposed to be constant. This latter model is called Haldane model by Sarkar (1991); an explicit form of the asymptotic distribution is given by Ycart (2013). General lifetimes have also been studied by Ycart (2013), but no explicit distribution is available appart from the exponential and constant lifetimes. Other extensions of the Luria–Delbrück model take into account the case where cells have a certain probability to die rather than divide (Angerer 2001, Sec. 3.1); (Dewanji et al. 2005; Komarova et al. 2007; Ycart 2014), where final number of cells are random (Angerer 2001; Komarova et al. 2007; Ycart and Veziris 2014), or where the cell divisions are asymmetric (Montgomery-Smith and Oveys 2016).
In the mutation models cited above, the lifetimes of the cells are supposed to be i.i.d., which is quite unrealistic. Indeed, during an experiment, a colony of cell grows in an environment which contains a finite amount of resources. Consider two instants \(s_1\) and \(s_2\) such that \(s_1\ll s_2\), then a cell born at time \(s_1\) will complete its lifetime faster than a cell born at time \(s_2\). The Verhulst model (Verhulst 1838) is one of the most known deterministic growth model which takes into account this limitation. Logistic-type stochastic models are described by Allen (2010, Sec. 9.4.2) and mathematically studied by several authors among which Tan (1986), Tan and Piantadosi (1991), Lambert (2005). The independence of lifetimes for single-type branching processes was questioned quite early (Kendall 1952). Experiments have evidenced correlation between a cell and its descendants, and between two sisters conditioning on their mother (Wang et al. 2010). The effects of these correlations and many models have been discussed since then: see Louhichi and Ycart (2015) and references cited therein.
When the lifetimes are i.i.d., using the theory of branching processes (Bellman and Harris 1952; Athreya and Ney 1972), the distribution of the total number of mutant cells converges as the initial number of cells tends to infinity to a heavy-tailed distribution. This convergence has also been proved using an analytic way by Bartlett (1978, Sec. 4.31), but for a restrictive case where the fitness is set to 1. (equal growth rates for normal and mutant cells). Stewart et al. (1990) proposed an approach to take into account the decreasing rate of division as the cells run out of resources. Houchmandzadeh (2015) described a discrete formulation without assumption on the growth model for the mutant clones. However, no result for the non-i.i.d. lifetimes case has been stated until now. In particular, there is no convergence result for the distribution of final mutant counts.
The main objective of this paper is to extend classic mutation models to the case where the split instant of a cell depends on its birth date. Cells deaths and non-exponential lifetimes are also taken into account with this approach. General modeling assumptions are described in Sect. 2. The main tool used in this paper is an extension of the Bellman–Harris integral equation (Bellman and Harris 1952), which is discussed in Sect. 3. General solutions are also provided. They are used in the asymptotic context (large observation time and small mutation probability) of mutation models in Sect. 4. Some examples among which Haldane model and a more general case (non-exponential growth) are provided. The convergence results are finally applied in Sect. 5 to the case where the hazard functions associated with the split instant distribution of normal and mutant cells are proportional. In particular, the Luria–Delbrück distribution with cell deaths, denoted here by LDD distribution (Ycart 2014), is recovered. A computation algorithm is proposed in Sect. 6.
2 Hypotheses and Models
In this section, the probabilistic model is defined as a tree-indexed process [see Pemantle (1995), Benjamini and Peres (1994) for general references]. Denote by \(\mathbb {T}\), the infinite complete binary tree and 0 its root. The vertices of \(\mathbb {T}\) are interpreted as cells, and each vertex has exactly 2 descendants. If x is a vertex of \(\mathbb {T}\), the number of edges between the root 0 and x is denoted by |x|. The subtree of \(\mathbb {T}\) composed by a vertex x and all its descent will be denoted by \(\mathbb {T}_x\). If x and y are two vertices of \(\mathbb {T}\), \(x\preccurlyeq y\) is the order relation that holds if x is in the path from 0 to y; \(x\prec y\) holds if x is strictly in the path from 0 to x; \(x\wedge y\) is the most recent common ancestor of x and y. The mother of the cell x is denoted by \(\overline{x}\): it is the cell such that \(\overline{x}\preccurlyeq x\) and \(|\overline{x}|=|x|-1\). Each cell produces two cells upon completion of its lifetime. Then a cell \(x_0\) which is not the root 0 has a sister cell \(x_1\). A vertex \(x_0\) and its sister \(x_1\) satisfy \(\overline{x_0}=\overline{x_1}=x_0\wedge x_1\).
The evolution of a clone stemming from a single cell at instant 0 will be modeled by a stochastic process \(\left( C_x\right) _{x\in \mathbb {T}}\) indexed by the binary tree \(\mathbb {T}\). For any \(x\in \mathbb {T}\), \(C_x\) is a couple \((B_x,T_x)\) where \(B_x\) describes the nature (dead, normal or mutant) of the cell x:
-
\(B_x=0\) if the cell x is dead;
-
\(B_x=1\) if the cell x is normal;
-
\(B_x=2\) if the cell x is mutant;
and \(T_x\) the instant at which the cell x completes its lifetime. The instant \(T_x\) will be called final instant of x. Denote by \(\overline{\mathbb {R}}_+=\mathbb {R}_+\cup \{+\infty \}\) the extended real line, and by \(\mathcal {B}(\overline{\mathbb {R}}_+)\) its Borel \(\sigma \)-field. From the above settings, the stochastic process \(\left( C_x\right) _{x\in \mathbb {T}}\) is defined on the measurable space \((\Omega ,\mathcal {A})\), where \(\Omega =\{0,1,2\}\times \overline{\mathbb {R}}_+\) and its \(\sigma -\)algebra \(\mathcal {A}=\mathcal {P}(\{0,1,2\})\times \mathcal {B}(\overline{\mathbb {R}}_+)\). The fact that \(T_x\) can be infinite will be discussed below. There remains to define a probability distribution on that space. Recall that the birth date of the root is set to 0. Assume that its nature \(B_0\) is known. Note that a dead cell has no descent. In other words, if \(B_x=0\) then \(B_y=0\) for any \(y\in \mathbb {T}_x\). Let \(\pi \), \(\gamma \), \(\delta \) be real numbers in \((0\,{;}\,1)\), respectively, interpreted as probability of mutation, of dying for a normal cell, of dying for a mutant cell.
Consider a cell \(x_0\ne 0\) and its sister \(x_1\). Their nature \(B_{x_0}\) and \(B_{x_1}\) depend only on the nature \(B_{\overline{x_0}}\) of the mother cell:
-
if \(B_{\overline{x_0}}=0\), then \(B_{x_0}=0\) and \(B_{x_1}=0\) with probability 1;
-
if \(B_{\overline{x_0}}=1\), then:
- –:
-
\(B_{x_0}=1\) and \(B_{x_1}=2\) with probability \(\pi /2\);
- –:
-
\(B_{x_0}=2\) and \(B_{x_1}=1\) with probability \(\pi /2\);
- –:
-
\(B_{x_0}=0\) and \(B_{x_1}=0\) with probability \(\gamma \);
- –:
-
\(B_{x_0}=B_{x_1}=1\) with probability \(1-\pi -\gamma \);
-
if \(B_{\overline{x_0}}=2\), then:
- –:
-
\(B_{x_0}=0\) and \(B_{x_1}=0\) with probability \(\delta \);
- –:
-
\(B_{x_0}=B_{x_1}=2\) with probability \(1-\delta \).
In other words, upon completion of its lifetime, any normal cell produces one normal and one mutant cell with probability \(\pi \) (this event is called a mutation), dies with probability \(\gamma \) or produces two normal cells with probability \(1-\pi -\gamma \). Upon completion of its lifetime, any mutant cell dies with probability \(\delta \) or produces two mutant cells with probability \(1-\delta \). Moreover, the events of death or mutation do not depend on the final instant of the cell.
For any cell x (such that \(B_x\ne 0\)), its final instant \(T_x\) depends on its nature \(B_x\) and its birth date, i.e., on the split instant of its mother \(T_{\overline{x}}\): if \(B_x=1\) and \(T_{\overline{x}}=s\), the cumulative distribution function (cdf) of \(T_x\) is denoted by \(F_\nu (s,\cdot )\); if \(B_x=2\) and \(T_{\overline{x}}=s\), the cdf of \(T_x\) is denoted by \(F_\mu (s,\cdot )\). These cdfs satisfy \(F_\nu (s,t)=0\) and \(F_\mu (s,t)=0\) for \(t\leqslant s\). Moreover, the total number of cells is in practice bounded by the carrying capacity. It corresponds to the maximum sustainable population: the closer to this bound the number of cells, the slower the growth of the population. In other words, some cells do not produce descendants before the end of the growth process. Thus, for any cell x, the distribution of \(T_x\) may have a positive mass at infinity. This hypothesis requires the notion of subprobability measure on a measurable space \((\Omega \,,\,\mathcal {A})\), i.e., a measure \(\tilde{\eta }\) on \((\Omega \,,\,\mathcal {A})\) such that \(\tilde{\eta }(\Omega )\leqslant 1\). For more details about subprobability measures, see for example Nguyen (2006, p. 170). Consider a subprobability measure \(\tilde{\eta }\) on \((\mathbb {R}_+\,,\,\mathcal {B}(\mathbb {R}_+))\). Then the limit of its cdf \(\tilde{F}(t)\) as t tends to infinity is smaller than 1. Let us define for any \(a,b\in \mathbb {R}_+\) the following measure \(\overset{\centerdot }{\eta }\):
and
Since the set \(\left\{ ]a\,{;}\,b[\,{;}\,[0\,{;}\,b[\,{;}\,]a\,{;}\,+\infty ]\,|\,a,b\in \mathbb {R}_+\right\} \) is a topological basis of \(\overline{\mathbb {R}}_+\), the Carathéodory extension’s theorem can be applied to extend the measure \(\overset{\centerdot }{\eta }\) on \(\mathcal {B}(\overline{\mathbb {R}}_+)\). From the measure \(\overset{\centerdot }{\eta }\), the following probability measure \(\eta \) can be defined for any \(A\in \mathcal {B}(\overline{\mathbb {R}}_+)\):
with the associated cdf:
Remark that if \(\tilde{\eta }\) is a probability measure on \(\mathbb {R}_+\), then \(\eta \) is also a probability measure on \(\mathbb {R}_+\).
The stochastic process \((C_x)_{x\in \mathbb {T}}=(B_x,T_x)_{x\in \mathbb {T}}\) models the evolution of a clone stemming from a single cell at instant 0. Actually, the evolution of a clone stemming from a single cell y can be deduced from the above process. It consists of a stochastic process \((C_x)_{x\in \mathbb {T}_y}\) indexed by the binary tree \(\mathbb {T}_y\) with the same modeling assumptions as above, conditionally to \(C_y\).
There remains to define the dependences between the cells. Consider a cell \(x_0\ne 0\) and its sister \(x_1\). The final instants \(T_{x_0}\) and \(T_{x_1}\) are assumed to be independent conditionally to \(C_{\overline{x_0}}\). By extension, the clones \((C_y)_{y\in \mathbb {T}_{x_0}}\) and \((C_y)_{y\in \mathbb {T}_{x_1}}\) are independent conditionally to \(C_{\overline{x_0}}\). Consider now two cells \(x\ne 0\) and \(y\ne 0\) and their common ancestor \(x\wedge y\). Assume that their common ancestor is neither x nor y. Then only one of the daughter cells of \(x\wedge y\) is in the path from 0 to x, and its sister is in the path from 0 to y. Thus, according to the previous dependence assumption, the final instants \(T_x\) and \(T_y\) are independent conditionally to \(C_{x\wedge y}\). Therefore, the clones \((C_w)_{w\in \mathbb {T}_x}\) and \((C_w)_{w\in \mathbb {T}_y}\) are independent conditionally to \(C_{x\wedge y}\).
Thereafter, the root 0 is assumed to be a normal cell, i.e., \(B_0=1\). The model can be summarized as follows:
-
at time 0, a single normal cell is present;
-
the final instant of any cell depends on its nature and its birth date;
-
the final instant of a normal cell born at time s is a random variable with cdf \(F_\nu (s,\cdot )\) defined on \(\overline{\mathbb {R}}_+\);
-
upon completion of the lifetime of a normal cell:
- –:
-
with probability \(\pi \) one normal and one mutant cells are produced;
- –:
-
with probability \(\gamma \) the cell dies out;
- –:
-
with probability \(1-\gamma -\pi \) two normal cells are produced;
-
the final instant of a mutant cell born at time s is a random variable with cdf \(F_\mu (s,\cdot )\) defined on \(\overline{\mathbb {R}}_+\);
-
upon completion of the lifetime of a mutant cell:
- –:
-
with probability \(\delta \) the cell dies out;
- –:
-
with probability \(1-\delta \) two mutant cells are produced;
-
for any cell, the events of death or mutation do not depend on its final instant;
-
two cells, whatever their nature, are independent conditionally on their common ancestor;
-
two clones are independent conditionally on the common ancestor of the two cells which started those clones.
All the results will be given in terms of the bivariate probability generating functions (pgf) of the numbers of normal and mutant cells, and the pgf of the number of mutant cells. The next section is dedicated to the calculation of these functions.
3 Integral Equations for Probability Generating Functions
Denote by \(\mathcal {N}(s,t)\) the couple of numbers of normal and mutant cells at time t in the clone stemming from a single normal cell born at time s. Its bivariate pgf is defined by
Note that \(\varphi (1,z,s,t)\) is the pgf of the number of mutant cells in the clone stemming from a normal cell born at time s. Denote by M(s, t) the number at time t of mutant cells in the clone stemming from a single mutant cell born at time s. Its pgf is defined by
One way to study these pgfs is to apply the well-known Bellman–Harris integral equation (Bellman and Harris 1952). However, this equation has been justified only for the case where the lifetimes are i.i.d.. This section extends it to the case where the final instant of a cell depends on its birth date. These equations have already been stated for the case of i.i.d. lifetimes [see for example Kimmel and Axelrod (2002, Chap. 5)]. The case of multitype branching processes with non-i.i.d. lifetimes is studied here using similar arguments. It leads to equation (BHM). Applying (BHM) to (2) gives Eq. (4). Description of this kind of processes with i.i.d. lifetimes can be found in Athreya and Ney (1972, chap. 5) or Kimmel and Axelrod (2002, chap. 6). Consider a multitype branching process with a number \(l+1\) of cell types (incuding the “dead” type, indexed by 0). The model is the following:
-
the final instant of a cell of type \(i>0\) born at time s is a random variable with cdf \(F_i(s,\cdot )\) defined on \(\overline{\mathbb {R}}_+\) such that \(F_i(s,t)=0\) if \(t\leqslant s\);
-
upon completion of a cell of type \(i>0\):
- –:
-
for any \(j\in \{1;\dots ;l\}\), a random number \(K_{i,j}\) of cell of type j is produced;
- –:
-
with probability \(\gamma _i\), the cell dies out;
-
two cells, whatever their nature, are independent conditionally to their common ancestor;
-
two clones are independent conditionally on the common ancestor of the two cells which started those clones.
For any \(1\leqslant i\leqslant l\), denote by \(\chi _i\) the pgf of \(\left( K_{i,j}\right) _{j=1,\dots ,l}\) defined by
for any \(Z=(z_j)_{j=1,\dots ,l}\in [0\,{;}\,1]^l\). Denote by \(X_{i,j}(s,t)\) the number at time t of cell of type j in the clone stemming from a cell of type i born at time s. Denote by \(\varphi _i(Z,s,t)\) the pgf of \(\left( X_{i,j}(s,t)\right) _{j=1,\dots ,l}\) defined by
Assume the initial cell completes its lifetime at a given time \(u>s\). For times \(t<u\), the mother cell is still alone in the corresponding clone. For times \(t\geqslant u\), the number of cells of type j is equal to the sum of cells of type j in the clones stemming from the cells produced by this first division. Then, the number \(X_{i,j}(s,t|u)\) of cell of type j in the considered clone, knowing that the initial cell (of type i) completes its lifetime at a time u, is given by
where \(X_{i,j}^{(h)}(u,t)\) are i.i.d. copies of the variable \(X_{i,j}(u,t)\). Denote by \(\varphi _i(Z,s,t|u)\) the pgf of \(\left( X_{i,j}(s,t|u)\right) _{j=1,\dots ,l}\) defined by
Since the growths of the clones are mutually independent, the following equation is obtained:
Integrating with respect to the distribution \(F_i(s,\cdot )\) removes the conditioning on the final instant of the cell. Then the following integral equation is obtained:
Similarly as for the homogeneous case (Bellman and Harris 1952), an intuitive interpretation of (BHM) can be given. For a given time \(t>s\), there are two possibilities: either the division of the cell has taken place after t with probability \(1-F_i(s,t)\). Then there is still one cell of type i, and \(\phi _i(Z,s,t)=z_i\) [second term in (BHM)]; or the division of the cell has taken place in \([u\,{;}\,u+\,\mathrm {d}u]\) (where \(s<u\leqslant t\)) with probability \(\mathrm {d}F_i(s,u)\). In that case, each cell of type j will start a clone, accounted for \(\phi _j(Z,u,t)\). Since the number of cells of each type is given by pgf \(\chi _j\), this accounts for the integral term of (BHM).
Consider now the model described in the previous section. In that case, \(l=2\). Assume that the type 1 correspond to the normal type. Then, for any \((y,z)\in [0\,{;}\,1]^2\):
and
Hence, applying (BHM) to pgf (2) and (3) leads to the following integral equations:
and
Until now, there were no specific assumptions on \(F_\nu \) and \(F_\mu \), except their definition domain and the fact that \(F_\nu (s,t)=F_\mu (s,t)=0\) if \(t\leqslant s\). Thereafter, in order to solve (4), some hypotheses on \(F_\nu \) are precised. For any \(s\geqslant 0\), let \(F(s,\cdot )\) be a cdf on \(\overline{\mathbb {R}}_+\) such that \(F(s,t)=0\) if \(t\leqslant s\). The cdf F will satisfy \((\mathcal {H})\) if there exists a cdf of subprobability on \(\mathbb {R}_+\), denoted by \(\tilde{F}(s,\cdot )\), such that the following holds:
- \((\mathcal {H}_{1})\) :
-
the cdf \(\tilde{F}\) is differentiable with respect to s and t, and decreasing in s;
- \((\mathcal {H}_{2})\) :
-
\(\displaystyle \lim _{t\rightarrow +\infty }\tilde{F}(s,t)\leqslant 1\) for all \(s\in \mathbb {R}_+\) and \(\tilde{F}(s,t)=0\) if \(t\leqslant s\);
- \((\mathcal {H}_{3})\) :
-
for any \(s\geqslant 0\), \(F(s,\cdot )\) is deduced from \(\tilde{F}(s,\cdot )\) as (1);
- \((\mathcal {H}_{4})\) :
-
the function h defined for all \((s,t)\in \mathbb {R}_+^2\) by
$$\begin{aligned} h(s,t) = -\log \left( 1-\tilde{F}(s,t)\right) \,, \end{aligned}$$satisfies for any \(t\geqslant s\):
$$\begin{aligned} h(s,t)=h(0,t)-h(0,s)\,. \end{aligned}$$
Remark that h is by definition positive, differentiable with respect to s and t, increasing in t, decreasing in s and for any \((s,t)\in \mathbb {R}_+^2\):
For the moment, no assumptions on \(F_\mu \) are required. However, some of the particular cases introduced here will assume that \(F_\mu \) satisfies \((\mathcal {H})\) too. In that case, its related function defined in \((\mathcal {H}_{4})\) will be denoted by \(h_\mu \). From now on, assume that \(F_\nu \) satisfies \((\mathcal {H})\), and denote by \(h_\nu \) its related function defined in \((\mathcal {H}_{4})\). Thus there exists a positive, continuous, \(\mathbb {R}_+\)-valued function \(\lambda _\nu \) such that:
The function \(h_\nu \) can be interpreted as the cumulative rate on an interval \([s\,{;}\,t]\) associated to \(F_\nu \). The function \(\lambda _\nu \) can be interpreted as the instantaneous rate associated to \(F_\nu \) on \(\mathbb {R}_+\). The cdf \(F_\nu (s,\cdot )\) is then defined on \(\overline{\mathbb {R}}_+\) for any \(s\in \mathbb {R}_+\) by:
Since \(h_\nu (s,t)=h_\nu (0,t)-h_\nu (0,s)\), replacing \(F_\nu \) by its expression in (4) leads to:
Taking the derivative with respect to s and dividing by \(\mathrm {e}^{-h_\nu (0,s)}\) leads to the following Riccati equation:
with the condition \(\varphi (y,z,t,t)=y\). Riccati equations may have explicit solutions depending on the coefficients [see for example Kucera (1973), Harko et al. (2014)]. One of the cases where (R1) can be solved is when \(\gamma =0\). This case makes sense in the context of mutation models: the objective is to obtain an explicit pgf for the mutant counts only. The number of divisions occurring in dying clones remains bounded and can be neglected. Thus, it can be considered that observed mutants only come from divisions in surviving clones.
In order to simplify expressions, define the following function:
If \(\gamma =0\), (R1) reduces to a Bernoulli equation of order 2. Then the change of variable \(\widetilde{\varphi }=1/\varphi \) leads to the solution \(\varphi (y,z,s,t)\).
Proposition 3.1
Assume \(\gamma =0\). The general solution of the Riccati equation (R1) is given by
Another case where (R1) has an explicit solution is when \(\gamma =\delta \) and \(F_\mu \) satisfies \((\mathcal {H})\) such that \(h_\mu (s,t)=h_\nu (s,t)\) for any (s, t) in \(\mathbb {R}_+^2\). In that case, \(\psi \) is a particular solution, and the general solution of (R1) is explicit (Harko et al. 2014). However, it corresponds to the case where mutant and normal cells have the same death probability and the same final instant distribution. Mutant and normal cells are then indistinguishable, which seems to be of less practical relevance. An explicit solution can also be obtained if \(1-\pi -\gamma =0\). However, \(\gamma \) has to be less than 0.5 (supercritical process), and \(\pi \) is typically in practice of order \(10^{-11}-10^{-9}\). Thus this case will not be studied here.
As a direct consequence of Proposition 3.1, setting \(y=1\) and \(s=0\) in (7) leads to the following result.
Corollary 3.1
Assume \(\gamma =0\). The mutant counts at time t starting with a single normal cell at time 0 follows the distribution with pgf:
From now on, \(\gamma \) will be set at 0. Observe that no assumptions on the cdf \(F_\mu \) are required. In particular, \(F_\mu \) does not necessarily satisfy \((\mathcal {H})\). As long as the pgf \(\psi \) is known, the distribution of the mutant counts at a given time is explicit. As an example, consider the Haldane model: the final instants of the normal cell are exponentially distributed with rate \(\lambda \) and the lifetimes of the mutant cells are equal to a constant a. In this case, \(\lambda _\nu \equiv \lambda \) and \(h_\nu \) is given by
and the cdf \(F_\mu (s,\cdot )\) is defined for any \(t\geqslant s\) by
Then \(F_\mu \) does not satisfy \((\mathcal {H})\): property \((\mathcal {H}_{1})\) is not satisfied. However, the pgf \(\psi \) is easily identifiable. Assume that the lifetime of any mutant cells is equal to a. Considering a cell born at time s, let \(b_i(z)\) be the pgf of the size of its clone in the interval \([s+ia\,{;}\,s+(i+1)a)\). Then \(b_0(z)=z\), and for all positive integer i,
Therefore the pgf of the size at time t of a clone starting at time s is:
Functions I(z, s, t) and \(\varphi (z,t)\) can be explicited. This example will be continued in the next section where the asymptotic model is considered.
Assume now that \(F_\mu \) satisfies \((\mathcal {H})\). There exists a function \(\lambda _\mu \), with the same properties as \(\lambda _\nu \), such that \(F_\mu (s,\cdot )\) is given by
By the same reasoning as for \(\varphi \), the following Riccati equation for \(\psi \) is obtained from (5):
with the condition \(\psi (z,t,t)=z\). The general solution of (R2) can be explicited without specific hypotheses:
Proposition 3.2
The general solution of the Riccati equation (R2) is given by
where:
The proof of Proposition 3.2 simply consists in observing that 1 is a particular solution of (R2). Then the general solution (R2) is explicit (Harko et al. 2014) and given by (10).
Observe that if \(\lambda _\mu \equiv \lambda \) with \(\lambda \) a positive constant, then \(h_\mu (s,t) = \lambda (t-s)\) and Proposition 3.2 reduces to the example of Athreya and Ney (1972, p. 109) . Moreover, if \(\lambda _\mu \equiv \lambda \) and \(\delta =0\), (R2) reduces to a Bernoulli equation. Its solution is the pgf of the geometric distribution with parameter \(\mathrm {e}^{-\lambda (t-s)}\), i.e., the pgf of a Yule process with parameter \(\lambda \). [see Yule (1925, p. 35) or Athreya and Ney (1972, p. 109)]. As a direct consequence of Proposition 3.2, the distribution of the number of mutant cells in a mutant clone started at time s can be explicited.
Proposition 3.3
Denote by \(\left( p_k(s,t)\right) _{k\in \mathbb {N}}\) the probabilities of the size at time t of a clone starting from a single mutant cell born at time s. Then:
and for \(k\geqslant 1\),
where:
and:
In other words, a random variable with pgf \(\psi (z,s,t)\) is the following random mixture: either 0 with probability \(p_0(s,t)\) or a geometric random variable with parameter P(s, t).
Proof
(Proposition 3.3) Writing \(\psi \) as the following rational function:
where
and
Then the probabilities \(p_k(s,t)\) can be recursively identified:
and
Denote by P(s, t) the second term of the above product:
Then for \(k\geqslant 2\):
\(\square \)
Observe that if \(\lambda _\mu \equiv \lambda \) with \(\lambda \) a positive constant, Proposition 3.3 reduces to formula (3.1) of Ycart (2014). Moreover, the expectation of the size at a finite time t of a mutant clone started at time s is \(\mathrm {e}^{h_\mu ^*(s,t)}\). Assume there exists \(\rho >0\) such that for any \(s\geqslant 0\):
The constant \(\rho \) can be interpreted as the instantaneous ratio of hazard functions. The assumption of proportional hazard functions is not new: in survival analysis, it is known as the Cox proportional-hazard regression model (Cox 1972), which is widely used. This assumption generalizes the notion of fitness defined in homogeneous mutation models as the ratio of the growth rate of normal cells to hat of mutants. Thereafter, the parameter \(\rho \) will be mentionned as the fitness. In that case, (6) can be explicited:
In particular, let f be a continuous, nonnegative, and increasing function on \(\mathbb {R}_+\). Let \(\mu \) be defined for (s, t) in \(\mathbb {R}_+^2\) by
Then \(\lambda _\mu \) is given by
and the expectation of the size at a finite time t of a mutant clone started at time s is \((f(t)/f(s))^{1-2\delta }\). In other words, it is possible to fit the average trajectory of the development of the clones to any appropriate function of time. Moreover, if \(\delta =0\), plugging (11) in (8) and applying the change of variable \(w=f(u)/f(t)\) leads to:
For example, if f is defined for any \(t\geqslant 0\) by \(f(t)=\mathrm {e}^{t}\), (12) is given by
which is the inverse of formula (10) of Bartlett (1978, p. 155). Functions with a carrying capacity, such that logistic or Gompertz functions, can also be considered and plugged into (12).
Corollary 3.1 is used in the next section in a relevant asymptotic context to get the convergence in distribution of the mutant count when n normal cells are initially present.
4 Asymptotic for Mutation Models
In this section, the previous results are applied to mutation models. A convergence theorem for the final number of mutant cells is proved, generalizing the analytic method initiated by Bartlett (1978, Sec. 4.31). A mutation model consists of n independent copies of the model described in Sect. 2. Denote by \(\phi _n(z,t)\) the pgf for the mutant counts at time t starting with n normal cells at time 0. Because of the independence of the n initial cells, \(\phi _n(z,t)\) is the n-th power of (8). Let \((\tau _n)_{n\in \mathbb {N}}\) be a sequence of observation instants, tending to infinity as n tends to infinity. Let \((\pi _n)_{n\in \mathbb {N}}\) be a sequence of mutation probabilities, tending to 0 as n tends to infinity. Moreover, assume that
where \(\alpha \) is some fixed positive real number. Remark that the constant \(\alpha \) corresponds in the classic case to the mean number of mutations. Considering this asymptotic context, the main objective of this section is to establish the convergence as n tends to infinity of \(\phi _n(z,\tau _n)\). Before stating the result, recall that \(\gamma =0\), and that \(F_\nu \) satisfies \((\mathcal {H})\). Denote by \(h_\nu \) its related function defined by \((\mathcal {H}_{4})\), and by \(\lambda _\nu \) the associated instantaneous rate. The limit of \(h_\nu (0,t)\) as t tends to infinity will be denoted by \(h_{\nu ,\infty }\). Note that the result exposed in this section does not require that \(F_\mu \) satisfies \((\mathcal {H})\). The Haldane model described earlier will be considered as an example. Define now the function \(\mathcal {I}\) as
Remark that, assuming the probability distribution function of a mutation instant is given by \(\lambda _\nu \mathrm {e}^{-h_\nu (u,t)}\mathbb {1}_{[0\,{;}\,t]}\), the function \(\mathcal {I}\) could be interpreted as the pgf of the size at a given time t of any mutant clone. The main result of this paper is the following convergence theorem:
Theorem 4.1
Assume \(\gamma =0\). Let \(\pi =\left( \pi _n\right) _{n\in \mathbb {N}}\) and \(\tau =\left( \tau _n\right) _{n\in \mathbb {N}}\) two sequences, and \(\alpha \) a positive real such that:
Assume that the limit
exists and is finite. As n tends to infinity, the pgf of the number of mutants at time \(\tau _n\), starting with n normal cells at time 0, converges to
where
and
Observe that (15) is the pgf of a Poisson compound with parameter m. By analogy with the homogeneous case, this parameter could be interpreted as the mean number of mutations, assuming that the number of mutation occasions is almost surely equivalent to \(n\left( \mathrm {e}^{h_\nu (0,\tau _n)}-1\right) \) as n tends to infinity.
The main tool required to prove Theorem 4.1 is Lemma 1 below.
Lemma 1
For any \(\pi \in [0\,{;}\,1[\), \(z\in [0\,{;}\,1]\), \(t\in \mathbb {R}_+\), and \(s\in [0\,{;}\,t]\), the following bound holds:
The proof uses a power series expansion of \(\mathrm {e}^{\pm \pi I(z,s,t)}\):
Proof
(Lemma 1)
Hence:
\(\square \)
An analytic proof for the case where mutant and normal cells are exponentially i.i.d. with equal rates has been provided by Bartlett (1978, Sec. 4.31). This approach has been adapted to prove Theorem 4.1.
Proof
(Theorem 4.1) Define the following two functions:
According to Lemma 1:
Then, the second factor in (8)
can be written as:
Let:
and
Since the limit (14) exists and is finite, the limit of \(f_3(z,t)\) as t tends to infinity exists and is finite. Consider now the following term:
It can be written as:
Multiplying by \(\mathrm {e}^{h_\nu (0,\tau _n)}\):
Remark now that according to inequality satisfied by \(f_1\) in (16):
Then, since \(n\pi _n\mathrm {e}^{h_\nu (0,\tau _n)}\) tends to \(\alpha \) as n tends to infinity, \(\pi _n\mathrm {e}^{h_\nu (0,\tau _n)}\) tends to 0. Hence:
Denote by \(o(\pi _n,\tau _n)\) any function such that \(n o(\pi _n,\tau _n)\) tends to 0 as n tends to infinity. Then:
Let \((\phi _n)_{n\in \mathbb {N}}\) be the sequence of functions defined by \(\phi _n(z,\tau _n)=\phi (z,\tau _n)^n\). Then:
Since \(\pi _n n\) is equivalent to \(\alpha \mathrm {e}^{-h_{\nu ,\infty }}\), the following limit is obtained:
with
\(\square \)
Observe that Theorem 4.1 holds whether \(F_\mu \) satisfies \((\mathcal {H})\) or not. As an example, consider again the Haldane model exposed in Sect. 3: the final instants of normal cells are exponentially distributed with rate \(\lambda \) and the lifetimes of mutant cells are equal to a constant a. The function \(\mathcal {I}(z,t)\) is then given by
where the \(b_i\)’s are given by (9). Hence the limit of \(\mathcal {I}(z,t)\) as t tends to infinity:
Remark that for \(\delta =0\) and \(a=\log (2)\), the result obtained by Ycart (2013) is recovered. Then the pgf of the final number of mutants can be explicited applying Theorem 4.1. The probabilities of final mutant counts under Haldane model with \(\delta >0\) will be also explicited in Sect. 6.
5 Proportional Hazard Functions
Here, the assumptions on \(\gamma \) and \(F_\nu \) made in previous the section are still valid. Assume that \(F_\mu \) also satisfies \((\mathcal {H})\). Denote by \(h_\mu \) its related function defined by \((\mathcal {H}_{4})\), and by \(\lambda _\mu \) the associated instantaneous rate. The limit of \(h_\mu (0,t)\) as t tends to infinity will be denoted by \(h_{\mu ,\infty }\). Moreover, assume there exists \(\rho >0\) such that for any \(s\geqslant 0\):
An interpretation of assumption ( \(\text {H}_\rho \) ) was given at the end of Sect. 3. In this section, a convergence theorem is deduced from Theorem 4.1, and examples are discussed. In particular, the Luria–Delbrück model with cell deaths is recovered.
Consider first the case where:
Under ( \(\text {H}_\rho \) ), the function I is given by (11). Then the limit
exists and is finite, and Theorem 4.1 can be applied. Consider now the case where \(h_{\mu ,\infty }\) is infinite. Then:
Thus:
Then the following result can be deduced from Theorem 4.1:
Theorem 5.1
Assume \(\gamma =0\). Let \(\pi =\left( \pi _n\right) _{n\in \mathbb {N}}\) and \( \tau =\left( \tau _n\right) _{n\in \mathbb {N}}\) two sequences, and \(\alpha \) a positive real such that:
Under ( \(\text {H}_\rho \) ), as n tends to infinity, the pgf of the number of mutants at time \(\tau _n\), starting with n normal cells at time 0, converges to
where
with
and
As a general application of Theorem 5.1, let f be a nonnegative and increasing function on \(\mathbb {R}_+\), with finite limit \(f_\infty \) as t tends to infinity. Let \(\mu \) be defined for (s, t) in \(\mathbb {R}_+^2\) by
Assume that hypothesis ( \(\text {H}_\rho \) ) is satisfied. Taking the limit as t tends to infinity of (11):
and:
where \(f^*(t)=f^{1-2\delta }\) for any positive t, and \(f^*_\infty =f_\infty ^{1-2\delta }\). Remark that only the ratio of \(f_\infty \) over f(0) has an influence on \(\mathcal {I}_\infty (z)\). Another natural example is the classic case where the final instants of normal and mutant cells are both exponentially distributed, i.e.:
where \(\lambda \) is a positive constant. Then the LDD distribution (Ycart 2014) is recovered. Actually, if (\(\text {H}_\rho \)) is satisfied and \(h_{\mu ,\infty }=+\infty \), the LDD distribution can be recovered from Theorem 5.1:
Corollary 5.1
Assume \(\gamma =0\). Let \(\pi =\left( \pi _n\right) _{n\in \mathbb {N}}\) and \(\tau =\left( \tau _n\right) _{n\in \mathbb {N}}\) two sequences, and \(\alpha \) a positive real such that:
Under ( \(\text {H}_\rho \) ), assume that \(h_{\mu ,\infty }=+\infty \). As n tends to infinity, the distribution of the number of mutants at time \(\tau _n\) starting with n normal cells at time 0, converges to the distribution with pgf:
where
and
In other words, the LDD distribution can be extended to the case where \(F_\nu (s,\cdot )\) and \(F_\mu (s,\cdot )\) are non-exponential distributions, as long as \(F_\nu (s,\cdot )\) and \(F_\mu (s,\cdot )\) are cdfs of true measures on \(\mathbb {R}_+\) and the associated hazard functions \(\lambda _\nu \) and \(\lambda _\mu \) are proportional.
6 Calculation Algorithm
A probability computation algorithm for the distribution of the final mutant counts is described here under the hypotheses of Theorem 5.1. The pgf \(\psi \) is given by
where the \(p_k\)’s are defined in Proposition 3.3. Thus, (13) can be written as:
where \(r_k\) is defined for any \(k\geqslant 0\) by
Hence:
and for all \(k>0\):
Then \(\mathcal {I}_\infty (z)\) can be given by
where for any \(k\geqslant 0\):
In other words:
and for all \(k>0\):
Moreover, (18) admits a series expansion for any \(z\in [0\,{;}\,1]\):
where the \(q_k\)’s can be easily expressed as function of the \(r_k\)’s, using the following algorithm exposed by Embrechts and Hawkes (1982). Firstly:
The derivative of \(\phi \) with respect to z is given by
On the other hand:
Hence for any \(k>0\):
Naturally, if \(h_{\mu ,\infty }=+\infty \), the probabilities of the LDD distribution (Ycart 2014) are recovered.
Consider now the Haldane model. In that case, the pgf \(\mathcal {I}_\infty \) is defined by (17). Considering a cell born at time s, let \(\left( p_k^{(i)}\right) _{k\in \mathbb {N}}\) be the probabilities of the size of its clone in the interval \([s+ia\,{;}\,s+(i+1)a)\). In other words:
for any \(i\geqslant 0\), where the \(b_i\)’s are given by (9). Therefore:
and the probabilities \(\left( r_k\right) _{k\in \mathbb {N}}\) associated to pgf \(\mathcal {I}_\infty \) are given by
Hence, the probabilities \(\left( r_k\right) _{k\in \mathbb {N}}\) can be explicited if the probabilities \(\left( p_k^{(i)}\right) _{k\in \mathbb {N}}\) can be explicited for any \(i\geqslant 0\). In practice, the Fast Fourier Transform can be used to identify the \(r_k^{(i)}\)’s. Then, the \(q_k\)’s can be computed using the algorithm of Embrechts and Hawkes (1982) described above.
7 Conclusion and Perspectives
An extension for the classic mutation models to the case where the final instant of a cell depends on its birth date has been proposed. The main results are based on the theory of supercritical branching processes. It led to a family of distributions, modeling asymptotic number of mutants. These distributions depend on the expected number of mutations m, the death probability of mutant cells \(\delta \), and the final instant distributions \(F_\nu (s,\cdot )\) and \(F_\mu (s,\cdot )\) for normal and mutant cells born at a given time s. A convergence theorem for the final count of mutants has been proved for both cases where \(F_\nu (s,\cdot )\) and \(F_\mu (s,\cdot )\) are defined on \(\overline{\mathbb {R}}_+\) or \(\mathbb {R}_+\). The first case provides the possibility that a cell does not split or die before the end of the experiment. It enables to model more realistic growths, such as logistic growth. The particular case where the hazard functions \(\lambda _\nu \) and \(\lambda _\mu \) associated to \(F_\nu (s,\cdot )\) and \(F_\mu (s,\cdot )\) are proportional has been studied. Computation algorithm for probabilities has been described. Moreover, the LDD distribution is recovered when \(F_\nu (s,\cdot )\) and \(F_\mu (s,\cdot )\) are defined on \(\mathbb {R}_+\) and the associated hazard functions are proportional. The consequences for statistical inference and simulation must be developed. Since the R package flan (available on CRAN: https://cran.r-project.org/package=flan) provides tools for inference of mutation models for the case where final instants are i.i.d., an extension to the model proposed here is planned.
References
Allen L (2010) An introduction to stochastic processes with applications to biology, 2nd edn. Chapman and Hall/CRC, Boca Raton
Angerer W (2001) An explicit representation of the Luria–Delbrück distribution. J Math Biol 42(2):145–174
Armitage P (1952) The statistical theory of bacterial populations subject to mutation. J R StatSoc B 14:1–40
Athreya K, Ney P (1972) Branching processes. Springer, Berlin
Bartlett MS (1978) An introduction to stochastic processes, with special reference to methods and applications, 3rd edn. Cambridge University Press, Cambridge
Bellman R, Harris T (1952) On age-dependent binary branching processes. Ann Math 55(2):280–295
Benjamini I, Peres Y (1994) Markov chains indexed by trees. Ann Probab 22(1):219–243
Cox D (1972) Regression models and life-tables. J R Stat Soc (Ser B) 34(2):187–220
Dewanji A, Luebeck E, Moolgavkar S (2005) A generalized Luria–Delbrück model. Math Biosci 197(2):140–152
Embrechts P, Hawkes J (1982) A limit theorem for tails of discrete infinitely divisible laws with applications to fluctuation theory. J Aust Math Soc Ser A 32:412–422
Hamon A, Ycart B (2012) Statistics for the Luria–Delbrück distribution. Electron J Stat 6:1251–1272
Harko T, Lobo F, Mak M (2014) Analytical solutions of the Riccati equation with coefficients satisfying integral or differential conditions with arbitrary functions. Univ J Appl Math 2(2):109–118
Houchmandzadeh B (2015) General formulation of Luria-Delbrück distribution of the number of mutants. Phys Rev E Stat Nonlin Soft Matter Phys 92:012719
Kendall D (1952) On the choice of a mathematical model to represent normal bacterial growth. J R Stat Soc B 14(1):41–44
Kimmel M, Axelrod D (2002) Branching processes in biology. Springer, New York
Komarova NL, Wu L, Baldi P (2007) The fixed-size Luria–Delbrück model with a nonzero death rate. Math Biosci 210(1):253–290
Kucera V (1973) A review of the matrix Riccati equation. Kybernetika 9(1):42–61
Lambert A (2005) The branching process with logistic growth. Ann Appl Probab 15(2):1506–1535
Lea D, Coulson C (1949) The distribution of the number of mutants in bacterial populations. J Genet 49(3):264–285
Louhichi S, Ycart B (2015) Exponential growth of bifurcating processes with ancestral dependence. Adv Appl Probab 47(2):545–564
Luria S, Delbrück M (1943) Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28(6):491–511
Montgomery-Smith S, Oveys H (2016) Age-dependent branching processes and applications to the Luria–Delbrück experiment. arXiv preprint arXiv:1608.06314
Nguyen H (2006) An introduction to random sets. Chapman & Hall/CRC, Boca Raton
Pemantle R (1995) Tree-indexed process. Stat Sci 10(2):200–213
Sarkar S (1991) Haldane’s solution of the Luria–Delbrück distribution. Genetics 127:257–261
Stewart F, Gordon D, Levin B (1990) Fluctuation analysis: the probability distribution of the number of mutants under different conditions. Genetics 124(1):175–185
Tan W (1986) A stochastic Gompertz birth-death process. Stat Probab Lett 4(4):25–28
Tan W, Piantadosi S (1991) On stochastic growth processes with application to stochastic logistic growth. Stat Sin 1:527–540
Verhulst PF (1838) Notice sur la loi que la population suit dans son accroissement. In: Garnier J, Quetelet A (eds) Correspondance mathématique et physique, vol 10. Société Belge de Librairie, Bruxelles, pp 113–121
Wang P, Robert L, Pelletier J, Dang WL, Taddei F, Wright A, Jun S (2010) Robust growth of Escherichia coli. Curr Biol 20:1099–1103
Ycart B (2013) Fluctuation analysis: Can estimates be trusted? PLoS ONE 8(12):1–12
Ycart B (2014) Fluctuation analysis with cell deaths. J Appl Probab Statist 9(1):13–29
Ycart B, Veziris N (2014) Unbiased estimates of mutation rates under fluctuating final counts. PLoS ONE 9(7):1–10
Yule G (1925) A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, F.R.S. Phil Trans R Soc Lond Ser B 213:21–87
Acknowledgements
This research was supported by Laboratoire d’Excellence TOUCAN (Toulouse Cancer). The author is grateful to Bernard Ycart for comments on earlier drafts of the paper and to the anonymous referees for helpful remarks.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mazoyer, A. Time Inhomogeneous Mutation Models with Birth Date Dependence. Bull Math Biol 79, 2929–2953 (2017). https://doi.org/10.1007/s11538-017-0357-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-017-0357-3
Keywords
- Branching process
- Probability generating function
- Fluctuation analysis
- Luria–Delbrück distribution
- Cell kinetics
- Non-exponential growth