A New Proof for the Convergence of an Individual Based Model to the Trait Substitution Sequence

Gupta, Ankit; Metz, J. A. J.; Tran, Viet Chi

doi:10.1007/s10440-013-9847-y

A New Proof for the Convergence of an Individual Based Model to the Trait Substitution Sequence

Published: 28 September 2013

Volume 131, pages 1–27, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Acta Applicandae Mathematicae Aims and scope Submit manuscript

A New Proof for the Convergence of an Individual Based Model to the Trait Substitution Sequence

Download PDF

Ankit Gupta¹,
J. A. J. Metz^2,3 &
Viet Chi Tran^1,4

223 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

We consider a continuous time stochastic individual based model for a population structured only by an inherited vector trait and with logistic interactions. We consider its limit in a context from adaptive dynamics: the population is large, the mutations are rare and the process is viewed in the timescale of mutations. Using averaging techniques due to Kurtz (in Lecture Notes in Control and Inform. Sci., vol. 177, pp. 186–209, 1992), we give a new proof of the convergence of the individual based model to the trait substitution sequence of Metz et al. (in Trends in Ecology and Evolution 7(6), 198–202, 1992), first worked out by Dieckman and Law (in Journal of Mathematical Biology 34(5–6), 579–612, 1996) and rigorously proved by Champagnat (in Theoretical Population Biology 69, 297–321, 2006): rigging the model such that “invasion implies substitution”, we obtain in the limit a process that jumps from one population equilibrium to another when mutations occur and invade the population.

Convergence of an infinite dimensional stochastic process to a spatially structured trait substitution sequence

Article 19 May 2016

Introduction

Ancestral Lineages and Limit Theorems for Branching Markov Chains in Varying Environment

Article 27 March 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction: The Logistic Birth and Death Model

We consider a stochastic individual based model (IBM) with trait structure and evolving as a result of births and deaths, that was introduced by Dieckmann and Law [9] and Metz et al. [23] and in rigorous detail by Champagnat [2]. We study its limit in an evolutionary time scale when the population is large and the mutations are rare.

Champagnat [2] established the first rigorous proof of the convergence of a sequence of such IBMs to the trait substitution sequence process (TSS) introduced by Metz et al. [22] (with Metz et al. [23] as a follow up). Following Dieckmann and Law [9], the TSS can be explained as follows. In the limit, the time scales of ecology and evolution are separated. Mutations are rare and before a mutant arises, the resident population stabilizes around an equilibrium. Under the “invasion implies substitution” Ansatz, there cannot be long term coexistence of two different traits. Evolution thus proceeds as a succession of monomorphic population equilibria. The fine structure of the transitions disappear on the time scale that is considered and when a mutation occurs, invades, and fixates in the population by completely replacing the resident trait, the TSS jumps from one state to another. Champagnat’s proof is based on some fine estimates, including some fine large deviation results, to combine several approximations of the microscopic process. We separate the different time scales that are involved using averaging techniques due to Kurtz [17], and thus propose a new simplified proof of Champagnat’s results that skips many technicalities linked with fine approximations of the IBM. The aim of this paper is to exemplify the use of such averaging techniques in adaptive dynamics, which we hope will pave the way for generalizations of the TSS.

We consider a structured population where each individual is characterized by a trait $x\in \mathbb{X}$, where $\mathbb{X}$ is a compact subset of $\mathbb{R}^{d}$. We are interested in large populations. We assume that the population’s initial size is proportional to a parameter $K\in \mathbb{N}^{*}=\{1,2,\dots\} $, to be interpreted as the area to which the population is confined. We will let K go to infinity while keeping the density constant by counting the individuals with weight 1/K. The population is assumed to be well mixed and its density is assumed to be limited by a fixed availability of resources per unit area. The population at time t can be described by the following point measure on $\mathbb{X}$

$$\begin{aligned} X^{K}(t) = \frac{1}{K} \sum _{i=1}^{N^{K}(t)} \delta_{x^i_t}, \end{aligned}$$

(1.1)

where N ^K(t) is the total number of individuals in the population at time t and where $x^{i}_{t}\in \mathbb{X}$ denotes the trait of individual i living at time t, the latter being numbered in lexicographical order.

The population evolves by births and deaths. An individual with trait $x\in \mathbb{X}$ gives birth to new individuals at rate b(x), where b(x) is a positive continuous function on $\mathbb{X}$. With probability u _K p(x)∈[0,1], the daughter is a mutant with trait y, where y is drawn from the mutation kernel m(x,dy) supported on $\mathbb{X}$. Here u _K∈[0,1] is a parameter depending on K that scales the probability of mutation. With probability 1−u _K p(x)∈[0,1], the daughter is a clone of her mother, and has the same trait x. In a population described by $X\in \mathcal{M}_{F}(\mathbb{X})$, the individual with trait x dies at rate $d(x)+\int_{\mathbb{X}}\alpha(x,y)X(dy)$, where the natural death rate d(x) and the competition kernel α(x,y) are positive continuous functions.

Assumption 1.1

We assume that the functions b, d and α satisfy the following hypotheses:

(A)
For all $x\in \mathbb{X}$, b(x)−d(x)>0 and p(x)>0.
(B)
“Invasion implies substitution”: For all x and y in $\mathbb{X}$, we either have:
$$\begin{aligned} & \bigl(b(y)-d(y)\bigr)\alpha(x,x)-\bigl(b(x)-d(x)\bigr)\alpha(y,x)<0 \end{aligned}$$
(1.2)

$$\begin{aligned} \mbox{or}\quad& \left \{ \begin{array}{l} (b(y)-d(y))\alpha(x,x)-(b(x)-d(x))\alpha(y,x)>0 \\ (b(x)-d(x))\alpha(y,y)-(b(y)-d(y))\alpha(x,y)<0. \end{array} \right . \end{aligned}$$
(1.3)
(C)
There exist $\underline{\alpha}$ and $\overline {\alpha}>0$ such that for every $x,y\in \mathbb{X}$:
$$ 0<\underline{\alpha}\leq\alpha(x,y)\leq \overline {\alpha}. $$
(1.4)

Part (A) of Assumption 1.1 says that in the absence of competition, the population has a positive natural growth rate. Also the probability of a birth resulting in a mutation is positive. Part (B) corresponds to a condition known in adaptive dynamics as “invasion implies substitution”. It can be obtained from the analysis of the equilibria of the Lotka-Volterra system that results from the ordinary large number limit of the logistic competition process without mutation. The consequence of this condition is that when the mutant population manages to reach a sufficiently large size, it wipes out the resident population. Of course the other possibility is that the mutant population becomes extinct quickly. Hence, the population should be monomorphic away from the mutation events.

In this paper, we use the methods of Kurtz [18], based on martingale problems, for separating time scales. We show that on one hand the populations stabilize around their equilibria on the fast ecological scale, while on the other hand, rare mutations at the evolutionary time scale may induce switches from one trait to another. Our proof differs from the one in [2] in that we do not require comparisons with differential equations and large deviation results. Instead we use comparisons with branching processes to exhibit the stabilization of the population around the equilibria determined by the resident trait. Some of our arguments are similar in nature to those presented in [16] for a related model but with finite population sizes.

In Sect. 2, we describe the IBM introduced in [2]. The model accounts only for a trait-structure and otherwise has very simple dynamics. More general trait spaces (possibly functional, as first introduced in [10, 24]) are considered in [14] where the “invasion implies substitution” assumption is also relaxed (see also [3, 13, 23]). There exist several other possible generalisations of the IBMs that underlie the TSS, including for e.g. the physiological structure [11, 19], diploidy [7] or multi-resources [6]. We consider the process that counts the new traits appearing due to mutations, and the occupation measure Γ ^K of the process X ^K under a changed time scale. The tightness of this couple of processes is studied in Sect. 3. The limiting values are shown to satisfy an equation that is considered in Sect. 4. This equation says that when a favorable mutant appears, then the distribution describing the population jumps to the equilibrium characterized by the new trait with a probability depending on the fitness of the mutant trait compared to the resident trait. From the consideration of monomorphic and dimorphic populations and constructing couplings with branching processes, we prove the convergence in distribution of {Γ ^K} to the occupation measure Γ of a pure jump Markov process that is called the TSS.

Notation

Let E be a Polish space and let $\mathcal{B}(E)$ be its Borel sigma field. We denote by $\mathcal{M}_{F}(E)$ (resp. $\mathcal{M}_{P}(E)$) the set of nonnegative finite (resp. point) measures on E, endowed with the topology of weak convergence. If E is compact, this topology coincides with the topology of vague convergence (see for e.g. [15]) and for any M>0, the set $\{\mu\in\mathcal{M}_{F}(E) : \mu(E) \leq M\}$ is compact. For a measure μ, we denote its support by supp(μ). If f is a bounded measurable function on E and $\mu\in\mathcal{M}_{F}(E)$, we use the notation: 〈μ,f〉=∫_E f(x)μ(dx). With an abuse of notation, 〈μ,x〉=∫_E xμ(dx). Convergence in distribution of a sequence of random variables (or processes) is denoted by ‘⇒’. The minimum of any two numbers $a,b \in \mathbb{R}$ is given by a∧b. For any $a\in \mathbb{R}$, its positive part is denoted by [a]⁺ and ⌊a⌋ denotes the largest integer less than a. For any two $\mathbb{N}^{*}$-valued sequences $\{a_{K} : K \in \mathbb{N}^{*}\}$ and $\{b_{K} : K \in \mathbb{N}^{*}\}$ we say that a _K≪b _K if a _K/b _K→0 as K→∞.

Define a class of test functions on $\mathcal{M_{F}}(\mathbb{X})$ by

$$\begin{aligned} \mathbb{F}^{2}_b = \bigl\{ F_f : F_f(\mu) = F \bigl(\langle\mu, f \rangle\bigr), f \in C_b(\mathbb{X},\mathbb{R}) \text{ and } F \in C^2_b(\mathbb{R}, \mathbb{R}) \text{ with compact support}\bigr\}. \end{aligned}$$

Here $C_{b}(\mathbb{X},\mathbb{R})$ is the set of all continuous and bounded real functions on $\mathbb{X}$ and $C^{2}_{b}(\mathbb{R},\mathbb{R})$ is the set of bounded, twice continuously differentiable real-valued functions on $\mathbb{R}$ with bounded first and second derivatives. This class $\mathbb{F}^{2}_{b}$ is separable and it is known (see for example [8]) that it characterizes the convergence in distribution on $\mathcal{M}_{F}(\mathbb{X})$.

The class of càdlàg processes from $\mathbb{R}_{+}$ to E is denoted by $\mathbb{D}(\mathbb{R}_{+},E)$.

The value at time t of a process X is denoted by X(t) or sometimes X _t for notational convenience.

2 IBM in the Evolutionary Time-Scale

The process X ^K is characterized by its generator L ^K, defined as follows. For any $F_{f} \in\mathbb{F}^{2}_{b}$ let

$$\begin{aligned} L^K F_{f}(X) & =K \int_{E} b(x) \biggl[ \int_{\mathbb{X}} \biggl( F_f \biggl( X+ \frac{1}{K} \delta_{y} \biggr) - F_f(X) \biggr)M^K(x,dy) \biggr] X(dx) \\ &\quad +K \int_{E} \bigl(d(x)+\bigl\langle X,\alpha(x,.)\bigr \rangle \bigr) \biggl( F_f \biggl( X- \frac{1}{K} \delta_{x} \biggr) - F_f(X) \biggr) X(dx), \end{aligned}$$

(2.1)

where M ^K is the transition kernel given by

$$ M^K(x,dy)=u_K p(x) m(x,dy)+\bigl(1-u_K p(x) \bigr) \delta_{x}(dy). $$

(2.2)

Let $K\in \mathbb{N}^{*}$ be fixed. The martingale problem for L ^K has a unique solution for any initial condition $X^{K}(0) \in \mathcal {M}_{F}(\mathbb{X})$ and it is possible to construct the solution of the martingale problem by considering a stochastic differential equation (SDE) driven by Poisson point processes which corresponds to the IBM used for simulations (see [4, 5]). The following estimate will be needed in the sequel. It is proved in [2, Lemma 1]

Lemma 2.1

Suppose that $\sup_{K \in \mathbb{N}^{*} } \mathbb{E}( \langle X^{K}(0),1 \rangle ^{2} ) < \infty$, then

$$\sup_{K\geq1,\ t\geq0}\mathbb{E}\bigl(\bigl\langle X^K(t),1\bigr \rangle^2 \bigr)<\infty. $$

In the sequel, we make some assumptions on the initial condition.

Assumption 2.2

Suppose that the sequence of $\mathcal{M}_{F}(\mathbb{X})$-valued random variables $\{X^{K}(0) : K \in \mathbb{N}^{*}\}$ satisfies the following conditions.

(A)
There exists a $x_{0} \in \mathbb{X}$ such that supp(X ^K(0))={x ₀} for all $K \in \mathbb{N}^{*}$.
(B)
$\sup_{K\in \mathbb{N}^{*}}\mathbb{E}(\langle X^{K}(0),1\rangle ^{2} ) < \infty$.
(C)
X ^K(0)⇒X(0) as K→∞ and 〈X(0),1〉>0 a.s.

From (2.1), we can see that the dynamics has two time scales. The slower time scale is of order Ku _K and it corresponds to the occurrence of new mutants while the faster time scale is of order 1 and it corresponds to the birth-death dynamics. We consider rare mutations and so we require that Ku _K→0 as K→∞. Moreover as in [2], we will assume that the speed of arrival of new mutants is slow enough to give each mutant enough time to invade the population if it can (time of order logK) and fast enough to ensure that a new mutant arrives while the resident population is near its equilibrium (time of order e ^cK for some c>0). We will therefore make the following assumption.

Assumption 2.3

For any c>0

$$\log K \ll\frac{1}{K u_K } \ll e^{cK}. $$

Consider the process

$$\begin{aligned} Z^K(t) = X^K \biggl( \frac{t}{Ku_K} \biggr), \quad t \geq0. \end{aligned}$$

(2.3)

In what follows, we denote by $\{ \mathcal{F}^{K}_{t} : t\geq0\}$ the canonical filtration associated with Z ^K. Due to the change in time, the generator $\mathbb{L}^{K}$ of Z ^K is the generator L ^K of X ^K multiplied by (1/Ku _K). Hence for any $F_{f} \in\mathbb{F}^{2}_{b}$

$$\begin{aligned} \mathbb{L}^K F_f(Z)&= \int_{\mathbb{X}} p(x) b(x) \biggl[ \int_{\mathbb{X}} \biggl( F_f \biggl(Z+\frac{1}{K}\delta_{y} \biggr)-F_f(Z) \biggr) m(x,dy) \biggr]Z(dx) \\ &\quad + \frac{1}{K u_K} \biggl[ \int_{\mathbb{X}} b(x) \bigl(1-u_K p(x)\bigr) K \biggl( F_f \biggl( Z+ \frac{1}{K} \delta_{x} \biggr) - F_f(Z) \biggr) Z(dx) \\ &\quad + \int_{\mathbb{X}} \bigl(d(x)+\bigl\langle Z,\alpha(x,.) \bigr\rangle \bigr) K \biggl( F_f \biggl( Z- \frac{1}{K} \delta_{x} \biggr) - F_f(Z) \biggr) Z(dx) \biggr]. \end{aligned}$$

(2.4)

In the process Z ^K we have compressed time so that the mutations occur at a rate of order 1. When we work at this time scale we can expect that between subsequent mutations, the fast birth-death dynamics will average (see for e.g. [18]). Our aim is to exploit this separation in the time scales of ecology (which is related to the births and deaths of individuals) and of evolution (which is linked to mutations).

To study the averaging phenomenon, for the fast birth-death dynamics, we use the martingale techniques developed by Kurtz [18]. We introduce the occupation measure Γ ^K defined for any t≥0 and for any set $A \in\mathcal{B} ( \mathcal{M}_{F}(\mathbb{X}) )$ by

(2.5)

Kurtz’s techniques have been used in the context of measure-valued processes in [20, 21] for different population dynamic problems, but an additional difficulty arises here due to the presence of non-linearities at the fast time scale.

We introduce a $\mathcal{M}_{P}(\mathbb{X})$-valued process {χ ^K(t):t≥0} which keeps track of the traits that have appeared in the population. That is, for each t≥0, χ ^K(t) is a counting measure on $\mathbb{X}$ that weights the traits that have appeared in the population until time t. The process χ ^K is a pure-jump Markov process that satisfies the following martingale problem. For any $F_{f} \in\mathbb{F}^{2}_{b}$

$$\begin{aligned} M^{\chi,K}_t&:= F_f\bigl(\chi^K(t) \bigr)-F_f\bigl(\chi^K(0)\bigr) \end{aligned}$$

(2.6)

$$\begin{aligned} &\quad - \int_0^t \int_{\mathbb{X}} p(x) b(x) \int_{\mathbb{X}} \bigl(F_f\bigl(\chi ^K(s)+\delta_{y}\bigr)-F_f\bigl( \chi^K(s)\bigr) \bigr) m(x,dy)\, Z^K(s,dx)ds \\ &= F_f\bigl(\chi^K(t)\bigr)-F_f\bigl( \chi^K(0)\bigr) \\ &\quad - \int_0^t \int_{\mathcal{M}_F(\mathbb{X})} \biggl[ \int_{\mathbb{X}} p(x) b(x) \int_{\mathbb{X}} \bigl( F_f\bigl(\chi^K(s)+\delta_{y} \bigr) \\ &\quad -F_f\bigl(\chi^K(s)\bigr) \bigr) m(x,dy) \mu(dx) \biggr] \varGamma^K(ds \times d\mu), \end{aligned}$$

(2.7)

is a square integrable $\{\mathcal{F}^{K}_{t}\}$-martingale.

The main result of the paper proves the convergence of {(χ ^K,Γ ^K)} to a limit (χ,Γ), where the slow component χ is a jump Markov process and the fast component stabilizes in an equilibrium that depends on the value of the slow component.

Theorem 2.4

Suppose that Assumptions 1.1, 2.2 and 2.3 hold.

(A)
There exists a process χ with paths in $\mathbb{D}(\mathbb{R}_{+}, \mathcal{M}_{P}(\mathbb{X}) )$ and a random measure $\varGamma\in\mathcal{M}_{F} ([0,\infty)\times \mathcal{M}_{F}(\mathbb{X}) )$ such that (χ ^K,Γ ^K)⇒(χ,Γ) as K→∞, where (χ,Γ) satisfy the following. For all functions $F_{f} \in\mathbb {F}^{2}_{b}$,
$$\begin{aligned} & F_f\bigl(\chi(t)\bigr) - F_f\bigl(\chi(0)\bigr) \\ &\quad -\int_{0}^{t} \int_{\mathcal{M}_F(\mathbb{X})} \biggl[\int_\mathbb{X}b(x)p(x) \int_{\mathbb{X}} \bigl(F_f\bigl(\chi(s)+\delta_{y}\bigr) \\ &\quad -F_f \bigl(\chi(s)\bigr) \bigr) m(x,dy) \mu(dx) \biggr]\varGamma(ds \times d \mu) \end{aligned}$$
(2.8)
is a square integrable martingale with respect to the filtration
$$\begin{aligned} \mathcal{F}_t = \sigma \bigl\{\chi(s), \varGamma \bigl([0,s] \times A \bigr) : s \in[0,t], A \in\mathcal{B} \bigl( \mathcal{M}_F(\mathbb{X}) \bigr) \bigr\} \end{aligned}$$
(2.9)
and for any t≥0
$$\begin{aligned} \int_{0}^{t}\int _{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f (\mu )\varGamma(ds \times d \mu) = 0\quad \textit{for all }F_f \in\mathbb {F}^{2}_b \ \textit{a.s.}, \end{aligned}$$
(2.10)
where the nonlinear operator $\mathbb{B}$ is defined by
$$\begin{aligned} \mathbb{B}F_f(\mu) = F' \bigl( \langle\mu,f \rangle \bigr)\int_{ \mathbb{X}} \bigl( b(x) - \bigl( d(x)+\bigl\langle\mu, \alpha (x,.)\bigr\rangle \bigr) \bigr) f(x) \mu(dx). \end{aligned}$$
(2.11)
(B)
Let Γ be as in part (A). Then for any t>0 and $A \in\mathcal{B} ( \mathcal{M}_{F}(\mathbb{X}) )$ we have
(2.12)
where {χ′(t):t≥0} is a $\mathbb{X}$-valued Markov jump process with χ′(0)=x ₀ and generator given by
$$\begin{aligned} \mathbb{C} f(x) = b(x)p(x) \widehat{n}_{x} \int _{\mathbb{X}} \frac {[\mathrm{Fit}(y,x)]^{+}}{b(y)} \bigl( f(y) - f(x) \bigr) m(x,dy), \end{aligned}$$
(2.13)
for any $f \in C_{b}(\mathbb{X},\mathbb{R})$. Here the population equilibrium $\widehat {n}_{x}$ and the fitness function Fit(y,x) are defined as
$$ \widehat{n}_x=\bigl(b(x)-d(x)\bigr)/\alpha(x,x)\quad\textit{and} \quad\mathrm {Fit}(y,x) = b(y) - d(y) - \alpha(y,x)\widehat{n}_x . $$
(2.14)

Note that part (B) of the theorem characterizes the limiting occupation measure Γ. Once Γ is known, χ is characterized by the martingale problem (2.8). The next remark motivates the formula for $\widehat{n}_{x}$ and the fitness function Fit(y,x).

Remark 2.5

Consider a Lotka-Volterra system that results from the large number limit of the logistic competition process without mutation. If the system has two traits x and y, then the population sizes n _x(t) and n _y(t) corresponding to these traits evolve according to the system of ordinary differential equations given by

$$\begin{aligned} \begin{aligned} \frac{dn_x}{dt}&= n_x(t) \bigl(b(x) - d(x) - \alpha(x,x)n_x(t) - \alpha(x,y) n_y(t) \bigr) \\ \frac{dn_y}{dt}&= n_y(t) \bigl(b(y) - d(y) - \alpha(y,x)n_x(t) - \alpha(y,y) n_y(t) \bigr), \end{aligned} \end{aligned}$$

(2.15)

where α is the competition kernel. Observe that $(\widehat {n}_{x},0)$ is a fixed point for this system. The fitness function Fit(y,x) describes the growth rate of a negligible mutant population with trait y in an environment characterized by $\widehat {n}_{x}$. Furthermore, the fixed point $(\widehat{n}_{x},0)$ is asymptotically stable if and only if Fit(y,x)<0. However, the analysis of such a dynamical system (done in [2]) is not necessary for our purpose.

3 Tightness of {(χ ^K,Γ ^K)}

To study the limit when K→∞, we proceed by a tightness-uniqueness argument. First, we show the tightness of the distributions of $\{ ( \chi^{K},\varGamma^{K} ) : K \in \mathbb{N}^{*}\}$ and derive certain properties of the limiting distribution. The limiting values of {Γ ^K} satisfy an equation that characterizes the state of the population between two mutations, thanks to the “invasion implies substitution” assumption.

Theorem 3.1

Suppose that Assumption 1.1 is satisfied, $\sup_{K\geq 1}\mathbb{E}(\langle X^{K}(0),1\rangle^{2})<\infty$ and Ku _K→0 as K→∞. Then:

(A)
The distributions of $\{ (\chi^{K},\varGamma^{K}) : K \in \mathbb{N}^{*}\}$ are tight in the space
$$\mathcal{P} \bigl( \mathbb{D}\bigl(\mathbb{R}_+, \mathcal{M}_P(\mathbb{X}) \bigr) \times \mathcal{M}_F \bigl(\mathbb{R}_+\times \mathcal{M}_F(\mathbb{X}) \bigr) \bigr). $$
(B)
Suppose that (χ ^K,Γ ^K)⇒(χ,Γ), along some subsequence, as K→∞. Then χ satisfies the martingale problem given by (2.8) and Γ satisfies (2.10).

Proof

To prove the tightness of $\{\chi^{K} : K \in \mathbb{N}^{*}\}$, we use a criterion from [12]. Let n ^K(t)=〈χ ^K(t),1〉 for t≥0. This process counts the number of mutations that occurred in the population. For any T>0 we have by Lemma 2.1 that

$$\begin{aligned} \sup_{K \geq1} \mathbb{E}\bigl( n^K(T)\bigr) \leq \|b\|_\infty T \sup_{K \geq1, t\leq T} \mathbb{E}\bigl( \bigl\langle Z^K(t),1\bigr\rangle \bigr)\leq\|b \|_\infty T \sup_{K\geq1, t \geq0} \mathbb{E}\bigl( \bigl\langle X^K(t),1\bigr\rangle \bigr)< \infty. \end{aligned}$$

(3.1)

From this estimate and the martingale problem (2.7), it can be checked by using Aldous and Rebolledo criteria that for every test function $f\in C_{b}(\mathbb{X},\mathbb{R})$, the laws of 〈χ ^K,f〉 are tight in $\mathbb{D}(\mathbb{R}_{+},\mathbb{R})$ and the compact containment condition is satisfied.

Let us now prove the tightness of $\{\varGamma^{K} : K \in \mathbb{N}^{*}\}$. Let ϵ>0 be fixed. Using Lemma 2.1, there exists a N _ϵ>0 such that

$$\begin{aligned} \sup_{K \geq1, t \geq0} \mathbb{P}\bigl(\bigl\langle Z^K(t),1\bigr\rangle> N_{\epsilon }\bigr)<\epsilon. \end{aligned}$$

(3.2)

Since $\mathbb{X}$ is compact, the set $\mathcal{K}_{\epsilon}=\{\mu\in \mathcal{M}_{F}(\mathbb{X}),\ \langle\mu,1\rangle\leq N_{\epsilon}\}$ is compact. We deduce that for any T>0

$$ \inf_{K \geq1} \mathbb{E}\bigl( \varGamma^K \bigl([0,T] \times\mathcal {K}_{\epsilon} \bigr) \bigr) \geq(1-\epsilon)T. $$

(3.3)

Indeed

and the result follows from Fubini’s theorem and (3.2). From Lemma 1.3 of [18], $\{\varGamma^{K} : K \in \mathbb{N}^{*}\}$ is a tight family of random measures. The joint tightness of $\{ ( \chi^{K},\varGamma^{K} ) : K\in \mathbb{N}^{*} \}$ is immediate from the tightness of $\{\chi^{K} : K \in \mathbb{N}^{*} \} $ and $\{\varGamma^{K} : K \in \mathbb{N}^{*}\}$. This proves part (A).

We now prove part (B). Our proof is adapted from the proof of Theorem 2.1 in [18]. From part (A) we know that the distributions of $\{(\chi^{K},\varGamma^{K}) : K \in \mathbb{N}^{*}\}$ are tight. Therefore there exists a subsequence {η _K} along which (χ ^K,Γ ^K) converges in distribution to a limit (χ,Γ). We can take the limit in (2.7) along this subsequence and show that $M_{t}^{\chi,K}$ converges in distribution to the martingale given by (2.8).

Let us now show that the limiting value Γ satisfies (2.10). From (2.4), for any $F_{f} \in \mathbb{F}^{2}_{b}$, we get that

$$\begin{aligned} m^{F,f,K}_t & = F_f \bigl(Z^K(t) \bigr)-F_f \bigl(Z^K(0) \bigr)- \int_0^t \mathbb{L}^K F_f\bigl(Z^K(s)\bigr) ds \\ & = F_f \bigl(Z^K(t) \bigr) - F_f \bigl(Z^K(0) \bigr)- \biggl( \frac{1}{K u_K} \biggr)\int_{0}^{t} \int _{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f(\mu) \varGamma^K(ds \times d\mu) \\ &\quad - \frac{\delta ^{F,f,K}(t)}{K u_K} \end{aligned}$$

(3.4)

is a martingale. Here the operator $\mathbb{B}$ is defined by (2.11) and

$$\begin{aligned} \delta^{F,f,K}(t)= \int_{0}^{t} \bigl(K u_K \mathbb{L}^K F_f \bigl(Z^K(s)\bigr) - \mathbb{B} F_f\bigl(Z^K(s) \bigr) \bigr)ds. \end{aligned}$$

(3.5)

For any $\mu\in\mathcal{M}_{F}(\mathbb{X})$ we have

$$\begin{aligned} &K u_K \mathbb{L}^K F_f(\mu) - \mathbb{B} F_f(\mu) \\ &\quad = K u_K \int_{\mathbb{X}} p(x) b(x) \biggl[ \int_{\mathbb{X}} \biggl( F_f \biggl( \mu+\frac{1}{K}\delta_{y} \biggr)-F_f(\mu) \biggr) m(x,dy) \biggr]\mu(dx) \\ &\qquad + K \int_{\mathbb{X}} b(x) \biggl( F_f \biggl(\mu+ \frac{1}{K} \delta_{x} \biggr) - F_f(\mu) - \frac{1}{K} F' \bigl( \langle\mu,f \rangle \bigr)f(x) \biggr) \mu(dx) \\ &\qquad + K \int_{\mathbb{X}} \bigl( d(x) + \bigl\langle\mu, \alpha(x,.) \bigr\rangle \bigr) \biggl( F_f \biggl( \mu- \frac{1}{K} \delta_{x} \biggr) - F_f(\mu) + \frac{1}{K} F' \bigl( \langle \mu, f \rangle \bigr)f(x) \biggr) \mu(dx) \\ &\qquad - K u_K \int_{\mathbb{X}} p(x)b(x) \biggl( F_f \biggl( \mu+ \frac{1}{K} \delta_{x} \biggr) - F_f(\mu) \biggr)\mu(dx). \end{aligned}$$

(3.6)

For any $x\in \mathbb{X}$ and $\mu\in\mathcal{M}_{F}(\mathbb{X})$, we have by Taylor expansion that for some α ₁,α ₂∈(0,1):

$$\begin{aligned} F_f \biggl( \mu\pm\frac{1}{K} \delta_{x} \biggr) - F_f(\mu) & = \pm F' \biggl(\langle\mu, f\rangle+ \alpha_1 \frac{f(x)}{K} \biggr) \frac{f(x)}{K} \\ & = \pm F' \bigl(\langle\mu,f\rangle\bigr) \frac{f(x)}{K} + \frac {f(x)^2}{2K^2} F'' \biggl(\langle\mu,f\rangle+ \alpha_2 \frac {f(x)}{K} \biggr). \end{aligned}$$

Therefore we get

$$\begin{aligned} \biggl \vert F_f \biggl( \mu\pm\frac{1}{K} \delta_{x} \biggr) - F_f(\mu ) \biggr \vert \leq \frac{\|F'\|_\infty\|f\|_{\infty}^2}{K} \end{aligned}$$

and

$$\begin{aligned} \biggl \vert K \biggl( F_f \biggl( \mu\pm\frac{1}{K} \delta_{x} \biggr) - F_f(\mu) \mp\frac{1}{K} F' \bigl(\langle\mu,f\rangle\bigr)f(x) \biggr) \biggr \vert \leq \frac{\|F''\|_\infty\|f\|_{\infty}^2}{2K}. \end{aligned}$$

Using these estimates and Assumption 1.1,

$$\begin{aligned} \bigl \vert K u_K \mathbb{L}^K F_f(\mu) - \mathbb{B} F_f(\mu) \bigr \vert & \leq2 u_K \Vert b \Vert _{\infty} \bigl\|F'\bigr\|_\infty\|f \|_{\infty}^2 \langle \mu,1 \rangle \\ &\quad + \frac{\|F''\|_\infty\|f\|_{\infty}^2}{2K} \bigl( \bigl(\Vert b\Vert _{\infty} +\Vert d \Vert _{\infty} \bigr) \langle\mu ,1 \rangle+ \overline {\alpha} \langle\mu,1 \rangle^2 \bigr). \end{aligned}$$

Pick any T>0. This estimate along with Lemma 2.1 implies that as K→∞, δ ^F,f,K(t) (given by (3.5)) converges to 0 in $L^{1}(d\mathbb{P})$, uniformly in t∈[0,T]. Multiplying (3.4) by Ku _K and letting K→∞, we get that along the subsequence η _K, the sequence of martingales $\{ Ku_{K} m^{F,f,K} : K \in \mathbb{N}^{*}\}$ converges in $L^{1}(d\mathbb{P})$, uniformly in t∈[0,T] to $\int_{0}^{t} \int_{\mathcal{M}_{F}(\mathbb{X})} \mathbb{B} F_{f} (\mu)\varGamma(ds \times d\mu)$. The limit is itself a martingale. Since it is continuous and has paths of bounded variation, it must be 0 at all times a.s. Hence for any $F_{f} \in\mathbb{F}^{2}_{b}$,

$$\int_{0}^{t} \int_{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f (\mu )\varGamma(ds \times d\mu) = 0\quad \text{a.s.} $$

Separability of $\mathbb{F}^{2}_{b}$ ensures that (2.10) also holds. □

Theorem 3.1 shows tightness of the family of occupation measures $\{\varGamma^{K} : K \in \mathbb{N}^{*}\}$. In the next section we will prove that this family has a unique limit point Γ which is the occupation measure of the process $\{ \widehat{n}_{\chi'(t)} \delta_{\chi'(t) } : t \geq0 \}$ (see Theorem 2.4) that corresponds to the TSS. This demonstrates the convergence of Z ^K to the TSS in the sense of occupation measures. Note that the process Z ^K does not converge in the Skorokhod topology as K→∞ (see Proposition 1 in [2]), due to sharp transitions in the process at the time of mutations. Hence the convergence of Z ^K to the TSS is shown in the sense of finite dimensional distributions in [2]. In our approach, instead of working with Z ^K we work with its occupation measure Γ ^K which remains relatively unaffected by the behaviour of Z ^K over small time intervals. This mitigates the problem of having sharp mutation-induced transitions in Z ^K and simplifies the analysis.

4 Characterization of the Limiting Values

4.1 Dynamics Without Mutation

As in [2, 4], to understand the information provided by (2.10), we need to consider the dynamics of monomorphic and dimorphic populations. Our purpose in this section is to show that under the time scale separation given by Assumption 2.3, the operator (2.11) and Assumption 1.1(B) characterize the state of the population between two mutant arrivals. Because of Assumption 1.1(B), we will see that two different traits cannot coexist in the long term and thus it suffices to work with monomorphic or dimorphic initial populations (i.e. the support of $Z^{K}_{0}$ is one or two singletons).

In Sect. 4.1.1, we consider monomorphic or dimorphic populations and show convergence of the occupation measures when the final trait composition of the population is known. For instance, if the final trait is x ₀, then the occupation measure of Z ^K(dx,dt) converges to $\widehat{n}_{x_{0}} \delta_{x_{0}}(dx)dt$. In Sect. 4.1.2, we use couplings with linear birth and death processes to show that the distribution of the final trait composition of the population can be computed from the fitness of the mutant and the resident.

4.1.1 Convergence of the Occupation Measure Γ ^K in the Absence of Mutation

First, we show that the “invasion implies substitution” assumption (Assumption 1.1(B)) provides information on the behavior of a dimorphic population when we know which trait is fixed.

Definition 4.1

Let $L^{K}_{0}$ be the operator L ^K (given by (2.1)) with p(x)=0 for all $x \in \mathbb{X}$. We will denote by {Y ^K(t):t≥0} a process with generator $L^{K}_{0}$ and an initial condition that varies according to the case that is studied. This process has the same birth-death dynamics as a process with generator L ^K, but there is no mutation.

In this section we investigate how a process with generator $L^{K}_{0}$ behaves at time scales of order (Ku _K)⁻¹, when the population is monomorphic or dimorphic. We start by proving a simple proposition.

Proposition 4.2

For any $x,y \in \mathbb{X}$ suppose that $\pi\in\mathcal{P} ( \mathcal{M}_{F}(\mathbb{X}) )$ is such that

$$\begin{aligned} \pi \bigl( \bigl\{ \mu\in\mathcal{M}_{F}(\mathbb{X}) : \{x\} \subset \mathrm{supp}(\mu) \subset\{x,y\} \bigr\} \bigr) = 1 \end{aligned}$$

(4.1)

and

$$ \int_{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f (\mu)\pi(d\mu) = 0 $$

(4.2)

for all $F_{f} \in\mathbb{F}^{2}_{b}$. Then for any $A \in\mathcal {B} ( \mathcal{M}_{F}(\mathbb{X}) )$ we have where $\widehat{n}_{x}$ has been defined in (2.14).

Proof

Since π satisfies (4.1), any μ picked from the distribution π has the form μ=n _x δ _x+n _y δ _y with n _x>0. Let Φ be the map from $\mathcal{M}_{F}(\mathbb{X})$ to $\mathbb{R}_{+} \times \mathbb{R}_{+}$ defined by

Let $\pi^{*}=\pi\circ\varPhi^{-1} \in\mathcal{P} ( \mathbb{R}_{+} \times \mathbb{R}_{+} )$ be the image of the distribution π under Φ ⁻¹. Replacing the operator $\mathbb{B}$ by its definition, we can rewrite (4.2) as

$$\begin{aligned} 0 & = \int_{\mathcal{M}_F(\mathbb{X})}F'\bigl(\langle\mu,f\rangle \bigr) \biggl(\bigl\langle\mu, (b-d)f\bigr\rangle-\int_{E} f(x) \bigl\langle\mu, \alpha (x,.)\bigr\rangle\mu(dx) \biggr) \pi(d\mu) \\ & = \int_{\mathbb{R}_{+} \times \mathbb{R}_{+}} F' \bigl( f(x)n_x+f(y)n_y \bigr) \bigl[ \bigl( b(x)-d(x) - \alpha(x,x)n_x - \alpha(x,y)n_y \bigr) n_x f(x) \\ &\quad + \bigl( b(y)-d(y) - \alpha(y,x)n_x - \alpha(y,y)n_y \bigr) n_y f(y) \bigr] \pi^{*}(dn_x,dn_y). \end{aligned}$$

This equation can hold for all $F_{f} \in\mathbb{F}^{2}_{b}$ only if the support of π ^∗ consists of (n _x,n _y) with n _x>0 that satisfy (b(x)−d(x)−α(x,x)n _x−α(x,y)n _y)n _x=0 and (b(y)−d(y)−α(y,x)n _x−α(y,y)n _y)n _y=0. The only possible solutions are $(\widehat{n}_{x},0)$ and

$$\begin{aligned} (\widetilde{n}_x,\widetilde{n}_y) = \biggl(&\frac{(b(x)-d(x))\alpha(y,y)-(b(y)-d(y))\alpha(x,y)}{\alpha (x,x)\alpha(y,y)-\alpha(x,y)\alpha(y,x)},\\ & \frac{(b(y)-d(y))\alpha (x,x)-(b(x)-d(x))\alpha(y,x)}{\alpha(x,x)\alpha(y,y)-\alpha (x,y)\alpha(y,x)} \biggr). \end{aligned}$$

However due to Assumption 1.1, either $\widetilde{n}^{x}$ or $\widetilde{n}^{y}$ is negative and hence $(\widetilde{n}^{x},\widetilde {n}^{y})$ cannot be in the support of π ^∗. Therefore $\pi ^{*} (\{(\widehat{n}_{x},0)\} ) = 1$ and this proves the proposition. □

Remark 4.3

Note that (0,0), $(\widehat{n}_{x},0)$, $(0,\widehat{n}_{y})$ and $(\widetilde{n}_{x},\widetilde{n}_{y})$ are the stationary solutions of the Lotka-Volterra system given by (2.15).

Heuristically, the “invasion implies substitution” assumption prevents two traits from coexisting in the long run. If we know which trait survives, then we know that it fixates and Proposition 4.2 provides the form of the solution π to (4.2). In this case, we can deduce the convergence of the occupation measure of Y ^K(⋅/Ku _K).

Corollary 4.4

Let $x,y \in \mathbb{X}$. For each $K \in \mathbb{N}^{*}$, let {Y ^K(t):t≥0} be a process with generator $L^{K}_{0}$ and supp(Y ^K(0))={x,y}. Let T>0, and suppose that there exists a δ>0 such that:

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl(Y^K_t \{x\} < \delta \ \textit{for some}\ t \in \biggl[ 0 ,\frac{T}{K u_K} \biggr] \biggr) = 0. \end{aligned}$$

(4.3)

Then for any $F_{f}\in\mathbb{F}^{2}_{b}$,

$$\int_0^T \int_{\mathcal{M}_F(\mathbb{X})}F_f( \mu)\varGamma_0^K(dt\times d\mu):=\int _{0}^{T} F_f \biggl( Y^K \biggl( \frac{t}{K u_K} \biggr) \biggr) dt \Rightarrow T \times F_f ( \widehat{n}_x \delta_{x} ) $$

as K→∞.

Proof

As in part (A) of Theorem 3.1 we can show that $\{ \varGamma^{K}_{0} : K \in \mathbb{N}^{*}\}$ is tight in the space $\mathcal {P} (\mathcal{M}_{F} ([0,T]\times \mathcal{M}_{F}(\mathbb{X}) ) )$. Let Γ ₀ be a limit point. Then from part (C) of Theorem 3.1 we get that

$$\begin{aligned} \int_{0}^{T}\int_{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f (\mu )\varGamma_0(dt \times d\mu) = 0 \quad \text{for all } F_f \in\mathbb {F}^{2}_b \text{ a.s.,} \end{aligned}$$

(4.4)

where the operator $\mathbb{B}$ is given by (2.11).

Since supp(Y ^K(0))⊂{x,y} we also have that supp(Y ^K(t))⊂{x,y} for all t≥0. Let

$$\mathcal{S}_{\delta} = \bigl\{ \mu\in\mathcal{M}_F(\mathbb{X}) : \mu \{ x\} \geq\delta \bigr\}. $$

Observe that $\varGamma^{K}_{0}([0,T]\times\mathcal{S}_{\delta})\leq T$ a.s. and

Hence by (4.3) we get that $\varGamma^{K}_{0} ([0,T] \times\mathcal{S}_{\delta}) $ converges to T in $L^{1}(d\mathbb{P})$. Because $\mathcal{S}_{\delta}$ is a closed set, $\varGamma_{0} ([0,T] \times\mathcal{S}_{\delta}) = T \text{ a.s.}$

Let π be the $\mathcal{P} ( \mathcal{M}_{F}(\mathbb{X}) )$-valued random variable defined by π(A)=Γ ₀([0,T]×A)/T for any $A \in\mathcal{B} ( \mathcal{M}_{F}(\mathbb{X}) )$. Then $\pi ( \mathcal{S}_{\delta} ) = 1$ and hence π satisfies (4.1) almost surely. Furthermore $\int_{\mathcal{M}_{F}(\mathbb{X})} \mathbb{B} F_{f} (\mu)\pi(d\mu) = 0$ for all $F_{f} \in\mathbb{F}^{2}_{b}$, almost surely. Therefore using Proposition 4.2 proves this corollary. □

4.1.2 Fixation Probabilities

We have seen in Corollary 4.4 that the behaviour of a dimorphic population is known provided we know which trait survives and then fixates. Following Champagnat et al. [2, 4], we can answer this question by using couplings with branching processes. This is done in Propositions 4.6 and 4.7, whose proofs are given in the Appendix. These propositions study populations evolving as Markov processes with generator $L^{K}_{0}$ (see Definition 4.1). Proposition 4.6 shows that over a time period of order (Ku _K)⁻¹, a monomorphic population with a non-negligible initial size does not die and a monomorphic population that is initially near equilibrium remains near equilibrium. Proposition 4.7 considers a dimorphic population, where the resident population is near equilibrium while the mutant population has a small size. It shows that an unfavourable mutant will certainly die out quickly (in the evolutionary time scale), while a favourable mutant can invade the population with a positive probability whose value can be easily computed. Moreover after a successful invasion, the mutant population will not die in a time period of order (Ku _K)⁻¹. Formally, the process we consider in Propositions 4.6 and 4.7 can be defined as follows.

Definition 4.5

Pick two trait values $x,y\in \mathbb{X}$ and two $\mathbb{N}^{*}$-valued sequences $\{ z_{1}^{K} : K \in \mathbb{N}^{*} \}$ and $\{z_{2}^{K} : K \in \mathbb{N}^{*} \}$. Let {Y ^K(t):t≥0} be the process with generator $L^{K}_{0}$ (see Definition 4.1) and initial condition

$$\begin{aligned} Y^K(0)=\frac{ z_1^K}{K} \delta_x + \frac{ z_2^K}{K} \delta_y. \end{aligned}$$

Here x and y should be seen as the resident trait and the mutant trait respectively.

For any $x \in \mathbb{X}$ and ϵ>0 let

$$\begin{aligned} \mathcal{N}_{\epsilon}(x) = \bigl\{ \mu\in \mathcal{M}_F(\mathbb{X}) : \mathrm{supp}(\mu)=\{x\} \text{ and } \langle \mu, 1 \rangle\in [ \widehat{n}_x -\epsilon, \widehat{n}_x+ \epsilon ] \bigr\}. \end{aligned}$$

(4.5)

Proposition 4.6

(Behaviour of a monomorphic population)

Suppose that Assumptions 1.1 and 2.3 hold. Let {Y ^K(t):t≥0} be the process given by Definition 4.5. Assume that $z^{K}_{2}= 0$ for all $K \in \mathbb{N}^{*}$. Then for any T>0 we have the following.

(A)
A monomorphic population with a non-negligible size does not die in a time of order (Ku _K)⁻¹: Suppose that for some ϵ>0, $z^{K}_{1} \geq K \epsilon$ for each $K \in \mathbb{N}^{*}$. Then for some δ>0
$$\begin{aligned} \lim _{K \to\infty} \mathbb{P}\biggl( \exists t \in \biggl[ 0 , \frac{T}{K u_K} \biggr],\ Y^K_t\{x\} < \delta \biggr) = 0. \end{aligned}$$
(4.6)
(B)
A monomorphic population with a size around equilibrium remains near equilibrium for a time of order (Ku _K)⁻¹: Suppose that for some ϵ>0, $z^{K}_{1} \in[ K(\widehat{n}_{x} -\epsilon), K(\widehat{n}_{x}+\epsilon)]$ for each $K \in \mathbb{N}^{*}$. Then
$$\begin{aligned} \lim _{K \to\infty} \mathbb{P}\biggl(\exists t \in \biggl[ 0 , \frac{T}{K u_K} \biggr],\ Y^K(t) \notin\mathcal{N}_{2 \epsilon}(x) \biggr) = 0. \end{aligned}$$
(4.7)

Proposition 4.7

(Behaviour of a dimorphic population)

Suppose that Assumptions 1.1 and 2.3 hold. Let {Y ^K(t):t≥0} be the process given by Definition 4.5. We will assume that the resident population is near equilibrium, that is, for a small ϵ>0 we have $z^{K}_{1} \in[ K(\widehat{n}_{x} -\epsilon), K(\widehat {n}_{x}+\epsilon)]$ for all $K \in \mathbb{N}^{*}$. Suppose that {t _K} is any $\mathbb{N}^{*}$-valued sequence such that logK≪t _K≪1/Ku _K. Let $\mathcal{S}_{K}=( \widehat{n}_{x}-2\epsilon, \widehat{n}_{x}+2\epsilon )\times( 0 ,2 \epsilon)$ and let $T_{\mathcal{S}_{K}}$ be the stopping time

$$\begin{aligned} T_{\mathcal{S}_K} = \inf \bigl\{ t \geq0 : Y^K(t) \notin\mathcal {S}_K \bigr\}. \end{aligned}$$

(4.8)

Then we have the following.

(A)
A favorable mutant with a non-negligible size does not die in a time of order (Ku _K)⁻¹: Suppose that Fit(y,x)>0 and $z^{K}_{2} > K \epsilon$ for all $K \in \mathbb{N}^{*}$. There exists an ϵ ₀>0 such that if ϵ<ϵ ₀ then
$$\begin{aligned} \lim _{K \to\infty} \mathbb{P}\biggl(\exists t \in \biggl[ 0 , \frac{T}{K u_K} \biggr],\ Y^K_t\{y\} < \frac{\epsilon}{2} \biggr) = 0. \end{aligned}$$
(4.9)
(B)
An unfavorable mutant dies out in time t _K: Let Fit(y,x)<0 and $z^{K}_{2} < K \epsilon$ for all $K \in \mathbb{N}^{*}$. There exists an ϵ ₀>0 such that if ϵ<ϵ ₀ then
$$\begin{aligned} \lim _{K \to\infty} \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , Y^K_{ T_{\mathcal{S}_K}}\{y\} = 0 \bigr) = 1. \end{aligned}$$
(4.10)
(C)
A favorable mutant either dies out or invades the population in time t _K. The probability of invasion is the fitness of the mutant with respect to the resident trait divided by its birth rate: let Fit(y,x)>0 and $z^{K}_{2} = 1$ for all $K \in \mathbb{N}^{*}$. Then there exist positive constants c,ϵ ₀ such that for all ϵ<ϵ ₀ we have
$$\begin{aligned} & \limsup _{K \to\infty} \biggl \vert \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , Y^K_{ T_{\mathcal{S}_K} }\{y\} \geq2 \epsilon \bigr)- \frac{ \mathrm{Fit}(y,x) }{b(y)} \biggr \vert \leq c \epsilon, \end{aligned}$$
(4.11)

$$\begin{aligned} \textit{and}\quad & \limsup_{K \to\infty} \biggl \vert \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , Y^K_{ T_{\mathcal{S}_K} }\{y\} = 0 \bigr) - \biggl( 1- \frac{\mathrm{Fit}(y,x)}{b(y)} \biggr) \biggr \vert \leq c \epsilon. \end{aligned}$$
(4.12)

Using Propositions 4.6 and 4.7, we can retrieve the state of the process in a large time window [ϵ/Ku _K,ϵ ⁻¹/Ku _K] from the initial condition. This allows us to understand what will happen to the population if we neglect the transitions due to rare mutation events.

Corollary 4.8

Suppose that Assumptions 1.1 and 2.3 hold. For each $K \in \mathbb{N}^{*}$ let {Y ^K(t):t≥0} be a process with generator $L^{K}_{0}$.

(A)
Suppose that for some $x \in \mathbb{X}$ and ϵ>0 we have supp(Y ^K(0))={x} and $Y^{K}_{0}\{x\}>\epsilon$ for all $K \in \mathbb{N}^{*}$. Then
$$ \lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal {N}_{\epsilon}(x) \ \textit{for all } t \in \biggl[\frac{\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) = 1. $$
(4.13)
(B)
Suppose that for some $x,y \in \mathbb{X}$ and ϵ>0, we have Fit(y,x)<0, supp(Y ^K(0))={x,y}, $Y^{K}_{0}\{x\}\in [ \widehat{n}_{x} -\epsilon, \widehat {n}_{x}+\epsilon ]$ and $Y^{K}_{0}\{y\}<\epsilon$ for all $K \in \mathbb{N}^{*}$. Then for a sufficiently small ϵ,
$$ \lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal {N}_{\epsilon}(x) \ \textit{for all } t \in \biggl[\frac{\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) = 1. $$
(C)
Suppose that for some $x,y \in \mathbb{X}$ and ϵ>0 we have Fit(y,x)>0, supp(Y ^K(0))={x,y}, $Y^{K}_{0}\{x\} \in [ \widehat{n}_{x} -\epsilon, \widehat {n}_{x}+\epsilon ]$ and $Y^{K}_{0}\{y\} = 1/K$ for all $K \in \mathbb{N}^{*}$. Then
$$\begin{aligned} & \lim_{\epsilon\to0} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(x) \ \textit{for all } t \in \biggl[\frac {\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) \\ &\quad = 1- \lim_{\epsilon\to0} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(y) \ \textit{for all } t \in \biggl[\frac {\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) \\ &\quad = 1 - \frac{ \mathrm{Fit}(y,x)}{b(y)}. \end{aligned}$$

Proof

We first prove part (A). Let Y ^K be the process with generator $L^{K}_{0}$ such that supp(Y ^K(0))={x} and $Y^{K}_{0}\{x\}>\epsilon$. Part (A) of Proposition 4.6 implies that for some δ>0

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K_t\{x\} < \delta \text{ for some } t \in \biggl[ 0 ,\frac{\epsilon^{-1}}{K u_K} \biggr] \biggr) = 0. \end{aligned}$$

From Corollary 4.4 we know that for any t≥0 and $F_{f}\in\mathbb{F}^{2}_{b}$,

$$\int_{0}^{t} \mathbb{F}_f \biggl( Y^K \biggl( \frac{s}{K u_K} \biggr) \biggr) ds \Rightarrow t F_f ( \widehat{n}_x \delta_{x} ) $$

as K→∞. Hence if we define $\sigma^{K}_{\epsilon} = \inf\{ t \geq0: Y^{K}(t) \in \mathcal{N}_{\epsilon/2}(x)\}$, then $K u_{K}\sigma^{K}_{\epsilon} \rightarrow0$ in probability as K→∞. Now let the process $\{\widetilde {Y}^{K}(t): t \geq0\}$ be given by $\widetilde {Y}^{K}_{\epsilon }(t) = Y^{K}_{\epsilon}(t+\sigma^{K}_{\epsilon})$. By the strong Markov property, this process also has generator $L^{K}_{0}$. Moreover its initial state is inside $\mathcal{N}_{\epsilon/2}(x)$. Using part (B) of Proposition 4.6 proves part (A).

For part (B) consider the set $\mathcal{S}_{K}=( \widehat{n}_{x} - 2\epsilon, \widehat{n}_{x} + 2\epsilon) \times(0,2\epsilon)$ and let $T_{\mathcal{S}_{K}}$ be given by (4.8). Let the sequence {t _K} be as in Proposition 4.7 and consider the event $E^{K}(\epsilon)=\{ T_{\mathcal{S}_{K}} \leq t_{K} , Y^{K}_{T_{\mathcal{S}_{K}} } \{y\} = 0\}$. Since Fit(y,x)<0, part (B) of Proposition 4.7 says that as K→∞, the probability of the event E ^K(ϵ) approaches 1 for a sufficiently small ϵ>0. Note that Ku _K t _K→0 as K→∞. The proof of part (B) follows from part (A) of this corollary along with the strong Markov property at time $T_{\mathcal{S}_{K}}$.

For part (C), fix an ϵ>0 and define {Y ^K(t):t≥0} with the initial condition specified in the statement. Let $\mathcal {S}_{K}$ and $T_{\mathcal{S}_{K}}$ be as in the proof of part (B). We can write

$$\begin{aligned} &\mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(x) \ \text{for all } t \in \biggl[\frac{\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) \\ &\quad = \sum_{i=1}^3 \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(x) \ \text{for all } t \in \biggl[\frac{\epsilon}{K u_K}, \frac {\epsilon^{-1}}{Ku_K} \biggr] ; E^K_i( \epsilon) \biggr), \end{aligned}$$

(4.14)

where $E^{K}_{1}(\epsilon)= \{ T_{\mathcal{S}_{K}} \leq t_{K} , Y^{K}_{T_{\mathcal{S}_{K}}}\{y\} = 0 \}$, $E^{K}_{2}(\epsilon)= \{ T_{\mathcal{S}_{K}} \leq t_{K} , Y^{K}_{T_{\mathcal{S}_{K}}}\{y\} \geq 2\epsilon \}$ and $E^{K}_{3}(\epsilon)= ( E^{K}_{1}(\epsilon )\cup E^{K}_{2}(\epsilon) )^{c}$.

Let us consider the term in (4.14) corresponding to i=1. On the event $E_{1}^{K}(\epsilon)$, we have $Y_{T_{\mathcal{S}_{K}}}\{x\}\in (\widehat{n}_{x}-2\epsilon,\widehat{n}_{x}+2\epsilon)$ and $Y_{T_{\mathcal{S}_{K}}}\{y\} = 0$. The strong Markov property at time $T_{\mathcal{S}_{K}}$, along with part (A) of this corollary and part (C) of Proposition 4.7 imply that this term converges to 1−Fit(y,x)/b(y) as K→∞ and ϵ→0.

The term corresponding to i=2 in (4.14) can be written as

(4.15)

On the event $E_{2}^{K}(\epsilon)$, $Y^{K}_{T_{\mathcal{S}_{K}}}\{y\}\geq 2\epsilon$ and $Y^{K}_{T_{\mathcal{S}_{K}}}\{x\}\in( \widehat{n}_{x} - 2\epsilon, \widehat{n}_{x} + 2\epsilon)$. From part (A) of Proposition 4.7, the probability of the process $\{ Y^{K}_{t}\{y\} : t \geq0\}$ going below ϵ between times $T_{\mathcal{S}_{K}}$ and $T_{\mathcal{S}_{K}}+\epsilon^{-1}/K u_{K}$ tends to 0 as K→∞. Hence the probability of the event $\{ \exists t \in[T_{\mathcal{S}_{K}}, T_{\mathcal {S}_{K}}+\epsilon^{-1}/K u_{K}] : y \notin\mathrm{supp}(Y^{K}_{t}) \}$ also tends to 0 as K→∞. Note that if $y \in\mathrm{supp}(Y^{K}_{t}) $ then $Y^{K}(t) \notin \mathcal{N}_{\epsilon}(x) $. Conditioning by $\mathcal{F}_{T_{\mathcal{S}_{K}}}$ and using the strong Markov property shows that the term corresponding to i=2 in (4.14) converges to 0 as K→∞ and ϵ→0.

Part (C) of Proposition 4.7 implies that

$$\begin{aligned} \lim_{\epsilon\to0} \lim_{K \to\infty} \mathbb{P}\bigl( E^K_1(\epsilon )\cup E^K_2( \epsilon) \bigr) = 1. \end{aligned}$$

Hence

$$\lim_{\epsilon\to0 }\lim_{K \to\infty} \mathbb{P}\bigl( E^K_3(\epsilon ) \bigr) = 0 $$

which shows that the term corresponding to i=3 in (4.14) converges to 0.

Gathering the results for i∈{1,2,3}, we get

$$\lim_{\epsilon\to0 }\lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(x) \ \text{for all } t \in \biggl[\frac {\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) = 1- \frac{\mathrm{Fit}(y,x)}{b(y)}. $$

The proof of

$$\lim_{\epsilon\to0 }\lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(y) \ \text{for all } t \in \biggl[\frac {\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) = \frac{\mathrm{Fit}(y,x)}{b(y)} $$

is similar. This completes the proof of part (C) of the corollary. □

4.2 Proof of Theorem 2.4: Convergence to the TSS

We now have the tools to prove Theorem 2.4. By Theorem 3.1, the distributions of $\{(\chi ^{K},\varGamma^{K}) : K \in \mathbb{N}^{*}\}$ are tight. Let (χ,Γ) be a limiting value satisfying (2.8) and (2.10). If we can prove that (2.12) holds for the process {χ′(t):t≥0} (introduced in the statement of Theorem 2.4), then the distribution of Γ is uniquely determined, which in turn uniquely determines the distribution of χ due to the martingale problem given by (2.8). Hence we only have to prove part (B) of Theorem 2.4.

Since (χ,Γ) is a limiting value, due to Prohorov’s theorem, there exists a subsequence $\{(\widetilde {\chi}^{K},\widetilde {\varGamma}^{K})\}$ that converges in distribution to (χ,Γ) as K→∞. The Skorokhod representation theorem (see for e.g. [1]) says that on the same probability space as (χ,Γ), we can construct a sequence which we again denote by {(χ ^K,Γ ^K)} with an abuse of notation, such that (χ ^K,Γ ^K) converges to (χ,Γ) a.s. and (χ ^K,Γ ^K) has the same marginal distributions as $\{ (\widetilde {\chi}^{K},\widetilde {\varGamma}^{K})\}$.

Assuming (χ ^K,Γ ^K)→(χ,Γ) a.s. as K→∞, we now try to identify (χ,Γ). The main idea is that between subsequent appearances of new mutants, our process {X ^K(t):t≥0} behaves like the process considered in Corollary 4.8. When a fit mutant appears, it either gets extinct quickly or the process stabilizes around the new monomorphic equilibrium characterized by the mutant trait. Between two rare mutations, the trait and size of the population can be inferred from the occupation measure, because the population is monomorphic and the size is shown to reach an equilibrium.

Throughout this section whenever we say “as K→∞ and ϵ→0” we mean that the limit K→∞ is taken first and the limit ϵ→0 is taken next. For $K\in \mathbb{N}^{*},\ i \in \mathbb{N}^{*}$ let $\tau^{K}_{i}$ and τ _i be the i-th jump times of the process χ ^K and χ respectively. For convenience we define $\tau^{K}_{0}= \tau_{0} =0$. Since (χ ^K,Γ ^K)→(χ,Γ) a.s., for any $m \in \mathbb{N}^{*}$, $(\tau^{K}_{1},\dots,\tau ^{K}_{m} ) \rightarrow (\tau_{1} , \dots, \tau_{m} )$ a.s. Using (2.8) and Lemma 2.1, we know that τ _i−τ _i−1>0 almost surely for each $i \in \mathbb{N}^{*}$. Thus

$$\begin{aligned} \lim_{\epsilon\rightarrow0}\lim _{K \to\infty} \mathbb{P}\bigl( \tau ^K_{i} - \tau^K_{i-1} > \epsilon \bigr) = 1. \end{aligned}$$

(4.16)

Pick an arbitrary $x_{\text{arb}} \in \mathbb{X}$. For any $i \in \mathbb{N}^{*}$ and ϵ>0 define

(4.17)

Note that if supp(Z ^K(s))={x} for all $s \in[ \tau^{K}_{i-1} + \epsilon, \tau^{K}_{i} )$, then we have $R^{K,\epsilon}_{i} = x$. Heuristically, $R^{K,\epsilon}_{i}$ is an estimator for the trait that fixates between the (i−1)^th and the ith mutation. Define an event

$$\begin{aligned} E^{K}_i(\epsilon) = & \bigl\{ \tau^K_i \geq\tau^K_{i-1} + \epsilon \mbox{ and } Z^K(t) \in\mathcal{N}_{\epsilon}\bigl(R^{K,\epsilon}_i \bigr) \ \text{for all } t \in\bigl[ \tau^K_{i-1} + \epsilon, \tau^K_i \bigr) \bigr\}, \end{aligned}$$

where for any $x \in \mathbb{X}$, $\mathcal{N}_{\epsilon}(x)$ is defined by (4.5). The next proposition shows how $R^{K,\epsilon}_{i}$ and $E^{K}_{i}(\epsilon) $ behave as K→∞ and ϵ→0.

Proposition 4.9

Suppose that Assumptions 1.1, 2.2 and 2.3 hold. Then for any $i\in \mathbb{N}^{*}$

$$\begin{aligned} \lim_{\epsilon\to0 } \lim_{K \to\infty} \mathbb{P}\bigl( E^K_i(\epsilon ) \bigr) = 1. \end{aligned}$$

(4.18)

Furthermore for any measurable set $A \subset \mathbb{X}$, $x \in \mathbb{X}$ and $i \in \mathbb{N}^{*}$ we have

(4.19)

Proof

For each $i \in \mathbb{N}^{*}$, we can construct a process $\{ Y^{K}_{i}(t) : t \geq0 \}$ with generator $L^{K}_{0}$ such that

$$\begin{aligned} Z^K \bigl(\tau^K_{i-1} + t\bigr) = Y^K_i \biggl( \frac{t}{K u_K} \biggr) \ \text{for all } t \in\bigl[0, \tau^K_{i} - \tau^K_{i-1}\bigr). \end{aligned}$$

(4.20)

For i=1, we obtain from part (B) of Assumption 2.2 and part (A) of Corollary 4.8 that

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K_1(t) \in\mathcal{N}_{\epsilon }(x_0) \ \text{for all } t \in \biggl[ \frac{\epsilon}{K u_K} , \frac{\epsilon^{-1}}{K u_K} \biggr] \biggr) = 1. \end{aligned}$$

(4.21)

This limit along with (4.16) and (4.20) imply that $R^{K,\epsilon}_{1} \to x_{0}$ with probability converging to 1 as K→∞ and ϵ→0. Moreover (4.18) holds for i=1 thanks to (4.20) and (4.21).

For any $i \in \mathbb{N}^{*}$, let $U^{K}_{i}$ denote the type of the new mutant that appears at time $\tau^{K}_{i}$. Pick $x,y \in \mathbb{X}$. On the event $\{ E^{K}_{i}(\epsilon), R^{K,\epsilon}_{i} = x,U^{K}_{i} = y \}$, we have $\mathrm{supp}(Z^{K} ( \tau^{K}_{i} )) = \{x,y\}$, $Z^{K}_{\tau^{K}_{i}}\{ y\} = 1/K$ and $Z^{K}_{\tau^{K}_{i}}\{x\} \in[ \widehat{n}_{x} -\epsilon, \widehat{n}_{x}+\epsilon]$. Using parts (B) and (C) of Corollary 4.8 and (4.20), we obtain that

$$\begin{aligned} &\lim_{\epsilon\to0}\lim_{K \to\infty} \mathbb{P}\bigl( E^K_{i+1}(\epsilon),R^{K,\epsilon}_{i+1} = x \big\vert E^K_i(\epsilon), R^{K,\epsilon}_i = x,U^K_i = y \bigr) = \biggl(1- \frac{ [\mathrm{Fit}(y,x)]^+}{b(y)} \biggr) \\ \text{and}\quad &\lim_{\epsilon\to0}\lim_{K \to\infty} \mathbb{P}\bigl( E^K_{i+1}(\epsilon),R^{K,\epsilon}_{i+1} = y \big\vert E^K_i(\epsilon), R^{K,\epsilon}_i= x,U^K_i = y \bigr) = \frac{ [\mathrm{Fit}(y,x)]^+}{b(y)}. \end{aligned}$$

Since the distribution of $U^{K}_{i} $ conditionally to $\{ E^{K}_{i}(\epsilon ), R^{K,\epsilon}_{i} = x\}$ is m(x,dy), for any measurable set $A \subset \mathbb{X}$ we get

(4.22)

Taking $A = \mathbb{X}$ in (4.22) we obtain

$$\begin{aligned} \lim_{\epsilon\to0}\lim _{K \to\infty} \mathbb{P}\bigl( E^K_{i+1}(\epsilon) \big\vert E^K_i(\epsilon) \bigr) = 1. \end{aligned}$$

(4.23)

This relation shows that if (4.18) holds for some $i \in \mathbb{N}^{*}$, then it also holds for (i+1). Since we have already shown that (4.18) holds for i=1, by induction we can conclude that (4.18) holds for all $i \in \mathbb{N}^{*}$. From (4.22) and (4.23) we can deduce that for any measurable set $A \subset \mathbb{X}$, $x \in \mathbb{X}$ and $i \in \mathbb{N}^{*}$

$$\begin{aligned} \lim_{\epsilon\to0 } \lim_{K \to\infty} \mathbb{P}\bigl( R^{K,\epsilon }_{i+1} \in A \big\vert R^{K,\epsilon}_{i} = x \bigr) = \lim_{\epsilon\to0}\lim_{K \to\infty} \mathbb{P}\bigl( E^K_{i+1}(\epsilon), R^{K,\epsilon}_{i+1} \in A \big\vert E^K_i(\epsilon), R^{K,\epsilon}_i = x \bigr). \end{aligned}$$

Hence (4.19) holds due to (4.22). This completes the proof of the proposition. □

We now complete the proof of part (B) of Theorem 2.4 by using Proposition 4.9.

Proof of Theorem 2.4(B)

For each $i \in \mathbb{N}^{*}$ define

$$\begin{aligned} R_i= \frac{\int_{\tau_{i-1}}^{\tau_i } \int_{\mathcal{M}_F(\mathbb{X})}\langle\mu, x \rangle\varGamma(dt \times d \mu)}{\int_{\tau _{i-1}}^{\tau_i} \int_{\mathcal{M}_F(\mathbb{X})}\langle\mu, 1\rangle \varGamma(dt \times d \mu)}. \end{aligned}$$

Since (χ ^K,Γ ^K)→(χ,Γ) a.s. for any positive integer m we must have $(R^{K,\epsilon}_{1} ,\dots, R^{K,\epsilon}_{m} ) \to( R_{1},\dots ,R_{m})$ a.s. as K→∞ and ϵ→0. Proposition 4.9 implies that $\{R_{i} : i \in \mathbb{N}^{*}\}$ is a Markov chain over $\mathbb{X}$ with R ₁=x ₀ and transition probabilities given by

(4.24)

for any measurable set $A \subset \mathbb{X}$, $x \in \mathbb{X}$ and $i \in \mathbb{N}^{*}$.

Define two $\mathbb{X}$-valued processes as

The almost sure convergence of (χ ^K,Γ ^K)→(χ,Γ) implies that χ′^K,ϵ converges almost surely to χ′ as K→∞ and ϵ→0 in the Skorokhod space $\mathbb{D}([0,T],\mathbb{X})$ for any T>0.

We now show that the process {χ′(t):t≥0} uniquely characterizes the distribution of the limiting occupation measure Γ. For any $F_{f} \in\mathbb{F}^{2}_{b}, i \in \mathbb{N}^{*}$ and t≥0, define the real-valued variables

$$\begin{aligned} \begin{aligned} &\rho^{K,\epsilon}_i = \int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} \int_{\mathcal{M}_F(\mathbb{X})}F_f(\mu) \varGamma ^K(ds \times d \mu)\quad \text{ and }\\ &\rho_i = \int_{\tau_{i-1} \wedge t}^{\tau_i \wedge t} \int_{\mathcal{M}_F(\mathbb{X})}F_f(\mu) \varGamma(ds \times d \mu ).\end{aligned} \end{aligned}$$

(4.25)

Then certainly $\rho^{K,\epsilon}_{i} \to\rho_{i}$ a.s. as K→∞ and ϵ→0. Suppose F _f has the form F _f(μ)=F(〈μ,f〉) for some $f \in C_{b}(\mathbb{X},\mathbb{R})$ and $F \in C^{2}_{b}(\mathbb{R},\mathbb{R})$. Then for any $\mu\in\mathcal{N}_{\epsilon}(x)$

$$\begin{aligned} \bigl| F_f( \mu) - F_f( \widehat {n}_x \delta_x )\bigr| = \bigl| F\bigl( \langle\mu,f \rangle\bigr) - F( \widehat {n}_x f(x) \bigr| \leq\bigl\|F'\bigr\|_{\infty} \|f \|_\infty \epsilon. \end{aligned}$$

Therefore, on the event $E^{K}_{i}(\epsilon)$,

$$\begin{aligned} \biggl \vert \rho^{K,\epsilon}_i - \int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} F \bigl( \widehat{n}_{R^{K,\epsilon}_i} f\bigl(R^{K,\epsilon}_i\bigr) \bigr) ds\biggr \vert & = \biggl \vert \int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} \bigl( F_f \bigl( Z^K_s \bigr) - F \bigl( \widehat {n}_{R^{K,\epsilon}_i} f\bigl(R^{K,\epsilon}_i\bigr) \bigr) \bigr) ds \biggr \vert \\ & \leq \int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} \bigl \vert F_f \bigl( Z^K_s \bigr) - F \bigl( \widehat{n}_{R^{K,\epsilon }_i} f\bigl(R^{K,\epsilon}_i\bigr) \bigr) \bigr \vert ds \\ & \leq\bigl\|F'\bigr\|_{\infty} \|f\|_\infty \epsilon\bigl( \tau^K_i \wedge t - \bigl( \tau^K_{i-1} + \epsilon\bigr) \wedge t \bigr). \end{aligned}$$

Since

$$\int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} F \bigl( \widehat{n}_{R^{K,\epsilon}_i} f\bigl(R^{K,\epsilon}_i\bigr) \bigr) ds = \bigl( \tau^K_{i} \wedge t - \bigl(\tau^K_{i-1} + \epsilon\bigr) \wedge t \bigr) F \bigl( \widehat{n}_{R^{K,\epsilon}_i} f \bigl(R^{K,\epsilon}_i\bigr) \bigr) $$

converges to $( \tau_{i} \wedge t - \tau_{i-1} \wedge t ) F_{f} ( \widehat{n}_{R_{i}} \delta_{R_{i}} )$ as K→∞ and ϵ→0, we have that

$$\begin{aligned} \rho_i = ( \tau_{i} \wedge t - \tau_{i-1} \wedge t ) F_f ( \widehat{n}_{R_i} \delta_{R_i} ) = \int_{\tau _{i-1} \wedge t}^{\tau_i \wedge t} F_f ( \widehat{n}_{\chi '(s)} \delta_{\chi'(s)} )ds. \end{aligned}$$

This is true for each $i \in \mathbb{N}^{*}$ and this ensures that (2.12) holds with χ′ defined as above.

Recall that the process {χ(t):t≥0} satisfies the martingale problem given by (2.8). Using (2.12) we obtain that for any $F_{f} \in\mathbb{F}^{2}_{b}$

$$\begin{aligned} &F_f\bigl(\chi(t)\bigr) - F_f\bigl(\chi(0) \bigr) -\int_{0}^{t} b\bigl(\chi'(s) \bigr)p\bigl(\chi'(s)\bigr) \widehat{n}_{\chi'(s)}\\ &\quad \times \biggl( \int _{\mathbb{X}} \bigl( F_f\bigl(\chi(s)+ \delta_{y}\bigr)-F_f\bigl(\chi(s)\bigr) \bigr) m\bigl( \chi'(s),dy\bigr) \biggr) ds, \end{aligned}$$

is a martingale. This shows that if χ′(t)=x, then the next jump time of the process χ is exponentially distributed with parameter $b(x)p(x) \widehat{n}_{x}$. Therefore for each $i \in \mathbb{N}^{*}$, (τ _i−τ _i−1) is exponentially distributed with parameter $b(R_{i})p(R_{i}) \widehat{n}_{R_{i}}$. Since $\{R_{i} : i \in \mathbb{N}^{*}\}$ is a Markov chain with R ₁=x ₀ and transition probabilities given by (4.19), we can conclude that {χ′(t):t≥0} is a Markov process with generator $\mathbb{C}$ (given by (2.13)) and initial state x ₀. This completes the proof of part (B) of Theorem 2.4. □

References

Billingsley, P.: Convergence of Probability Measures. Wiley, New York (1968)
MATH Google Scholar
Champagnat, N.: A microscopic interpretation for adaptative dynamics trait substitution sequence models. Stoch. Process. Appl. 116(8), 1127–1160 (2006)
Article MATH MathSciNet Google Scholar
Champagnat, N., Méléard, S.: Polymorphic evolution sequence and evolutionary branching. Probab. Theory Relat. Fields 151, 45–94 (2011)
Article MATH Google Scholar
Champagnat, N., Ferrière, R., Méléard, S.: Individual-based probabilistic models of adaptative evolution and various scaling approximations. In: Dalang, R.C., Dozzi, M., Russo, F. (eds.) Proceedings of the 5th Seminar on Stochastic Analysis, Random Fields and Applications, Ascona, May 2005. Progress in Probability, vol. 59, pp. 75–114. Birkhäuser, Basel (2006)
Chapter Google Scholar
Champagnat, N., Ferrière, R., Méléard, S.: Unifying evolutionary dynamics: from individual stochastic processes to macroscopic models via timescale separation. Theor. Popul. Biol. 69, 297–321 (2006)
Article MATH Google Scholar
Champagnat, N., Jabin, P.-E., Méléard, S.: Adaptation in a stochastic multi-resources chemostat model (2013). http://arxiv.org/abs/1302.0552
Collet, P., Méléard, S., Metz, J.A.J.: A rigorous model study of the adaptative dynamics of Mendelian diploids (2011). http://arxiv.org/abs/1111.6234
Dawson, D.A.: Mesure-valued Markov processes. In: Hennequin, P.-L. (ed.) Ecole d’Eté de probabilités de Saint-Flour XXI. Lecture Notes in Math., vol. 1541, pp. 1–260. Springer, New York (1993)
Chapter Google Scholar
Dieckmann, U., Law, R.: The dynamical theory of coevolution: a derivation from stochastic ecological processes. J. Math. Biol. 34(5–6), 579–612 (1966)
MathSciNet Google Scholar
Dieckmann, U., Heino, M., Parvinen, K.: The adaptive dynamics of function-valued traits. J. Theor. Biol. 241(2), 370–389 (2006)
Article MathSciNet Google Scholar
Durinx, M., Metz, J.A.J., Meszéna, G.: Adaptive dynamics for physiologically structured models. J. Math. Biol. 56(5), 673–742 (2008)
Article MATH MathSciNet Google Scholar
Ethier, S.N., Kurtz, T.G.: Markov Processes, Characterization and Convergence. Wiley, New York (1986)
Book MATH Google Scholar
Geritz, S.A.H., Kisdi, É., Meszéna, G., Metz, J.A.J.: Evolutionarily singular strategies and the adaptive growth and branching of the evolutionary tree. Evol. Ecol. 12, 35–57
Gupta, A., Metz, J.A.J., Tran, V.C.: Work in progress
Kallenberg, O.: Random Measures. Academic Press, New York (1983)
MATH Google Scholar
Klebaner, F.C., Sagitov, S., Vatutin, V.A., Haccou, P., Jagers, P.: Stochasticity in the adaptive dynamics of evolution: the bare bones. J. Biol. Dyn. 5(2), 147–162 (2011)
Article MathSciNet Google Scholar
Kurtz, T.G.: Approximation of Population Processes. SIAM, Philadelphia (1981)
Book Google Scholar
Kurtz, T.G.: Averaging for martingale problems and stochastic approximation. In: Karatzas, I., Ocone, D. (eds.) Applied Stochastic Analysis, New Brunswick, NJ, 1991. Lecture Notes in Control and Inform. Sci., vol. 177, pp. 186–209. Springer, Berlin (1992)
Chapter Google Scholar
Méléard, S., Tran, V.C.: Trait substitution sequence process and canonical equation for age-structured populations. J. Math. Biol. 58(6), 881–921 (2009)
Article MATH MathSciNet Google Scholar
Méléard, S., Tran, V.C.: Slow and fast scales for superprocess limits of age-structured populations. Stoch. Process. Appl. 122(1), 250–276 (2012)
Article MATH Google Scholar
Méléard, S., Metz, J.A.J., Tran, V.C.: Limiting Feller diffusions for logistic populations with age-structure. In: Proceedings of the 58th World Statistics Congress—ISI2011. ISI (2011). Prepring HAL 00595928
Metz, J.A.J., Nisbet, R.M., Geritz, S.A.H.: How should we define fitness for general ecological scenarios. Trends Ecol. Evol. 7(6), 198–202 (1992)
Article Google Scholar
Metz, J.A.J., Geritz, S.A.H., Meszéna, G., Jacobs, F.A.J., Van Heerwaarden, J.S.: Adaptative dynamics, a geometrical study of the consequences of nearly faithful reproduction. In: Van Strien, S.J., Verduyn Lunel, S.M. (eds.) Stochastic and Spatial Structures of Dynamical Systems. KNAW Verhandelingen Afd. Natuurkunde, Eerste reeks, vol. 45, pp. 183–231. North Holland, Amsterdam (1996)
Google Scholar
Parvinen, K., Dieckmann, U., Heino, M.: Function-valued adaptive dynamics and the calculus of variations. J. Math. Biol. 52(1), 1–26 (2006)
Article MATH MathSciNet Google Scholar

Download references

Acknowledgements

The authors thank Sylvie Méléard for invaluable discussions. We also wish to give special thanks to a reviewer whose suggestions were very helpful in improving the presentation of our paper. This work benefited from the support of the ANR MANEGE (ANR-09-BLAN-0215), from the Chair “Modélisation Mathématique et Biodiversité” of Veolia Environnement-Ecole Polytechnique-Museum National d’Histoire Naturelle-Fondation X. V.C.T. was also supported in part by the Labex CEMPI (ANR-11-LABX-0007-01).

Author information

Authors and Affiliations

CMAP, Ecole Polytechnique, UMR CNRS 7641, Route de Saclay, 91128, Palaiseau Cédex, France
Ankit Gupta & Viet Chi Tran
Mathematical Institute & Institute of Biology & NCB Naturalis, Leiden, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands
J. A. J. Metz
EEP, IIASA, Laxenburg, Austria
J. A. J. Metz
Equipe Probabilité Statistique, Laboratoire Paul Painlevé, UMR CNRS 8524, UFR de Mathématiques, Université des Sciences et Technologies Lille 1, Cité Scientifique, 59655, Villeneuve d’Ascq Cédex, France
Viet Chi Tran

Authors

Ankit Gupta
View author publications
You can also search for this author in PubMed Google Scholar
J. A. J. Metz
View author publications
You can also search for this author in PubMed Google Scholar
Viet Chi Tran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ankit Gupta.

Appendix

Our main aim in this section is to prove Propositions 4.6 and 4.7. The proof will involve several stages. We start by recalling some useful results for one-dimensional branching processes (Lemma A.2). We then consider a two-dimensional birth and death process, where the two components interact by influencing each other’s death rate. In Proposition A.3 we present some coupling relationships between the components of this process and one-dimensional branching processes. These relationships enable us to prove Propositions A.4 and A.5, which give some important estimates for the two-dimensional process. Finally we show how these estimates can be used in proving Propositions 4.6 and 4.7.

Definition A.1

For any $b, d \in \mathbb{R}_{+}$ and $n \in \mathbb{N}$, let ${\bf P}(b,d,n)$ denote the law of the $\mathbb{N}$-valued continuous time branching process starting at n with birth rate b and death rate d. For convenience, we will consider ${\bf P}(b,\infty,n)$ to be the law of the process that is 0 at all times.

Lemma A.2

For each positive integer K let {B ^K(t):t≥0} be a continuous time branching process with law ${\bf P}(b,d,K)$ where b≠d. Let {t _K} be any $\mathbb{N}^{*}$-valued sequence satisfying logK≪t _K. Pick an ϵ∈[0,1] and define a stopping time

$$\begin{aligned} \sigma_K = \inf \bigl\{ t \geq0 : B^K(t) \leq K (1- \epsilon)\ \textit{or}\ B^K(t) > K (1+\epsilon) \bigr\}. \end{aligned}$$

Then we have the following.

(A)
If b<d then $\mathbb{P}( B^{K} (\sigma_{K} ) > K (1+\epsilon) ) \leq\exp (-K \epsilon\log(d/b) )$ and if b>d then $\mathbb{P}( B^{K} (\sigma_{K} ) \leq K (1-\epsilon) ) \leq\exp (-K \epsilon\log(b/d) )$.
(B)
If bd=0 and ϵ∉{0,1} then $\inf_{K \geq 1} \mathbb{E}( \sigma_{K} ) > 0$ and $\sup_{K \geq1} \mathbb{E}( \sigma^{2}_{K} ) < \infty$.
(C)
If ϵ=1 and if b<d then $\lim_{K \to\infty} \mathbb{P}( \sigma_{K} \leq t_{K}, B^{K} (\sigma _{K} ) = 0 ) = 1$.
(D)
Let {Y(t):t≥0} be a branching process with law ${\bf P}(b,d,1)$, where b>d. For any ϵ>0 define
$$\begin{aligned} \gamma_K = \inf \bigl\{ t \geq0 : Y(t) = 0\ \textit{or}\ Y(t) \geq K \epsilon) \bigr\}. \end{aligned}$$
Then
$$1- \lim_{K \to\infty} \mathbb{P}\bigl( \gamma_K \leq t_K, Y (\gamma _K ) \geq K \epsilon \bigr) = \lim _{K \to\infty} \mathbb{P}\bigl( \gamma_K \leq t_K,Y ( \gamma_K ) = 0 \bigr) = \frac{d}{b}. $$

Proof

Note that σ _K<∞ a.s. since a branching process either goes to 0 or to ∞ almost surely. We can easily check that $M_{K}(t) = ( d/b)^{B^{K}(t \wedge\sigma_{K})}$ is a bounded martingale. Then by the optional sampling theorem we get

$$\begin{aligned} \mathbb{E}\bigl( M_K(\sigma_K) \bigr) = \mathbb{E}\bigl(M_K(0) \bigr) = \biggl( \frac{d}{b} \biggr)^{K}. \end{aligned}$$

(A.1)

If b<d then $\mathbb{E}( M_{K}(\sigma_{K}) ) \geq \mathbb{P}( B^{K} (\sigma_{K} ) > K (1+\epsilon) ) (d/n)^{ K (1+\epsilon)}$ and if b>d then $\mathbb{E}( M_{K}(\sigma_{K}) ) \geq \mathbb{P}( B^{K} (\sigma_{K} ) \leq K (1-\epsilon) ) ( d/b)^{K(1-\epsilon)}$. Substituting these estimates in (A.1) proves part (A).

For part (B) we assume that d=0 and b>0. The case where b=0 and d>0 is similar. We can also assume that K(1+ϵ) is a positive integer. Since d=0, the process B ^K is monotonically increasing and hence B ^K(σ _K)=K(1+ϵ). We can show with Dynkin’s formula that $B^{K}(t) - b \int_{0}^{t} B^{K}(s)ds$ is a martingale, and obtain by the optional sampling theorem that

$$\begin{aligned} K = \mathbb{E}\bigl( B^K(\sigma_K) \bigr) -b \mathbb{E}\biggl( \int _{0}^{\sigma _K} B^K(s) ds \biggr) = K (1+ \epsilon) - b \mathbb{E}\biggl( \int_{0}^{\sigma_K} B^K(s) ds \biggr). \end{aligned}$$

Since B ^K(t)∈[K,K(1+ϵ)] for t≤σ _K we have $K \mathbb{E}( \sigma_{K} ) \leq \mathbb{E}( \int_{0}^{\sigma_{K}} B^{K}(s)ds ) \leq K(1+\epsilon) \mathbb{E}( \sigma_{K} )$, and hence

$$\begin{aligned} \frac{ \epsilon}{b(1+\epsilon)} \leq\inf_{K \geq1} \mathbb{E}( \sigma_K ) \leq\sup_{K \geq1} \mathbb{E}( \sigma_K ) \leq\frac{ \epsilon}{b}. \end{aligned}$$

(A.2)

From the fact that $t B^{K}(t) - \int_{0}^{t} B^{K}(s)(bs+1)ds$ is also a martingale, we obtain that

$$K \mathbb{E}\biggl( \frac{b}{2} \sigma_K^2 + \sigma_K \biggr)\leq \mathbb{E}\biggl( \int_{0}^{\sigma_K} B^K(s) (bs+1)ds \biggr) = \mathbb{E}\bigl( \sigma_K B^K(\sigma_K) \bigr) = K(1+\epsilon) \mathbb{E}( \sigma_K ). $$

This along with (A.2), proves part (B). Parts (C) and (D) follow directly from [2, Theorem 4]. □

Proposition A.3

Let {N(t)=(N ₁(t),N ₂(t)):t≥0} be a $\mathbb{N}\times \mathbb{N}$-valued pure jump Markov process with the following transition rates.

$$\begin{array}{c@{\quad }c} m b_1 & \textit{from}\ (m,n)\ \textit{to}\ (m+1,n) \\ n b_2 & \textit{from}\ (m,n)\ \textit{to}\ (m,n+1) \\ m d_1(m,n) & \textit{from}\ (m,n)\ \textit{to}\ (m-1,n) \\ n d_2(m,n) & \textit{from}\ (m,n)\ \textit{to}\ (m,n-1) . \end{array} $$

Here b ₁,b ₂ are positive constants and d ₁,d ₂ are functions from $\mathbb{N}\times \mathbb{N}$ to $\mathbb{R}_{+}$. Suppose that there is a set $\mathcal{S} \subset \mathbb{R}_{+} \times \mathbb{R}_{+}$ and constants $d^{+}_{1},d^{-}_{1}, d^{+}_{2}, d^{-}_{2} \in[0,\infty]$ such that

$$\begin{aligned} d^{-}_i \leq\inf \bigl\{ d_i(m,n) : (m,n) \in \mathcal{S} \cap(\mathbb{N}\times \mathbb{N}) \bigr\} \leq\sup \bigl\{ d_i(m,n) : (m,n) \in\mathcal {S} \cap(\mathbb{N}\times \mathbb{N}) \bigr\} \leq d^{+}_i , \end{aligned}$$

for i=1,2. Assume that $(N_{1}(0),N_{2}(0)) \in\mathcal{S}$ and let $T_{\mathcal{S}}$ be the random time defined by

$$\begin{aligned} T_\mathcal{S} = \inf\bigl\{ t \geq0 : N(t) \notin\mathcal{S} \bigr\}. \end{aligned}$$

Let $z^{+}_{1},z^{-}_{1},z^{+}_{2},z^{-}_{1}$ be positive integers satisfying $z^{-}_{1} \leq N_{1}(0)\leq z^{+}_{1}$ and $z^{-}_{2} \leq N_{2}(0)\leq z^{+}_{2}$. Then on the same probability space as {N(t):t≥0}, we can construct four $\mathbb{N}$-valued processes $B^{+}_{1},B^{-}_{1}$, $B^{+}_{2}$ and $B^{-}_{2}$ with laws ${\bf P}(b_{1},d^{-}_{1},z^{+}_{1})$, ${\bf P}(b_{1},d^{+}_{1},z^{-}_{1})$, ${\bf P}(b_{2},d^{-}_{2},z^{+}_{2})$, ${\bf P}(b_{2},d^{+}_{2},z^{-}_{1})$ such for all $t \leq T_{\mathcal{S}}$ the following relations are satisfied almost surely,

$$\begin{aligned} B^{-}_1(t) \leq N_1(t) \leq B^{+}_1(t) \quad\textit{and} \quad B^{-}_2(t) \leq N_2(t) \leq B^{+}_2(t) . \end{aligned}$$

Proof

The proof follows from the direct coupling of these processes. □

We now introduce some notation that will be used in Propositions A.4 and A.5. For each $K \in \mathbb{N}^{*}$, let $\{N^{K}(t) = ( N^{K}_{1}(t), N^{K}_{2}(t) ) : t \geq0 \}$ be the $\mathbb{N}\times \mathbb{N}$-valued Markov process described in Proposition A.3. We allow the functions d ₁ and d ₂ to depend on K and so we denote them as $d^{K}_{1}$ and $d^{K}_{2}$ respectively. Corresponding to any set $\mathcal{S}_{K} \subset \mathbb{R}^{2}$ define the stopping time $T_{\mathcal{S}_{K}}$ by

$$\begin{aligned} T_{\mathcal{S}_K} = \inf \bigl\{ t \geq0 : N^K(t) \notin\mathcal {S}_K \bigr\}. \end{aligned}$$

(A.3)

Proposition A.4

(Exit of $N^{K}_{1}$ from sets away from 0)

Suppose that Assumption 2.3 is satisfied. Let $\mathcal{S}_{K} = [A_{K},B_{K}) \times[C_{K},D_{K})$ for some positive real numbers A _K, B _K, C _K and D _K. Assume that $(N^{K}_{1}(0),N^{K}_{2}(0)) \in\mathcal{S}_{K}$ and there exist constants η>0, ϵ∈(0,1) such that [Kη(1−ϵ),Kη(1+ϵ)]⊂[A _K,B _K). Let $d^{-}_{1} (\eta,\epsilon)$ and $d^{+}_{1} (\eta ,\epsilon)$ be positive constants that satisfy

$$\begin{aligned} d^{-}_1 (\eta,\epsilon) \leq & \inf_{ K \in \mathbb{N}^*} \inf\bigl\{ d^K_1(m,n) : (m,n)\in\bigl[K\eta(1- \epsilon),K\eta(1+\epsilon)\bigr] \times[C_K,D_K) \bigr\} \end{aligned}$$

(A.4)

$$\begin{aligned} \textit{ and }d^{+}_1 (\eta,\epsilon) \geq & \sup _{K \in \mathbb{N}^*} \sup\bigl\{ d^K_1(m,n) : (m,n) \in\bigl[K\eta(1-\epsilon),K\eta (1+\epsilon)\bigr] \times[C_K,D_K) \bigr\}. \end{aligned}$$

(A.5)

Let $T_{\mathcal{S}_{K}}$ be given by (A.3). Then we have the following.

(A)
If $b_{1} < d^{-}_{1}(\eta,\epsilon)$ and $N^{K}_{1}(0) \leq K \eta$ then in a time interval of order (Ku _K)⁻¹, $N^{K}_{1}$ does not exit the interval [A _K,B _K) by B _K almost surely: For any T>0
$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( T_{\mathcal{S}_K} \leq\frac{T}{K u_K} , N^K_1 ( T_{\mathcal{S}_K} ) \geq B_K \biggr) = 0. \end{aligned}$$
(B)
If $b_{1} > d^{+}_{1}(\eta,\epsilon)$ and $N^{K}_{1}(0) \geq K\eta$ then in a time interval of order (Ku _K)⁻¹, $N^{K}_{1}$ does not exit the interval [A _K,B _K) by A _K almost surely: For any T>0
$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( T_{\mathcal{S}_K} \leq\frac{T}{K u_K} , N^K_1 ( T_{\mathcal{S}_K} ) < A_K \biggr) = 0. \end{aligned}$$
(C)
Now assume that there exist ϵ ₁,ϵ ₂∈(0,1) and η ₁<η ₂ such that [Kη _i(1−ϵ _i),Kη _i(1+ϵ _i)]⊂[A _K,B _K) for i=1,2. Let $d^{+}_{1} (\eta_{1},\epsilon_{1})$ be a positive constant that satisfies (A.5) for η=η ₁ and ϵ=ϵ ₁. Similarly let $d^{-}_{1} (\eta _{2},\epsilon_{2})$ be a positive constant that satisfies (A.4) for η=η ₂ and ϵ=ϵ ₂.

If $d^{+}_{1} (\eta_{1},\epsilon_{1})< b_{1} < d^{-}_{1} (\eta_{2},\epsilon_{2})$ and $N^{K}_{1}(0) \in[K \eta_{1},K \eta_{2}]$ then in a time interval of order (Ku _K)⁻¹, $N^{K}_{1}$ does not exit the interval [A _K,B _K) almost surely: For any T>0
$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( T_{\mathcal{S}_K} \leq\frac{T}{K u_K} , N^K_1 ( T_{\mathcal{S}_K} ) \notin[A_K,B_K ) \biggr) = 0. \end{aligned}$$

Proof

We first prove part (A). Without loss of generality, we can assume that Kη,Kη(1−ϵ),Kη(1+ϵ) are positive integers and $N^{K}_{1}(0) = K \eta$. Define

$$\begin{aligned} \sigma_K = \inf \bigl\{ t \geq0 : N^K(t) \notin\bigl[K \eta(1-\epsilon ),K\eta(1+\epsilon)\bigr] \times[C_K,D_K) \bigr\}. \end{aligned}$$

In order for the process $N^{K}_{1}$, started at Kη, to go beyond level B _K it must exit the interval [Kη(1−ϵ),Kη(1+ϵ)] from above. The probability to go from Kη to Kη(1+ϵ) is exponentially small. Indeed, since $b_{1} < d^{-}_{1}(\eta,\epsilon)$, by Proposition A.3 we can construct a coupled subcritical branching process $B^{K}_{+}$ with law ${\bf P}(b_{1},d^{-}_{1}(\eta,\epsilon),K)$ such that $B^{K}_{+}(t) \geq N^{K}_{1}(t)$ for all t≤σ _K almost surely. Hence if γ _K is the first time $B^{K}_{+}$ leaves the set [Kη(1−ϵ),Kη(1+ϵ)] then from part (A) of Lemma A.2 we obtain that for some c>0

$$\begin{aligned} \mathbb{P}\bigl( N^K_1( \sigma_K) > K\eta(1+\epsilon) \bigr) \leq \mathbb{P}\bigl( B^K_{+}(\gamma_K) > K\eta(1+\epsilon) \bigr) \leq e^{-c K}. \end{aligned}$$

(A.6)

Before $N^{K}_{1}$ exits [A _K,B _K) from above, this process crosses the interval [Kη(1−ϵ),Kη] several times. Let ρ _K be the number (possibly 0) of these passages in the interval $[0,\mathcal {T}_{\mathcal{S}_{K}}]$. Because of (A.6), for any $n\in \mathbb{N}$,

$$\begin{aligned} \mathbb{P}( \rho_K < n ) \leq n e^{-cK}. \end{aligned}$$

(A.7)

Let n _K=⌊1/Ku _K⌋². Then

$$\begin{aligned} & \mathbb{P}\biggl( T_{\mathcal{S}_K} \leq\frac{T}{K u_K} , N^K_1 ( T_{\mathcal{S}_K} ) \geq B_K \biggr) \\ &\quad \leq \mathbb{P}\biggl( T_{\mathcal{S}_K} \leq\frac{T}{K u_K} , N^K_1 ( T_{\mathcal{S}_K} ) \geq K\eta(1+\epsilon) , \rho_K \geq n_K \biggr) + \mathbb{P}(\rho_K < n_K) \\ &\quad \leq \mathbb{P}\Biggl( \sum_{i=1}^{n_K} \tau_{K,i} < \frac{T}{K u_K} \Biggr) + \mathbb{P}(\rho_K < n_K), \end{aligned}$$

(A.8)

where the $\tau_{K,1},\dots,\tau_{K,n_{K}}$ denotes the durations of the first n _K passages of $N^{K}_{1}$ from Kη(1−ϵ) to Kη. The last term in the r.h.s. of (A.8) tends to 0 with K due to our choice of n _K, (A.7) and Assumption 2.3. Using the couplings of Proposition A.3, it is possible to dominate $N^{K}_{1}$ by a pure birth process with distribution ${\bf P}(b_{1},0,K\eta(1-\epsilon))$ on each of its excursions from Kη(1−ϵ) to Kη. Thus, each τ _K,i can be bounded above by the time $\overline {\sigma}_{K,i}$ needed by a pure birth process with distribution ${\bf P}(b_{1},0,K\eta(1-\epsilon))$ to reach Kη. The $\overline {\sigma}_{K,i}$’s can be chosen to be i.i.d. and by part (B) of Lemma A.2, $\inf_{K \geq1} \mathbb{E}( \tau_{K,1} ) > 0$ and $\sup_{K \geq1} \mathbb{E}( \tau^{2}_{K,1} ) < \infty$. Applying Chebychev’s inequality we get

$$\begin{aligned} \mathbb{P}\Biggl( \sum_{i=1}^{n_K} \tau_{K,i} \leq\frac{T}{K u_K} \Biggr) & \leq \mathbb{P}\Biggl( \frac{1}{n_K} \sum_{i=1}^{n_K} \overline { \sigma }_{K,i} - \mathbb{E}( \overline {\sigma}_{K,i} ) \leq \frac{T}{K u_K n_K} -\mathbb{E}( \overline {\sigma}_{K,1} ) \Biggr) \\ & \leq\frac{ \mathbb{E}( \overline {\sigma}_{K,1}^2 ) }{n_K ( \mathbb{E}( \overline {\sigma}_{K,1} ) - \frac{T}{K u_K n_K} )^{+}}. \end{aligned}$$

As K→∞, T/(Ku _K n _K)→0 and n _K→∞. This proves part (A). Proof of part (B) is similar and part (C) is a direct consequence of parts (A) and (B). □

Proposition A.5

(Exit of $N^{K}_{2}$ from sets close to 0)

Suppose that Assumption 2.3 is satisfied. Let the process $\{N^{K}(t) = ( N^{K}_{1}(t), N^{K}_{2}(t) ) : t \geq0 \}$ and the set $\mathcal{S}_{K} = [A_{K},B_{K}) \times[C_{K},D_{K})$ be as in Proposition A.4. Let [C _K,D _K)=[1,Kϵ) for some ϵ>0 and let {t _K} be any $\mathbb{N}^{*}$-valued sequence such that $\log K \ll t_{K} \ll \frac{1}{Ku_{K}}$. Suppose $d^{-}_{2} (\epsilon)$ and $d^{+}_{2} (\epsilon )$ are positive constants that satisfy

$$\begin{aligned} d^{-}_2 (\epsilon) & \leq\inf_{K \in \mathbb{N}^*} \inf\bigl\{d^K_2(m,n) : (m,n)\in [A_K,B_K) \times[1, K\epsilon) \bigr\} \\ \textit{and}\quad d^{+}_2(\epsilon) & \geq \sup _{K \in \mathbb{N}^*}\sup \bigl\{d^K_2(m,n) : (m,n) \in[A_K,B_K)\times[1, K\epsilon) \bigr\}. \end{aligned}$$

Let $T_{\mathcal{S}_{K}}$ be given by (A.3). Then we have the following.

(A)
If $b_{2} < d^{-}_{2}(\epsilon)$ and $N^{K}_{2}(0) \leq K \epsilon /2$ then
$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , N^K_2 ( T_{\mathcal{S}_K} ) = 0 \bigr) = 1. \end{aligned}$$
(B)
If $b_{2} > d^{+}_{2}(\epsilon)$ and $N^{K}_{2}(0) = 1$ then
$$\begin{aligned} \liminf_{K \to\infty} \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , N^K_2 ( T_{\mathcal{S}_K} ) \geq\epsilon K \bigr) & \geq 1 - \frac{d^{+}_2(\epsilon)}{b_2} , \\ \limsup_{K \to\infty} \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , N^K_2 ( T_{\mathcal{S}_K} ) \geq\epsilon K \bigr) & \leq1 - \frac{d^{-}_2(\epsilon)}{b_2}, \\ \liminf_{K \to\infty} \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , N^K_2 ( T_{\mathcal{S}_K} ) = 0 \bigr) & \geq \frac {d^{-}_2(\epsilon)}{b_2} \\ \textit{and}\quad \limsup_{K \to\infty} \mathbb{P}\bigl( T_{\mathcal {S}_K} \leq t_K , N^K_2 ( T_{\mathcal{S}_K} ) = 0 \bigr) & \leq \frac{d^{+}_2(\epsilon)}{b_2}. \end{aligned}$$

Proof

We first prove part (A). Since $b_{2}<d^{-}_{2}(\epsilon)$ and $N^{K}_{2}(0) \leq K \epsilon/2$, Proposition A.3 shows that $N^{K}_{2}$ can be dominated on $[0,T_{\mathcal{S}_{K}}]$ by a subcritical branching process $B^{K}_{+}$ with distribution ${\bf P}(b_{2},d^{-}_{2}(\epsilon), \lfloor K \epsilon/2 \rfloor)$. Let

$$\begin{aligned} \gamma_K^{+} = \inf\bigl\{ t \geq0 : B^K_{+}(t) = 0 \text{ or } B^K_{+}(t) \geq K \epsilon\bigr\}. \end{aligned}$$

(A.9)

The coupling ensures that if $T_{\mathcal{S}_{K}} \geq\gamma_{K}^{+}$ and $B^{K}_{+}(\gamma_{K}^{+}) = 0$ then $N^{K}_{2}(\gamma_{K}^{+}) = 0$ and $T_{\mathcal{S}_{K}} = \gamma_{K}^{+}$. Since $B^{K}_{+}(t)\in(0,K \epsilon )$ for $t \in[0,\gamma_{K}^{+})$ we can also see that if $T_{\mathcal {S}_{K}} < \gamma_{K}^{+}$ then either $N^{K}_{2} ( T_{\mathcal{S}_{K}} ) = 0$ or $N^{K}_{1} ( T_{\mathcal{S}_{K}} ) \notin [A_{K},B_{K})$. Hence

$$\begin{aligned} &\mathbb{P}\bigl( \gamma_K^+ \leq t_K, B^K_{+} \bigl(\gamma_K^+\bigr) = 0 \bigr) \\ &\quad \leq \mathbb{P}\bigl( \gamma_K^+ \leq t_K , B^K_{+}\bigl(\gamma_K^+\bigr) = 0 , T_{\mathcal{S}_K} \geq\gamma_K^+ \bigr) + \mathbb{P}\bigl( \gamma_K^+ \leq t_K , T_{\mathcal{S}_K} < \gamma_K^+ \bigr) \\ &\quad \leq \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K, N^K_2 ( T_{\mathcal{S}_K} ) = 0 , T_{\mathcal{S}_K} = \gamma _K^+ \bigr) + \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K, N^K_2 ( T_{\mathcal {S}_K} ) = 0 , T_{\mathcal{S}_K} < \gamma_K^+ \bigr) \\ &\qquad + \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K, N^K_1 ( T_{\mathcal {S}_K} ) \notin[A_K,B_K), T_{\mathcal{S}_K} < \gamma_K^+ \bigr) \\ &\quad \leq \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K, N^K_2 ( T_{\mathcal{S}_K} ) = 0 \bigr) + \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K, N^K_1 ( T_{\mathcal {S}_K} ) \notin[A_K,B_K) \bigr). \end{aligned}$$

(A.10)

Due to part (C) of Proposition A.4, the limit of the second term in the r.h.s. of (A.10) is 0. Since part (C) of Lemma A.2 tells us that the limit of the l.h.s. is 1, this completes the proof of part (A).

We now prove part (B). Since $b_{2} > d^{+}_{2}(\epsilon) > d^{-}_{2}(\epsilon)$ and $N^{K}_{2}(0)=1$, using Proposition A.3, we can construct two coupled supercritical branching processes $B^{K}_{-}, B^{K}_{+}$ with distributions ${\bf P}(b_{2}, d^{+}_{2}(\epsilon),1), {\bf P}(b_{2}, d^{-}_{2}(\epsilon),1)$ such that $B^{K}_{-}(t) \leq N^{K}_{2}(t) \leq B^{K}_{+}(t)$ for all $t \leq T_{\mathcal{S}_{K}}$ almost surely. Let $\gamma^{+}_{K}$ be given by (A.9) and $\gamma^{-}_{K}$ be the corresponding stopping time for $B^{K}_{-}$. The fact that $B^{K}_{-}$ is below $N^{K}_{2}$ until time $T_{\mathcal{S}_{K}}$, (A.10) and part (C) of Proposition A.4 allow us to prove that

$$\begin{aligned} \lim_{K\rightarrow\infty} \mathbb{P}\bigl( \gamma_K^+ \leq t_K, B^K_{+}\bigl( \gamma_K^+\bigr) = 0 \bigr) & \leq\liminf_{K\rightarrow\infty } \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K, N^K_2 ( T_{\mathcal {S}_K} ) = 0 \bigr) \end{aligned}$$

(A.11)

$$\begin{aligned} \text{and}\quad \limsup_{K\rightarrow\infty} \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K, N^K_2 ( T_{\mathcal{S}_K} ) = 0 \bigr) & \leq\lim_{K\rightarrow\infty} \mathbb{P}\bigl( \gamma_K^- \leq t_K, B^K_{-}\bigl(\gamma_K^-\bigr) = 0 \bigr). \end{aligned}$$

(A.12)

Since $B^{K}_{+}$ is above $N^{K}_{2}$ until time $T_{\mathcal{S}_{K}}$, we obtain

$$ \limsup_{K\rightarrow\infty} \mathbb{P}\bigl(T_{\mathcal{S}_K}\leq t_K,\ N^K_2(T_{\mathcal{S}_K})\geq K\epsilon \bigr)\leq\lim _{K\rightarrow\infty} \mathbb{P}\bigl(\gamma_K^+ \leq t_K,\ B^K_{+}\bigl(\gamma _K^+\bigr)\geq K\epsilon \bigr). $$

(A.13)

Finally, by a proof similar to the one of (A.10), we have

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\bigl( \gamma_K^- \leq t_K, B^K_{-}\bigl( \gamma _K^-\bigr) \geq K \epsilon \bigr) \leq\liminf _{K \to\infty} \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K, N^K_2 ( T_{\mathcal{S}_K} ) \geq K \epsilon \bigr). \end{aligned}$$

(A.14)

From part (D) of Lemma A.2 and Assumption 2.3 we can deduce that

$$\begin{aligned} &1 - \lim_{K \to\infty} \mathbb{P}\bigl( \gamma^{+}_K \leq t_K, B^K_{+} \bigl( \gamma^{+}_K \bigr) \geq K \epsilon \bigr) = \lim_{K \to \infty} \mathbb{P}\bigl( \gamma^{+}_K \leq t_K, B^K_{+} \bigl( \gamma ^{+}_K \bigr) = 0 \bigr) = \frac{d^{-}_2(\epsilon)}{b_2} \\ \mbox{and}\quad & 1-\lim_{K \to\infty} \mathbb{P}\bigl( \gamma^{-}_K \leq t_K, B^K_{-} \bigl( \gamma^{-}_K \bigr) \geq K \epsilon \bigr) = \lim _{K \to\infty} \mathbb{P}\bigl( \gamma^{-}_K \leq t_K, B^K_{-} \bigl( \gamma^{-}_K \bigr) = 0 \bigr) = \frac{d^{+}_2(\epsilon)}{b_2}. \end{aligned}$$

These relations along with (A.11), (A.12), (A.13) and (A.14) prove part (B). □

Remark A.6

Let {Y(t):t≥0} be the process described in Definition 4.5. Define the $\mathbb{N}\times \mathbb{N}$-valued process $\{N^{K}(t) =(N^{K}_{1}(t) , N^{K}_{2}(t) ) : t\geq0 \}$ by and . Then {N ^K(t):t≥0} is an instance of the $\mathbb{N}\times \mathbb{N}$-valued Markov process considered in Propositions A.4 and A.5 with b ₁=b(x), b ₂=b(y),

$$\begin{aligned} &d^K_1(m,n) = \biggl( d(x) + \alpha(x,x) \frac{m}{K} + \alpha(x,y) \frac{n}{K} \biggr)\quad \text{and}\\ & d^K_2(m,n) = \biggl( d(y) + \alpha(y,x) \frac{m}{K} + \alpha(y,y) \frac{n}{K} \biggr). \end{aligned}$$

Using Propositions A.4 and A.5 we now give the proofs of Propositions 4.6 and 4.7.

Proof of Proposition 4.6

Corresponding to the process {Y ^K(t):t≥0} let {N ^K(t):t≥0} be the $\mathbb{N}\times \mathbb{N}$-valued Markov process described in Remark A.6.

We first prove part (A). Since $z^{K}_{2} = 0$, $N^{K}_{2}(t) = 0$ for all t≥0 and $K \in \mathbb{N}^{*}$. Note that b(x)−d(x)>0 and so we can pick a δ>0 such that b(x)−d(x)−2α(x,x)δ>0 and 2δ<ϵ. Let $\mathcal{S}'_{K} = [K \delta, \infty) \times\{0\}$ and let $T_{\mathcal{S}'_{K}}$ be given by (A.3). The only way N ^K can exit the set $\mathcal {S}'_{K}$ is by having $N^{K}_{1}$ go below Kδ. Therefore

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K_t\{x\} < \delta\text{ for some } t \in \biggl[ 0 , \frac{T}{K u_K} \biggr] \biggr)= \lim_{K \to\infty} \mathbb{P}\biggl( T_{\mathcal{S}'_K} \leq \frac{T}{K u_K} , N^K_1 ( T_{\mathcal{S}'_K} ) < K \delta \biggr). \end{aligned}$$

(A.15)

Observe that on the set [Kδ,2Kδ]×{0}, the supremum of $d^{K}_{1}$ is bounded above by d(x)+2α(x,x)δ which is less than b(x). Therefore by part (B) of Proposition A.4, the limit on the r.h.s. of (A.15) is 0. This proves part (A).

For part (B) we can choose an ϵ>0 sufficiently small such that $\widehat{n}_{x} > 2 \epsilon$. Let $\eta_{1} = \widehat{n}_{x} - \epsilon$ and $\eta_{2} = \widehat{n}_{x} + \epsilon$. We can find ϵ ₁,ϵ ₂>0 such that $[\eta_{1}(1-\epsilon_{1}), \eta _{1}(1+\epsilon_{1})] \subset[\widehat{n}_{x} - 2\epsilon, \widehat {n}_{x})$, $[\eta_{2}(1-\epsilon_{2}), \eta_{2}(1+\epsilon_{2})] \subset (\widehat{n}_{x}, \widehat{n}_{x} + 2\epsilon)$ and on the set [Kη ₁(1−ϵ ₁),Kη ₁(1+ϵ ₁)]×{0} the supremum of $d^{K}_{1}$ is strictly below b(x) while on the set [Kη ₂(1−ϵ ₂),Kη ₂(1+ϵ ₂)]×{0} the infimum of $d^{K}_{1}$ is strictly above b(x). Let $\mathcal{S}'_{K} = [K (\widehat{n}_{x} - 2\epsilon) , K(\widehat{n}_{x} + 2\epsilon)) \times \{0\}$ and let $T_{\mathcal{S}'_{K}}$ be given by (A.3). From part (C) of Proposition A.4 we get

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( T_{\mathcal{S}'_K} \leq\frac{T}{K u_K} , N^K_1 ( T_{\mathcal{S}'_K} ) \notin\bigl[K (\widehat {n}_x - 2\epsilon) , K(\widehat{n}_x + 2\epsilon)\bigr) \biggr) = 0. \end{aligned}$$

Observe that supp(Y ^K(t))={x} for all t≥0. Hence this limit proves part (B). □

Proof of Proposition 4.7

Corresponding to the process {Y ^K(t):t≥0} let {N ^K(t):t≥0} be the $\mathbb{N}\times \mathbb{N}$-valued Markov process described in Remark A.6.

We first prove part (A). Note that Fit(y,x)>0. We can choose an ϵ ₀>0 such that $d(y) + \alpha(y,x) (\widehat {n}_{x} + 2 \epsilon_{0} ) + 2 \alpha(y,y)\epsilon_{0} = b(y)$. Now let ϵ<ϵ ₀ and assume that $z^{K}_{1} < K (\widehat {n}_{x}+\epsilon)$ and $z^{K}_{2} > K \epsilon$ for all $K \in \mathbb{N}^{*}$. Define the set $\mathcal{S}'_{K} = [ 0 , K (\widehat{n}_{x} + 2 \epsilon ) ) \times[K \epsilon/2,\infty)$ and let $T_{\mathcal{S}'_{K}}$ be given by (A.3). It is easy to see that

$$\begin{aligned} & \lim_{K \to\infty} \mathbb{P}\biggl( Y^K_t\{y\} < \frac{\epsilon}{2} \text{ for some } t \in \biggl[ 0 , \frac{T}{K u_K} \biggr] \biggr) \\ &\quad \leq \lim_{K \to\infty} \mathbb{P}\biggl( T_{\mathcal{S}'_K} \leq \frac {T}{K u_K} \biggr) \\ &\quad \leq\lim_{K \to\infty} \biggl[ \mathbb{P}\biggl( T_{\mathcal{S}'_K} \leq \frac{T}{K u_K}, N^K_1 ( T_{\mathcal{S}'_K} ) \geq K ( \widehat{n}_x + 2 \epsilon) \biggr) \\ &\qquad + \mathbb{P}\biggl( T_{\mathcal{S}'_K} \leq\frac{T}{K u_K},N^K_2 ( T_{\mathcal{S}'_K} )< \frac{K\epsilon}{2} \biggr) \biggr]. \end{aligned}$$

On the set $[K (\widehat{n}_{x} + \epsilon/2), K (\widehat{n}_{x} + 3\epsilon/2) ] \times[K \epsilon/2,\infty)$, the infimum of $d^{K}_{1}$ is greater that b(x). Part (A) of Proposition A.4 shows that the first limit on the right is 0. On the set $[ 0 , K (\widehat{n}_{x} + 2 \epsilon) ) \times[K \epsilon /2 , 3 K \epsilon/4 )$, the supremum of $d^{K}_{2}$ is less than b(y). We can use part(B) of Proposition A.4, to see that the second limit on the right is also 0. This proves part (A).

For part (B), observe that Fit(y,x)<0 and let ϵ ₀>0 satisfy $d(y)+ \alpha(y,x) (\widehat{n}_{x}- 2\epsilon_{0}) = b(y)$. Pick an ϵ∈(0,ϵ ₀). On the set $[ K(\widehat {n}_{x} -2 \epsilon), K(\widehat{n}_{x}+ 2 \epsilon)] \times[1, 2 K \epsilon)$ the infimum of $d^{K}_{2}$ is greater than b(y). Part (A) of Proposition A.5 proves part (B).

For part (C), note that Fit(y,x)>0 and let ϵ ₀>0 satisfy $d(y)+ \alpha(y,x) (\widehat{n}_{x}+ 2 \epsilon_{0}) + 2 \alpha (y,y) \epsilon_{0} = b(y)$. Pick an ϵ∈(0,ϵ ₀). On the set $[ K(\widehat{n}_{x} -2 \epsilon), K(\widehat{n}_{x}+ 2 \epsilon )] \times[1, 2 K \epsilon)$ the supremum of $d^{K}_{2}$ is less than $d^{+}_{2}(\epsilon) := d(y)+ \alpha(y,x) (\widehat{n}_{x}+ 2 \epsilon) + 2 \alpha(y,y) \epsilon$ and the infimum of $d^{K}_{2}$ is greater than $d^{-}_{2}(\epsilon) := d(y)+ \alpha(y,x) (\widehat{n}_{x}- 2 \epsilon )$. Both $d^{+}_{2}(\epsilon)$ and $d^{-}_{2}(\epsilon)$ are less than b(y). Part (B) of Proposition A.5 proves part (C). □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, A., Metz, J.A.J. & Tran, V.C. A New Proof for the Convergence of an Individual Based Model to the Trait Substitution Sequence. Acta Appl Math 131, 1–27 (2014). https://doi.org/10.1007/s10440-013-9847-y

Download citation

Received: 13 March 2012
Accepted: 18 September 2013
Published: 28 September 2013
Issue Date: June 2014
DOI: https://doi.org/10.1007/s10440-013-9847-y

Keywords

Mathematics Subject Classification (2000)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A New Proof for the Convergence of an Individual Based Model to the Trait Substitution Sequence

Abstract

Similar content being viewed by others

Convergence of an infinite dimensional stochastic process to a spatially structured trait substitution sequence

Introduction

Ancestral Lineages and Limit Theorems for Branching Markov Chains in Varying Environment

1 Introduction: The Logistic Birth and Death Model

Assumption 1.1

Notation

2 IBM in the Evolutionary Time-Scale

Lemma 2.1

Assumption 2.2

Assumption 2.3

Theorem 2.4

Remark 2.5

3 Tightness of {(χ K,Γ K)}

Theorem 3.1

Proof

4 Characterization of the Limiting Values

4.1 Dynamics Without Mutation

4.1.1 Convergence of the Occupation Measure Γ K in the Absence of Mutation

Definition 4.1

Proposition 4.2

Proof

Remark 4.3

Corollary 4.4

Proof

4.1.2 Fixation Probabilities

Definition 4.5

Proposition 4.6

Proposition 4.7

Corollary 4.8

Proof

4.2 Proof of Theorem 2.4: Convergence to the TSS

Proposition 4.9

Proof

Proof of Theorem 2.4(B)

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Definition A.1

Lemma A.2

Proof

Proposition A.3

Proof

Proposition A.4

Proof

Proposition A.5

Proof

Remark A.6

Proof of Proposition 4.6

Proof of Proposition 4.7

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation

3 Tightness of {(χ ^K,Γ ^K)}

4.1.1 Convergence of the Occupation Measure Γ ^K in the Absence of Mutation