1 Introduction: The Logistic Birth and Death Model

We consider a stochastic individual based model (IBM) with trait structure and evolving as a result of births and deaths, that was introduced by Dieckmann and Law [9] and Metz et al. [23] and in rigorous detail by Champagnat [2]. We study its limit in an evolutionary time scale when the population is large and the mutations are rare.

Champagnat [2] established the first rigorous proof of the convergence of a sequence of such IBMs to the trait substitution sequence process (TSS) introduced by Metz et al. [22] (with Metz et al. [23] as a follow up). Following Dieckmann and Law [9], the TSS can be explained as follows. In the limit, the time scales of ecology and evolution are separated. Mutations are rare and before a mutant arises, the resident population stabilizes around an equilibrium. Under the “invasion implies substitution” Ansatz, there cannot be long term coexistence of two different traits. Evolution thus proceeds as a succession of monomorphic population equilibria. The fine structure of the transitions disappear on the time scale that is considered and when a mutation occurs, invades, and fixates in the population by completely replacing the resident trait, the TSS jumps from one state to another. Champagnat’s proof is based on some fine estimates, including some fine large deviation results, to combine several approximations of the microscopic process. We separate the different time scales that are involved using averaging techniques due to Kurtz [17], and thus propose a new simplified proof of Champagnat’s results that skips many technicalities linked with fine approximations of the IBM. The aim of this paper is to exemplify the use of such averaging techniques in adaptive dynamics, which we hope will pave the way for generalizations of the TSS.

We consider a structured population where each individual is characterized by a trait \(x\in \mathbb{X}\), where \(\mathbb{X}\) is a compact subset of \(\mathbb{R}^{d}\). We are interested in large populations. We assume that the population’s initial size is proportional to a parameter \(K\in \mathbb{N}^{*}=\{1,2,\dots\} \), to be interpreted as the area to which the population is confined. We will let K go to infinity while keeping the density constant by counting the individuals with weight 1/K. The population is assumed to be well mixed and its density is assumed to be limited by a fixed availability of resources per unit area. The population at time t can be described by the following point measure on \(\mathbb{X}\)

$$\begin{aligned} X^{K}(t) = \frac{1}{K} \sum _{i=1}^{N^{K}(t)} \delta_{x^i_t}, \end{aligned}$$
(1.1)

where N K(t) is the total number of individuals in the population at time t and where \(x^{i}_{t}\in \mathbb{X}\) denotes the trait of individual i living at time t, the latter being numbered in lexicographical order.

The population evolves by births and deaths. An individual with trait \(x\in \mathbb{X}\) gives birth to new individuals at rate b(x), where b(x) is a positive continuous function on \(\mathbb{X}\). With probability u K p(x)∈[0,1], the daughter is a mutant with trait y, where y is drawn from the mutation kernel m(x,dy) supported on \(\mathbb{X}\). Here u K ∈[0,1] is a parameter depending on K that scales the probability of mutation. With probability 1−u K p(x)∈[0,1], the daughter is a clone of her mother, and has the same trait x. In a population described by \(X\in \mathcal{M}_{F}(\mathbb{X})\), the individual with trait x dies at rate \(d(x)+\int_{\mathbb{X}}\alpha(x,y)X(dy)\), where the natural death rate d(x) and the competition kernel α(x,y) are positive continuous functions.

Assumption 1.1

We assume that the functions b, d and α satisfy the following hypotheses:

  1. (A)

    For all \(x\in \mathbb{X}\), b(x)−d(x)>0 and p(x)>0.

  2. (B)

    “Invasion implies substitution”: For all x and y in \(\mathbb{X}\), we either have:

    $$\begin{aligned} & \bigl(b(y)-d(y)\bigr)\alpha(x,x)-\bigl(b(x)-d(x)\bigr)\alpha(y,x)<0 \end{aligned}$$
    (1.2)
    $$\begin{aligned} \mbox{or}\quad& \left \{ \begin{array}{l} (b(y)-d(y))\alpha(x,x)-(b(x)-d(x))\alpha(y,x)>0 \\ (b(x)-d(x))\alpha(y,y)-(b(y)-d(y))\alpha(x,y)<0. \end{array} \right . \end{aligned}$$
    (1.3)
  3. (C)

    There exist \(\underline{\alpha}\) and \(\overline {\alpha}>0\) such that for every \(x,y\in \mathbb{X}\):

    $$ 0<\underline{\alpha}\leq\alpha(x,y)\leq \overline {\alpha}. $$
    (1.4)

Part (A) of Assumption 1.1 says that in the absence of competition, the population has a positive natural growth rate. Also the probability of a birth resulting in a mutation is positive. Part (B) corresponds to a condition known in adaptive dynamics as “invasion implies substitution”. It can be obtained from the analysis of the equilibria of the Lotka-Volterra system that results from the ordinary large number limit of the logistic competition process without mutation. The consequence of this condition is that when the mutant population manages to reach a sufficiently large size, it wipes out the resident population. Of course the other possibility is that the mutant population becomes extinct quickly. Hence, the population should be monomorphic away from the mutation events.

In this paper, we use the methods of Kurtz [18], based on martingale problems, for separating time scales. We show that on one hand the populations stabilize around their equilibria on the fast ecological scale, while on the other hand, rare mutations at the evolutionary time scale may induce switches from one trait to another. Our proof differs from the one in [2] in that we do not require comparisons with differential equations and large deviation results. Instead we use comparisons with branching processes to exhibit the stabilization of the population around the equilibria determined by the resident trait. Some of our arguments are similar in nature to those presented in [16] for a related model but with finite population sizes.

In Sect. 2, we describe the IBM introduced in [2]. The model accounts only for a trait-structure and otherwise has very simple dynamics. More general trait spaces (possibly functional, as first introduced in [10, 24]) are considered in [14] where the “invasion implies substitution” assumption is also relaxed (see also [3, 13, 23]). There exist several other possible generalisations of the IBMs that underlie the TSS, including for e.g. the physiological structure [11, 19], diploidy [7] or multi-resources [6]. We consider the process that counts the new traits appearing due to mutations, and the occupation measure Γ K of the process X K under a changed time scale. The tightness of this couple of processes is studied in Sect. 3. The limiting values are shown to satisfy an equation that is considered in Sect. 4. This equation says that when a favorable mutant appears, then the distribution describing the population jumps to the equilibrium characterized by the new trait with a probability depending on the fitness of the mutant trait compared to the resident trait. From the consideration of monomorphic and dimorphic populations and constructing couplings with branching processes, we prove the convergence in distribution of {Γ K} to the occupation measure Γ of a pure jump Markov process that is called the TSS.

Notation

Let E be a Polish space and let \(\mathcal{B}(E)\) be its Borel sigma field. We denote by \(\mathcal{M}_{F}(E)\) (resp. \(\mathcal{M}_{P}(E)\)) the set of nonnegative finite (resp. point) measures on E, endowed with the topology of weak convergence. If E is compact, this topology coincides with the topology of vague convergence (see for e.g. [15]) and for any M>0, the set \(\{\mu\in\mathcal{M}_{F}(E) : \mu(E) \leq M\}\) is compact. For a measure μ, we denote its support by supp(μ). If f is a bounded measurable function on E and \(\mu\in\mathcal{M}_{F}(E)\), we use the notation: 〈μ,f〉=∫ E f(x)μ(dx). With an abuse of notation, 〈μ,x〉=∫ E (dx). Convergence in distribution of a sequence of random variables (or processes) is denoted by ‘⇒’. The minimum of any two numbers \(a,b \in \mathbb{R}\) is given by ab. For any \(a\in \mathbb{R}\), its positive part is denoted by [a]+ and ⌊a⌋ denotes the largest integer less than a. For any two \(\mathbb{N}^{*}\)-valued sequences \(\{a_{K} : K \in \mathbb{N}^{*}\}\) and \(\{b_{K} : K \in \mathbb{N}^{*}\}\) we say that a K b K if a K /b K →0 as K→∞.

Define a class of test functions on \(\mathcal{M_{F}}(\mathbb{X})\) by

$$\begin{aligned} \mathbb{F}^{2}_b = \bigl\{ F_f : F_f(\mu) = F \bigl(\langle\mu, f \rangle\bigr), f \in C_b(\mathbb{X},\mathbb{R}) \text{ and } F \in C^2_b(\mathbb{R}, \mathbb{R}) \text{ with compact support}\bigr\}. \end{aligned}$$

Here \(C_{b}(\mathbb{X},\mathbb{R})\) is the set of all continuous and bounded real functions on \(\mathbb{X}\) and \(C^{2}_{b}(\mathbb{R},\mathbb{R})\) is the set of bounded, twice continuously differentiable real-valued functions on \(\mathbb{R}\) with bounded first and second derivatives. This class \(\mathbb{F}^{2}_{b}\) is separable and it is known (see for example [8]) that it characterizes the convergence in distribution on \(\mathcal{M}_{F}(\mathbb{X})\).

The class of càdlàg processes from \(\mathbb{R}_{+}\) to E is denoted by \(\mathbb{D}(\mathbb{R}_{+},E)\).

The value at time t of a process X is denoted by X(t) or sometimes X t for notational convenience.

2 IBM in the Evolutionary Time-Scale

The process X K is characterized by its generator L K, defined as follows. For any \(F_{f} \in\mathbb{F}^{2}_{b}\) let

$$\begin{aligned} L^K F_{f}(X) & =K \int_{E} b(x) \biggl[ \int_{\mathbb{X}} \biggl( F_f \biggl( X+ \frac{1}{K} \delta_{y} \biggr) - F_f(X) \biggr)M^K(x,dy) \biggr] X(dx) \\ &\quad +K \int_{E} \bigl(d(x)+\bigl\langle X,\alpha(x,.)\bigr \rangle \bigr) \biggl( F_f \biggl( X- \frac{1}{K} \delta_{x} \biggr) - F_f(X) \biggr) X(dx), \end{aligned}$$
(2.1)

where M K is the transition kernel given by

$$ M^K(x,dy)=u_K p(x) m(x,dy)+\bigl(1-u_K p(x) \bigr) \delta_{x}(dy). $$
(2.2)

Let \(K\in \mathbb{N}^{*}\) be fixed. The martingale problem for L K has a unique solution for any initial condition \(X^{K}(0) \in \mathcal {M}_{F}(\mathbb{X})\) and it is possible to construct the solution of the martingale problem by considering a stochastic differential equation (SDE) driven by Poisson point processes which corresponds to the IBM used for simulations (see [4, 5]). The following estimate will be needed in the sequel. It is proved in [2, Lemma 1]

Lemma 2.1

Suppose that \(\sup_{K \in \mathbb{N}^{*} } \mathbb{E}( \langle X^{K}(0),1 \rangle ^{2} ) < \infty\), then

$$\sup_{K\geq1,\ t\geq0}\mathbb{E}\bigl(\bigl\langle X^K(t),1\bigr \rangle^2 \bigr)<\infty. $$

In the sequel, we make some assumptions on the initial condition.

Assumption 2.2

Suppose that the sequence of \(\mathcal{M}_{F}(\mathbb{X})\)-valued random variables \(\{X^{K}(0) : K \in \mathbb{N}^{*}\}\) satisfies the following conditions.

  1. (A)

    There exists a \(x_{0} \in \mathbb{X}\) such that supp(X K(0))={x 0} for all \(K \in \mathbb{N}^{*}\).

  2. (B)

    \(\sup_{K\in \mathbb{N}^{*}}\mathbb{E}(\langle X^{K}(0),1\rangle ^{2} ) < \infty\).

  3. (C)

    X K(0)⇒X(0) as K→∞ and 〈X(0),1〉>0 a.s.

From (2.1), we can see that the dynamics has two time scales. The slower time scale is of order Ku K and it corresponds to the occurrence of new mutants while the faster time scale is of order 1 and it corresponds to the birth-death dynamics. We consider rare mutations and so we require that Ku K →0 as K→∞. Moreover as in [2], we will assume that the speed of arrival of new mutants is slow enough to give each mutant enough time to invade the population if it can (time of order logK) and fast enough to ensure that a new mutant arrives while the resident population is near its equilibrium (time of order e cK for some c>0). We will therefore make the following assumption.

Assumption 2.3

For any c>0

$$\log K \ll\frac{1}{K u_K } \ll e^{cK}. $$

Consider the process

$$\begin{aligned} Z^K(t) = X^K \biggl( \frac{t}{Ku_K} \biggr), \quad t \geq0. \end{aligned}$$
(2.3)

In what follows, we denote by \(\{ \mathcal{F}^{K}_{t} : t\geq0\}\) the canonical filtration associated with Z K. Due to the change in time, the generator \(\mathbb{L}^{K}\) of Z K is the generator L K of X K multiplied by (1/Ku K ). Hence for any \(F_{f} \in\mathbb{F}^{2}_{b}\)

$$\begin{aligned} \mathbb{L}^K F_f(Z)&= \int_{\mathbb{X}} p(x) b(x) \biggl[ \int_{\mathbb{X}} \biggl( F_f \biggl(Z+\frac{1}{K}\delta_{y} \biggr)-F_f(Z) \biggr) m(x,dy) \biggr]Z(dx) \\ &\quad + \frac{1}{K u_K} \biggl[ \int_{\mathbb{X}} b(x) \bigl(1-u_K p(x)\bigr) K \biggl( F_f \biggl( Z+ \frac{1}{K} \delta_{x} \biggr) - F_f(Z) \biggr) Z(dx) \\ &\quad + \int_{\mathbb{X}} \bigl(d(x)+\bigl\langle Z,\alpha(x,.) \bigr\rangle \bigr) K \biggl( F_f \biggl( Z- \frac{1}{K} \delta_{x} \biggr) - F_f(Z) \biggr) Z(dx) \biggr]. \end{aligned}$$
(2.4)

In the process Z K we have compressed time so that the mutations occur at a rate of order 1. When we work at this time scale we can expect that between subsequent mutations, the fast birth-death dynamics will average (see for e.g. [18]). Our aim is to exploit this separation in the time scales of ecology (which is related to the births and deaths of individuals) and of evolution (which is linked to mutations).

To study the averaging phenomenon, for the fast birth-death dynamics, we use the martingale techniques developed by Kurtz [18]. We introduce the occupation measure Γ K defined for any t≥0 and for any set \(A \in\mathcal{B} ( \mathcal{M}_{F}(\mathbb{X}) )\) by

(2.5)

Kurtz’s techniques have been used in the context of measure-valued processes in [20, 21] for different population dynamic problems, but an additional difficulty arises here due to the presence of non-linearities at the fast time scale.

We introduce a \(\mathcal{M}_{P}(\mathbb{X})\)-valued process {χ K(t):t≥0} which keeps track of the traits that have appeared in the population. That is, for each t≥0, χ K(t) is a counting measure on \(\mathbb{X}\) that weights the traits that have appeared in the population until time t. The process χ K is a pure-jump Markov process that satisfies the following martingale problem. For any \(F_{f} \in\mathbb{F}^{2}_{b}\)

$$\begin{aligned} M^{\chi,K}_t&:= F_f\bigl(\chi^K(t) \bigr)-F_f\bigl(\chi^K(0)\bigr) \end{aligned}$$
(2.6)
$$\begin{aligned} &\quad - \int_0^t \int_{\mathbb{X}} p(x) b(x) \int_{\mathbb{X}} \bigl(F_f\bigl(\chi ^K(s)+\delta_{y}\bigr)-F_f\bigl( \chi^K(s)\bigr) \bigr) m(x,dy)\, Z^K(s,dx)ds \\ &= F_f\bigl(\chi^K(t)\bigr)-F_f\bigl( \chi^K(0)\bigr) \\ &\quad - \int_0^t \int_{\mathcal{M}_F(\mathbb{X})} \biggl[ \int_{\mathbb{X}} p(x) b(x) \int_{\mathbb{X}} \bigl( F_f\bigl(\chi^K(s)+\delta_{y} \bigr) \\ &\quad -F_f\bigl(\chi^K(s)\bigr) \bigr) m(x,dy) \mu(dx) \biggr] \varGamma^K(ds \times d\mu), \end{aligned}$$
(2.7)

is a square integrable \(\{\mathcal{F}^{K}_{t}\}\)-martingale.

The main result of the paper proves the convergence of {(χ K,Γ K)} to a limit (χ,Γ), where the slow component χ is a jump Markov process and the fast component stabilizes in an equilibrium that depends on the value of the slow component.

Theorem 2.4

Suppose that Assumptions 1.1, 2.2 and 2.3 hold.

  1. (A)

    There exists a process χ with paths in \(\mathbb{D}(\mathbb{R}_{+}, \mathcal{M}_{P}(\mathbb{X}) )\) and a random measure \(\varGamma\in\mathcal{M}_{F} ([0,\infty)\times \mathcal{M}_{F}(\mathbb{X}) )\) such that (χ K,Γ K)⇒(χ,Γ) as K→∞, where (χ,Γ) satisfy the following. For all functions \(F_{f} \in\mathbb {F}^{2}_{b}\),

    $$\begin{aligned} & F_f\bigl(\chi(t)\bigr) - F_f\bigl(\chi(0)\bigr) \\ &\quad -\int_{0}^{t} \int_{\mathcal{M}_F(\mathbb{X})} \biggl[\int_\mathbb{X}b(x)p(x) \int_{\mathbb{X}} \bigl(F_f\bigl(\chi(s)+\delta_{y}\bigr) \\ &\quad -F_f \bigl(\chi(s)\bigr) \bigr) m(x,dy) \mu(dx) \biggr]\varGamma(ds \times d \mu) \end{aligned}$$
    (2.8)

    is a square integrable martingale with respect to the filtration

    $$\begin{aligned} \mathcal{F}_t = \sigma \bigl\{\chi(s), \varGamma \bigl([0,s] \times A \bigr) : s \in[0,t], A \in\mathcal{B} \bigl( \mathcal{M}_F(\mathbb{X}) \bigr) \bigr\} \end{aligned}$$
    (2.9)

    and for any t≥0

    $$\begin{aligned} \int_{0}^{t}\int _{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f (\mu )\varGamma(ds \times d \mu) = 0\quad \textit{for all }F_f \in\mathbb {F}^{2}_b \ \textit{a.s.}, \end{aligned}$$
    (2.10)

    where the nonlinear operator \(\mathbb{B}\) is defined by

    $$\begin{aligned} \mathbb{B}F_f(\mu) = F' \bigl( \langle\mu,f \rangle \bigr)\int_{ \mathbb{X}} \bigl( b(x) - \bigl( d(x)+\bigl\langle\mu, \alpha (x,.)\bigr\rangle \bigr) \bigr) f(x) \mu(dx). \end{aligned}$$
    (2.11)
  2. (B)

    Let Γ be as in part (A). Then for any t>0 and \(A \in\mathcal{B} ( \mathcal{M}_{F}(\mathbb{X}) )\) we have

    (2.12)

    where {χ′(t):t≥0} is a \(\mathbb{X}\)-valued Markov jump process with χ′(0)=x 0 and generator given by

    $$\begin{aligned} \mathbb{C} f(x) = b(x)p(x) \widehat{n}_{x} \int _{\mathbb{X}} \frac {[\mathrm{Fit}(y,x)]^{+}}{b(y)} \bigl( f(y) - f(x) \bigr) m(x,dy), \end{aligned}$$
    (2.13)

    for any \(f \in C_{b}(\mathbb{X},\mathbb{R})\). Here the population equilibrium \(\widehat {n}_{x}\) and the fitness function Fit(y,x) are defined as

    $$ \widehat{n}_x=\bigl(b(x)-d(x)\bigr)/\alpha(x,x)\quad\textit{and} \quad\mathrm {Fit}(y,x) = b(y) - d(y) - \alpha(y,x)\widehat{n}_x . $$
    (2.14)

Note that part (B) of the theorem characterizes the limiting occupation measure Γ. Once Γ is known, χ is characterized by the martingale problem (2.8). The next remark motivates the formula for \(\widehat{n}_{x}\) and the fitness function Fit(y,x).

Remark 2.5

Consider a Lotka-Volterra system that results from the large number limit of the logistic competition process without mutation. If the system has two traits x and y, then the population sizes n x (t) and n y (t) corresponding to these traits evolve according to the system of ordinary differential equations given by

$$\begin{aligned} \begin{aligned} \frac{dn_x}{dt}&= n_x(t) \bigl(b(x) - d(x) - \alpha(x,x)n_x(t) - \alpha(x,y) n_y(t) \bigr) \\ \frac{dn_y}{dt}&= n_y(t) \bigl(b(y) - d(y) - \alpha(y,x)n_x(t) - \alpha(y,y) n_y(t) \bigr), \end{aligned} \end{aligned}$$
(2.15)

where α is the competition kernel. Observe that \((\widehat {n}_{x},0)\) is a fixed point for this system. The fitness function Fit(y,x) describes the growth rate of a negligible mutant population with trait y in an environment characterized by \(\widehat {n}_{x}\). Furthermore, the fixed point \((\widehat{n}_{x},0)\) is asymptotically stable if and only if Fit(y,x)<0. However, the analysis of such a dynamical system (done in [2]) is not necessary for our purpose.

3 Tightness of {(χ K,Γ K)}

To study the limit when K→∞, we proceed by a tightness-uniqueness argument. First, we show the tightness of the distributions of \(\{ ( \chi^{K},\varGamma^{K} ) : K \in \mathbb{N}^{*}\}\) and derive certain properties of the limiting distribution. The limiting values of {Γ K} satisfy an equation that characterizes the state of the population between two mutations, thanks to the “invasion implies substitution” assumption.

Theorem 3.1

Suppose that Assumption 1.1 is satisfied, \(\sup_{K\geq 1}\mathbb{E}(\langle X^{K}(0),1\rangle^{2})<\infty\) and Ku K →0 as K→∞. Then:

  1. (A)

    The distributions of \(\{ (\chi^{K},\varGamma^{K}) : K \in \mathbb{N}^{*}\}\) are tight in the space

    $$\mathcal{P} \bigl( \mathbb{D}\bigl(\mathbb{R}_+, \mathcal{M}_P(\mathbb{X}) \bigr) \times \mathcal{M}_F \bigl(\mathbb{R}_+\times \mathcal{M}_F(\mathbb{X}) \bigr) \bigr). $$
  2. (B)

    Suppose that (χ K,Γ K)⇒(χ,Γ), along some subsequence, as K→∞. Then χ satisfies the martingale problem given by (2.8) and Γ satisfies (2.10).

Proof

To prove the tightness of \(\{\chi^{K} : K \in \mathbb{N}^{*}\}\), we use a criterion from [12]. Let n K(t)=〈χ K(t),1〉 for t≥0. This process counts the number of mutations that occurred in the population. For any T>0 we have by Lemma 2.1 that

$$\begin{aligned} \sup_{K \geq1} \mathbb{E}\bigl( n^K(T)\bigr) \leq \|b\|_\infty T \sup_{K \geq1, t\leq T} \mathbb{E}\bigl( \bigl\langle Z^K(t),1\bigr\rangle \bigr)\leq\|b \|_\infty T \sup_{K\geq1, t \geq0} \mathbb{E}\bigl( \bigl\langle X^K(t),1\bigr\rangle \bigr)< \infty. \end{aligned}$$
(3.1)

From this estimate and the martingale problem (2.7), it can be checked by using Aldous and Rebolledo criteria that for every test function \(f\in C_{b}(\mathbb{X},\mathbb{R})\), the laws of 〈χ K,f〉 are tight in \(\mathbb{D}(\mathbb{R}_{+},\mathbb{R})\) and the compact containment condition is satisfied.

Let us now prove the tightness of \(\{\varGamma^{K} : K \in \mathbb{N}^{*}\}\). Let ϵ>0 be fixed. Using Lemma 2.1, there exists a N ϵ >0 such that

$$\begin{aligned} \sup_{K \geq1, t \geq0} \mathbb{P}\bigl(\bigl\langle Z^K(t),1\bigr\rangle> N_{\epsilon }\bigr)<\epsilon. \end{aligned}$$
(3.2)

Since \(\mathbb{X}\) is compact, the set \(\mathcal{K}_{\epsilon}=\{\mu\in \mathcal{M}_{F}(\mathbb{X}),\ \langle\mu,1\rangle\leq N_{\epsilon}\}\) is compact. We deduce that for any T>0

$$ \inf_{K \geq1} \mathbb{E}\bigl( \varGamma^K \bigl([0,T] \times\mathcal {K}_{\epsilon} \bigr) \bigr) \geq(1-\epsilon)T. $$
(3.3)

Indeed

and the result follows from Fubini’s theorem and (3.2). From Lemma 1.3 of [18], \(\{\varGamma^{K} : K \in \mathbb{N}^{*}\}\) is a tight family of random measures. The joint tightness of \(\{ ( \chi^{K},\varGamma^{K} ) : K\in \mathbb{N}^{*} \}\) is immediate from the tightness of \(\{\chi^{K} : K \in \mathbb{N}^{*} \} \) and \(\{\varGamma^{K} : K \in \mathbb{N}^{*}\}\). This proves part (A).

We now prove part (B). Our proof is adapted from the proof of Theorem 2.1 in [18]. From part (A) we know that the distributions of \(\{(\chi^{K},\varGamma^{K}) : K \in \mathbb{N}^{*}\}\) are tight. Therefore there exists a subsequence {η K } along which (χ K,Γ K) converges in distribution to a limit (χ,Γ). We can take the limit in (2.7) along this subsequence and show that \(M_{t}^{\chi,K}\) converges in distribution to the martingale given by (2.8).

Let us now show that the limiting value Γ satisfies (2.10). From (2.4), for any \(F_{f} \in \mathbb{F}^{2}_{b}\), we get that

$$\begin{aligned} m^{F,f,K}_t & = F_f \bigl(Z^K(t) \bigr)-F_f \bigl(Z^K(0) \bigr)- \int_0^t \mathbb{L}^K F_f\bigl(Z^K(s)\bigr) ds \\ & = F_f \bigl(Z^K(t) \bigr) - F_f \bigl(Z^K(0) \bigr)- \biggl( \frac{1}{K u_K} \biggr)\int_{0}^{t} \int _{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f(\mu) \varGamma^K(ds \times d\mu) \\ &\quad - \frac{\delta ^{F,f,K}(t)}{K u_K} \end{aligned}$$
(3.4)

is a martingale. Here the operator \(\mathbb{B}\) is defined by (2.11) and

$$\begin{aligned} \delta^{F,f,K}(t)= \int_{0}^{t} \bigl(K u_K \mathbb{L}^K F_f \bigl(Z^K(s)\bigr) - \mathbb{B} F_f\bigl(Z^K(s) \bigr) \bigr)ds. \end{aligned}$$
(3.5)

For any \(\mu\in\mathcal{M}_{F}(\mathbb{X})\) we have

$$\begin{aligned} &K u_K \mathbb{L}^K F_f(\mu) - \mathbb{B} F_f(\mu) \\ &\quad = K u_K \int_{\mathbb{X}} p(x) b(x) \biggl[ \int_{\mathbb{X}} \biggl( F_f \biggl( \mu+\frac{1}{K}\delta_{y} \biggr)-F_f(\mu) \biggr) m(x,dy) \biggr]\mu(dx) \\ &\qquad + K \int_{\mathbb{X}} b(x) \biggl( F_f \biggl(\mu+ \frac{1}{K} \delta_{x} \biggr) - F_f(\mu) - \frac{1}{K} F' \bigl( \langle\mu,f \rangle \bigr)f(x) \biggr) \mu(dx) \\ &\qquad + K \int_{\mathbb{X}} \bigl( d(x) + \bigl\langle\mu, \alpha(x,.) \bigr\rangle \bigr) \biggl( F_f \biggl( \mu- \frac{1}{K} \delta_{x} \biggr) - F_f(\mu) + \frac{1}{K} F' \bigl( \langle \mu, f \rangle \bigr)f(x) \biggr) \mu(dx) \\ &\qquad - K u_K \int_{\mathbb{X}} p(x)b(x) \biggl( F_f \biggl( \mu+ \frac{1}{K} \delta_{x} \biggr) - F_f(\mu) \biggr)\mu(dx). \end{aligned}$$
(3.6)

For any \(x\in \mathbb{X}\) and \(\mu\in\mathcal{M}_{F}(\mathbb{X})\), we have by Taylor expansion that for some α 1,α 2∈(0,1):

$$\begin{aligned} F_f \biggl( \mu\pm\frac{1}{K} \delta_{x} \biggr) - F_f(\mu) & = \pm F' \biggl(\langle\mu, f\rangle+ \alpha_1 \frac{f(x)}{K} \biggr) \frac{f(x)}{K} \\ & = \pm F' \bigl(\langle\mu,f\rangle\bigr) \frac{f(x)}{K} + \frac {f(x)^2}{2K^2} F'' \biggl(\langle\mu,f\rangle+ \alpha_2 \frac {f(x)}{K} \biggr). \end{aligned}$$

Therefore we get

$$\begin{aligned} \biggl \vert F_f \biggl( \mu\pm\frac{1}{K} \delta_{x} \biggr) - F_f(\mu ) \biggr \vert \leq \frac{\|F'\|_\infty\|f\|_{\infty}^2}{K} \end{aligned}$$

and

$$\begin{aligned} \biggl \vert K \biggl( F_f \biggl( \mu\pm\frac{1}{K} \delta_{x} \biggr) - F_f(\mu) \mp\frac{1}{K} F' \bigl(\langle\mu,f\rangle\bigr)f(x) \biggr) \biggr \vert \leq \frac{\|F''\|_\infty\|f\|_{\infty}^2}{2K}. \end{aligned}$$

Using these estimates and Assumption 1.1,

$$\begin{aligned} \bigl \vert K u_K \mathbb{L}^K F_f(\mu) - \mathbb{B} F_f(\mu) \bigr \vert & \leq2 u_K \Vert b \Vert _{\infty} \bigl\|F'\bigr\|_\infty\|f \|_{\infty}^2 \langle \mu,1 \rangle \\ &\quad + \frac{\|F''\|_\infty\|f\|_{\infty}^2}{2K} \bigl( \bigl(\Vert b\Vert _{\infty} +\Vert d \Vert _{\infty} \bigr) \langle\mu ,1 \rangle+ \overline {\alpha} \langle\mu,1 \rangle^2 \bigr). \end{aligned}$$

Pick any T>0. This estimate along with Lemma 2.1 implies that as K→∞, δ F,f,K(t) (given by (3.5)) converges to 0 in \(L^{1}(d\mathbb{P})\), uniformly in t∈[0,T]. Multiplying (3.4) by Ku K and letting K→∞, we get that along the subsequence η K , the sequence of martingales \(\{ Ku_{K} m^{F,f,K} : K \in \mathbb{N}^{*}\}\) converges in \(L^{1}(d\mathbb{P})\), uniformly in t∈[0,T] to \(\int_{0}^{t} \int_{\mathcal{M}_{F}(\mathbb{X})} \mathbb{B} F_{f} (\mu)\varGamma(ds \times d\mu)\). The limit is itself a martingale. Since it is continuous and has paths of bounded variation, it must be 0 at all times a.s. Hence for any \(F_{f} \in\mathbb{F}^{2}_{b}\),

$$\int_{0}^{t} \int_{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f (\mu )\varGamma(ds \times d\mu) = 0\quad \text{a.s.} $$

Separability of \(\mathbb{F}^{2}_{b}\) ensures that (2.10) also holds. □

Theorem 3.1 shows tightness of the family of occupation measures \(\{\varGamma^{K} : K \in \mathbb{N}^{*}\}\). In the next section we will prove that this family has a unique limit point Γ which is the occupation measure of the process \(\{ \widehat{n}_{\chi'(t)} \delta_{\chi'(t) } : t \geq0 \}\) (see Theorem 2.4) that corresponds to the TSS. This demonstrates the convergence of Z K to the TSS in the sense of occupation measures. Note that the process Z K does not converge in the Skorokhod topology as K→∞ (see Proposition 1 in [2]), due to sharp transitions in the process at the time of mutations. Hence the convergence of Z K to the TSS is shown in the sense of finite dimensional distributions in [2]. In our approach, instead of working with Z K we work with its occupation measure Γ K which remains relatively unaffected by the behaviour of Z K over small time intervals. This mitigates the problem of having sharp mutation-induced transitions in Z K and simplifies the analysis.

4 Characterization of the Limiting Values

4.1 Dynamics Without Mutation

As in [2, 4], to understand the information provided by (2.10), we need to consider the dynamics of monomorphic and dimorphic populations. Our purpose in this section is to show that under the time scale separation given by Assumption 2.3, the operator (2.11) and Assumption 1.1(B) characterize the state of the population between two mutant arrivals. Because of Assumption 1.1(B), we will see that two different traits cannot coexist in the long term and thus it suffices to work with monomorphic or dimorphic initial populations (i.e. the support of \(Z^{K}_{0}\) is one or two singletons).

In Sect. 4.1.1, we consider monomorphic or dimorphic populations and show convergence of the occupation measures when the final trait composition of the population is known. For instance, if the final trait is x 0, then the occupation measure of Z K(dx,dt) converges to \(\widehat{n}_{x_{0}} \delta_{x_{0}}(dx)dt\). In Sect. 4.1.2, we use couplings with linear birth and death processes to show that the distribution of the final trait composition of the population can be computed from the fitness of the mutant and the resident.

4.1.1 Convergence of the Occupation Measure Γ K in the Absence of Mutation

First, we show that the “invasion implies substitution” assumption (Assumption 1.1(B)) provides information on the behavior of a dimorphic population when we know which trait is fixed.

Definition 4.1

Let \(L^{K}_{0}\) be the operator L K (given by (2.1)) with p(x)=0 for all \(x \in \mathbb{X}\). We will denote by {Y K(t):t≥0} a process with generator \(L^{K}_{0}\) and an initial condition that varies according to the case that is studied. This process has the same birth-death dynamics as a process with generator L K, but there is no mutation.

In this section we investigate how a process with generator \(L^{K}_{0}\) behaves at time scales of order (Ku K )−1, when the population is monomorphic or dimorphic. We start by proving a simple proposition.

Proposition 4.2

For any \(x,y \in \mathbb{X}\) suppose that \(\pi\in\mathcal{P} ( \mathcal{M}_{F}(\mathbb{X}) )\) is such that

$$\begin{aligned} \pi \bigl( \bigl\{ \mu\in\mathcal{M}_{F}(\mathbb{X}) : \{x\} \subset \mathrm{supp}(\mu) \subset\{x,y\} \bigr\} \bigr) = 1 \end{aligned}$$
(4.1)

and

$$ \int_{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f (\mu)\pi(d\mu) = 0 $$
(4.2)

for all \(F_{f} \in\mathbb{F}^{2}_{b}\). Then for any \(A \in\mathcal {B} ( \mathcal{M}_{F}(\mathbb{X}) )\) we have where \(\widehat{n}_{x}\) has been defined in (2.14).

Proof

Since π satisfies (4.1), any μ picked from the distribution π has the form μ=n x δ x +n y δ y with n x >0. Let Φ be the map from \(\mathcal{M}_{F}(\mathbb{X})\) to \(\mathbb{R}_{+} \times \mathbb{R}_{+}\) defined by

Let \(\pi^{*}=\pi\circ\varPhi^{-1} \in\mathcal{P} ( \mathbb{R}_{+} \times \mathbb{R}_{+} )\) be the image of the distribution π under Φ −1. Replacing the operator \(\mathbb{B}\) by its definition, we can rewrite (4.2) as

$$\begin{aligned} 0 & = \int_{\mathcal{M}_F(\mathbb{X})}F'\bigl(\langle\mu,f\rangle \bigr) \biggl(\bigl\langle\mu, (b-d)f\bigr\rangle-\int_{E} f(x) \bigl\langle\mu, \alpha (x,.)\bigr\rangle\mu(dx) \biggr) \pi(d\mu) \\ & = \int_{\mathbb{R}_{+} \times \mathbb{R}_{+}} F' \bigl( f(x)n_x+f(y)n_y \bigr) \bigl[ \bigl( b(x)-d(x) - \alpha(x,x)n_x - \alpha(x,y)n_y \bigr) n_x f(x) \\ &\quad + \bigl( b(y)-d(y) - \alpha(y,x)n_x - \alpha(y,y)n_y \bigr) n_y f(y) \bigr] \pi^{*}(dn_x,dn_y). \end{aligned}$$

This equation can hold for all \(F_{f} \in\mathbb{F}^{2}_{b}\) only if the support of π consists of (n x ,n y ) with n x >0 that satisfy (b(x)−d(x)−α(x,x)n x α(x,y)n y )n x =0 and (b(y)−d(y)−α(y,x)n x α(y,y)n y )n y =0. The only possible solutions are \((\widehat{n}_{x},0)\) and

$$\begin{aligned} (\widetilde{n}_x,\widetilde{n}_y) = \biggl(&\frac{(b(x)-d(x))\alpha(y,y)-(b(y)-d(y))\alpha(x,y)}{\alpha (x,x)\alpha(y,y)-\alpha(x,y)\alpha(y,x)},\\ & \frac{(b(y)-d(y))\alpha (x,x)-(b(x)-d(x))\alpha(y,x)}{\alpha(x,x)\alpha(y,y)-\alpha (x,y)\alpha(y,x)} \biggr). \end{aligned}$$

However due to Assumption 1.1, either \(\widetilde{n}^{x}\) or \(\widetilde{n}^{y}\) is negative and hence \((\widetilde{n}^{x},\widetilde {n}^{y})\) cannot be in the support of π . Therefore \(\pi ^{*} (\{(\widehat{n}_{x},0)\} ) = 1\) and this proves the proposition. □

Remark 4.3

Note that (0,0), \((\widehat{n}_{x},0)\), \((0,\widehat{n}_{y})\) and \((\widetilde{n}_{x},\widetilde{n}_{y})\) are the stationary solutions of the Lotka-Volterra system given by (2.15).

Heuristically, the “invasion implies substitution” assumption prevents two traits from coexisting in the long run. If we know which trait survives, then we know that it fixates and Proposition 4.2 provides the form of the solution π to (4.2). In this case, we can deduce the convergence of the occupation measure of Y K(⋅/Ku K ).

Corollary 4.4

Let \(x,y \in \mathbb{X}\). For each \(K \in \mathbb{N}^{*}\), let {Y K(t):t≥0} be a process with generator \(L^{K}_{0}\) and supp(Y K(0))={x,y}. Let T>0, and suppose that there exists a δ>0 such that:

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl(Y^K_t \{x\} < \delta \ \textit{for some}\ t \in \biggl[ 0 ,\frac{T}{K u_K} \biggr] \biggr) = 0. \end{aligned}$$
(4.3)

Then for any \(F_{f}\in\mathbb{F}^{2}_{b}\),

$$\int_0^T \int_{\mathcal{M}_F(\mathbb{X})}F_f( \mu)\varGamma_0^K(dt\times d\mu):=\int _{0}^{T} F_f \biggl( Y^K \biggl( \frac{t}{K u_K} \biggr) \biggr) dt \Rightarrow T \times F_f ( \widehat{n}_x \delta_{x} ) $$

as K→∞.

Proof

As in part (A) of Theorem 3.1 we can show that \(\{ \varGamma^{K}_{0} : K \in \mathbb{N}^{*}\}\) is tight in the space \(\mathcal {P} (\mathcal{M}_{F} ([0,T]\times \mathcal{M}_{F}(\mathbb{X}) ) )\). Let Γ 0 be a limit point. Then from part (C) of Theorem 3.1 we get that

$$\begin{aligned} \int_{0}^{T}\int_{\mathcal{M}_F(\mathbb{X})} \mathbb{B} F_f (\mu )\varGamma_0(dt \times d\mu) = 0 \quad \text{for all } F_f \in\mathbb {F}^{2}_b \text{ a.s.,} \end{aligned}$$
(4.4)

where the operator \(\mathbb{B}\) is given by (2.11).

Since supp(Y K(0))⊂{x,y} we also have that supp(Y K(t))⊂{x,y} for all t≥0. Let

$$\mathcal{S}_{\delta} = \bigl\{ \mu\in\mathcal{M}_F(\mathbb{X}) : \mu \{ x\} \geq\delta \bigr\}. $$

Observe that \(\varGamma^{K}_{0}([0,T]\times\mathcal{S}_{\delta})\leq T\) a.s. and

Hence by (4.3) we get that \(\varGamma^{K}_{0} ([0,T] \times\mathcal{S}_{\delta}) \) converges to T in \(L^{1}(d\mathbb{P})\). Because \(\mathcal{S}_{\delta}\) is a closed set, \(\varGamma_{0} ([0,T] \times\mathcal{S}_{\delta}) = T \text{ a.s.}\)

Let π be the \(\mathcal{P} ( \mathcal{M}_{F}(\mathbb{X}) )\)-valued random variable defined by π(A)=Γ 0([0,TA)/T for any \(A \in\mathcal{B} ( \mathcal{M}_{F}(\mathbb{X}) )\). Then \(\pi ( \mathcal{S}_{\delta} ) = 1\) and hence π satisfies (4.1) almost surely. Furthermore \(\int_{\mathcal{M}_{F}(\mathbb{X})} \mathbb{B} F_{f} (\mu)\pi(d\mu) = 0\) for all \(F_{f} \in\mathbb{F}^{2}_{b}\), almost surely. Therefore using Proposition 4.2 proves this corollary. □

4.1.2 Fixation Probabilities

We have seen in Corollary 4.4 that the behaviour of a dimorphic population is known provided we know which trait survives and then fixates. Following Champagnat et al. [2, 4], we can answer this question by using couplings with branching processes. This is done in Propositions 4.6 and 4.7, whose proofs are given in the Appendix. These propositions study populations evolving as Markov processes with generator \(L^{K}_{0}\) (see Definition 4.1). Proposition 4.6 shows that over a time period of order (Ku K )−1, a monomorphic population with a non-negligible initial size does not die and a monomorphic population that is initially near equilibrium remains near equilibrium. Proposition 4.7 considers a dimorphic population, where the resident population is near equilibrium while the mutant population has a small size. It shows that an unfavourable mutant will certainly die out quickly (in the evolutionary time scale), while a favourable mutant can invade the population with a positive probability whose value can be easily computed. Moreover after a successful invasion, the mutant population will not die in a time period of order (Ku K )−1. Formally, the process we consider in Propositions 4.6 and 4.7 can be defined as follows.

Definition 4.5

Pick two trait values \(x,y\in \mathbb{X}\) and two \(\mathbb{N}^{*}\)-valued sequences \(\{ z_{1}^{K} : K \in \mathbb{N}^{*} \}\) and \(\{z_{2}^{K} : K \in \mathbb{N}^{*} \}\). Let {Y K(t):t≥0} be the process with generator \(L^{K}_{0}\) (see Definition 4.1) and initial condition

$$\begin{aligned} Y^K(0)=\frac{ z_1^K}{K} \delta_x + \frac{ z_2^K}{K} \delta_y. \end{aligned}$$

Here x and y should be seen as the resident trait and the mutant trait respectively.

For any \(x \in \mathbb{X}\) and ϵ>0 let

$$\begin{aligned} \mathcal{N}_{\epsilon}(x) = \bigl\{ \mu\in \mathcal{M}_F(\mathbb{X}) : \mathrm{supp}(\mu)=\{x\} \text{ and } \langle \mu, 1 \rangle\in [ \widehat{n}_x -\epsilon, \widehat{n}_x+ \epsilon ] \bigr\}. \end{aligned}$$
(4.5)

Proposition 4.6

(Behaviour of a monomorphic population)

Suppose that Assumptions 1.1 and 2.3 hold. Let {Y K(t):t≥0} be the process given by Definition 4.5. Assume that \(z^{K}_{2}= 0\) for all \(K \in \mathbb{N}^{*}\). Then for any T>0 we have the following.

  1. (A)

    A monomorphic population with a non-negligible size does not die in a time of order (Ku K )−1: Suppose that for some ϵ>0, \(z^{K}_{1} \geq K \epsilon\) for each \(K \in \mathbb{N}^{*}\). Then for some δ>0

    $$\begin{aligned} \lim _{K \to\infty} \mathbb{P}\biggl( \exists t \in \biggl[ 0 , \frac{T}{K u_K} \biggr],\ Y^K_t\{x\} < \delta \biggr) = 0. \end{aligned}$$
    (4.6)
  2. (B)

    A monomorphic population with a size around equilibrium remains near equilibrium for a time of order (Ku K )−1: Suppose that for some ϵ>0, \(z^{K}_{1} \in[ K(\widehat{n}_{x} -\epsilon), K(\widehat{n}_{x}+\epsilon)]\) for each \(K \in \mathbb{N}^{*}\). Then

    $$\begin{aligned} \lim _{K \to\infty} \mathbb{P}\biggl(\exists t \in \biggl[ 0 , \frac{T}{K u_K} \biggr],\ Y^K(t) \notin\mathcal{N}_{2 \epsilon}(x) \biggr) = 0. \end{aligned}$$
    (4.7)

Proposition 4.7

(Behaviour of a dimorphic population)

Suppose that Assumptions 1.1 and 2.3 hold. Let {Y K(t):t≥0} be the process given by Definition 4.5. We will assume that the resident population is near equilibrium, that is, for a small ϵ>0 we have \(z^{K}_{1} \in[ K(\widehat{n}_{x} -\epsilon), K(\widehat {n}_{x}+\epsilon)]\) for all \(K \in \mathbb{N}^{*}\). Suppose that {t K } is any \(\mathbb{N}^{*}\)-valued sequence such that logKt K ≪1/Ku K . Let \(\mathcal{S}_{K}=( \widehat{n}_{x}-2\epsilon, \widehat{n}_{x}+2\epsilon )\times( 0 ,2 \epsilon)\) and let \(T_{\mathcal{S}_{K}}\) be the stopping time

$$\begin{aligned} T_{\mathcal{S}_K} = \inf \bigl\{ t \geq0 : Y^K(t) \notin\mathcal {S}_K \bigr\}. \end{aligned}$$
(4.8)

Then we have the following.

  1. (A)

    A favorable mutant with a non-negligible size does not die in a time of order (Ku K )−1: Suppose that Fit(y,x)>0 and \(z^{K}_{2} > K \epsilon\) for all \(K \in \mathbb{N}^{*}\). There exists an ϵ 0>0 such that if ϵ<ϵ 0 then

    $$\begin{aligned} \lim _{K \to\infty} \mathbb{P}\biggl(\exists t \in \biggl[ 0 , \frac{T}{K u_K} \biggr],\ Y^K_t\{y\} < \frac{\epsilon}{2} \biggr) = 0. \end{aligned}$$
    (4.9)
  2. (B)

    An unfavorable mutant dies out in time t K : Let Fit(y,x)<0 and \(z^{K}_{2} < K \epsilon\) for all \(K \in \mathbb{N}^{*}\). There exists an ϵ 0>0 such that if ϵ<ϵ 0 then

    $$\begin{aligned} \lim _{K \to\infty} \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , Y^K_{ T_{\mathcal{S}_K}}\{y\} = 0 \bigr) = 1. \end{aligned}$$
    (4.10)
  3. (C)

    A favorable mutant either dies out or invades the population in time t K . The probability of invasion is the fitness of the mutant with respect to the resident trait divided by its birth rate: let Fit(y,x)>0 and \(z^{K}_{2} = 1\) for all \(K \in \mathbb{N}^{*}\). Then there exist positive constants c,ϵ 0 such that for all ϵ<ϵ 0 we have

    $$\begin{aligned} & \limsup _{K \to\infty} \biggl \vert \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , Y^K_{ T_{\mathcal{S}_K} }\{y\} \geq2 \epsilon \bigr)- \frac{ \mathrm{Fit}(y,x) }{b(y)} \biggr \vert \leq c \epsilon, \end{aligned}$$
    (4.11)
    $$\begin{aligned} \textit{and}\quad & \limsup_{K \to\infty} \biggl \vert \mathbb{P}\bigl( T_{\mathcal{S}_K} \leq t_K , Y^K_{ T_{\mathcal{S}_K} }\{y\} = 0 \bigr) - \biggl( 1- \frac{\mathrm{Fit}(y,x)}{b(y)} \biggr) \biggr \vert \leq c \epsilon. \end{aligned}$$
    (4.12)

Using Propositions 4.6 and 4.7, we can retrieve the state of the process in a large time window [ϵ/Ku K ,ϵ −1/Ku K ] from the initial condition. This allows us to understand what will happen to the population if we neglect the transitions due to rare mutation events.

Corollary 4.8

Suppose that Assumptions 1.1 and 2.3 hold. For each \(K \in \mathbb{N}^{*}\) let {Y K(t):t≥0} be a process with generator \(L^{K}_{0}\).

  1. (A)

    Suppose that for some \(x \in \mathbb{X}\) and ϵ>0 we have supp(Y K(0))={x} and \(Y^{K}_{0}\{x\}>\epsilon\) for all \(K \in \mathbb{N}^{*}\). Then

    $$ \lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal {N}_{\epsilon}(x) \ \textit{for all } t \in \biggl[\frac{\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) = 1. $$
    (4.13)
  2. (B)

    Suppose that for some \(x,y \in \mathbb{X}\) and ϵ>0, we have Fit(y,x)<0, supp(Y K(0))={x,y}, \(Y^{K}_{0}\{x\}\in [ \widehat{n}_{x} -\epsilon, \widehat {n}_{x}+\epsilon ]\) and \(Y^{K}_{0}\{y\}<\epsilon\) for all \(K \in \mathbb{N}^{*}\). Then for a sufficiently small ϵ,

    $$ \lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal {N}_{\epsilon}(x) \ \textit{for all } t \in \biggl[\frac{\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) = 1. $$
  3. (C)

    Suppose that for some \(x,y \in \mathbb{X}\) and ϵ>0 we have Fit(y,x)>0, supp(Y K(0))={x,y}, \(Y^{K}_{0}\{x\} \in [ \widehat{n}_{x} -\epsilon, \widehat {n}_{x}+\epsilon ]\) and \(Y^{K}_{0}\{y\} = 1/K\) for all \(K \in \mathbb{N}^{*}\). Then

    $$\begin{aligned} & \lim_{\epsilon\to0} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(x) \ \textit{for all } t \in \biggl[\frac {\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) \\ &\quad = 1- \lim_{\epsilon\to0} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(y) \ \textit{for all } t \in \biggl[\frac {\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) \\ &\quad = 1 - \frac{ \mathrm{Fit}(y,x)}{b(y)}. \end{aligned}$$

Proof

We first prove part (A). Let Y K be the process with generator \(L^{K}_{0}\) such that supp(Y K(0))={x} and \(Y^{K}_{0}\{x\}>\epsilon\). Part (A) of Proposition 4.6 implies that for some δ>0

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K_t\{x\} < \delta \text{ for some } t \in \biggl[ 0 ,\frac{\epsilon^{-1}}{K u_K} \biggr] \biggr) = 0. \end{aligned}$$

From Corollary 4.4 we know that for any t≥0 and \(F_{f}\in\mathbb{F}^{2}_{b}\),

$$\int_{0}^{t} \mathbb{F}_f \biggl( Y^K \biggl( \frac{s}{K u_K} \biggr) \biggr) ds \Rightarrow t F_f ( \widehat{n}_x \delta_{x} ) $$

as K→∞. Hence if we define \(\sigma^{K}_{\epsilon} = \inf\{ t \geq0: Y^{K}(t) \in \mathcal{N}_{\epsilon/2}(x)\}\), then \(K u_{K}\sigma^{K}_{\epsilon} \rightarrow0\) in probability as K→∞. Now let the process \(\{\widetilde {Y}^{K}(t): t \geq0\}\) be given by \(\widetilde {Y}^{K}_{\epsilon }(t) = Y^{K}_{\epsilon}(t+\sigma^{K}_{\epsilon})\). By the strong Markov property, this process also has generator \(L^{K}_{0}\). Moreover its initial state is inside \(\mathcal{N}_{\epsilon/2}(x)\). Using part (B) of Proposition 4.6 proves part (A).

For part (B) consider the set \(\mathcal{S}_{K}=( \widehat{n}_{x} - 2\epsilon, \widehat{n}_{x} + 2\epsilon) \times(0,2\epsilon)\) and let \(T_{\mathcal{S}_{K}}\) be given by (4.8). Let the sequence {t K } be as in Proposition 4.7 and consider the event \(E^{K}(\epsilon)=\{ T_{\mathcal{S}_{K}} \leq t_{K} , Y^{K}_{T_{\mathcal{S}_{K}} } \{y\} = 0\}\). Since Fit(y,x)<0, part (B) of Proposition 4.7 says that as K→∞, the probability of the event E K(ϵ) approaches 1 for a sufficiently small ϵ>0. Note that Ku K t K →0 as K→∞. The proof of part (B) follows from part (A) of this corollary along with the strong Markov property at time \(T_{\mathcal{S}_{K}}\).

For part (C), fix an ϵ>0 and define {Y K(t):t≥0} with the initial condition specified in the statement. Let \(\mathcal {S}_{K}\) and \(T_{\mathcal{S}_{K}}\) be as in the proof of part (B). We can write

$$\begin{aligned} &\mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(x) \ \text{for all } t \in \biggl[\frac{\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) \\ &\quad = \sum_{i=1}^3 \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(x) \ \text{for all } t \in \biggl[\frac{\epsilon}{K u_K}, \frac {\epsilon^{-1}}{Ku_K} \biggr] ; E^K_i( \epsilon) \biggr), \end{aligned}$$
(4.14)

where \(E^{K}_{1}(\epsilon)= \{ T_{\mathcal{S}_{K}} \leq t_{K} , Y^{K}_{T_{\mathcal{S}_{K}}}\{y\} = 0 \}\), \(E^{K}_{2}(\epsilon)= \{ T_{\mathcal{S}_{K}} \leq t_{K} , Y^{K}_{T_{\mathcal{S}_{K}}}\{y\} \geq 2\epsilon \}\) and \(E^{K}_{3}(\epsilon)= ( E^{K}_{1}(\epsilon )\cup E^{K}_{2}(\epsilon) )^{c}\).

Let us consider the term in (4.14) corresponding to i=1. On the event \(E_{1}^{K}(\epsilon)\), we have \(Y_{T_{\mathcal{S}_{K}}}\{x\}\in (\widehat{n}_{x}-2\epsilon,\widehat{n}_{x}+2\epsilon)\) and \(Y_{T_{\mathcal{S}_{K}}}\{y\} = 0\). The strong Markov property at time \(T_{\mathcal{S}_{K}}\), along with part (A) of this corollary and part (C) of Proposition 4.7 imply that this term converges to 1−Fit(y,x)/b(y) as K→∞ and ϵ→0.

The term corresponding to i=2 in (4.14) can be written as

(4.15)

On the event \(E_{2}^{K}(\epsilon)\), \(Y^{K}_{T_{\mathcal{S}_{K}}}\{y\}\geq 2\epsilon\) and \(Y^{K}_{T_{\mathcal{S}_{K}}}\{x\}\in( \widehat{n}_{x} - 2\epsilon, \widehat{n}_{x} + 2\epsilon)\). From part (A) of Proposition 4.7, the probability of the process \(\{ Y^{K}_{t}\{y\} : t \geq0\}\) going below ϵ between times \(T_{\mathcal{S}_{K}}\) and \(T_{\mathcal{S}_{K}}+\epsilon^{-1}/K u_{K}\) tends to 0 as K→∞. Hence the probability of the event \(\{ \exists t \in[T_{\mathcal{S}_{K}}, T_{\mathcal {S}_{K}}+\epsilon^{-1}/K u_{K}] : y \notin\mathrm{supp}(Y^{K}_{t}) \}\) also tends to 0 as K→∞. Note that if \(y \in\mathrm{supp}(Y^{K}_{t}) \) then \(Y^{K}(t) \notin \mathcal{N}_{\epsilon}(x) \). Conditioning by \(\mathcal{F}_{T_{\mathcal{S}_{K}}}\) and using the strong Markov property shows that the term corresponding to i=2 in (4.14) converges to 0 as K→∞ and ϵ→0.

Part (C) of Proposition 4.7 implies that

$$\begin{aligned} \lim_{\epsilon\to0} \lim_{K \to\infty} \mathbb{P}\bigl( E^K_1(\epsilon )\cup E^K_2( \epsilon) \bigr) = 1. \end{aligned}$$

Hence

$$\lim_{\epsilon\to0 }\lim_{K \to\infty} \mathbb{P}\bigl( E^K_3(\epsilon ) \bigr) = 0 $$

which shows that the term corresponding to i=3 in (4.14) converges to 0.

Gathering the results for i∈{1,2,3}, we get

$$\lim_{\epsilon\to0 }\lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(x) \ \text{for all } t \in \biggl[\frac {\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) = 1- \frac{\mathrm{Fit}(y,x)}{b(y)}. $$

The proof of

$$\lim_{\epsilon\to0 }\lim_{K \to\infty} \mathbb{P}\biggl( Y^K(t) \in \mathcal{N}_{\epsilon}(y) \ \text{for all } t \in \biggl[\frac {\epsilon}{K u_K}, \frac{\epsilon^{-1}}{Ku_K} \biggr] \biggr) = \frac{\mathrm{Fit}(y,x)}{b(y)} $$

is similar. This completes the proof of part (C) of the corollary. □

4.2 Proof of Theorem 2.4: Convergence to the TSS

We now have the tools to prove Theorem 2.4. By Theorem 3.1, the distributions of \(\{(\chi ^{K},\varGamma^{K}) : K \in \mathbb{N}^{*}\}\) are tight. Let (χ,Γ) be a limiting value satisfying (2.8) and (2.10). If we can prove that (2.12) holds for the process {χ′(t):t≥0} (introduced in the statement of Theorem 2.4), then the distribution of Γ is uniquely determined, which in turn uniquely determines the distribution of χ due to the martingale problem given by (2.8). Hence we only have to prove part (B) of Theorem 2.4.

Since (χ,Γ) is a limiting value, due to Prohorov’s theorem, there exists a subsequence \(\{(\widetilde {\chi}^{K},\widetilde {\varGamma}^{K})\}\) that converges in distribution to (χ,Γ) as K→∞. The Skorokhod representation theorem (see for e.g. [1]) says that on the same probability space as (χ,Γ), we can construct a sequence which we again denote by {(χ K,Γ K)} with an abuse of notation, such that (χ K,Γ K) converges to (χ,Γ) a.s. and (χ K,Γ K) has the same marginal distributions as \(\{ (\widetilde {\chi}^{K},\widetilde {\varGamma}^{K})\}\).

Assuming (χ K,Γ K)→(χ,Γ) a.s. as K→∞, we now try to identify (χ,Γ). The main idea is that between subsequent appearances of new mutants, our process {X K(t):t≥0} behaves like the process considered in Corollary 4.8. When a fit mutant appears, it either gets extinct quickly or the process stabilizes around the new monomorphic equilibrium characterized by the mutant trait. Between two rare mutations, the trait and size of the population can be inferred from the occupation measure, because the population is monomorphic and the size is shown to reach an equilibrium.

Throughout this section whenever we say “as K→∞ and ϵ→0” we mean that the limit K→∞ is taken first and the limit ϵ→0 is taken next. For \(K\in \mathbb{N}^{*},\ i \in \mathbb{N}^{*}\) let \(\tau^{K}_{i}\) and τ i be the i-th jump times of the process χ K and χ respectively. For convenience we define \(\tau^{K}_{0}= \tau_{0} =0\). Since (χ K,Γ K)→(χ,Γ) a.s., for any \(m \in \mathbb{N}^{*}\), \((\tau^{K}_{1},\dots,\tau ^{K}_{m} ) \rightarrow (\tau_{1} , \dots, \tau_{m} )\) a.s. Using (2.8) and Lemma 2.1, we know that τ i τ i−1>0 almost surely for each \(i \in \mathbb{N}^{*}\). Thus

$$\begin{aligned} \lim_{\epsilon\rightarrow0}\lim _{K \to\infty} \mathbb{P}\bigl( \tau ^K_{i} - \tau^K_{i-1} > \epsilon \bigr) = 1. \end{aligned}$$
(4.16)

Pick an arbitrary \(x_{\text{arb}} \in \mathbb{X}\). For any \(i \in \mathbb{N}^{*}\) and ϵ>0 define

(4.17)

Note that if supp(Z K(s))={x} for all \(s \in[ \tau^{K}_{i-1} + \epsilon, \tau^{K}_{i} )\), then we have \(R^{K,\epsilon}_{i} = x\). Heuristically, \(R^{K,\epsilon}_{i}\) is an estimator for the trait that fixates between the (i−1)th and the ith mutation. Define an event

$$\begin{aligned} E^{K}_i(\epsilon) = & \bigl\{ \tau^K_i \geq\tau^K_{i-1} + \epsilon \mbox{ and } Z^K(t) \in\mathcal{N}_{\epsilon}\bigl(R^{K,\epsilon}_i \bigr) \ \text{for all } t \in\bigl[ \tau^K_{i-1} + \epsilon, \tau^K_i \bigr) \bigr\}, \end{aligned}$$

where for any \(x \in \mathbb{X}\), \(\mathcal{N}_{\epsilon}(x)\) is defined by (4.5). The next proposition shows how \(R^{K,\epsilon}_{i}\) and \(E^{K}_{i}(\epsilon) \) behave as K→∞ and ϵ→0.

Proposition 4.9

Suppose that Assumptions 1.1, 2.2 and 2.3 hold. Then for any \(i\in \mathbb{N}^{*}\)

$$\begin{aligned} \lim_{\epsilon\to0 } \lim_{K \to\infty} \mathbb{P}\bigl( E^K_i(\epsilon ) \bigr) = 1. \end{aligned}$$
(4.18)

Furthermore for any measurable set \(A \subset \mathbb{X}\), \(x \in \mathbb{X}\) and \(i \in \mathbb{N}^{*}\) we have

(4.19)

Proof

For each \(i \in \mathbb{N}^{*}\), we can construct a process \(\{ Y^{K}_{i}(t) : t \geq0 \}\) with generator \(L^{K}_{0}\) such that

$$\begin{aligned} Z^K \bigl(\tau^K_{i-1} + t\bigr) = Y^K_i \biggl( \frac{t}{K u_K} \biggr) \ \text{for all } t \in\bigl[0, \tau^K_{i} - \tau^K_{i-1}\bigr). \end{aligned}$$
(4.20)

For i=1, we obtain from part (B) of Assumption 2.2 and part (A) of Corollary 4.8 that

$$\begin{aligned} \lim_{K \to\infty} \mathbb{P}\biggl( Y^K_1(t) \in\mathcal{N}_{\epsilon }(x_0) \ \text{for all } t \in \biggl[ \frac{\epsilon}{K u_K} , \frac{\epsilon^{-1}}{K u_K} \biggr] \biggr) = 1. \end{aligned}$$
(4.21)

This limit along with (4.16) and (4.20) imply that \(R^{K,\epsilon}_{1} \to x_{0}\) with probability converging to 1 as K→∞ and ϵ→0. Moreover (4.18) holds for i=1 thanks to (4.20) and (4.21).

For any \(i \in \mathbb{N}^{*}\), let \(U^{K}_{i}\) denote the type of the new mutant that appears at time \(\tau^{K}_{i}\). Pick \(x,y \in \mathbb{X}\). On the event \(\{ E^{K}_{i}(\epsilon), R^{K,\epsilon}_{i} = x,U^{K}_{i} = y \}\), we have \(\mathrm{supp}(Z^{K} ( \tau^{K}_{i} )) = \{x,y\}\), \(Z^{K}_{\tau^{K}_{i}}\{ y\} = 1/K\) and \(Z^{K}_{\tau^{K}_{i}}\{x\} \in[ \widehat{n}_{x} -\epsilon, \widehat{n}_{x}+\epsilon]\). Using parts (B) and (C) of Corollary 4.8 and (4.20), we obtain that

$$\begin{aligned} &\lim_{\epsilon\to0}\lim_{K \to\infty} \mathbb{P}\bigl( E^K_{i+1}(\epsilon),R^{K,\epsilon}_{i+1} = x \big\vert E^K_i(\epsilon), R^{K,\epsilon}_i = x,U^K_i = y \bigr) = \biggl(1- \frac{ [\mathrm{Fit}(y,x)]^+}{b(y)} \biggr) \\ \text{and}\quad &\lim_{\epsilon\to0}\lim_{K \to\infty} \mathbb{P}\bigl( E^K_{i+1}(\epsilon),R^{K,\epsilon}_{i+1} = y \big\vert E^K_i(\epsilon), R^{K,\epsilon}_i= x,U^K_i = y \bigr) = \frac{ [\mathrm{Fit}(y,x)]^+}{b(y)}. \end{aligned}$$

Since the distribution of \(U^{K}_{i} \) conditionally to \(\{ E^{K}_{i}(\epsilon ), R^{K,\epsilon}_{i} = x\}\) is m(x,dy), for any measurable set \(A \subset \mathbb{X}\) we get

(4.22)

Taking \(A = \mathbb{X}\) in (4.22) we obtain

$$\begin{aligned} \lim_{\epsilon\to0}\lim _{K \to\infty} \mathbb{P}\bigl( E^K_{i+1}(\epsilon) \big\vert E^K_i(\epsilon) \bigr) = 1. \end{aligned}$$
(4.23)

This relation shows that if (4.18) holds for some \(i \in \mathbb{N}^{*}\), then it also holds for (i+1). Since we have already shown that (4.18) holds for i=1, by induction we can conclude that (4.18) holds for all \(i \in \mathbb{N}^{*}\). From (4.22) and (4.23) we can deduce that for any measurable set \(A \subset \mathbb{X}\), \(x \in \mathbb{X}\) and \(i \in \mathbb{N}^{*}\)

$$\begin{aligned} \lim_{\epsilon\to0 } \lim_{K \to\infty} \mathbb{P}\bigl( R^{K,\epsilon }_{i+1} \in A \big\vert R^{K,\epsilon}_{i} = x \bigr) = \lim_{\epsilon\to0}\lim_{K \to\infty} \mathbb{P}\bigl( E^K_{i+1}(\epsilon), R^{K,\epsilon}_{i+1} \in A \big\vert E^K_i(\epsilon), R^{K,\epsilon}_i = x \bigr). \end{aligned}$$

Hence (4.19) holds due to (4.22). This completes the proof of the proposition. □

We now complete the proof of part (B) of Theorem 2.4 by using Proposition 4.9.

Proof of Theorem 2.4(B)

For each \(i \in \mathbb{N}^{*}\) define

$$\begin{aligned} R_i= \frac{\int_{\tau_{i-1}}^{\tau_i } \int_{\mathcal{M}_F(\mathbb{X})}\langle\mu, x \rangle\varGamma(dt \times d \mu)}{\int_{\tau _{i-1}}^{\tau_i} \int_{\mathcal{M}_F(\mathbb{X})}\langle\mu, 1\rangle \varGamma(dt \times d \mu)}. \end{aligned}$$

Since (χ K,Γ K)→(χ,Γ) a.s.  for any positive integer m we must have \((R^{K,\epsilon}_{1} ,\dots, R^{K,\epsilon}_{m} ) \to( R_{1},\dots ,R_{m})\) a.s. as K→∞ and ϵ→0. Proposition 4.9 implies that \(\{R_{i} : i \in \mathbb{N}^{*}\}\) is a Markov chain over \(\mathbb{X}\) with R 1=x 0 and transition probabilities given by

(4.24)

for any measurable set \(A \subset \mathbb{X}\), \(x \in \mathbb{X}\) and \(i \in \mathbb{N}^{*}\).

Define two \(\mathbb{X}\)-valued processes as

The almost sure convergence of (χ K,Γ K)→(χ,Γ) implies that χK,ϵ converges almost surely to χ′ as K→∞ and ϵ→0 in the Skorokhod space \(\mathbb{D}([0,T],\mathbb{X})\) for any T>0.

We now show that the process {χ′(t):t≥0} uniquely characterizes the distribution of the limiting occupation measure Γ. For any \(F_{f} \in\mathbb{F}^{2}_{b}, i \in \mathbb{N}^{*}\) and t≥0, define the real-valued variables

$$\begin{aligned} \begin{aligned} &\rho^{K,\epsilon}_i = \int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} \int_{\mathcal{M}_F(\mathbb{X})}F_f(\mu) \varGamma ^K(ds \times d \mu)\quad \text{ and }\\ &\rho_i = \int_{\tau_{i-1} \wedge t}^{\tau_i \wedge t} \int_{\mathcal{M}_F(\mathbb{X})}F_f(\mu) \varGamma(ds \times d \mu ).\end{aligned} \end{aligned}$$
(4.25)

Then certainly \(\rho^{K,\epsilon}_{i} \to\rho_{i}\) a.s. as K→∞ and ϵ→0. Suppose F f has the form F f (μ)=F(〈μ,f〉) for some \(f \in C_{b}(\mathbb{X},\mathbb{R})\) and \(F \in C^{2}_{b}(\mathbb{R},\mathbb{R})\). Then for any \(\mu\in\mathcal{N}_{\epsilon}(x)\)

$$\begin{aligned} \bigl| F_f( \mu) - F_f( \widehat {n}_x \delta_x )\bigr| = \bigl| F\bigl( \langle\mu,f \rangle\bigr) - F( \widehat {n}_x f(x) \bigr| \leq\bigl\|F'\bigr\|_{\infty} \|f \|_\infty \epsilon. \end{aligned}$$

Therefore, on the event \(E^{K}_{i}(\epsilon)\),

$$\begin{aligned} \biggl \vert \rho^{K,\epsilon}_i - \int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} F \bigl( \widehat{n}_{R^{K,\epsilon}_i} f\bigl(R^{K,\epsilon}_i\bigr) \bigr) ds\biggr \vert & = \biggl \vert \int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} \bigl( F_f \bigl( Z^K_s \bigr) - F \bigl( \widehat {n}_{R^{K,\epsilon}_i} f\bigl(R^{K,\epsilon}_i\bigr) \bigr) \bigr) ds \biggr \vert \\ & \leq \int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} \bigl \vert F_f \bigl( Z^K_s \bigr) - F \bigl( \widehat{n}_{R^{K,\epsilon }_i} f\bigl(R^{K,\epsilon}_i\bigr) \bigr) \bigr \vert ds \\ & \leq\bigl\|F'\bigr\|_{\infty} \|f\|_\infty \epsilon\bigl( \tau^K_i \wedge t - \bigl( \tau^K_{i-1} + \epsilon\bigr) \wedge t \bigr). \end{aligned}$$

Since

$$\int_{( \tau^K_{i-1} + \epsilon) \wedge t}^{\tau^K_i \wedge t} F \bigl( \widehat{n}_{R^{K,\epsilon}_i} f\bigl(R^{K,\epsilon}_i\bigr) \bigr) ds = \bigl( \tau^K_{i} \wedge t - \bigl(\tau^K_{i-1} + \epsilon\bigr) \wedge t \bigr) F \bigl( \widehat{n}_{R^{K,\epsilon}_i} f \bigl(R^{K,\epsilon}_i\bigr) \bigr) $$

converges to \(( \tau_{i} \wedge t - \tau_{i-1} \wedge t ) F_{f} ( \widehat{n}_{R_{i}} \delta_{R_{i}} )\) as K→∞ and ϵ→0, we have that

$$\begin{aligned} \rho_i = ( \tau_{i} \wedge t - \tau_{i-1} \wedge t ) F_f ( \widehat{n}_{R_i} \delta_{R_i} ) = \int_{\tau _{i-1} \wedge t}^{\tau_i \wedge t} F_f ( \widehat{n}_{\chi '(s)} \delta_{\chi'(s)} )ds. \end{aligned}$$

This is true for each \(i \in \mathbb{N}^{*}\) and this ensures that (2.12) holds with χ′ defined as above.

Recall that the process {χ(t):t≥0} satisfies the martingale problem given by (2.8). Using (2.12) we obtain that for any \(F_{f} \in\mathbb{F}^{2}_{b}\)

$$\begin{aligned} &F_f\bigl(\chi(t)\bigr) - F_f\bigl(\chi(0) \bigr) -\int_{0}^{t} b\bigl(\chi'(s) \bigr)p\bigl(\chi'(s)\bigr) \widehat{n}_{\chi'(s)}\\ &\quad \times \biggl( \int _{\mathbb{X}} \bigl( F_f\bigl(\chi(s)+ \delta_{y}\bigr)-F_f\bigl(\chi(s)\bigr) \bigr) m\bigl( \chi'(s),dy\bigr) \biggr) ds, \end{aligned}$$

is a martingale. This shows that if χ′(t)=x, then the next jump time of the process χ is exponentially distributed with parameter \(b(x)p(x) \widehat{n}_{x}\). Therefore for each \(i \in \mathbb{N}^{*}\), (τ i τ i−1) is exponentially distributed with parameter \(b(R_{i})p(R_{i}) \widehat{n}_{R_{i}}\). Since \(\{R_{i} : i \in \mathbb{N}^{*}\}\) is a Markov chain with R 1=x 0 and transition probabilities given by (4.19), we can conclude that {χ′(t):t≥0} is a Markov process with generator \(\mathbb{C}\) (given by (2.13)) and initial state x 0. This completes the proof of part (B) of Theorem 2.4. □