1 Introduction

The extinction of local populations can happen frequently in nature, particularly in small and fragmented habitats due to various causes, including genetic deterioration, over-harvesting, climate change, and environmental catastrophes. Even in the absence of all other causes, the finiteness of population size and the resultant demographic stochasticity will eventually drive any isolated population to extinction. Therefore, the expected time until population extinction due to demographic stochasticity alone provides a baseline scenario estimation for the long-term viability of the population. It is closely related to the concept of minimal viable population size, and is of great importance to the conservation of species and global biodiversity (Shaffer 1981; Traill et al. 2007).

The study of population extinction due to demographic stochasticity is a long-standing yet rapidly advancing topic of research, with Francis Galton’s famous problem of the extinction of family names already proposed in 1873 [for reviews of the history see Kendall (1966)]. In the last decades, new mathematical tools have been developed to analyse stochastic population dynamics. A number of such tools, such as the Fokker–Planck approximation and the Wentzel–Kramers–Brillouin (WKB) approximation methods, were originally developed for solving problems in statistical mechanics and quantum mechanics. We can take advantage of analogies between biological systems and corresponding physical systems [e.g., the extinction of population from a steady state driven by weak noise is very similar to the escaping problem of particles in a trapping potential (Dykman et al. 1994)], and apply methods developed for tackling physical problems to answering biological questions.

In this paper, we provide a pedagogical comparative study of the WKB and Fokker–Planck approximation methods in analyzing population extinction from a stable state driven by weak demographic fluctuations. We examine some widely-used stochastic models of population extinction as examples, and show that the nature of the stable states in the mean-field level determines the behaviour of the mean extinction time. In systems with an attracting fixed point or limit cycle, extinction is caused by rare events, the WKB method is a natural approach. For systems with marginally stable states, since extinction is driven by typical Gaussian fluctuations, the Fokker–Planck approximation is also valid.

2 Extinction Time of Populations Formed by a Single Species

2.1 The Deterministic Logistic Growth Model

One of the most widely applied population growth model of a single species is the logistic growth model, or the Verhulst model (Verhulst 1838). This model has been extensively used in modelling the saturation of population size due to resource limitations (Murray 2007; McElreath and Boyd 2008; Haefner 2012), and formed the basis for several extended models that predict more accurately the population growth in real biological systems, such as the Gompertz, Richards, Schnute, and Stannard models [for a review, see Tsoularis and Wallace (2002)].

The classic logistic model takes the form

$$\begin{aligned} \frac{{\hbox {d}}n}{{\hbox {d}}t}=r n \left( 1-\frac{n}{K}\right) , \end{aligned}$$
(1)

where n represents population size, the positive constant r defines the growth rate and K is the carrying capacity. The unimpeded growth rate is modeled by the first term rn and the second term captures the competition for resources, such as food or living space. The solution to the equation has the form of a logistic function

$$\begin{aligned} n(t) = \frac{K n_0 e^{rt}}{K + n_0 \left( e^{rt} - 1\right) }, \end{aligned}$$
(2)

where \(n_{0}\) is the initial population size. Note that \(\displaystyle \lim _{t\rightarrow \infty }n(t)=K\), and this limit is asymptotically reached as long as the initial population size is positive, and the extinction of the population will never happen.

2.2 Population Dynamics Under Demographic Stochasticity

When the typical size of the population is very large (\(1/K \ll 1\)), fluctuations in the observed number of individuals are typically small. In this case, the deterministic logistic growth model generally provides a good approximation to the population dynamics by predicting that the population will evolve towards and then persists at the stable stationary state where \(n=K\). However, in the presence of the demographic noise, occasional large fluctuations can still induce extinction, making the stable states in the deterministic level metastable. In any finite population, extinction will occur as \(t\rightarrow \infty \) with unit probability.

In an established population under logistic growth with a large carrying capacity, the population size fluctuates around K due to random birth and death events, and typically the fluctuation is small in the large K limit. But from time to time, a rare large fluctuation can happen, and it may lead to the extinction of the population. In such situations, it is interesting and often biologically important to determine the most probable paths and the mean extinction time, starting from the stable population size. A rigorous approach for solving these problems in the weak noise limit is the large deviation theory (Touchette 2009). We use the logistic growth model to illustrate the main idea.

Let the function \(T(n\rightarrow m)\) represent the probability of the transition \(n\rightarrow m\) per unit time. For the logistic model \(T(n\rightarrow n+1)=\lambda _n=Bn\) describes the birth rate of the popualtion, where B is the per capita growth rate, and \(T(n \rightarrow n-1)=\mu _n=n+B n^2/K\) describes the death rate of the population, in which the first term represents spontaneous death, and the second term represents death caused by competition. The function P(nt) is the probability density for the system to be in the state with the population of n at the time t, obeying a Master equation

$$\begin{aligned} \frac{{\hbox {d}} P(n,t)}{ {\hbox {d}} t}&=\sum _{m} \left[ T(m \rightarrow n,t) P(m,t)-T(n \rightarrow m,t) P(n,t)\right] \nonumber \\&=\mu _{n+1}P(n+1,t)+\lambda _{n-1}P(n-1,t)-(\mu _{n}+\lambda _{n})P(n,t). \end{aligned}$$
(3)

The initial condition \(P(n,t=t_0)=\delta _{n,n(0)}\). Since \(n=0\) is an absorbing state, for \(m>0\), \(T(0 \rightarrow m)=0\), we have

$$\begin{aligned} \frac{P(n=0,t)}{{\hbox {d}}t}=\sum _{m>0} T(m\rightarrow 0) P(m,t). \end{aligned}$$
(4)

The average population size \(\overline{n}=\sum _{n} P(n,t) n\) satisfies a deterministic averaged (mean-field) rate equation

$$\begin{aligned} \frac{{\hbox {d}}\overline{n}}{{\hbox {d}}t}=(B-1)\overline{n}-B\frac{\overline{n}^2}{K}, \end{aligned}$$
(5)

where we neglect the number fluctuation, namely, \(\overline{n^2}=\overline{n}^2\) (mean-field). Now we have derived the stochastic version of the logistic growth function, corresponding to Eq. (1). Equation (5) has two fixed points: an attracting fixed point \(\overline{n}_\mathrm{s}=(B-1)K/B\), provided \(B>1\); and a repelling fixed point \(\overline{n}_\mathrm{e}=0\) (extinction point). In the presence of noise there is a quasi-stationary state for \(B>1\), in which the population fluctuates near \(n_\mathrm{s}\). However, the system eventually is going to reach \(n=n_\mathrm{e}=0\) driven by rare events, where extinction happens. It is then important to estimate the extinction time.

The commonly used methods for estimating the time until extinction include the Fokker–Planck approximation (also called diffusion approximation in population genetics literature), and the Wentzel–Kramers–Brillouin (WKB) method. The former has a long history of application in studying biological population dynamics, going back to Fisher (1922), and was greatly promoted since the seminal work of Kimura (1964). Nowadays it has become an indispensable topic in population genetics textbooks (Ewens 2004; Svirezhev and Passekov 2012). But despite its honourable place in mathematical biology, the application of Fokker–Planck approximation is restricted to systems where the extinction is driven by typical Gaussian fluctuations (such as genetic drift), characterised by frequent but small jumps (Gardiner 1985). The WKB method was introduced into biology much later (most works are published only in the last two decades), yet it has been gaining popularity steadily, as it generally provides more accurate predictions of the mean extinction time if the extinction is driven by rare events, and can be applied under much broader conditions.

In the following we will first introduce the more general WKB method and then the classic Fokker–Planck approximation, in order to facilitate the comparison of the two methods later on.

2.2.1 Wentzel–Kramers–Brillouin (WKB) Method

The Wentzel–Kramers–Brillouin (WKB) method was named after the three physicists Gregor Wentzel (Wentzel 1926), Hendrik Kramers (Kramers 1926) and Léon Brillouin (Brillouin 1926). It provides a systematic and controllable approximating method to calculate the mean extinction time in the small fluctuations limit. And it has been applied widely in studying different extinction problems, such as large fluctuations in numbers of molecules in chemical reactions (Dykman et al. 1994), the fixation of a strategy in evolutionary games (Black et al. 2012), and the extinction of epidemics  (Chen et al. 2017) .

In a finite population under logistic growth, once the stationary state is reached, the population size fluctuates around the metastable attractor \(\bar{n}_\mathrm{s}\). The characteristic scale of the fluctuations is of the order of \(1/\sqrt{K}\) (Central Limit Theorem). However, occasionally much larger fluctuations also happen that take the system far from the stable state (Dykman et al. 1994). Such large fluctuations are rare events, and their probabilities form the tails of the quasi-stationary population state distribution. The mean extinction time \(\tau \) (mean time to reach the absorbing state \(n_\mathrm{e}=0\)) is determined by this quasi-stationary distribution according to the Fermi’s golden rule (Landau and Lifshitz 2013)

$$\begin{aligned} \tau ^{-1}=\sum _{n>0} T(n\rightarrow 0) P_\mathrm{st}(n), \end{aligned}$$
(6)

where the stationary distribution \(P_\mathrm{st}(n)\) satisfies

$$\begin{aligned} 0=\sum _{m} \left[ T(m \rightarrow n,t) P_\mathrm{st}(m)-T(n \rightarrow m,t) P_\mathrm{st}(n)\right] . \end{aligned}$$
(7)

In terms of the rescaled population size \(x=n/K=n \epsilon \) with \(\epsilon =1/K\), \(\lambda (x)=\lambda _n/K=Bx\), and \(\mu (x)=\mu _n/K=x+Bx^2\). We look for the solution of Eq. (7) by proposing a large deviation form of the stationary distribution

$$\begin{aligned} P_\mathrm{st}(x) = C \exp \left( -\mathcal S_{\epsilon }/\epsilon \right) \end{aligned}$$
(8)

with the WKB ansatz: \(\mathcal S_{\epsilon }=\sum ^{\infty }_{i=0} \epsilon ^{i} \mathcal S_{i}\). Here \(\epsilon \) characterises the noise level, and at the weak-noise limit, \(\epsilon \rightarrow 0\). An asymptotic expansion in small \(\epsilon \) corresponds to a semiclassical approximation. In both quantum mechanics and statistical mechanics this is also known as a WKB expansion. In the former case, \(\epsilon \) is the Planck constant \(\hbar \), characterising quantum fluctuations; and in the later case, \(\epsilon \) is the temperature, characterising thermal fluctuations. In stochastic population dynamics, meanwhile, the small parameter \(\epsilon \) is 1 / K, characterising population size fluctuations.

Plugging Eq. (8) into Eq. (7) and expanding \(\mathcal S_{\epsilon }\) to \(\mathcal {O}(\epsilon )\), we obtain

$$\begin{aligned} \mathcal S_0(x)=\int ^{x} p(x') \ dx', \quad \mathcal S_1(x)=\frac{1}{2} \ln [\mu (x)\lambda (x)] \end{aligned}$$
(9)

where \(p(x)=\ln \left[ \mu (x)/\lambda (x)\right] =\ln \left[ (1+Bx)/B\right] \). It is possible to construct an effective Hamiltonian such that the solution describes an optimal path which represents the ground (lowest-energy) state of the effective Hamiltonian:

$$\begin{aligned} H(x,p)=\lambda (x)(e^{p}-1)+\mu (x)(e^{-p}-1), \end{aligned}$$
(10)

where the canonical momentum \(p=\partial \mathcal S_0/ \partial x\).

We hence obtain the stationary distribution

$$\begin{aligned} P_\mathrm{st}(x)=\frac{B-1}{\sqrt{2\pi K B x^2(1+Bx)}} e^{-K \mathcal S_0(x)}, \end{aligned}$$
(11)

where

$$\begin{aligned} \mathcal S_0(x)=1-B^{-1}-x+(x+B^{-1})\ln (x+B^{-1}). \end{aligned}$$
(12)

The leading-order WKB action \(\mathcal S_0\) describes an effective exponential barrier to extinction and the prefactor in Eq. (11) is proportional to \(e^{-\mathcal S_1(x)}\).

Using Eq. (6) we obtain the mean extinction time for the logistic growth model (Assaf and Meerson 2010) for \(1/K \ll x \ll 1/\sqrt{K}\),

$$\begin{aligned} \tau =\sqrt{\frac{2 \pi B}{N}}\frac{1}{(B-1)^2} e^{K\mathcal S_{0}(0)}, \end{aligned}$$
(13)

which is exponentially large in K. The analytical result of the mean extinction time Eq .(13) shows excellent agreement with Monte Carlo simulations (Assaf and Meerson 2017).

In this section we derived the mean extinction time of a population under logistic growth in a pedagogical way, for illustrating the basic concepts and techniques of the WKB method. For more applications of the WKB method in single species stochastic population models, Ovaskainen and Meerson (2010) provide an excellent overview. A recent review of Assaf and Meerson (2017) includes various applications of the WKB method in multi-species population dynamics. On the technical aspect, an introduction to the path integral representation of master equations can be found in Weber and Frey (2017).

2.2.2 Fokker–Planck Approximation Method

The master equation, the exact formulation of the stochastic population dynamics, is generally difficult to solve. The WKB method provides a systematic and controllable way to approximately solve the stationary master equation by utilising the small parameter \(\epsilon =1/K\). Another way of approximately solving the master equation is to start from a formal Kramers–Moyal expansion:

$$\begin{aligned} \frac{\partial P(X,t)}{\partial t}=\sum ^{\infty }_{m=1}\frac{(-1)^{m}}{m!}\frac{\partial ^{m}}{ \partial X^{m} }\left[ a_m(X,t)P(X,t)\right] , \end{aligned}$$
(14)

where

$$\begin{aligned} a_{m}(X,t)=\int dY (Y-X)^{m} T(X\rightarrow Y). \end{aligned}$$
(15)

Pawula Theorem states that the expansion in Eq. (14) may stop either up to the second term, or must contain an infinite number of terms. If the expansion stops after the second term, it is called the Fokker–Planck equation (Risken 1996). Van-Kampen made the Kramers–Moyal expansion controllable by introducing a small parameter that is the inverse of a system size \(\Omega ^{-1}\) (Gardiner 1985). In the context of population dynamics governed by the logistic growth function, \(\Omega \) corresponds to the carrying capacity K, and the random variable X in Eq. (14) corresponds to the population size n. Since we use the example of logistic growth through out Sect. 2, we will trade generality for consistency and use K and n in the following. In terms of the scaled variable \(x=n/K\), \(a_m \sim K^{1-m/2}\), the Kramers–Moyal expansion will stop at the second term when K is large, and the system reduces to the Fokker–Planck equation. However, the Van-Kampen system size expansion should be used with caution. It may be valid only when x is in the vicinity of its fixed point. For the rare events driven by large fluctuations, the Fokker–Planck approximation may yield large errors.

For the logistic growth model, the system size is characterised by the carrying capacity K. In terms of rescaled variable \(x=n/K\), the master equation (3) becomes

$$\begin{aligned} \frac{{\hbox {d}} P(x,t)}{{\hbox {d}} t}&=K\mu (x+\delta x)P(x+\delta x,t)+K\lambda (x-\delta x)P(x-\delta x,t)\nonumber \\&\quad -K(\mu (x)+\lambda (x))P(x,t), \end{aligned}$$
(16)

where \(\delta x=1/K\). Expanding Eq. (16) to \((\delta x)^2\), we obtain the Fokker–Planck equation

$$\begin{aligned} \frac{{\hbox {d}} P(x,t)}{{\hbox {d}}t} = \frac{1}{2K}\frac{\partial ^2 (g^2 P)}{\partial x^2} -\frac{\partial (f P)}{\partial x}, \end{aligned}$$
(17)

where \(g^2=\lambda +\mu =(B+1)x+Bx^2\) and \(f=\lambda -\mu =(B-1)x-Bx^2\). In population genetics literature, the first term is often attributed to the effect of genetic drift, and the second term is attributed to directional selection (Kimura 1964; Ewens 2004). A diffusive process described by a Fokker–Planck equation, can be equivalently described by a corresponding Langevin type stochastic differential equation (Gardiner 1985). For Eq. (17), the corresponding stochastic differential equation reads

$$\begin{aligned} {\hbox {d}}x =f(x,t)+K^{-1/2}g(x,t) {\hbox {d}}W(t), \end{aligned}$$
(18)

where W(t) a Wiener process with \(\langle W(t) W(t') \rangle =\delta (t-t')\). Note that higher correlations functions of W(t) vanish, reflecting that the stochastic process is diffusive and being consistent with the Fokker–Planck equation.

The stationary distribution of Eq. (17) reads (Gardiner 1985)

$$\begin{aligned} P_\mathrm{st}(x)\propto e^{-K \mathcal {S}_\mathrm{FP}(x)}, \end{aligned}$$
(19)

where \(0<x<x_\mathrm{s}=\bar{n}_\mathrm{s}/K\) and the effective potential

$$\begin{aligned} \mathcal {S}_\mathrm{FP}(x)=\int ^{x_\mathrm{s}}_{x} {\hbox {d}}y \frac{2f(y)}{g^2(y)}=2\left[ x-1+B^{-1}-2\ln \left( \frac{1+B+Bx}{2B}\right) \right] . \end{aligned}$$
(20)

In the vicinity of the stable point (attracting fixed point in the deterministic level) \(x_\mathrm{s}=(B-1)/B\), \(\mathcal {S}_{0}(x)\simeq \mathcal {S}_\mathrm{FP}(x)\simeq (x-x_\mathrm{s})^2\ll 1\), leading to the Gaussian fluctuation. A comparison between \(\mathcal {S}_0(x)\) and \(\mathcal {S}_\mathrm{FP}(x)\) for different x values is shown in Fig. 1. Near the stable fixed point, fluctuations are Gaussian, and hence the stochastic processes can be well-approximated by the Fokker–Planck equation. But if we are interested in rare events driven by large fluctuations, for example the extinction event, the Fokker–Planck approximation becomes invalid. As is shown in the previous section, the mean extinction time is determined by the effective potential \(\mathcal S_{\text {FP}}\) at \(x=0\) which is far from \(x_\mathrm{s}\) for \(B\ne 1\).

Fig. 1
figure 1

Comparison between \(\mathcal {S}_0(x)\) in Eq. (12) and \(\mathcal {S}_\mathrm{FP}(x)\) in Eq. (20) for \(B=10\) (Color figure online)

Compare the effective potential given by the WKB approximation

$$\begin{aligned} \mathcal {S}_0(0)=1-B^{-1}+B^{-1}\ln B^{-1}, \end{aligned}$$
(21)

and the corresponding result given by Fokker–Planck approximation

$$\begin{aligned} \mathcal {S}_\mathrm{FP}(0)=2\left\{ -1+B^{-1}-2\ln \left[ (1+B)/2B\right] \right\} , \end{aligned}$$
(22)

we can see that although Fokker–Planck approximation predicts the correct behaviour of the mean extinction time, namely, \(\tau \sim e^{c K}\), it yields an error that is exponentially large in K (Doering et al. 2005; Bressloff and Newby 2014). Only in the special case when \(B\rightarrow 1\), \(\mathcal {S}_0(0)-\mathcal {S}_\mathrm{FP}(0)=o ((B-1)^2)\) can be neglected. In this limit, \(x_\mathrm{s} \rightarrow 0\), and hence the extinction is a typical event driven by Gaussian fluctuations. In summary, the Fokker–Planck approximation is valid only under the special case if \(B\rightarrow 1\) and extinction is driven by typical Gaussian fluctuations, but for \(B>1\), the extinction is caused by rare events, and the Fokker–Planck approximation fails to give accurate estimations of the mean extinction time.

The difference in the range of application between the WKB method and the Fokker–Planck method arises from the fundamental difference between the Master equation and the Fokker–Planck equation. A diffusion process characterised by the Fokker–Planck equation can always be approximated by a jump process described by the Master equation, while the reverse is true only under the conditions that the jumps must be frequent and the step sizes of such jumps must be small comparing to the time and length scales of observables (Gardiner 1985).

3 Extinction Time of Populations of Two Interacting Species

In populations of two interacting species (e.g. predator and prey) or two different types of individuals (e.g. susceptible and infected), the equilibrium state predicted by the deterministic rate functions can either be a stable fixed point, a stable limit cycle, marginal stable cycles, or no attractor at all. In general, for an attracting fixed point or a stable limit cycle, the extinction from a stable quasi-stationary coexistence state is a rare event driven by large fluctuations, and the mean extinction time will be exponentially large in population size. In this situation the Fokker–Planck approximation is invalid, whereas the WKB approximation method can provide fully controlled weak noise expansion. But if the coexistence state is marginally stable, then the extinction event is a diffusion process driven by typical fluctuations but not a jump. In this case the Fokker–Planck approximation is also valid and the mean extinction time grows algebraically with the initial population size. We discuss the different cases separately in the following.

3.1 Extinction from an Attracting Fixed Point

As an example of multi-species stochastic systems with an attracting fixed point, we consider the endemic SIR model. The SIR model describes the spread of a disease in a population, with susceptible (S), infected (I) and recovered (R) individuals. Assuming that N is the total population size at equilibrium, individuals are born (as susceptible) at rate \(\mu N\). Susceptible, infected, and recovered individuals die at rates \(\mu S\), \(\mu _I I\), and \(\mu _R R\), respectively. Susceptible individuals become infected at rate \((\beta /N)SI\), and infected individuals recover at rate \(\gamma I\). The corresponding deterministic rate equations for the SIR model are

$$\begin{aligned} \frac{{\hbox {d}}S}{{\hbox {d}}t}&=\mu N-\mu S-(\beta /N) S I, \nonumber \\ \frac{{\hbox {d}}I}{{\hbox {d}}t}&=-\mu _I I - \gamma I +(\beta /N) S I, \nonumber \\ \frac{{\hbox {d}}R}{{\hbox {d}}t}&=-\mu _R R + \gamma I. \end{aligned}$$
(23)

According to this formulation, the R individuals obtain lifelong immunity and will never become S or I again, its dynamics is thus decoupled from that of the other two subpopulations. For simplicity, we will ignore the R individuals, and focus on the population dynamics of only S and I individuals. By setting \(\mu _I+\gamma =\Gamma \), which measures the effective death rate of the infected, we obtain the corresponding SI model:

$$\begin{aligned} \frac{{\hbox {d}}S}{{\hbox {d}}t}&=\mu N-\mu S-(\beta /N) S I, \nonumber \\ \frac{{\hbox {d}} I}{{\hbox {d}}t}&=-\Gamma I +(\beta /N) S I. \end{aligned}$$
(24)

For a sufficiently high infection rate, \(\beta >\Gamma \), there is an attracting fixed point \(\bar{S}=N \Gamma / \beta \), \(\bar{I}=\mu (\beta -\Gamma )N/(\beta \Gamma )\), corresponding to an endemic state, and an unstable fixed point \(\bar{S}=N\), \(\bar{I}=0\), describing an uninfected steady-state population.

Accounting for the demographic stochasticity and random contacts between the susceptible and infected, the master equation for the probability P(nmt) of finding n susceptible and m infected individuals at time t reads

$$\begin{aligned} \frac{dP(n,m,t)}{dt}&=\mu \left[ N(P(n-1,m)-P(n,m))+(n+1)P(n+1,m)-nP(n,m)\right] \nonumber \\&\quad +\,\Gamma \left[ (m+1)P(n,m+1)-mP(n,m)\right] \nonumber \\&\quad +\,(\beta /N)\left[ (n+1)(m-1)P(n+1,m-1)-nmP(n,m)\right] . \end{aligned}$$
(25)

In a finite population, the extinction of the disease, starting from the quasi-stationary endemic state, occurs within finite time due to rare events. It therefore is interesting to find out the mean time it takes for the I subpopulation to go extinct. For weak fluctuations (\(1/N \ll 1\)), a long lived quasi-stationary distribution has a Gaussian peak around the stable state of the deterministic model. The Fokker–Planck approximation to the master equation can accurately describe small deviations from the stable state, but it fails to describe the probability of large fluctuations.

In Sect. 2 we discussed the WKB approximation used directly to the quasi-stationary distribution that solves the stationary master equation. Elgart and Kamenev (2004) proposed a method approximating the evolution equation for the probability generating function. The generating function associated with the probability distribution is defined as

$$\begin{aligned} G(p_{S},p_{I},t)=\sum _{n,m}p^n_{S}p^m_{I}P(n,m,t). \end{aligned}$$
(26)

Using the ansatz \(G(p_{S},p_{I},t)=\exp [-S_{\epsilon }(p_{S},p_{I},t)/\epsilon ]\) with \(S_{\epsilon }(p_{S},p_{I},t)=\sum _{i=0}\epsilon ^{i}S_{i}\) and \(\epsilon =1/N\), to the leading order in \(\epsilon \), one obtains the Hamilton-Jacobi equation \(\partial _t \mathcal {S}_0+H=0\), where H is the effective classical Hamiltonian (Kamenev and Meerson 2008):

$$\begin{aligned} H=\mu (p_{S}-1)(N-S)-\Gamma (p_{I}-1)I-(\beta /N)(p_{S}-p_{I})p_{I} SI. \end{aligned}$$
(27)

The meanings of \(p_{S}\) and \(p_{I}\) are clear now. They are the canonical momenta of S and I respectively, and \(S=-\partial _{p_{S}} \mathcal {S}_0\) and \(I=-\partial _{p_{I}}\mathcal {S}_0\). The phase space defined by the Hamiltonian in Eq. (27) provides an important tool to study the extinction dynamics. Demographic stochasticity that induces the extinction of the disease proceeds along the optimal path: a particular trajectory in the phase space. All the mean-field trajectories, described by Eqs. (24) are located in the zero energy \(H=0\) plane \(p_S=p_I=1\). As illustrated in Fig. 2, the attracting fixed point of the mean-field theory becomes a hyperbolic point \(A=[\bar{S},\bar{I},1,1]\) in this phase space. There are two more zero-energy fixed points in the system: the point \(C=[N,0,1,1]\) that is present in the mean-field description, and the emergent fixed point \(B=[N,0,1,\Gamma /\beta ]\) due to the presence of fluctuations. Both of them are hyperbolic and describe extinction of the disease.

The optimal path (instanton) that brings the system from the stable endemic state to the extinction of the disease, is given by the trajectory that minimises the WKB action \(\mathcal S_{0}\). The optimal path must be a zero-energy trajectory. It turns out that there is no trajectory going directly from A to C (see Fig. 2), instead, the fluctuational extinction point B is crucial in the disease extinction.

Fig. 2
figure 2

a Projection of the optimal path on the (x,y) plane (thick black line) and the mean-field trajectory (\(p_x=p_y=0\)) describing an epidemic outbreak (thin red line). b Projection of the optimal path on the (\(p_x\), \(p_y\)) plane. \(x=S/N-1\), \(y=I/N\); \(K=20\) and \(\delta \equiv 1-\Gamma /\beta =0.5\) (Kamenev and Meerson 2008). Permission for reuse obtained from the publisher (Color figure online)

The mean extinction time of the disease \(\tau \) is exponentially large in \(N \gg 1\) and

$$\begin{aligned} \tau \sim \exp \{N\mathcal {S}_0[\text {optimal path}]\}, \end{aligned}$$
(28)

where

$$\begin{aligned} {\mathcal S}_0[\text {optimal path}]=\int ^{\infty }_{-\infty } (p_{S}\dot{S}+p_{I}\dot{I}) \ {\hbox {d}}t, \end{aligned}$$
(29)

and the integration is evaluated along the optimal path going from A to C  (Kamenev and Meerson 2008).

For populations of more than one species interacting with each other, the analytical form of the mean extinction time is generally not available  (Assaf and Meerson 2017), and the optimal path can be computed only numerically. It is also worth mentioning that, for extinction from a deterministically stable limit cycle driven by large fluctuations, the corresponding mean extinction time is also exponentially large in the population size N (Smith and Meerson 2016).

3.2 Extinction from Marginally Stable Equilibrium States

If the extinction is not driven by rare events, it can occur much more quickly. As we will see, the mean extinction time may have a power-law dependence on the population size in the predator-prey and competitive Lotka–Volterra models. In these models, since extinction is driven by Gaussian fluctuations, the Fokker–Planck approximation can be applied.

We first take the classic Lotka–Volterra predator-prey model as an example. Use the continuous variables \(q_1\) and \(q_2\) to represent the predator and prey populations, the deterministic rate equations are:

$$\begin{aligned} \frac{{\hbox {d}}q_1}{{\hbox {d}}t}&=-\sigma q_1 + \lambda q_1 q_2, \nonumber \\ \frac{{\hbox {d}}q_2}{{\hbox {d}}t}&=\mu q_2 -\lambda q_1 q_2, \end{aligned}$$
(30)

where \(\sigma \) represents the death rate of the predator, \(\mu \) represents the birth rate of the prey, and \(\lambda \) is the rate of interaction between a predator and a prey. Note that this formulation assumes that the preys have no intrinsic death, their population will grow exponentially without the presence of the predator. There are three fixed points: \((q_1, q_2)=(0, 0)\), \((0,\infty )\), and \((\mu /\lambda ,\sigma /\lambda )\). The first one corresponds to the case where both species are extinct. The second one describes the population explosion of the prey due to the extinction of the predator. The third one represents the steady state where the predator and the prey coexist at the population size \(N_1=\mu /\lambda \) and \(N_2=\sigma /\lambda \), respectively.

A particular feature of the Lotka–Volterra model is that there is an “accidental” conserved quantity:

$$\begin{aligned} G=\lambda q_1 -\mu -\mu \ln (q_1 \lambda /\mu )+\lambda q_2-\sigma -\sigma \ln (q_2 \lambda /\sigma ), \end{aligned}$$
(31)

where \(G=0\) corresponds to the coexistence fixed point, and \(G>0\) corresponds to larger amplitude cycles (Parker and Kamenev 2009). An illustration of orbits at different G values is shown in Fig. 3. For a given initial condition, the the predator and prey populations cycle along a closed orbit.

Fig. 3
figure 3

Orbits of constant \(G=(0.01, 0.1, 0.4, 1, 1.7, 2.7, 4.2)\) in units of \(\sqrt{\sigma \mu }\). The evolution proceeds clockwise around the mean-field fixed point of \(N_1=N_2=100\).  (Parker and Kamenev 2009). Permission for reused obtained from the publisher

The existence of an “accidental” conserved quantity G not only leads to closed orbits, but also makes them marginally stable. Population fluctuations due to demographic noise are isotropic in the space \((q_1,q_2)\), leading to slow diffusion between the mean-field orbits. Even large deviations from a mean-field orbit, such as extinction, can be seen as the accumulation of many small step fluctuations in the radial direction. This is in contrast with the systems with a stable fixed point or limit cycle, such as the endemic SIR model discussed in the previous section. In those systems, large deviations proceed only along very special optimal paths in the phage space (Dykman et al. 1994; Elgart and Kamenev 2004; Kamenev and Meerson 2008). Consequently, the mean extinction time in marginally stable systems such as the predator-prey model has a power law dependence on the sizes of the two populations.

In the presence of demographic noises, the corresponding master equation is

$$\begin{aligned} \frac{{\hbox {d}}P(m,n,t)}{{\hbox {d}}t}&=\sigma \left[ (m+1)(P(m+1,n)-mP(m,n))\right. \nonumber \\&\quad \left. +\,\mu (n-1)P(m,n-1)-nP(m,n)\right] \nonumber \\&\quad +\,\mu \left[ (n-1)P(m,n-1)-nP(m,n)\right] \nonumber \\&\quad +\,\lambda \left[ (m-1)(n+1)P(m-1,n+1)-nmP(m,n)\right] , \end{aligned}$$
(32)

where P(mnt) is the probability of the system having m predators and n preys at time t.

Since extinction in this case is driven by Gaussian fluctuations rather than large jumps, the Fokker–Planck approximation can be appropriately applied. G can be identified as a “slow” dynamic variable that is responsible for the long time behaviour of the system. In the presence of demographic stochasticity, after averaging out the “fast” variable (angles in \((q_1,q_2)\) space), one can obtain a one-dimensional Fokker–Planck equation on the probability distribution of G. Solving the mean first passage time of this one-dimensional problem gives that the mean extinction time \(\tau \sim N^{3/2}_1/N^{1/2}_2\) with \(N_1\le N_2\) (Parker and Kamenev 2009).

In the previous example of the predator-prey Lotka–Volterra model, overcrowding and intra-specific competition are not considered. The death of prey is solely caused by predation, and the per capita reproduction rate of predators only depends on the abundance of prey. These paradise-like conditions are seldom met in real biological systems. Instead, competition is the norm and battles over resources for survival and reproduction can often be fierce and unforgiving. The competitive Lokta-Volterra model captures the self-limiting behaviour of the population growth. The corresponding deterministic rate equations are:

$$\begin{aligned} \frac{{\hbox {d}}x}{{\hbox {d}}t}=r_1 x \left( 1 - x-\alpha y\right) , \end{aligned}$$
(33)
$$\begin{aligned} \frac{{\hbox {d}}y}{{\hbox {d}}t}=r_2 y \left( 1 - y-\alpha x\right) , \end{aligned}$$
(34)

where \(x=n_1/K_1\), \(y=n_2/K_2\) are rescaled population size, in which \(n_1\) and \(n_2\) are the population size of each of the competing species, \(K_{i}\) is the carrying capacity for each of them, \(r_1\) and \(r_2\) are the intrinsic optimal growth rates of the two species when competition is absent, and \(\alpha \in [0,1]\) is the competition coefficient between the two species.

In the limiting case when \(\alpha = 0\), the growth of the two species are independent of each other. When \(0<\alpha <1\), there is an attracting fixed point \(x^*_1=y^*_2=1/(1+\alpha )\) where the two species coexist. If \(\alpha = 1\), the two species are competitively identical. Consdiering that they have the same carrying capacity \(K_1=K_2=K\), the only difference is that one species reproduces faster and dies sooner than the other. This leads to the degenerate case where there is a line of fixed points corresponding to the marginally stable coexistence of the two species with the ratio of populations determined uniquely by the initial conditions. In the degenerate case, the Fokker–Planck approximation can be applied. The corresponding Fokker–Planck equation is equivalent to stochastic differential equations of x(t) and y(t), which can be reduced to one-dimension by introducing \(z(t)=x(t)-y(t)\):

$$\begin{aligned} {\hbox {d}}z=v(z)+ \sqrt{2 D(z)} {\hbox {d}}W(t). \end{aligned}$$
(35)

Here W(t) is a Wiener process. By determining the drift v(z) and the diffusion D(z) terms, the absorption time (the time until one of the species goes extinct) is \(\tau \sim K\) (Lin et al. 2012).

Parsons et al. (2008) also studied the competition dynamics of a fast-living species and a slow-living species, which have the same carrying capacity. The authors compared the absorption time to the prediction of the corresponding Wright–Fisher model of fixed population size, and found that it depends on the relative abundance of the two species. The absorption time is longer when the initial frequency of the fast-living species is higher, and shorter when it is lower. The work of (Kogan et al. 2014) incorporated the “fast” and “slow” life history features with infectious diseases dynamics and studied the absorption time under the scenario of two pathogens competing for the same susceptible host population, in which one pathogen has higher infection rate yet its hosts recover more quickly compared the other pathogen. Additional interesting works on extinction along a quasi-neutral line where population dynamics can be validly modelled by the Fokker–Planck approximation include Parsons and Quince (2007) and Constable et al. (2013).

4 Discussion and Conclusions

In this paper, we provide a comparative analysis of the WKB and Fokker–Planck approximation methods in analysing the problem of population extinction under weak demographic fluctuations. In particular, we focus on estimating the mean extinction/absorption time of well-mixed systems containing a single or two interacting species. The mean extinction time has distinct behaviours depending on the nature of the stationary states (fixed points) of the corresponding deterministic model. If the fixed point is attractive (for instance, logistic growth model and the endemic SIR model), the extinction is driven by rare events and the mean extinction time is experientially large in population size. In this case, the WKB method gives rise to the correct result whereas the Fokker–Planck approximation leads to an exponentially large error in the mean extinction time. If the stationary state is marginally stable (for instance, the competitive Lotka–Volterra model when the two species have the same carrying capacity), the extinction instead is driven by typical Gaussian fluctuations and the mean extinction time has a power law dependence on the population size. Under this situation, the Fokker–Planck approach is also appropriate.

Here we only included examples of applying the WKB method in analysing a few basic population dynamics models, but note that the method has much broader applications in stochastic population dynamics. For instance, it provides a powerful tool in studying population extinction in fragmented landscape with dispersal between habitat patches (Meerson and Sasorov 2011; Khasin et al. 2012) and on heterogeneous networks (Hindes and Schwartz 2016, 2017). In addition, it has been applied to study the most likely path of extinction from species coexistence in the context of evolutionary games (Park and Traulsen 2017). For further reading on the vast applications of the WKB approximation method, we recommend the following reviews and references therein. The concise review of Ovaskainen and Meerson (2010) provides an excellent overview of the WKB approximation in single species stochastic population models. Technique-wise, Weber and Frey (2017) provides a comprehensive introduction to the path integral representation of master equations. The recent review of Assaf and Meerson (2017) includes great details on applications of the WKB method in various models and pointed out interesting open questions.

Through this paper, we hope to arouse in biologists the interest to the WKB method and the great potential of applying it to solving stochastic population dynamics problems. Using several examples of the successful applications of the WKB and Fokker–Planck methods in solving evolutionary biology problems, we highlight the great value of knowledge transfer between physics and biology, and we encourage further exchange of knowledge and collaborations between physicists and biologists for developing novel approaches in modelling biological evolution.