1 Introduction

Understanding mechanisms underlying speciation remains a central question in evolutionary biology. The main puzzle is the origin of isolating barriers that prevent gene flow among populations or within a population. Ecological speciation has been largely studied, highlighting the relations between sexual selection and speciation, and demonstrating negative links (Servedio and Bürger 2014, 2015) as well as beneficial ones (Boughman 2001). Lande (1981) is the first one to have popularized the idea of sexual selection promoting speciation. Then numerous authors have dealt with it in depth (Wu 1985; Turner and Burrows 1995; Higashi et al. 1999; Van Doorn et al. 2004; Ritchie 2007; Pennings et al. 2008). Furthermore, biological examples of speciation that involve well studied mechanisms of sexual selection are numerous and well documented, as the case of Hawaiian cricket Laupala (Otte 1989; Shaw and Parsons 2002; Mendelson and Shaw 2005), Amazonian frog Physalaemus (Boul et al. 2007), or the cichlid fish species of Lake Victoria (Seehausen et al. 2008). Modelling approaches allow to investigate the relative roles of stochastic processes, ecological factors, and sexual selection in limiting gene flow. The role of so-called ‘magic’ or ‘multiple effect’ traits, which associate both adaptation to a new ecological niche and a mate preference as enhancer of speciation has been evidenced in many experimental studies (Merrill et al. 2012) as well as theoretical ones (Lande and Kirkpatrick 1988; Van Doorn et al. 1998). However, identifying the role of sexual selection itself as trigger of speciation without ecological adaptation has received less attention (Gavrilets 2014), although some authors have illustrated the promoting role of sexual preference alone, using numerical simulations (Kondrashov and Shpak 1998; M’Gonigle et al. 2012). In this paper, we aim at introducing and studying mathematically a stochastic model accounting for the stopping of gene flow between two subpopulations by means of sexual preference only.

We consider a population of hermaphroditic haploid individuals characterized by their genotype at one multi-allelic locus, and by their position on a space that is divided in several patches. This population is modeled by a multi-type birth and death process with competition, which is ecologically neutral in the sense that individuals with different genotypes are not characterized by different adaptations to environment or by different resource preferences. However, individuals reproduce sexually according to mating preferences that depend on their genotype: two individuals having the same genotype have a higher probability to mate. This assortative mating situation (assortative mating by phenotype matching) has been highlighted notably in plant species, in particular due to simultaneous maturation of male and female reproductive organs (Herrero 2003; Savolainen et al. 2006), and its selective advantages have been studied and modeled by Darwin (1871) and more recently in the review by Jones and Ratterman (2009). This review provides a detailed description of these models, as well as some empirical examples supporting mate preference evolution. In addition to this sexual preference, individuals can migrate from one patch to another, at a rate depending on the frequency of individuals carrying the other genotype and living in the same patch. Examples of animals migrating to find suitable mates are well documented (Schwagmeyer 1988; Höner et al. 2007). A migration mechanism similar to the one presented in our paper has been studied by Payne and Krakauer (1997) in a continuous space model.

The class of stochastic individual-based models with competition and varying population size we are studying have been introduced by Bolker and Pacala (1997), Dieckmann and Law (2000) and made rigorous in a probabilistic setting in the seminal paper of Fournier and Méléard (2004). Then they have been studied by many authors (see Champagnat 2006; Champagnat et al. 2006; Costa et al. 2015; Leman 2016 and references therein for instance). Initially restricted to asexual populations, such models have evolved to incorporate the case of sexual reproduction, in both haploid (Smadi 2015) and diploid (Collet et al. 2013; Coron 2015; Neukirch and Bovier 2016) populations. Taking into account varying population sizes and stochasticity is necessary if we aim at better understanding phenomena involving small populations, like mutational meltdown (Coron et al. 2013), invasion of a mutant population (Champagnat 2006), evolutionary suicide and rescue (Abu Awad and Billiard 2017) or population extinction time (Theorem 3 of the current paper). Rudnicki and Zwoleński (2015) considered both random and assortative mating in a phenotypically structured population. In the present article, we consider a different kind of mechanism of sexual preference (see Sect. 3 for a detailed discussion), and our model is spatially structured.

We study both the stochastic individual-based model and its deterministic limit in large population. We give a complete description of the equilibria of the limiting deterministic dynamical system, and prove that the stable equilibria are the ones where only one genotype survives in each patch. We use classical arguments based on Lyapunov functions (LaSalle 1960; Chicone 2006) to derive the convergence at exponential speed of the solution to one of the stable equilibria, depending on the initial condition. Our theoretical results hold for small migration rates but we conjecture using simulations that they hold for all the possible migration rates. This fine study of the large population limit is essential to derive the average behaviour of the stochastic process. Then using coupling techniques with branching processes, we derive bounds for the time needed for speciation to occur in the stochastic process. These bounds are explicit functions of the individual birth rate and the mating preference parameter. Besides, we propose several generalisations of our model, and prove that our findings are robust for those generalisations.

The structure of the paper is the following. In Sect. 2 we describe the model and present the main results. Section 3 is devoted to a discussion on the biological assumptions of the model. In Sects. 4 and 5 we state properties of the deterministic limit and of the stochastic population process, respectively. They are key tools in the proofs of the main results, which are then completed. In Sect. 6 we illustrate our findings and make a conjecture on a more general result with the help of numerical simulations. Section 7 is devoted to some generalisations of the model. Finally, we state in the “Appendix” technical results needed in the proofs.

2 Model and main results

We consider a sexual haploid population with Mendelian reproduction (Griffiths et al. 2000, chap. 3). Time is continuous. At any moment, an individual can die, give birth or migrate. As a consequence, generations are overlapping and there is no specific period for individuals to reproduce. Each individual carries an allele belonging to the genetic type space \(\mathcal {A}=\{A,a\}\), and lives in a patch i in \({\mathcal {I}}=\{1,2\}\). We denote by \({\mathcal {E}}={\mathcal {A}}\times {\mathcal {I}}\) the type space, by \((\mathbf {e}_{\alpha ,i}, (\alpha ,i)\in {\mathcal {E}})\) the canonical basis of \(\mathbb {R}^{{\mathcal {E}}}\), and by \(\bar{\alpha }\) the complement of \(\alpha \) in \(\mathcal {A}\). The population is modeled by a multi-type birth and death process with values in \(\mathbb {N}^{{\mathcal {E}}}\). More precisely, we denote by \(n_{\alpha ,i}\) the current number of \(\alpha \)-individuals in the patch i and by \(\mathbf {n}=(n_{\alpha ,i}, (\alpha ,i) \in {\mathcal {E}})\) the current state of the population. The birth rate is the consequence of the following mechanisms: at a rate \(B>0\), any individual encounters another individual uniformly at random in its deme. Indeed, all the individuals are assumed to be ecologically and demographically equivalent, thus the probability that they are at the same place at the same time is uniform. Mathematically, the probability of encountering an individual of genotype \(\alpha ^{\prime }\) in the patch i writes

$$\begin{aligned} \frac{n_{\alpha ^{\prime },i}}{n_{\alpha ,i}+n_{\bar{\alpha },i}}, \end{aligned}$$
(1)

at the time of the encounter. Then the probability that the encounter leads to a successful mating with the birth of an offspring is \(b\beta /B\le 1\) if the two individuals carry the same genotype, and \(b/B \le 1\) otherwise. As a consequence, the birth rate of individuals with genotype \(\alpha \) in the deme i is equal to

$$\begin{aligned} \begin{aligned} \lambda _{\alpha ,i}(\mathbf {n}):=&\,b\left( n_{\alpha ,i}\beta \frac{ n_{\alpha ,i}}{n_{\alpha ,i}+n_{\bar{\alpha },i}}+\frac{1}{2}n_{\alpha ,i}\frac{n_{\bar{\alpha },i}}{n_{\alpha ,i}+n_{\bar{\alpha },i}}+\frac{1}{2}n_{\bar{\alpha },i}\frac{n_{\alpha ,i}}{n_{\alpha ,i}+n_{\bar{\alpha },i}}\right) \\ =&\, b n_{\alpha ,i} \frac{\beta n_{\alpha ,i}+n_{\bar{\alpha },i}}{n_{\alpha ,i}+n_{\bar{\alpha },i}} . \end{aligned} \end{aligned}$$
(2)

In other words, the parameter \(\beta >1\) represents the “mating preference”. Indeed, individuals meet uniformly at random and two encountering individuals have a probability \(\beta \) times larger to mate and give birth to a viable offspring if they carry the same allele \(\alpha \). This modeling of mating preferences, directly determined by the genome of each individual, is biologically relevant, considering the works by Hollocher et al. (1997) or Haesler and Seehausen (2005) for instance. Note that, in our model, the preference does not only bias the distribution of genotypes among mating partners but also affects the rate of mating for choosy individuals, unlike what is assumed in standard sexual selection models (Lande 1981; Kirkpatrick 1982).

Fig. 1
figure 1

Migrations of A- and a-individuals between the patches

The death rate of \(\alpha \)-individuals in the patch i writes

$$\begin{aligned} d^K_{\alpha ,i}(\mathbf {n}):= \left( d+\frac{c}{K}\left( n_{\alpha ,i}+n_{\bar{\alpha },i}\right) \right) n_{\alpha ,i}, \end{aligned}$$
(3)

where K is an integer accounting for the quantity of available resources or space. This parameter is related to the concept of carrying capacity, which is the maximum population size that the environment can sustain indefinitely, and is consequently a scaling parameter for the size of the community. The individual intrinsic death rate d is assumed to be non negative and less than b:

$$\begin{aligned} 0\le d<b. \end{aligned}$$
(4)

The death rate definition (3) implies that all the individuals are ecologically equivalent: the competition pressure does not depend on the alleles carried by the two individuals involved in an event of competition for food or space. The competition intensity is denoted by \(c>0\). Last, the migration of \(\alpha \)-individuals from patch \(\bar{i}= {\mathcal {I}} {\setminus } \{i\}\) to patch i occurs at a rate

$$\begin{aligned} \rho _{\alpha , \bar{i} \rightarrow i}(\mathbf {n}):= p \left( 1-\frac{n_{\alpha ,\bar{i}}}{n_{\alpha ,\bar{i}}+n_{\bar{\alpha },\bar{i}}}\right) \quad n_{\alpha ,\bar{i}}=p\frac{n_{\alpha ,\bar{i}}n_{\bar{\alpha },\bar{i}}}{n_{\alpha ,\bar{i}}+n_{\bar{\alpha },\bar{i}}}, \end{aligned}$$
(5)

(see Fig. 1). The individual migration rate of \(\alpha \)-individuals is proportional to the frequency of \(\bar{\alpha }\)-individuals in the patch. It reflects the fact that individuals prefer being in an environment with a majority of individuals of their own type. In particular, if all the individuals living in a patch are of the same type, there is no more migration outside this patch. Remark that the migration rate from patch \(\bar{i}\) to i is equal for A- and a-individuals, hence to simplify notation, we denote

$$\begin{aligned} \rho _{\bar{i}\rightarrow i}(\mathbf {n})=\rho _{A,\bar{i}\rightarrow i}(\mathbf {n})=\rho _{a,\bar{i}\rightarrow i}(\mathbf {n}). \end{aligned}$$

A biological discussion of the model is provided in Sect. 3. Besides, extensions of this model are presented and studied in Sect. 7.

The community is therefore represented at every time \(t\ge 0\) by a stochastic process with values in \(\mathbb {R}^{\mathcal {E}}\):

$$\begin{aligned} \left( \mathbf {N}^K(t),t\ge 0\right) :=\left( N^K_{\alpha ,i}(t), (\alpha ,i)\in {\mathcal {E}},\quad t\ge 0\right) , \end{aligned}$$

whose transitions are, for \(\mathbf {n} \in \mathbb {N}^{\mathcal {E}}\) and \((\alpha ,i)\in {\mathcal {E}}\):

$$\begin{aligned} \begin{array}{lllll} \mathbf {n} &{} \longrightarrow &{} \mathbf {n}+\mathbf {e}_{\alpha ,i} &{}\quad \hbox {at rate}&{}\lambda _{\alpha ,i}(\mathbf {n}),\\ &{} \longrightarrow &{} \mathbf {n}-\mathbf {e}_{\alpha ,i} &{}\quad \hbox {at rate}&{}d^K_{\alpha ,i}(\mathbf {n}),\\ &{} \longrightarrow &{} \mathbf {n} + \mathbf {e}_{\alpha ,i} -\mathbf {e}_{\alpha ,\bar{i}} &{} \quad \hbox {at rate}&{} \rho _{\bar{i}\rightarrow i}(\mathbf {n}). \end{array} \end{aligned}$$

As originally done by Fournier and Méléard (2004), it is convenient to represent a trajectory of the process \(\mathbf {N}^K\) as the unique solution of a system of stochastic differential equations driven by Poisson point measures. We introduce twelve independent Poisson point measures \((R_{\alpha ,i}, M_{\alpha ,i}, D_{\alpha ,i},(\alpha ,i)\in {\mathcal {E}})\) on \(\mathbb {R}_+^2\) with intensity \(ds \, d\theta \). These measures represent respectively the birth, migration and death events in the population \(N^K_{\alpha ,i}\). We obtain for every \(t\ge 0\),

$$\begin{aligned} \mathbf {N}^K(t)= & {} \mathbf {N}^K(0)+\sum _{(\alpha ,i)\in {\mathcal {E}}}\left[ \int _0^t \int _0^\infty \mathbf {e}_{\alpha ,i} \mathbf {1}_{\{\theta \le \lambda _{\alpha ,i}(\mathbf {N}^K(s-))\}}R_{\alpha ,i}(ds,d\theta )\right. \nonumber \\&\left. -\,\int _0^t \int _0^\infty \mathbf {e}_{\alpha ,i}\mathbf {1}_{\{\theta \le d^K_{\alpha ,i}(\mathbf {N}^K(s-))\}}D_{\alpha ,i}(ds,d\theta )\right. \nonumber \\&\left. +\,\int _0^t \int _0^\infty (\mathbf {e}_{\alpha ,\bar{i}}-\mathbf {e}_{\alpha ,i})\mathbf {1}_{\{\theta \le \rho _{\bar{i}\rightarrow i}(\mathbf {N}^K(s-))\}}M_{\alpha ,i}(ds,d\theta ) \right] . \end{aligned}$$
(6)

In the sequel, we will assume that the initial population sizes \((N^K_{\alpha ,i}(0),(\alpha ,i)\in {\mathcal {E}})\) are of order K. As a consequence, we consider a rescaled stochastic process

$$\begin{aligned} \left( \mathbf {Z}^K(t),t\ge 0\right) =\left( Z^K_{\alpha ,i}(t),(\alpha ,i)\in {\mathcal {E}},\quad t\ge 0\right) =\left( \frac{\mathbf {N}^K(t)}{K},t\ge 0\right) , \end{aligned}$$

which will be comparable to a solution of the dynamical system

$$\begin{aligned} \left\{ \begin{array}{l} \frac{d}{dt}z_{A,1}(t)= z_{A,1}\left[ b\frac{\beta z_{A,1}+z_{a,1}}{z_{A,1}+z_{a,1}}-d-c(z_{A,1}+z_{a,1})-p\frac{z_{a,1}}{z_{A,1}+z_{a,1}}\right] +p\frac{z_{A,2}z_{a,2}}{z_{A,2}+z_{a,2}}\\ \frac{d}{dt}z_{a,1}(t)= z_{a,1}\left[ b\frac{\beta z_{a,1}+z_{A,1}}{z_{A,1}+z_{a,1}}-d-c(z_{A,1}+z_{a,1})-p\frac{z_{A,1}}{z_{A,1}+z_{a,1}}\right] +p\frac{z_{A,2}z_{a,2}}{z_{A,2}+z_{a,2}}\\ \frac{d}{dt}z_{A,2}(t)= z_{A,2}\left[ b\frac{\beta z_{A,2}+z_{a,2}}{z_{A,2}+z_{a,2}}-d-c(z_{A,2}+z_{a,2})-p\frac{z_{a,2}}{z_{A,2}+z_{a,2}}\right] +p\frac{z_{A,1}z_{a,1}}{z_{A,1}+z_{a,1}}\\ \frac{d}{dt}z_{a,2}(t)= z_{a,2}\left[ b\frac{\beta z_{a,2}+z_{A,2}}{z_{A,2}+z_{a,2}}-d-c(z_{A,2}+z_{a,2})-p\frac{z_{A,2}}{z_{A,2}+z_{a,2}}\right] +p\frac{z_{A,1}z_{a,1}}{z_{A,1}+z_{a,1}}.\end{array}\right. \end{aligned}$$
(7)

Note that, from a mathematical point of view, it is possible to reduce the number of parameters b, c, d, p, \(\beta \). Using a time scaling and a size scaling, we can prove that only three effective parameters are necessary to describe the mathematical behaviour of the system, corresponding to a reformulation of the parameters \(\beta \), d and p (we refer the interested reader to the “Appendix” for more details). However, since each parameter has a biological meaning, we will keep these notations.

Let us denote by

$$\begin{aligned} \left( \mathbf {{z}}^{(\mathbf {z}^0)}(t),t\ge 0\right) =\left( {z}^{(\mathbf {z}^0)}_{\alpha ,i}(t),(\alpha ,i)\in {\mathcal {E}}\right) _{t\ge 0} \end{aligned}$$

the unique solution to (7) starting from \(\mathbf {z}(0)=\mathbf {z}^0 \in \mathbb {R}_+^{\mathcal {E}}\). The uniqueness derives from the fact that the vector field is locally Lipschitz and that the solutions do not explode in finite time (Chicone 2006). We have the following classical approximation result which will be proven in “Appendix A”:

Lemma 1

Let T be in \(\mathbb {R}_+^*\). Assume that the sequence \((\mathbf {Z}^K(0),K \ge 1)\) converges in probability when K goes to infinity to a deterministic vector \(\mathbf{{z}^0} \in \mathbb {R}_+^{\mathcal {E}}\). Then

$$\begin{aligned} \underset{K \rightarrow \infty }{\lim }\ \sup _{s\le T}\ \left\| \mathbf {Z}^K(s)-\mathbf {z}^{(\mathbf {z}^0)}(s) \right\| =0 \quad \text {in probability}, \end{aligned}$$
(8)

where \(\Vert . \Vert \) denotes the \(L^\infty \)-Norm on \(\mathbb {R}^{\mathcal {E}}\).

When K is large, this convergence result allows one to derive the global behaviour of the population process \(\mathbf {N}^K\) from the behaviour of the dynamical system (7). Therefore, a fine study of (7) is needed. To this aim, let us introduce the parameter

$$\begin{aligned} \zeta := \frac{\beta b-d}{c}, \end{aligned}$$
(9)

which corresponds to the equilibrium size of the \(\alpha \)-population for the dynamical system (7), in a patch with no \(\bar{\alpha }\)-individuals and no migration. Let us also define the parameters

$$\begin{aligned} \widetilde{\zeta }:= & {} \dfrac{b^2(\beta ^2-1)+2p(b-d)-2bd(\beta -1)}{4c(b(\beta -1)+p)} \quad \text {and}\nonumber \\ \quad \Delta:= & {} \zeta \left( \zeta -2p\frac{\widetilde{\zeta }}{b(\beta -1)+p}\right) > 0 \end{aligned}$$
(10)

(see (30) for the positivity of \(\Delta \)). We derive in Sect. 4 the following properties of the dynamical system (7):

Theorem 1

  1. 1.

    For \(\beta \ge 1\), the following points for which only one type remains, in only one patch

    $$\begin{aligned} (\zeta ,0,0,0) \quad (0,\zeta ,0,0)\quad (0,0,\zeta ,0) \quad (0,0,0,\zeta ) \end{aligned}$$
    (11)

    are non-null and non-negative equilibria of the dynamical system (7).

  2. 2.

    For \(\beta > 1\), the remaining non-null and non-negative fixed points are exactly:

  • Equilibria for which each type is present in exactly one patch

    $$\begin{aligned} (\zeta ,0,0,\zeta ),\quad (0,\zeta ,\zeta ,0) \end{aligned}$$
    (12)
  • Equilibria for which only one type remains present, in both patches

    $$\begin{aligned} (\zeta ,0,\zeta ,0), \quad (0,\zeta ,0,\zeta ) \end{aligned}$$
    (13)
  • Equilibria with both types remaining in both patches

$$\begin{aligned} \left( \frac{b(\beta +1)-2d}{4c},\frac{b(\beta +1)-2d}{4c},\frac{b(\beta +1)-2d}{4c},\frac{b(\beta +1)-2d}{4c}\right) \end{aligned}$$
(14)
$$\begin{aligned}&\left( \frac{\zeta +\sqrt{\Delta }}{2},\frac{\zeta -\sqrt{\Delta }}{2},\widetilde{\zeta },\widetilde{\zeta } \right) , \quad \left( \frac{\zeta -\sqrt{\Delta }}{2},\frac{\zeta +\sqrt{\Delta }}{2},\widetilde{\zeta },\widetilde{\zeta } \right) , \end{aligned}$$
(15)
$$\begin{aligned}&\left( \widetilde{\zeta },\widetilde{\zeta },\frac{\zeta +\sqrt{\Delta }}{2},\frac{\zeta -\sqrt{\Delta }}{2} \right) , \quad \left( \widetilde{\zeta },\widetilde{\zeta },\frac{\zeta -\sqrt{\Delta }}{2},\frac{\zeta +\sqrt{\Delta }}{2} \right) . \end{aligned}$$
(16)

The only stable equilibria of the dynamical system (7) are those defined in Eq. (12), for which each of the two alleles is present in exactly one patch, and those given in Eq. (13) for which only one type remains.

  1. 3.

    For \(\beta = 1\), the remaining non-null and non-negative fixed points are exactly the two sets

    $$\begin{aligned} {\mathcal {L}}=\{\mathbf {u}(x)=(\zeta -x,x,x,\zeta -x),x\in [0,\zeta ]\} \end{aligned}$$

and

$$\begin{aligned} \tilde{\mathcal {L}}=\{\tilde{\mathbf {u}}(x)=(\zeta -x,x,\zeta -x,x),x\in [0,\zeta ]\}. \end{aligned}$$

Those equilibria are non-hyperbolic. For any \(x\in [0,\zeta ]\setminus \{\zeta /2\}\), the Jacobian matrix at the equilibrium \(\mathbf {u}(x)\) admits 0 as an eigenvalue (associated with the eigenvector \((1,-1,-1,1)\), direction of the line \({\mathcal {L}}\)) and three negative eigenvalues. Some symmetrical results hold for \(\tilde{\mathbf {u}}(x)\). The Jacobian matrix at the equilibrium \(\mathbf {u}(\zeta /2)=\tilde{\mathbf {u}}(\zeta /2)\) admits two negative eigenvalues and the eigenvalue 0 which is of multiplicity two.

The equilibria (12) and (13) correspond to the case where reproductive isolation occurs since the gene flow between the two patches ends to be null. The dynamics of the solutions are fundamentally different in the cases \(\beta >1\) and \(\beta =1\). They converge to an equilibrium without gene flow when \(\beta >1\), whereas when \(\beta =1\), depending on the initial condition, the solutions will converge to different equilibria with a nonzero migration rate, that is without reproductive isolation. The following proposition states that for each x, we can construct particular trajectories of the system which converge to \(\mathbf {u}(x)\).

Proposition 1

Let us introduce for any \(w\in (0,+\infty )\) and \(x\in [0,w]\) the vector

$$\begin{aligned} \mathbf {v}(w,x):=(w-x,x,x,w-x). \end{aligned}$$

The solution \(z^{(\mathbf {v}(w,x))}\) of the system (7) with \(\beta =1\) such that \(z^{(\mathbf {v}(w,x))}(0)=\mathbf {v}(w,x)\) converges when \(t\rightarrow \infty \) to the equilibrium \(\mathbf {u}(\zeta x/w)\).

In particular, the equilibria (12) are not asymptotically stable when \(\beta =1\) since solutions starting in any neighbourhood of (12) can converge to different equilibria. Note that the shape of the migration is not sufficient to entail reproductive isolation although it seems to reinforce the homogamy described by the \(\beta \) parameter. Thanks to simulations in Sect. 6, we will see that the effect of migration on the system dynamics is rather involved.

As a consequence, we assume \(\beta >1\) in the sequel. The following theorem gives the long-time convergence of the dynamical system (7) toward a stable equilibrium of interest, when starting from an explicit subset of \(\mathbb {R}_+^{\mathcal {E}}\). To state this latter, we need to define the subset of \(\mathbb {R}_+^{\mathcal {E}}\)

$$\begin{aligned} {\mathcal {D}}:=\left\{ \mathbf {z} \in \mathbb {R}_+^{\mathcal {E}}, z_{A,1}-z_{a,1}>0, z_{a,2}-z_{A,2}>0 \right\} , \end{aligned}$$
(17)

and the positive real number

$$\begin{aligned} p_0:=\frac{\sqrt{b(\beta -1)[b(3\beta +1)-4d]}-b(\beta -1)}{2}. \end{aligned}$$
(18)

Notice that under Assumption (4) and as \(\beta >1\),

$$\begin{aligned} p_0 < b(\beta +1)-2d. \end{aligned}$$

Finally, for \(p <b(\beta +1)-2d\), we introduce the set

$$\begin{aligned} {\mathcal {K}}_p:= & {} \left\{ \mathbf {z} \in {\mathcal {D}}, \; \{ z_{A,1}+z_{a,1}, \ z_{A,2}+z_{a,2} \}\right. \nonumber \\&\left. \in \left[ \frac{b(\beta +1)-2d-p}{2c}, \frac{2b\beta -2d+p}{2c}\right] \right\} . \end{aligned}$$
(19)

Then we have the following result:

Theorem 2

Let \(p<p_0\). Then

  • Any solution to (7) which starts from \( {\mathcal {D}}\) converges to the equilibrium \((\zeta ,0,0,\zeta )\).

  • If the initial condition of (7) lies in \({\mathcal {K}}_p\), there exist two positive constants \(k_1\) and \(k_2\), depending on the initial condition, such that for every \(t \ge 0\),

    $$\begin{aligned} \Vert \mathbf {z}(t)- (\zeta ,0,0,\zeta )\Vert \le k_1 e^{- k_2 t}. \end{aligned}$$

Symmetrical results hold for the equilibria \((0,\zeta ,\zeta ,0)\), \((\zeta ,0,\zeta ,0)\) and \((0,\zeta ,0,\zeta )\).

Note that the limit reached depends on the genotype which is initially in majority in each patch, since the subset \({\mathcal {D}}\) is invariant under the dynamical system (7). Secondly, when \(p=0\), the results of Theorem 2 can be proven easily since the two patches are independent from each other. The difficulty is thus to prove the result when \(p>0\). Our argument allows us to deduce an explicit constant \(p_0\) under which we have convergence to an equilibrium with reproductive isolation between patches. However, we are not able to deduce a rigorous result for all p. Indeed, when p increases, there are more mixing between the two patches which makes the model difficult to study. Nevertheless simulations in Sect. 6 suggest that the result stays true.

Let us now introduce our main result on the probability and the time needed for the stochastic process \(\mathbf {N}^K\) to reach a neighbourhood of the equilibria defined in (12).

Theorem 3

Assume that \(\mathbf {Z}^K(0)\) converges in probability to a deterministic vector \(\mathbf{{z}^0}\) belonging to \({\mathcal {D}}\), with \((z_{a,1}^0,z_{A,2}^0)\ne (0,0)\). Introduce the following bounded set depending on \(\varepsilon >0\):

$$\begin{aligned} {\mathcal {B}}_\varepsilon := [(\zeta -\varepsilon )K,(\zeta +\varepsilon )K] \times \{0\} \times \{0\} \times [(\zeta -\varepsilon )K,(\zeta +\varepsilon )K]. \end{aligned}$$

Then there exist three positive constants \(\varepsilon _0\), \(C_0\) and m, and a positive constant V depending on \((m,\varepsilon _0)\) such that if \(p < p_0\) and \(\varepsilon \le \varepsilon _0\),

$$\begin{aligned} \lim _{K \rightarrow \infty }\mathbb {P}\left( \left| \frac{T^K_{{\mathcal {B}}_{\varepsilon }}}{\log K}-\frac{1}{b(\beta -1)} \right| \le C_0\varepsilon , \quad \mathbf {N}^K\left( T^K_{{\mathcal {B}}_{\varepsilon }}+t\right) \in {\mathcal {B}}_{m\varepsilon }\; \forall t \le e^{VK} \right) = 1, \end{aligned}$$
(20)

where \(T^K_{\mathcal {B}}\), \({\mathcal {B}} \subset \mathbb {R}_+^{\mathcal {E}}\) is the hitting time of the set \({\mathcal {B}}\) by the population process \(\mathbf {N}^K\).

Symmetrical results hold for the equilibria \((0,\zeta ,\zeta ,0)\), \((\zeta ,0,\zeta ,0)\) and \((0,\zeta ,0,\zeta )\).

This theorem gives the order of magnitude of the time to reproductive isolation between the two patches, as a function of the population size scaling factor K. This isolation time is infinite when considering the dynamical system (7) for which K is equal to infinity. Note that the time needed to reach the reproductive isolation is inversely proportional to \(\beta -1\) which, as studied previously, suggests that the system behaves differently for \(\beta =1\). Moreover, the time does not depend on the parameter p. Intuitively, this can be understood as follows: the time needed to reach a neighbourhood of the state \((\zeta ,0,0,\zeta )\) is of order 1, and from this neighbourhood the time needed for the complete extinction of the a-individuals in the patch 1 and the A-individuals in the patch 2 is much longer, it is of order \(\log K\). During this second phase, the migrations between the two patches are already balanced, which entails the independence with respect to p. Furthermore, the constant does not depend on d and c since there is no ecological difference between the two types and the two patches: during the second phase, the natural birth rate of the a-individuals in the patch 1 is approximately b since the patch 1 is almost entirely filled with A-individuals, and their natural death rate can be approximated by \(d+c\zeta =b\beta \) where the term \(c\zeta \) comes from the competition exerted by the A-individuals. Thus, their natural growth rate is approximately \(b-b\beta \) which only depends on the birth parameters.

Note that Theorem 3 gives not only an estimation of the time to reach a neighbourhood of the limit, but also it proves that the dynamics of the population process stays a long time in the neighbourhood of equilibria (12) after this time.

Finally, the assumption \((z_{a,1}^0,z_{A,2}^0)\ne (0,0)\) is necessary to get the lower bound in (20). Indeed, if \((z_{a,1}^0,z_{A,2}^0)= (0,0)\), the set \({\mathcal {B}}_\varepsilon \) is reached faster, and thus only the upper bound still holds. In this case, the speed to reach the set \({\mathcal {B}}_\varepsilon \) will depend on the speed of convergence of the sequence \((Z^K_{a,1},Z^K_{A,2})\) to the limit (0, 0). In the trivial example where \((Z^K_{a,1},Z^K_{A,2})=(0,0)\), \(T^K_{{\mathcal {B}}}\) will be of order 1 which is the time needed for the processes \(Z^K_{A,1}\) and \(Z^K_{a,2}\) to reach a neighbourhood of the equilibrium \(\zeta \).

3 Discussion of the model

Assortative mating and genetic incompatibilities have been modeled and studied by many authors using discrete time models (see for instance the works by Gavrilets and Boake (1998), Matessi et al. (2002), Gavrilets (2003), Bürger and Schneider (2006), Servedio (2010) and references therein). Comparing continuous time models and discrete non-overlapping generations models is tricky. Indeed, some concepts that are clearly defined for the second class of models, like mating success or cost of choosiness, are hard to adapt to the first one. In this section we discuss our model in link with previous work.

3.1 Assortative mating

Assortative mating can result from different factors. Here, we are interested in assortative mating by phenotypic matching. That is to say, we consider uniform encounters between individuals and assume that assortative mating is the consequence of an increased mating probability between individuals with the same phenotype, when encountering. This leads to the following birth rate of \(\alpha \)-individuals on patch i

$$\begin{aligned} bn_{\alpha ,i}\frac{\beta n_{\alpha ,i}+n_{\bar{\alpha },i}}{ n_{\alpha ,i}+n_{\bar{\alpha },i}}. \end{aligned}$$
(21)

We think of a comportemental or a mechanical prezygotic isolation after encountering. As an example of our birth rate definition, we can think of high density populations of milkweed longhorn beetle Tetraopes tetraophthalmus where assortative mating is strong because at high density, large males are more likely to interfere with small males’ copulation with large females (McLain and Boromisa 1987). We can find other examples of this type in the recent review on assortative mating in animals, by Jiang et al. (2013). Note that we can also interpret the birth rate (21) as post-zygotic isolation (Ravigné et al. 2010), thinking of a low survival probability of the diploid zygotes of genotype Aa after mating (Gavrilets 2004; Bank et al. 2012).

In contrast with our model, most papers about sexual preferences (see for instance the works by Gavrilets and Boake 1998; Matessi et al. 2002; Bürger and Schneider 2006; Servedio 2010) use generational models with infinite population size, and study the evolution through time of the frequency of each genotype. As a consequence, they express the population dynamics in terms of a table describing the frequencies of mating at each generation. With our notations, the table of the Supplementary Material of the article by Servedio (2010) giving probabilities that the individuals with genotype \(\alpha \) mate with any individual with genotype \(\alpha ^{\prime }\) in the deme i and transmit their genotype writes:

$$\begin{aligned} \begin{array}{c|c|c} \alpha {\setminus }\alpha ^{\prime } &{}A &{} a \\ \hline A &{} \frac{ \beta n^2_{A,i}}{(n_{A,i}+n_{a,i})(\beta n_{A,i}+n_{a,i})} &{} \frac{ n_{A,i} n_{a,i} }{(n_{A,i}+n_{a,i})(\beta n_{A,i}+n_{a,i})} \\ \hline a &{} \frac{ n_{A,i} n_{a,i}}{(n_{A,i}+n_{a,i})( n_{A,i}+\beta n_{a,i})} &{} \frac{ \beta n^2_{a,i}}{(n_{A,i}+n_{a,i})(n_{A,i}+\beta n_{a,i})} \\ \end{array} \end{aligned}$$
(22)

Here the lines give the genotype \(\alpha \) transmitted to the offspring (often called the female genotype). Note that the same probabilities are derived by Gavrilets and Boake (1998) (in the case \(n=\infty \)), or by Matessi et al. (2002). These mating probabilities at first glance may seem very different from the equations governing the births in our model (21). However, as explicited in the Supplementary Information of the article by Servedio (2010), the mate choice mechanism is similar to ours: during mating, individuals encounter uniformly and are more likely to mate with an individual with the allele that they themselves carry.

The difference comes from the fact that the models studied by Gavrilets and Boake (1998), Matessi et al. (2002), Bürger and Schneider (2006), Servedio (2010) are in discrete time whereas ours is in continuous time. Moreover, they assume that all the females reproduce once. In our model, any individual can reproduce many times or even die before having any chance to reproduce. To compare our model with a generational one, we can compute the probabilities for a given \(\alpha \)-individual in the deme i to reproduce with an \(\alpha ^{\prime }\)-individual and transmits its genotype at time t, conditionally to the fact that this \(\alpha \)-individual reproduces and transmits its genotype at time t. We get:

$$\begin{aligned} \begin{aligned}&\mathbb {P}( \alpha {\text { mate with any }} \alpha {\text { and transmits at }} t |\alpha {\text { reproduces and transmits at }} t )\\&\qquad \qquad \qquad =\,\frac{\mathbb {P}( \alpha {\text { mate with any }} \alpha {\text { and transmits at }} t ) }{\mathbb {P}(\alpha {\text { reproduces and transmits at }} t)}\\&\qquad \qquad \qquad =\,\frac{ \frac{b\beta }{B} \frac{n_{\alpha ,i}}{(n_{\alpha ,i}+n_{\bar{\alpha },i}) } }{ \frac{b\beta }{B} \frac{ n_{\alpha ,i}}{(n_{\alpha ,i}+n_{\bar{\alpha },i})} +\frac{b}{B} \frac{ n_{\bar{\alpha },i}}{(n_{\alpha ,i}+n_{\bar{\alpha },i})} } = \frac{\beta n_{\alpha ,i}}{\beta n_{\alpha ,i}+n_{\bar{\alpha },i}}, \end{aligned} \end{aligned}$$

and

$$\begin{aligned}&\mathbb {P}( \alpha \text { mate with any } \bar{\alpha }\text { and transmits at }\\&\quad t |\alpha \text { reproduces at } t \text { and transmits}) = \frac{ n_{\bar{\alpha },i} }{\beta n_{\alpha ,i}+n_{\bar{\alpha },i}}. \end{aligned}$$

Multiplying by the frequency of the \(\alpha \)-individuals in the deme i gives the expressions derived in classical generational models (22). That is to say, in both cases, the mating probabilities at the mating time are similar.

Assortative mating can also derive from a non-homogeneous mating of individuals as proposed by Rudnicki and Zwoleński (2015). In their case, the birth rate for an individual with genotype \(\alpha \) in the deme i is

$$\begin{aligned} b n_{\alpha ,i} \frac{\beta n_{\alpha ,i} + \frac{1}{2} n_{\bar{\alpha },i}}{\beta n_{\alpha ,i} + n_{\bar{\alpha },i}} + b n_{\bar{\alpha },i} \frac{\frac{1}{2} n_{\alpha ,i}}{n_{\alpha ,i} + \beta n_{\bar{\alpha },i}}. \end{aligned}$$
(23)

This non uniform encountering and mating can presume of an ecological or temporal isolation of reproducing individuals. Indeed, with this expression, individuals of the same genotype are more likely to encounter than individuals of different genotypes, as if individuals of the same genotype were more likely to be at the same place at the same time. As an example, this definition of birth rate can model reproduction of hermaphroditic plants with uniform pollen dispersal within each deme and simultaneous maturation of both male and female reproductive organs, at a time that depends on the plant genotype, as studied by Herrero (2003), Savolainen et al. (2006).

3.2 Cost of choosiness

Cost of choosiness for populations having specific mating periods and limited mating trials have been studied by Gavrilets and Boake (1998), Bürger and Schneider (2006), Kopp and Hermisson (2008) notably. In these articles, each female can reproduce at most once, and cost of choosiness is quantified by a maximum number n of encounters that a female can make, in order to reproduce. In the present article, we assume a constant availability of both male and female organs of hermaphroditic individuals (like in sponges, sea anemones, tapeworms, snails, earthworms, or some fishes (Avise and Mank 2009) for instance). However, the potential of reproduction of each individual is hampered by its lifespan, which is stochastic.

3.3 Initial conditions

Concerning the initial allelic diversity, which is a question highly debated in the literature on speciation (Weissing et al. 2011), we have in mind populations where traits evolved neutrally before taking part in mating preferences after a change in the environment or a migration of the population to a new environment. For example, it is the case for the two sister species P. nyererei and P. pundamilia. Males of these two species have different nuptial colorations (red and blue, respectively), and females of these two species have preferences for a specific male nuptial coloration in clear water (red for P. nyererei and blue for P. pundamilia). These mating preferences have been proven to be inheritable (Haesler and Seehausen 2005), and uniformly random mating in turbid water has been inferred from phenotype frequency distribution in nature (Seehausen et al. 1997).

3.4 Migration

In our model the migration rate of a given individual is proportional to the frequency of individuals that do not have the same genotype as the considered individual. The idea is that an individual is more prone to move if it does not find suitable mates in its deme. This particular form of mating success dependent dispersal has also been studied by Payne and Krakauer (1997) for a continuous space. Chaput-Bardy et al. (2010) study the dispersal behaviour of the banded damselflies Calopteryx spendens, which display a lek mating system. They observe that females move to find a suitable mate, and that they disperse less when the sex-ratio is male biased. This is in agreement with our hypothesis that individuals migration rate is a decreasing function of the frequency of suitable mates. More generally, correlations between male dispersal and mating success have been empirically observed (see the articles by Schwagmeyer (1988) or Höner et al. (2007) for instance).

To emphasize the fact that migration is also governed by mate choice, we could rewrite the parameter of migration p as \((\beta -1)p^{\prime }\). Some formulations would then be modified although results would be identical. A degree of freedom would be kept thanks to \(p^{\prime }\) showing that migration and mating choice are not completely linked in our model, but we could not have systems with migration and no preference (\(\beta =1\)) anymore. In Sect. 6, we provide a deep study of the influence of the parameter p on the behaviour of the system.

4 Study of the dynamical system

In this section, we study the dynamical system (7) in order to prove Theorems 1 and 2. In the first subsection, we are concerned with the equilibria of (7) and their local stability (Theorem 1). In the second subsection, we look more closely at the case where the migration rate p is lower than \(p_0\) and prove the convergence of the solution to (7) towards one of the equilibria with an exponential rate once the trajectory belongs to \({\mathcal {K}}_p\) (Theorem 2).

4.1 Fixed points and stability when \(\beta >1\)

First of all, we prove that all nonnegative and non-zero stationary points of (7) are given in Theorem 1. Let us write the four equations defining equilibria \((z_{A,1},z_{a,1},z_{A,2},z_{a,2})\) of the dynamical system (7):

$$\begin{aligned}&z_{A,1}\left[ b\frac{\beta z_{A,1}+z_{a,1}}{z_{A,1}+z_{a,1}}-d-c(z_{A,1}+z_{a,1})-p\frac{z_{a,1}}{z_{A,1}+z_{a,1}}\right] +p\frac{z_{A,2}z_{a,2}}{z_{A,2}+z_{a,2}}=0, \end{aligned}$$
(24)
$$\begin{aligned}&z_{a,1}\left[ b\frac{\beta z_{a,1}+z_{A,1}}{z_{A,1}+z_{a,1}}-d-c(z_{A,1}+z_{a,1})-p\frac{z_{A,1}}{z_{A,1}+z_{a,1}}\right] +p\frac{z_{A,2}z_{a,2}}{z_{A,2}+z_{a,2}}=0, \end{aligned}$$
(25)
$$\begin{aligned}&z_{A,2}\left[ b\frac{\beta z_{A,2}+z_{a,2}}{z_{A,2}+z_{a,2}}-d-c(z_{A,2}+z_{a,2})-p\frac{z_{a,2}}{z_{A,2}+z_{a,2}}\right] +p\frac{z_{A,1}z_{a,1}}{z_{A,1}+z_{a,1}}=0, \end{aligned}$$
(26)
$$\begin{aligned}&z_{a,2}\left[ b\frac{\beta z_{a,2}+z_{A,2}}{z_{A,2}+z_{a,2}}-d-c(z_{A,2}+z_{a,2})-p\frac{z_{A,2}}{z_{A,2}+z_{a,2}}\right] +p\frac{z_{A,1}z_{a,1}}{z_{A,1}+z_{a,1}}=0. \end{aligned}$$
(27)

By subtracting (24) and (25), and (26) and (27) we get

$$\begin{aligned} (z_{A,i}-z_{a,i})\left( b\beta -d-c(z_{A,i}+z_{a,i})\right) =0, \quad i \in {\mathcal {I}}. \end{aligned}$$

Therefore equilibria are defined by the four following cases:

$$\begin{aligned} \left\{ \begin{aligned}&z_{A,1}=z_{a,1}\\&\text {or}\\&z_{A,1}+z_{a,1}=(b\beta -d)/c \end{aligned} \right. \quad \text {and}\quad \left\{ \begin{aligned}&z_{A,2}=z_{a,2}\\&\text {or}\\&z_{A,2}+z_{a,2}=(b\beta -d)/c. \end{aligned} \right. \end{aligned}$$

1st case: \(z_{A,1}=z_{a,1}\) and \(z_{A,2}=z_{a,2}\).

From (24) and (26) we derive

$$\begin{aligned} z_{A,1}\left[ b\frac{(\beta +1)}{2}-d-2cz_{A,1}-\frac{p}{2}\right] =-\frac{z_{A,2}p}{2}, \end{aligned}$$

and

$$\begin{aligned} -\frac{z_{A,1}p}{2}=z_{A,2}\left[ b\frac{(\beta +1)}{2}-d-2cz_{A,2}-\frac{p}{2}\right] . \end{aligned}$$

By summing, we get \(P(z_{A,1})=P(z_{A,2})\) where P is the polynomial function defined by:

$$\begin{aligned} P(X)=X\left[ b\frac{(\beta +1)}{2}-d-p\right] -2cX^2, \end{aligned}$$

whose roots are 0 and

$$\begin{aligned} \frac{b(\beta +1)-2d-2p}{4c}. \end{aligned}$$

Then, either \(z_{A,1}=z_{A,2}\) or \(z_{A,1}\) and \(z_{A,2}\) are symmetrical with respect to the maximum of P which leads to

$$\begin{aligned} z_{A,1}=\frac{b(\beta +1)-2d-2p}{4c}-z_{A,2}. \end{aligned}$$

In the first case \(z_{A,1}=z_{A,2}\), Eq. (24) implies that either \(z_{A,1}=0\), which gives the null equilibrium or

$$\begin{aligned} z_{A,1}=\frac{b(\beta +1)-2d}{4c}, \end{aligned}$$

which gives equilibrium (14). In the second case, we inject the expression of \(z_{A,2}\) in (24) to obtain that \(z_{A,1}\) satisfies:

$$\begin{aligned} -2cX^2+AX+\frac{p}{4c}A=0, \end{aligned}$$

with \(A= b(\beta +1)/2-d-p\). The discriminant of this degree 2 equation is \(A(A+2p)\). Therefore, either

$$\begin{aligned} z_{A,1}=\frac{A+\sqrt{A(A+2p)}}{4c} \quad \text {and}\quad z_{A,2}=\frac{A-\sqrt{A(A+2p)}}{4c}, \end{aligned}$$

or

$$\begin{aligned} z_{A,1}=\frac{A-\sqrt{A(A+2p)}}{4c} \quad \text {and}\quad z_{A,2}=\frac{A+\sqrt{A(A+2p)}}{4c}. \end{aligned}$$

However, these equilibria are not positive.

2nd case: \( z_{A,1}+z_{a,1}=(b\beta -d)/c=\zeta =z_{A,2}+z_{a,2}\). As previously, we obtain

$$\begin{aligned} (b(\beta -1)+p)z_{A,1}\left( \frac{z_{A,1}}{\zeta }-1\right) =pz_{A,2}\left( \frac{z_{A,2}}{\zeta }-1\right) , \end{aligned}$$

and

$$\begin{aligned} pz_{A,1}\left( \frac{z_{A,1}}{\zeta }-1\right) =(b(\beta -1)+p)z_{A,2}\left( \frac{z_{A,2}}{\zeta }-1\right) . \end{aligned}$$

By summing these equalities, we get \(Q(z_{A,1})=Q(z_{A,2})\) with

$$\begin{aligned} Q(X)=X\left( \frac{X}{\zeta }-1\right) \left( b(\beta -1)+2p\right) . \end{aligned}$$

Then, either \(z_{A,1}=z_{A,2}\) and (24) gives that

$$\begin{aligned} z_{A,1}\left( \frac{z_{A,1}}{\zeta }-1\right) =0, \end{aligned}$$

which gives equilibrium (13), or \(z_{A,1}=\zeta -z_{A,2}\) which implies \( z_{A,1}(z_{A,1}/\zeta -1)=0 \) and gives equilibrium (12).

3rd case: \( z_{A,1}=z_{a,1}, \text { and } z_{A,2}+z_{a,2}=(b\beta -d)/c=\zeta \).

Substituting in Eqs. (24) and (27) we get that

$$\begin{aligned} z_{A,1}\left[ b\frac{\beta +1}{2}-d-2c z_{A,1}-\frac{p}{2}\right] +p\frac{z_{A,2}(\zeta -z_{A,2})}{\zeta }=0, \end{aligned}$$

and

$$\begin{aligned} (\zeta - z_{A,2})\left[ \dfrac{b}{\zeta }\left( \beta \zeta +(1-\beta ) z_{A,2}\right) -d-c \zeta -p\frac{z_{A,2}}{\zeta }\right] +p\frac{z_{A,1}}{2}=0. \end{aligned}$$

Therefore, since \(\zeta =(b\beta -d)/c\), these equations become

$$\begin{aligned} z_{A,1}=\dfrac{2}{p}(z_{A,2}-\zeta ) z_{A,2} \left[ \dfrac{b(1-\beta )-p}{\zeta }\right] , \end{aligned}$$
(28)

and

$$\begin{aligned}&\dfrac{(z_{A,2}-\zeta ) z_{A,2}}{\zeta } \left\{ \dfrac{2}{p} \left[ b(1-\beta )-p\right] \right. \\&\left. \quad \times \left[ b\frac{\beta +1}{2}-d-\frac{p}{2} -\dfrac{4c}{p} (z_{A,2}-\zeta ) z_{A,2} \dfrac{b(1-\beta )-p}{\zeta } \right] -{p}\right\} =0. \end{aligned}$$

This last equation provides the following possible cases:

  • \(z_{A,2}=0\), which implies \(z_{a,2}=\zeta \), and from (28) \(z_{A,1}=z_{a,1}=0\) [Equilibrium (11)],

  • \(z_{A,2}=\zeta \), which implies \(z_{a,2}=0 \), and from (28) \(z_{A,1}=z_{a,1}=0\) [Equilibrium (11)],

  • \(z_{A,2}\) solution of

    $$\begin{aligned} (b(1-\beta )-p) \left[ b\frac{\beta +1}{2}-d-\frac{p}{2} -\dfrac{4c}{p} (z_{A,2}-\zeta ) z_{A,2} \dfrac{b(1-\beta )-p}{\zeta } \right] - \dfrac{p^2}{2}=0, \end{aligned}$$

    which can be summarized as

    $$\begin{aligned} (z_{A,2}-\zeta )z_{A,2}+C=0, \end{aligned}$$
    (29)

    where

    $$\begin{aligned} C=\dfrac{p\zeta }{8c(b(\beta -1)+p)^2} \left[ b^2(\beta ^2-1)+2p(b-d)-2bd(\beta -1) \right] . \end{aligned}$$

    The discriminant \(\Delta \) of the degree 2 Eq. (29) was introduced in Eq. (10). A simple computation gives the sign of \(\Delta \):

    $$\begin{aligned} \begin{aligned} \Delta&={\zeta ^2-4C}\\&= \zeta ^2-\dfrac{p\zeta }{2c(b(\beta -1)+p)^2} \left[ b^2(\beta ^2-1)+2p(b-d)-2bd(\beta -1) \right] \\&= \dfrac{\zeta }{2c(b(\beta -1)+p)^2} \left[ 2b^2(\beta -1)^2(b\beta -d)\right. \\&\quad \left. +\,2bp(\beta -1)[b\beta -d+p]+b^2(\beta -1)^2p \right] > 0. \end{aligned} \end{aligned}$$
    (30)

    Thus (29) has two distinct solutions:

    $$\begin{aligned} z_{A,2}^{+}=\dfrac{\zeta +\sqrt{\Delta }}{2}>0 \quad \text {and} \quad z_{A,2}^{-}=\dfrac{\zeta -\sqrt{\Delta }}{2}. \end{aligned}$$

    Since \(C>0\), both roots \(z_{A,2}^{-}\) and \(z_{A,2}^{+}\) are strictly positive. We finally deduce from (28) and (29) that in both cases \(z_{A,2}=z_{A,2}^{-}\) and \(z_{A,2}= z_{A,2}^{+}\) then

    $$\begin{aligned} z_{A,1}=z_{a,1}=\dfrac{b^2(\beta ^2-1)+2p(b-d)-2bd(\beta -1)}{4c(b(\beta -1)+p)}. \end{aligned}$$

    This gives equilibrium (15) and (16), by symmetry between patches 1 and 2.

The end of this subsection provides a detailed exposition of the stability of fixed points of (7). We consider separately each equilibrium and use symmetries of the dynamical system between patches 1 and 2 and between alleles A and a.

Equilibrium (11): By subtracting (25) from (24), we obtain:

$$\begin{aligned} \frac{d}{dt}(z_{A,1}-z_{a,1})=(z_{A,1}-z_{a,1})\left( b\beta -d-c(z_{A,1}+z_{a,1})\right) . \end{aligned}$$
(31)

This equation provides the asymptotic instability since for this equilibrium, \(z_{A,1}+z_{a,1}=0\).

Equilibrium (12): We consider the equilibrium \((\zeta ,0,0,\zeta )\). The Jacobian matrix of the dynamical system at this fixed point is:

$$\begin{aligned} \begin{pmatrix} -(b\beta -d) &{} b(1-2\beta )+d-p &{} p &{} 0 \\ 0 &{} b(1-\beta )-p &{} p &{} 0 \\ 0 &{} p &{} b(1-\beta )-p &{} 0 \\ 0 &{} p &{} b(1-2\beta )+d-p &{} -(b\beta -d) \end{pmatrix} \end{aligned}$$

The eigenvalues are: \(-b(\beta -1)\), \(-b(\beta -1)-2p\), and \(-(b\beta -d)\). All of them are negative, and \(-(b\beta -d)\) is of multiplicity two. The equilibrium is therefore asymptotically stable.

Equilibrium (13): We consider the equilibrium \((0,\zeta ,0,\zeta )\). The Jacobian matrix of the dynamical system at this fixed point is:

$$\begin{aligned} \begin{pmatrix} b(1-\beta )-p &{} 0 &{} p &{} 0 \\ b(1-2\beta )+d-p &{} -(b\beta -d) &{} p &{} 0 \\ p &{} 0 &{}b(1-\beta )-p &{} 0 \\ p &{} 0 &{} b(1-2\beta )+d-p &{} -(b\beta -d) \end{pmatrix} \end{aligned}$$

The eigenvalues are: \(-b(\beta -1)\), \(-b(\beta -1)-2p\), and \(-(b\beta -d)\). All of them are negative, and \(-(b\beta -d)\) is of multiplicity two. The equilibrium is therefore asymptotically stable.

Equilibrium (14): The Jacobian matrix of the dynamical system at this fixed point is:

$$\begin{aligned} \dfrac{1}{4} \begin{pmatrix} 2(d-b)-p &{} 2(d-b\beta )-p &{} p &{} p \\ 2(d-b\beta )-p &{} 2(d-b)-p &{} p &{} p \\ p &{} p &{} 2(d-b)-p &{} 2(d-b\beta )-p \\ p &{} p &{} 2(d-b\beta )-p &{} 2(d-b)-p \end{pmatrix} \end{aligned}$$

The eigenvalues are: \(-(b(\beta +1)/2-d)\), \(-(b(\beta +1)/2-d+p)\) and \(b(\beta -1)/2\). \(-(b(\beta +1)/2-d)\) and \(-(b(\beta +1)/2-d+p)\) are negative, and \(b(\beta -1)/2\) is positive and of multiplicity two. The equilibrium is thus unstable.

Equilibrium (15): Recall the definition of \(\widetilde{\zeta }\) in (10) and assume that \(z_{A,1}=z_{a,1}=\widetilde{\zeta }\). We first prove that at this fixed point,

$$\begin{aligned} z_{A,1}+z_{a,1}= 2 \widetilde{\zeta } <\zeta , \end{aligned}$$
(32)

which is equivalent to

$$\begin{aligned} b^2(\beta ^2-1)+2p(b-d)-2bd(\beta -1)<2(b(\beta -1)+p)(b\beta -d). \end{aligned}$$

A straightforward computation leads to

$$\begin{aligned}&b^2(\beta ^2-1)+2p(b-d)-2bd(\beta -1)-2(b(\beta -1)+p)(b\beta -d)\\&\quad =-b(\beta -1)(2p+b(\beta -1)), \end{aligned}$$

which is negative and thus proves the inequality. From (32) we deduce that near the equilibrium (15), \(b\beta -d-c(z_{A,1}+z_{a,1})>0\). The instability then derives from Eq. (31).

4.2 Fixed points and stability when \(\beta =1\)

Following a similar reasoning to the one in Sect. 4.1, we obtain that the equilibria of the system are exactly the lines \( {\mathcal {L}}\) and \( \tilde{\mathcal {L}} \) defined in Theorem 1. A study of the Jacobian matrices proves that these equilibria are no longer hyperbolic. It ends the proof of Theorem 1.

4.3 Proof of Proposition 1

This subsection is devoted to the proof of Proposition 1. The idea is to find a solution of the form

$$\begin{aligned} \psi (t)=\gamma (t)\mathbf {v}(w,x) \quad \text {with} \quad \gamma (0)=1, \end{aligned}$$

where \(\mathbf {v}(w,x)=(w-x,x,x,w-x)\) has been introduced in Proposition 1. Assuming that \(\psi \) is solution to the system (7) with \(\beta =1\), we deduce that for all \((\alpha ,i)\in {\mathcal {E}}\):

$$\begin{aligned} \begin{aligned} \frac{d}{dt}\psi _{\alpha ,i}(t)&=\frac{d}{dt}\gamma (t) v_{\alpha ,i} (w,x)\\&=\psi _{\alpha ,i}(t)(b-d-c(\psi _{\alpha ,i}(t)+\psi _{\bar{\alpha },i}(t)))\\&\quad +\,p\frac{\psi _{\alpha ,i}(t)\psi _{\bar{\alpha },i}(t)}{\psi _{\alpha ,i}(t)+\psi _{\bar{\alpha },i}(t)}-p\frac{\psi _{\alpha ,\bar{i}}(t)\psi _{\bar{\alpha },\bar{i},}(t)}{\psi _{\alpha ,\bar{i}}(t)+\psi _{\bar{\alpha },\bar{i},}(t)}\\&=\gamma (t)v_{\alpha ,i} (w,x)(b-d-cw\gamma (t)). \end{aligned} \end{aligned}$$

Thus \(\gamma (t)\) satisfies the logistic equation

$$\begin{aligned} \frac{d}{dt}\gamma (t)=\gamma (t)(b-d-cw\gamma (t)), \end{aligned}$$

whose solution starting from 1 is given by

$$\begin{aligned} \gamma (t)=\frac{e^{t(b-d)}}{1+\frac{cw}{b-d}(e^{t(b-d)}-1)}. \end{aligned}$$
(33)

In particular \(\gamma (t)\) converges to \((b-d)/cw=\zeta /w\) as \(t\rightarrow \infty \).

A standard computation proves that \(\psi (t)=\gamma (t)\mathbf {v}(w,x)\) with \(\gamma \) chosen according to (33) is the solution to (7) starting from \(\mathbf {v}(w,x)\) and converges to \(\zeta \mathbf {v}(w,x)/w= \mathbf {u}( \zeta x/w)\). This ends the proof of Proposition 1.

4.4 Containment and Lyapunov function for a small migration rate

In this subsection, we are mainly interested in Equilibrium (12). Recall the definition of \({\mathcal {D}}\) in (17)

$$\begin{aligned} {\mathcal {D}}:=\left\{ z\in \mathbb {R}_+^{\mathcal {E}}, z_{A,1}-z_{a,1}>0, z_{a,2}-z_{A,2}>0 \right\} . \end{aligned}$$

First, we prove that we can restrict our attention to the bounded set \({\mathcal {K}}_p\subset {\mathcal {D}}\) defined in (19). For the sake of readability, we introduce the two real numbers

$$\begin{aligned} z_{min}:= \frac{b(\beta +1)-2d-p}{2c}\le \zeta \le \zeta +\frac{p}{2c}=:z_{max}, \end{aligned}$$
(34)

which allows one to write the set \( {\mathcal {K}}_p\) defined in (19) as

$$\begin{aligned} {\mathcal {K}}_p:= \left\{ \mathbf {z} \in {\mathcal {D}}, \; \{ z_{A,1}+z_{a,1}, \ z_{A,2}+z_{a,2} \} \in \left[ z_{min}, z_{max}\right] \right\} . \end{aligned}$$

Lemma 2

Assume that \(p<b(\beta +1)-2d\). The set \({\mathcal {K}}_p\) is invariant under the dynamical system (7). Moreover, any solution to (7) starting from the set \( {\mathcal {D}} \) reaches \({\mathcal {K}}_p\) after a finite time.

Proof

First, Eq. (31) and the symmetrical equation for the patch 2 are sufficient to prove that the subset \( {\mathcal {D}}\) is invariant under the dynamical system.

Second, we prove that the trajectory reaches the bounded set \({\mathcal {K}}_p\) in a finite time and third that \({\mathcal {K}}_p\) is stable. The dynamics of the total population size \(n=z_{A,1}+z_{a,1}+z_{A,2}+z_{a,2}\) satisfies

$$\begin{aligned} \frac{dn}{dt}= & {} n(\beta b-d)-2b(\beta -1)\left( \frac{z_{A,1}z_{a,1}}{z_{A,1}+z_{a,1}} + \frac{z_{A,2}z_{a,2}}{z_{A,2}+z_{a,2}} \right) \\&-c((z_{A,1}+z_{a,1})^2+(z_{A,2}+z_{a,2})^2). \end{aligned}$$

Since \((a+b)^2\le 2(a^2+b^2)\) for every real numbers (ab),

$$\begin{aligned} \frac{dn}{dt}\le n \left( \beta b -d - \frac{c}{2}n \right) . \end{aligned}$$

Using classical results on logistic equations, we deduce that

$$\begin{aligned} \limsup _{t\rightarrow +\infty } n(t)\le 2\zeta . \end{aligned}$$
(35)

Let \(\varepsilon \) be positive, and suppose that for any \(t>0\), \( (z_{A,1}+z_{a,1})(t)\le \zeta -\varepsilon , \) then using (31) we have for \(t \ge 0\),

$$\begin{aligned} z_{A,1}(t)\ge (z_{A,1}-z_{a,1})(t)\ge (z_{A,1}-z_{a,1})(0)e^{c\varepsilon t} \underset{t\rightarrow +\infty }{\rightarrow } +\infty . \end{aligned}$$
(36)

This contradicts (35). As a consequence,

$$\begin{aligned} \exists \ t <\infty ,\quad (z_{A,1}+z_{a,1})(t) \ge \zeta -\varepsilon . \end{aligned}$$
(37)

In particular, this result holds for \(\zeta -\varepsilon _0=z_{min}\) where \(\varepsilon _0=(p+b(\beta -1))/2c\).

Furthermore, the dynamics of the total population size in the patch 1 satisfies the following equation:

$$\begin{aligned} \frac{d}{dt}(z_{A,1}+z_{a,1})= & {} (z_{A,1}+z_{a,1})(b \beta - d - c (z_{A,1}+z_{a,1}))\nonumber \\&-\,2(b(\beta -1)+p)\frac{z_{A,1}z_{a,1}}{z_{A,1}+z_{a,1}}+2p\frac{z_{A,2}z_{a,2}}{z_{A,2}+z_{a,2}}.\qquad \end{aligned}$$
(38)

By noticing that \(z_{A,1}z_{a,1} \le (z_{A,1}+z_{a,1})^2/4\), we get

$$\begin{aligned} \frac{d}{dt}(z_{A,1}+z_{a,1})\ge & {} (z_{A,1}+z_{a,1})\left( b \beta -d-c (z_{A,1}+z_{a,1})\right) \nonumber \\&- \left( b(\beta -1)+p\right) \frac{z_{A,1}+z_{a,1}}{2}\nonumber \\\ge & {} c(z_{A,1}+z_{a,1})\left( z_{min}- (z_{A,1}+z_{a,1})\right) . \end{aligned}$$
(39)

The last term becomes positive as soon as \(z_{A,1}+z_{a,1}\le z_{min}\). As a consequence, once the total population size in the patch 1 is larger than \(z_{min}\), it stays larger than this threshold. Using symmetrical arguments, the same conclusion holds for the patch 2. Using additionally (37), we find \(t_{min}>0\) such that \(\forall t\ge t_{min}\),

$$\begin{aligned} z_{A,i}(t)+z_{a,i}(t) \ge z_{min}, \quad \forall i\in {\mathcal {I}}, \text { and } n(t)\le 2\zeta +1. \end{aligned}$$
(40)

We now focus on the upper bound of the set \({\mathcal {K}}_p\) by bounding from above the total population size in the patch i, for all \(t\ge t_{min}\),

$$\begin{aligned} \begin{aligned} \frac{d}{dt}(z_{A,i}+z_{a,i})&\le (2\zeta +1)(c\zeta - c(z_{A,i}+z_{a,i}))+\frac{p}{2} (2\zeta +1)\\&\le c(2\zeta +1)\left( z_{max}-(z_{A,i}+z_{a,i})\right) . \end{aligned} \end{aligned}$$
(41)

This implies that, if \(\alpha >0\) is fixed, there exists \(t_\alpha \ge t_{min}\) such that \(z_{A,i}(t)+z_{a,i}(t)\le z_{max}+\alpha \) for all \(i \in {\mathcal {I}}\) and \(t\ge t_{\alpha }\).

Finally, we use a proof by contradiction to ensure that the trajectory hits the compact \({\mathcal {K}}_p\). Let us assume that for any \(t\ge t_\alpha \),

$$\begin{aligned} z_{A,1}(t)+z_{a,1}(t)\ge z_{max}-\alpha . \end{aligned}$$
(42)

From (31), and choosing an \(\alpha <p/2c\), we deduce that \(z_{A,1}-z_{a,1}\) converges to 0. In addition with (42), we find \(t_{\alpha }^{\prime }\ge t_\alpha \) such that for any \(t\ge t_{\alpha }^{\prime }\),

$$\begin{aligned} \frac{z_{A,1}(t)z_{a,1}(t)}{z_{A,1}(t)+z_{a,1}(t)} \ge \frac{1}{4}\left( z_{max}-2\alpha \right) . \end{aligned}$$
(43)

We insert (43) in the equation (38) to deduce that, for all \(t\ge t_{\alpha }^{\prime }\),

$$\begin{aligned} \begin{aligned} \frac{d}{dt}(z_{A,1}+z_{a,1})&\le c\left( 2\zeta +1\right) (\zeta - (z_{A,1}+z_{a,1})) -\frac{b(\beta -1)+p}{2} \left( z_{max}-2\alpha \right) \\ {}&\quad +\frac{p}{2} \left( 2\zeta +1\right) . \\&\le c\left( 2\zeta +1\right) \left( z_{max}-2\alpha - (z_{A,1}+z_{a,1})\right) +2\alpha c(2\zeta +1)\\&\quad -\,\frac{b(\beta -1)+p}{2} \left( z_{max}-2\alpha \right) . \end{aligned} \end{aligned}$$

The first term of the last line is negative under Assumption (42), thus, if \(\alpha \) is sufficiently small,

$$\begin{aligned} \frac{d}{dt}(z_{A,1}+z_{a,1})\le & {} -\frac{1}{2} \left[ (b(\beta -1)+p)z_{max}\right] +\alpha \left[ b(\beta -1) +\frac{p}{2c}(2\zeta +1) \right] \nonumber \\\le & {} -\frac{1}{4} \left[ (b(\beta -1)+p)z_{max} \right] . \end{aligned}$$
(44)

This contradicts (42). Thus, the total population size of the patch 1 is lower than \(z_{max}-\alpha \) after a finite time. Moreover, (41) ensures that once the total population size of the patch 1 has reached the threshold \(z_{max}\), it stays smaller than this threshold. Reasoning similarly for the patch 2, we finally find a finite time such that the trajectory hits the compact \({\mathcal {K}}_p\) and remains in it afterwards. This ends the proof of Lemma 2.

As \({\mathcal {D}}\) is invariant under the dynamical system (7), we can consider the function \(V: {\mathcal {D}} \rightarrow \mathbb {R}\):

$$\begin{aligned} V(\mathbf {z})= \ln \left( \frac{z_{A,1}+z_{a,1}}{z_{A,1}-z_{a,1}}\right) +\ln \left( \frac{z_{a,2}+z_{A,2}}{z_{a,2}-z_{A,2}}\right) . \end{aligned}$$
(45)

It characterizes the dynamics of (7) on \({\mathcal {K}}_p\). Indeed, as proved in the next lemma, V is a Lyapunov function if p is sufficiently small. This will allow us to prove that the solutions to (7) converge to \((\zeta ,0,0,\zeta )\) exponentially fast as soon as their trajectory hits the set \({\mathcal {K}}_p\). Before stating the next lemma, we introduce the positive real number:

$$\begin{aligned} C_1:= \frac{1}{2}\left( \frac{2b(\beta -1)+2p}{z_{min}}- \frac{2p}{z_{max}} \right) , \end{aligned}$$
(46)

where \(z_{min}\) and \(z_{max}\) have been defined in (34). Then we have the following result:

Lemma 3

Assume that \(p<p_0\) defined in (18). Then \(V(\mathbf {z}(t))\) is non-negative and non-increasing on \({\mathcal {K}}_p\), and satisfies

$$\begin{aligned} \frac{d}{dt}V(\mathbf {z}(t)) \le - C_1 \left( z_{a,1}(t)+z_{A,2}(t)\right) , \quad t \ge 0. \end{aligned}$$
(47)

Proof

For \(i\in {\mathcal {I}}\) and \(\mathbf {z}\in {\mathcal {K}}_p\), \(0<z_{\alpha _i,i}-z_{\bar{\alpha }_i,i}\le z_{\alpha _i,i}+z_{\bar{\alpha }_i,i}\), where \(\alpha _1=A, \alpha _2=a\) and \(\bar{\alpha }_i={\mathcal {A}} {\setminus } \alpha _i\). Thus, \(V(\mathbf {z})\ge 0\). Now,

$$\begin{aligned} \frac{d}{dt}V(\mathbf {z}(t))= & {} \frac{\dot{z}_{A,1}(t)+\dot{z}_{a,1}(t)}{z_{A,1}(t)+z_{a,1}(t)} -\frac{\dot{z}_{A,1}(t)-\dot{z}_{a,1}(t)}{z_{A,1}(t)-z_{a,1}(t)}\nonumber \\&+ \frac{\dot{z}_{A,2}(t)+\dot{z}_{a,2}(t)}{z_{A,2}(t)+z_{a,2}(t)} -\frac{\dot{z}_{a,2}(t) -\dot{z}_{A,2}(t)}{z_{a,2}(t)-z_{A,2}(t)}\nonumber \\= & {} -\underset{i=1,2}{\sum } \frac{z_{A,i}z_{a,i}}{z_{A,i}+z_{a,i}}\left[ \frac{2b(\beta -1)+2p}{z_{A,i}+z_{a,i}}- \frac{2p}{z_{A,\bar{i}}+z_{a,\bar{i}}} \right] , \end{aligned}$$
(48)

from (31) and (38). Thus, \(dV(\mathbf {z}(t))/dt\) is nonpositive if

$$\begin{aligned} \frac{b(\beta -1)}{p} > \max \left\{ \frac{z_{A,1}+z_{a,1}}{z_{A,2}+z_{a,2}}-1, \frac{z_{A,2}+z_{a,2}}{z_{A,1}+z_{a,1}}-1\right\} . \end{aligned}$$
(49)

Since \(\mathbf {z}\) belongs to \({\mathcal {K}}_p\), the r.h.s of (49) can be bounded from above by

$$\begin{aligned} \frac{z_{max}}{z_{min}}-1= \frac{b(\beta -1)+2p}{b(\beta +1)-2d-p}. \end{aligned}$$

Therefore, the condition (49) is satisfied if

$$\begin{aligned} \frac{b(\beta -1)}{p}> \frac{b(\beta -1)+2p}{b(\beta +1)-2d-p}, \end{aligned}$$

that is, if

$$\begin{aligned} p<\frac{\sqrt{b(\beta -1)[b(3\beta +1)-4d]}-b(\beta -1)}{2}=p_0, \end{aligned}$$

and under this condition,

$$\begin{aligned} \frac{2b(\beta -1)+2p}{z_{A,i}+z_{a,i}}- \frac{2p}{z_{A,\bar{i}}+z_{a,\bar{i}}}\ge 2C_1, \quad \mathbf{z} \in {\mathcal {K}}_p, \quad i \in {\mathcal {I}}. \end{aligned}$$

Moreover, as the set \({\mathcal {D}}\) is invariant under the dynamical system (7), \(z_{A,1}\) stays larger that \(z_{a,1}\), and

$$\begin{aligned} \frac{z_{A,1}}{z_{A,1}+z_{a,1}}\ge \frac{1}{2}. \end{aligned}$$

In the same way,

$$\begin{aligned} \frac{z_{a,2}}{z_{A,2}+z_{a,2}}\ge \frac{1}{2}. \end{aligned}$$

As a consequence, the first derivative of V satisfies (47) for every \(t \ge 0\).

We now have all the ingredients to prove Theorem 2.

4.5 Proof of Theorem 2

Lemma 2 states that any solution to (7) starting from the set \( {\mathcal {D}} \) reaches \({\mathcal {K}}_p\) after a finite time. Let us show that because of Lemma 3, any solution to (7) which starts from \({\mathcal {K}}_p\) converges exponentially fast to \((\zeta ,0,0,\zeta )\) when t tends to infinity. To do this, we need to introduce some positive constants

$$\begin{aligned} C_2:= & {} z_{min}^2 e^{-V(\mathbf {z}(0))} , \quad C_3:= \frac{2}{C_2}z_{max}\\ C_4:= & {} \frac{z_{max}}{2}V(\mathbf {z}(0)) , \quad C_5:= z(4b\beta -2d+3p) C_4 , \end{aligned}$$

where we recall that \(z_{min}\) and \(z_{max}\) have been defined in (34).

First, we prove that the population density differences \(z_{A,1}-z_{a,1}\) and \(z_{a,2}-z_{A,2}\) cannot be too small. To do this, we use the decay of the function V stated in Lemma 3:

$$\begin{aligned} V(\mathbf {z}(0))\ge V(\mathbf {z}(t))= & {} \ln \left( \frac{z_{A,1}(t)+z_{a,1}(t)}{z_{A,1}(t)-z_{a,1}(t)}\frac{z_{a,2}(t)+z_{A,2}(t)}{z_{a,2}(t)-z_{A,2}(t)}\right) \\\ge & {} \ln \left( \frac{z_{min}^2}{(z_{A,1}(t)-z_{a,1}(t))(z_{a,2}(t)-z_{A,2}(t))}\right) . \end{aligned}$$

This implies that

$$\begin{aligned} (z_{A,1}(t)-z_{a,1}(t))(z_{a,2}(t)-z_{A,2}(t))\ge C_2. \end{aligned}$$
(50)

Now, from the inequality \(\ln x \le x-1\) for \(x \ge 1\) we deduce for \(\mathbf {z}\) in \({\mathcal {K}}_p\),

$$\begin{aligned} V(\mathbf {z})\le & {} \left( \frac{z_{A,1}+z_{a,1}}{z_{A,1}-z_{a,1}}-1 \right) + \left( \frac{z_{a,2}+z_{A,2}}{z_{a,2}-z_{A,2}}-1 \right) \nonumber \\= & {} 2\frac{z_{a,1}(z_{a,2}-z_{A,2})+z_{A,2}(z_{A,1}-z_{a,1})}{(z_{A,1}-z_{a,1})(z_{a,2}-z_{A,2})} \le C_3 (z_{a,1}+z_{A,2}) , \end{aligned}$$
(51)

where we have used that \(\mathbf{z}\in {\mathcal {K}}_p\) and inequality (50). Then combining (47) and (51), we get

$$\begin{aligned} \frac{d}{dt}V(\mathbf {z}(t)) \le - \frac{C_1}{C_3}V(\mathbf {z}(t)) , \end{aligned}$$
(52)

which implies for every \(t \ge 0\):

$$\begin{aligned} V(\mathbf {z}(t)) \le V(\mathbf {z}(0))e^{-C_1 t /C_3}. \end{aligned}$$
(53)

Now, from the inequality \(\ln x \ge (x-1)/x\) for \(x \ge 1\) we deduce for \(\mathbf {z}\) in \({\mathcal {K}}_p\),

$$\begin{aligned} \begin{aligned} V(\mathbf {z}) \ge&\left( \frac{z_{A,1}+z_{a,1}}{z_{A,1}-z_{a,1}}-1 \right) \frac{z_{A,1}-z_{a,1}}{z_{A,1}+z_{a,1}} + \left( \frac{z_{a,2}+z_{A,2}}{z_{a,2}-z_{A,2}}-1 \right) \frac{z_{a,2}-z_{A,2}}{z_{a,2}+z_{A,2}} \\ =&\frac{2z_{a,1}}{z_{A,1}+z_{a,1}} + \frac{2z_{A,2}}{z_{a,2}+z_{A,2}} \ge \frac{2}{z_{max}}(z_{a,1}+z_{A,2}). \end{aligned} \end{aligned}$$
(54)

Hence,

$$\begin{aligned} z_{a,1}(t)+z_{A,2}(t) \le C_4 e^{-C_1 t /C_3}, \end{aligned}$$
(55)

and the exponential convergence of \( z_{a,1}\) and \(z_{A,2}\) to 0 is proved. Let us now focus on the two other variables, \( z_{A,1}\) and \(z_{a,2}\). From the definition of the dynamical system in (7), and noticing that \(| z_{A,1}(t)- \zeta |\le \zeta \) as \(\mathbf{z}\in {\mathcal {K}}_p\), we get

$$\begin{aligned} \begin{aligned}&\frac{d}{dt} \left( z_{A,1}(t)- \zeta \right) ^2 \\&\quad = - 2c z_{A,1}(t) \left( z_{A,1}(t)- \zeta \right) ^2 + 2pz_{a,2}(t)( z_{A,1}(t)- \zeta )\frac{z_{A,2}(t)}{z_{A,2}(t)+z_{a,2}}\\&\qquad - 2z_{a,1}(t)( z_{A,1}(t)- \zeta )\left( c z_{A,1}(t)+(p+b(\beta - 1))\frac{z_{A,1}(t)}{z_{A,1}(t)+z_{a,1}(t)} \right) \\&\quad \le -c z_{min} \left( z_{A,1}(t)- \zeta \right) ^2 + 2p\zeta z_{A,2}(t)+2\zeta z_{a,1}(t)\left( c z_{max}+p+b(\beta - 1) \right) \\&\quad \le -c z_{min} \left( z_{A,1}(t)- \zeta \right) ^2 + \zeta (4b\beta -2d+3p)( z_{a,1}(t)+ z_{A,2}(t) )\\&\quad \le -c z_{min} \left( z_{A,1}(t)- \zeta \right) ^2 + C_5 e^{-C_1 t /C_3} . \end{aligned} \end{aligned}$$

Hence, a classical comparison of nonnegative solutions of ordinary differential equations yields

$$\begin{aligned} ( z_{A,1}(t)- \zeta )^2\le & {} \left( ( z_{A,1}(0)- \zeta )^2 - \frac{C_5}{c z_{min}- C_1/C_3} \right) e^{-c z_{min}t}\\&+\,\frac{C_5}{c z_{min}- C_1/C_3}e^{-C_1 t /C_3}, \end{aligned}$$

which gives the exponential convergence of \(z_{A,1}\) to \(\zeta \). Reasoning similarly for the term \( z_{a,2}\) ends the proof of Theorem 2.

5 Stochastic process

In this section, we study properties of the stochastic process \((\mathbf {N}^K(t),t\ge 0)\). We derive an approximation for the extinction time of subpopulations under some small initial conditions, and then combine the results of this section with these on dynamical system (Sect. 4) to prove Theorem 3.

5.1 Approximation of the extinction time

Let us first study the stochastic system \((\mathbf {Z}^{K}(t), t \ge 0)\) around the equilibrium \((\zeta ,0,0,\zeta )\) when K is large. The aim is to estimate the time before the loss of all a-individuals in the patch 1 and all A-individuals in the patch 2, which we denote by

$$\begin{aligned} T_0^K=\inf \left\{ t\ge 0, Z^K_{a,1}(t)+Z^K_{A,2}(t)=0\right\} . \end{aligned}$$
(56)

Recall that \(\zeta =(b\beta -d)/c>0\) and that the sequence of initial states \((\mathbf {Z}^K(0),K\ge 1)\) converges in probability when K goes to infinity to a deterministic vector \(\mathbf{{z}^0}=(z_{A,1}^0, z_{a,1}^0, z_{A,2}^0, z_{a,2}^0)\in \mathbb {R}_+^{\mathcal {E}}\).

Proposition 2

There exist two positive constants \(\varepsilon _0\) and \(C_0\) such that for any \(\varepsilon \le \varepsilon _0\), if there exists \(\eta \in ]0,1/2[\) such that \(\max (|z_{A,1}^0-\zeta |,|z_{a,2}^0-\zeta |) \le \varepsilon \) and \(\eta \varepsilon /2 \le z_{a,1}^0,z_{A,2}^0 \le \varepsilon /2\), then

$$\begin{aligned} \begin{aligned}&\text {for any } C>(b(\beta -1))^{-1}+C_0\varepsilon ,&\quad \mathbb {P}(T_0^K\le C \log (K)) \underset{K\rightarrow +\infty }{\rightarrow } 1,\\&\text {for any } C <(b(\beta -1))^{-1}-C_0\varepsilon ,&\quad \mathbb {P}(T_0^K\le C \log (K)) \underset{K\rightarrow +\infty }{\rightarrow } 0. \end{aligned} \end{aligned}$$

Remark that the upper bound on \(T^K_0\) still holds if \(z_{a,1}^0=0\) or \(z_{A,2}^0=0\). Moreover, if \(z_{a,1}^0=z_{A,2}^0=0\), then the upper bound is satisfied with \(C_0= 0\). In the case where \(\eta =0\), the upper bound of the extinction time still holds but not the lower bound. Indeed, as the initial conditions \(z_{a,1}^0\) and \(z_{A,2}^0\) go to 0, the extinction time is faster.

Proof

The proof relies on several coupling arguments. Our first step is to prove that the population sizes \(Z^{K}_{A,1}\) and \(Z^{K}_{a,2}\) remain close to \(\zeta \) on a long time scale. In a second step, we couple the processes \(Z^{K}_{a,1}\) and \(Z^{K}_{A,2}\) with subcritical branching processes whose extinction times are known. We begin with introducing some additional notations: for any \(\gamma , \varepsilon >0\) and \((\alpha ,i)\in {\mathcal {E}}\),

$$\begin{aligned} R^{K,\gamma }_{\alpha ,i}:=\inf \left\{ t\ge 0, |Z^K_{\alpha ,i}(t)-\zeta |\ge \gamma \right\} , \end{aligned}$$
(57)

and

$$\begin{aligned} T^{K,\varepsilon }_{\alpha ,i}:=\inf \left\{ t\ge 0, Z^{K}_{\alpha ,i}(t)\ge \varepsilon \right\} . \end{aligned}$$
(58)

Step 1: The first step consists in proving that as long as the population processes \(Z^K_{a,1}\) and \(Z^K_{A,2}\) have small values, the processes \(Z^K_{A,1}\) and \(Z^K_{a,2}\) stay close to \(\zeta \). To this aim, we study the system on the time interval

$$\begin{aligned} I_1^{K,\varepsilon }:= \left[ 0, R^{K,\zeta /2}_{A,1} \wedge R^{K,\zeta /2}_{a,2} \wedge T^{K,\varepsilon }_{a,1} \wedge T^{K,\varepsilon }_{A,2}\right] , \end{aligned}$$

where \(a\wedge b\) stands for \(\min (a,b)\).

Let us first bound the rates of the population process \(Z^K_{A,1}\). \(\tilde{\lambda }_{\alpha ,i}\), \(\tilde{\rho }_{i \rightarrow j}\) and \(\tilde{d}_{\alpha ,i}\), \(\alpha \in \mathcal {A}\), \(i \in \mathcal {I}\) are defined in (67).

  • We start with the birth rate of A-individuals in the patch 1. Let us remark that as \(\beta >1\), the ratio \((\beta x +y)/(x+y)\le \beta \) for any \(x,y \in \mathbb {R}_+\). Moreover, the function \(x\mapsto (\beta x +y)/(x+y)\) increases with x, for any \(y\in \mathbb {R}_+\). Combining these observations with the fact that for any \(t<T^{K,\varepsilon }_{a,1}\wedge R^{K, \zeta /2}_{A,1}\), \(0\le Z^{K}_{a,1}(t)\le \varepsilon \) and \(Z^{K}_{A,1}(t)\ge \zeta /2\), we deduce that the birth rate of A-individuals in the patch 1, \(K\widetilde{\lambda }_{A,1}(\mathbf {Z}^K(t))\), can be bounded by:

    $$\begin{aligned} b\beta \left( \frac{\zeta }{\zeta +2\varepsilon }\right) K Z^K_{A,1}(t) \le K\widetilde{\lambda }_{A,1}(\mathbf {Z}^K(t)) \le b\beta K Z^K_{A,1}(t). \end{aligned}$$
  • The migration rate of A-individuals from the patch 2 to the patch 1 is sandwiched as follows for any \(t<T^{K,\varepsilon }_{a,1}\wedge R^{K,\zeta /2}_{A,1}\):

    $$\begin{aligned} 0 \le K\widetilde{\rho }_{2\rightarrow 1}(\mathbf {Z}^K(t))\le Kp\varepsilon . \end{aligned}$$
  • The death rate of A-individuals in the patch 1 and the migration rate from patch 1 to patch 2 are bounded as follows:

    $$\begin{aligned} \begin{aligned}&(d+c Z^K_{A,1}(t))K Z^K_{A,1}(t) \le K\widetilde{d}_{A,1}(\mathbf {Z}^K(t)) \le (d+c\varepsilon +cZ^K_{A,1}(t))K Z^K_{A,1}(t), \\&0 \le K\widetilde{\rho }_{1 \rightarrow 2}(\mathbf {Z}^K(t))\le K p\varepsilon . \end{aligned} \end{aligned}$$

Hence, using an explicit construction of the process \(Z^K_{A,1}\) by means of Poisson point measures as in (6), we deduce that on the time interval \(I_1^{K,\varepsilon }\), \(Z^{K}_{A,1}\) is stochastically bounded by

$$\begin{aligned} \mathcal {Y}^K_{inf} \preccurlyeq Z^K_{A,1} \preccurlyeq \mathcal {Y}^K_{sup}, \end{aligned}$$

where \(\mathcal {Y}^K_{inf}\) is a \(\mathbb {N}/ K\)-valued Markov jump process with transition rates

$$\begin{aligned} \begin{aligned} Kb\beta \left( 1-\frac{2\varepsilon }{\zeta +2\varepsilon }\right) \frac{i}{K}&\quad \text { from } \frac{i}{K} \text { to } \frac{(i+1)}{K},\\ K\left( \left( d+c\varepsilon +c \frac{i}{K} \right) \frac{i}{K} +p\varepsilon \right)&\quad \text { from } \frac{i}{K} \text { to } \frac{(i-1)}{K}, \end{aligned} \end{aligned}$$

and initial value \(Z^K_{A,1}(0)\), and \(\mathcal {Y}^K_{sup}\) is a \(\mathbb {N}/ K\)-valued Markov jump process with transition rates

$$\begin{aligned} \begin{aligned} K\left( b\beta \frac{i}{K} + p\varepsilon \right)&\quad \text { from } \frac{i}{K} \text { to } \frac{(i+1)}{K},\\ K\left( d +c \frac{i}{K} \right) \frac{i}{K}&\quad \text { from } \frac{i}{K} \text { to } \frac{(i-1)}{K}. \end{aligned} \end{aligned}$$

and initial value \(Z^K_{A,1}(0)\).

Let us focus on the process \(\mathcal {Y}^K_{inf}\). Using a proof similar to the one of Lemma 1, we prove that since the sequence \((\mathcal {Y}^K_{inf}(0),K\ge 1)\) converges in probability to the deterministic value \(z_{A,1}^0\),

$$\begin{aligned} \underset{K\rightarrow +\infty }{\lim } \underset{s\le t}{\sup } \ |\mathcal {Y}^K_{inf}(s)-\varPhi _{inf}(s)|=0\quad \quad a.s \end{aligned}$$

for every finite time \(t>0\), where \(\varPhi _{inf}\) is the solution to

$$\begin{aligned} \varPhi ^{\prime }(t)=b\beta (1-2\varepsilon /(\zeta +2\varepsilon )) \varPhi (t)-p\varepsilon -(d+c\varepsilon +c\varPhi (t))\varPhi (t) \end{aligned}$$
(59)

with initial value \(z_{A,1}^0\). Let us study the trajectory of \(\varPhi _{inf}\). The polynomial in \(\varPhi (t)\) on the r.h.s. of (59) has two roots

$$\begin{aligned} \varPhi ^{\pm }_{inf}= & {} \frac{1}{2c}\left( b\beta \left( 1-\frac{2\varepsilon }{\zeta +2\varepsilon }\right) -d-c\varepsilon \pm \sqrt{\left( b\beta \left( 1-\frac{2\varepsilon }{\zeta +2\varepsilon }\right) -d-c\varepsilon \right) ^2-4pc\varepsilon } \right) \nonumber \\= & {} \frac{\zeta }{2}-\frac{\varepsilon }{2} \left( \frac{2b\beta }{(\zeta +2\varepsilon )c}+1\right) \pm \sqrt{\left( \frac{\zeta }{2}- \frac{\varepsilon }{2} \left( \frac{2b\beta }{(\zeta +2\varepsilon )c}+1\right) \right) ^2-\frac{p\varepsilon }{c}}. \end{aligned}$$
(60)

As a consequence, \(\varPhi ^{\prime }>0\) if and only if \(\varPhi \in ] \varPhi ^{-}_{inf}, \varPhi ^{+}_{inf}[\). Definition (60) implies that for small \(\varepsilon \),

$$\begin{aligned} \varPhi ^{-}_{inf}\sim pc\varepsilon . \end{aligned}$$

Hence, if \(\varepsilon _0\) is chosen sufficiently small and for any \(\varepsilon <\varepsilon _0\),

$$\begin{aligned} \varPhi ^{-}_{inf}\le 2pc\varepsilon _0 <z^0_{A,1}. \end{aligned}$$

Thus, we observe that any solution to (59) with initial condition \(\varPhi _{inf}(0)\in [2pc\varepsilon _0,+\infty [\) is monotonous and converges to \(\varPhi _{inf}^+\). Similarly, we obtain that if \(\varepsilon _0\) is sufficiently small, then there exists \(M^{\prime }>0\) such that for any \(\varepsilon <\varepsilon _0\), \(|\varPhi _{inf}^+-\zeta |\le M^{\prime }\varepsilon \). We define the stopping time

$$\begin{aligned} R^{K,M^{\prime }}_{ \mathcal {Y}^K_{inf}}:=\inf \left\{ t\ge 0, \mathcal {Y}^K_{inf} \not \in [\zeta -(M^{\prime }+1)\varepsilon ,\zeta +(M^{\prime }+1)\varepsilon ] \right\} . \end{aligned}$$

As in the proof of Theorem 3/(c) in the article by Champagnat (2006), we can construct a family of Markov jump processes \(\widetilde{\mathcal {Y}}^K_{inf}\) with transition rates that are positive, bounded, Lipschitz and uniformly bounded away from 0, for which we can find the following estimate (Chapter 5 of the book by Freidlin and Wentzell 1984): there exists \(V^{\prime }>0\) such that,

$$\begin{aligned} \mathop \mathrm{lim}\limits _{K\rightarrow +\infty }\mathbb {P}\left( R^{K,M^{\prime }}_{\mathcal {Y}^K_{inf}}> e^{KV^{\prime }}\right) =\mathop \mathrm{lim}\limits _{K\rightarrow +\infty }\mathbb {P}\left( R^{K,M^{\prime }}_{\widetilde{\mathcal {Y}}^K_{inf}}>e^{KV^{\prime }}\right) =1. \end{aligned}$$

We can deal with the process \(\mathcal {Y}^K_{sup}\) similarly and find \(M^{\prime \prime }>0\) and \(V^{\prime \prime }>0\) such that

$$\begin{aligned} \mathbb {P}\left( R^{K,M^{\prime \prime }}_{\mathcal {Y}^K_{sup}}> e^{KV^{\prime \prime }}\right) \underset{K\rightarrow +\infty }{\rightarrow } 1, \end{aligned}$$

with

$$\begin{aligned} R^{K,M^{\prime \prime }}_{\mathcal {Y}^K_{sup}}:=\inf \left\{ t\ge 0, \mathcal {Y}^K_{sup}(t) \not \in [\zeta -(M^{\prime \prime }+1)\varepsilon , \zeta +(M^{\prime \prime }+1)\varepsilon ]\right\} . \end{aligned}$$

Finally, for \(M_1=M^{\prime } \vee M^{\prime \prime }\) and \(V_1=V^{\prime } \wedge V^{\prime \prime }\), we deduce that

$$\begin{aligned} \mathbb {P}(R^{K,M_1}_{\mathcal {Y}^K_{inf}} \wedge R^{K,M_1}_{\mathcal {Y}^K_{sup}}>e^{KV_1})\underset{K\rightarrow +\infty }{\rightarrow }1. \end{aligned}$$

Moreover, if \(R^{K,(M_1+1)\varepsilon }_{A,1} \le R^{K,\zeta /2}_{A,1} \wedge R^{K,\zeta /2}_{a,2} \wedge T^{K,\varepsilon }_{a,1} \wedge T^{K,\varepsilon }_{A,2},\) then

$$\begin{aligned} R^{K,(M_1+1)\varepsilon }_{A,1}\ge R^K_{\mathcal {Y}^K_{inf}} \wedge R^K_{\mathcal {Y}^K_{sup}}. \end{aligned}$$

Thus

$$\begin{aligned} \mathbb {P}\left( R^{K,\zeta /2}_{A,1} \wedge R^{K,\zeta /2}_{a,2} \wedge T^{K,\varepsilon }_{a,1} \wedge T^{K,\varepsilon }_{A,2} \wedge e^{KV_1} > R^{K,(M_1+1)\varepsilon }_{A,1}\right) \underset{K\rightarrow +\infty }{\rightarrow }0. \end{aligned}$$
(61)

Using symmetrical arguments for the population process \(Z^K_{a,2}\), we find \(M_2>0\) and \(V_2>0\) such that

$$\begin{aligned} \mathbb {P}\left( R^{K,\zeta /2}_{A,1} \wedge R^{K,\zeta /2}_{a,2} \wedge T^{K,\varepsilon }_{a,1} \wedge T^{K,\varepsilon }_{A,2} \wedge e^{KV_2} > R^{K,(M_2+1)\varepsilon }_{a,2}\right) \underset{K\rightarrow +\infty }{\rightarrow }0. \end{aligned}$$
(62)

Finally, we set \(M=M_1 \vee M_2\) and \(V=V_1 \wedge V_2\). Limits (61) and (62) are still true with M and V. Thus we have proved that, as long as the size of the a-population in Patch 1 and the size of the A-population in Patch 2 are small and as long as the time is smaller than \(e^{KV}\), the processes \(Z^{K}_{A,1}\) and \(Z^K_{a,2}\) stay close to \(\zeta \), i.e. they belong to \([\zeta -(M+1)\varepsilon , \zeta +(M+1)\varepsilon ]\).

Note that if \(\varepsilon _0\) is sufficiently small, \(R^{K,(M+1)\varepsilon }_{A,1}\le R^{K,\zeta /2}_{A,1}\) and \(R^{K,(M+1)\varepsilon }_{a,2}\le R^{K,\zeta /2}_{a,2}\) a.s. for all \(\varepsilon <\varepsilon _0\). So we reduce our study to the time interval

$$\begin{aligned} I_2^{K,\varepsilon }:= \left[ 0, R^{K,(M+1)\varepsilon }_{A,1} \wedge R^{K,(M+1)\varepsilon }_{a,2} \wedge T^{K,\varepsilon }_{a,1} \wedge T^{K,\varepsilon }_{A,2} \right] . \end{aligned}$$

Step 2: In the sequel we study the extinction time of the stochastic processes \((Z^K_{a,1}(t),t\ge 0)\) and \((Z^K_{A,2}(t),t\ge 0)\). We recall that there exists \(\eta \in ]0,1/2[\) such that \(\eta \varepsilon /2 \le z_{a,1}^0,z_{A,2}^0 \le \varepsilon /2\). Bounding the birth and death rates of \((Z^K_{a,1}(t),t\ge 0)\) and \((Z^K_{A,2}(t),t\ge 0)\) as previously, we deduce that the sum \((Z^K_{a,1}(t)+Z^K_{A,2}(t),t\ge 0)\) is stochastically bounded as follows, on the time interval \(I_2^{K,\varepsilon }\):

$$\begin{aligned} \frac{\mathcal {N}^K_{inf}}{K} \preccurlyeq Z^K_{a,1}+Z^K_{A,2} \preccurlyeq \frac{\mathcal {N}^K_{sup}}{K}. \end{aligned}$$

where \(\mathcal {N}^K_{inf}\) is a \(\mathbb {N}\)-valued binary branching process with birth rate \(b+p \frac{\zeta -(M+1)\varepsilon }{\zeta -M\varepsilon }\), death rate \(d +c\zeta +c(M+2)\varepsilon +p \) and initial state \(\lfloor \eta \varepsilon K \rfloor \), and \(\mathcal {N}^K_{sup}\) is a \(\mathbb {N}\)-valued binary branching process with birth rate

$$\begin{aligned} b\frac{\zeta +\varepsilon (\beta -M-1)}{\zeta -M\varepsilon }+p, \end{aligned}$$

death rate

$$\begin{aligned} d +c\zeta -c(M+1)\varepsilon +p \frac{\zeta -(M+1)\varepsilon }{\zeta -M\varepsilon } , \end{aligned}$$

and initial state \(\lfloor \varepsilon K \rfloor +1\).

It remains to estimate the extinction time for a binary branching process \((\mathcal {N}_t,t\ge 0)\) with a birth rate B and a death rate \(D> B\). Applying (68) with \(i=\lfloor \eta \varepsilon K \rfloor \), we get:

$$\begin{aligned} \begin{aligned} \forall C<(D-B)^{-1}, \quad&\mathbb {P}(S_{0}^{\mathcal {N}} \le C \log (K))\underset{K\rightarrow +\infty }{\rightarrow } 0,\\ \forall C>(D-B)^{-1}, \quad&\mathbb {P}(S_{0}^{\mathcal {N}} \le C \log (K))\underset{K\rightarrow +\infty }{\rightarrow } 1. \end{aligned} \end{aligned}$$

Moreover, if

$$\begin{aligned} S_{\lfloor \varepsilon K \rfloor }^{\mathcal {N}}:= \inf \{t >0, \mathcal {N}(t)\ge \lfloor \varepsilon K \rfloor \} , \end{aligned}$$

then

$$\begin{aligned} \mathbb {P}\left( S_{0}^{\mathcal {N}}>K \wedge S_{\lfloor \varepsilon K \rfloor }^{\mathcal {N}}\right) \underset{K \rightarrow +\infty }{\rightarrow } 0 \end{aligned}$$
(63)

(cf. Theorem 4 in the article by Champagnat 2006). Thus

$$\begin{aligned} \begin{aligned} \mathbb {P}&(T^K_0<C \log (K))-\mathbb {P}\left( S_{0}^{\mathcal {N}^K_{inf}}<C \log (K)\right) \\&\le \mathbb {P}\left( T^K_0>T^{K,\varepsilon }_{a,1}\wedge T^{K,\varepsilon }_{A,2}\wedge K\right) \\&\quad + \mathbb {P}\left( T^{K,\varepsilon }_{a,1}\wedge T^{K,\varepsilon }_{A,2}\wedge K> R^{K,(M+1)\varepsilon }_{A,1} \wedge R^{K,(M+1)\varepsilon }_{a,2}\right) \\&\le \mathbb {P}\left( S_{0}^{\mathcal {N}^K_{sup}}>S_{\lfloor \varepsilon K \rfloor }^{\mathcal {N}^K_{sup}}\wedge K\right) + \mathbb {P}\left( T^{K,\varepsilon }_{a,1}\wedge T^{K,\varepsilon }_{A,2}\wedge K> R^{K,(M+1)\varepsilon }_{A,1} \wedge R^{K,(M+1)\varepsilon }_{a,2}\right) . \end{aligned} \end{aligned}$$

The last term of the last line converges to 0 when K tends to 0 according to (61) and (62). The first one also tends to 0 according to (63). Thus,

$$\begin{aligned} \lim _{K\rightarrow +\infty }\mathbb {P}\left( T^K_0<C \log (K)\right) \le \lim _{K\rightarrow +\infty }\mathbb {P}\left( S_{0}^{\mathcal {N}^K_{inf}}<C \log (K)\right) . \end{aligned}$$

We prove similarly that

$$\begin{aligned} \lim _{K\rightarrow +\infty }\mathbb {P}\left( T^K_0<C \log (K)\right) \ge \lim _{K\rightarrow +\infty }\mathbb {P}\left( S_{0}^{\mathcal {N}^K_{sup}}<C \log (K)\right) . \end{aligned}$$

We conclude the proof by noticing that the growth rates of the processes \(\mathcal {N}^K_{inf}\) and \(\mathcal {N}^K_{sup}\) are equal to \(-b(\beta -1)\) up to a constant times \(\varepsilon \).

5.2 Proof of Theorem 3

We can now prove our main result:

Let \(\varepsilon \) be a small positive number. Applying Lemma 1 and Theorem 1 we get the existence of a positive real number \(s_\varepsilon \) such that

$$\begin{aligned} \lim _{K \rightarrow \infty } \mathbb {P}\left( \Vert \mathbf {N}^K(s_\varepsilon ) - (\zeta K,0,0,\zeta K)\Vert \le \varepsilon K/2 \right) =1 . \end{aligned}$$

Using Proposition 2 and the Markov property yields that there exists \(C_0>0\) such that

$$\begin{aligned} \lim _{K \rightarrow \infty }\mathbb {P}\left( \left| \frac{T^K_{ {\mathcal {B}}_{\varepsilon }}}{\log K}-\frac{1}{b(\beta -1)} \right| \le C_0\varepsilon \right) = 1, \end{aligned}$$

where by definition, we recall that \(T^K_{ {\mathcal {B}}_{\varepsilon }}\) is the hitting time of \( {\mathcal {B}}_{\varepsilon }\). Moreover, the migration rates are equal to zero for any \(t\ge T^K_{ {\mathcal {B}}_{\varepsilon }}\), so

$$\begin{aligned} Z^K_{a,1}(t)=Z^K_{A,2}(t)=0, \ \text { for any } \ t\ge T^K_{ {\mathcal {B}}_\varepsilon }. \end{aligned}$$

After the time \(T^K_{ {\mathcal {B}}_\varepsilon }\), the A-population in the patch 1 and the a-population in the patch 2 evolve independently from each other according to two logistic birth and death processes with birth rate \(b\beta \), death rate d and competition rate c. Using Theorem 3(c) in the article by Champagnat (2006), we deduce that for any \(m>1\), there exists \(V>0\) such that

$$\begin{aligned} \inf _{X \in {\mathcal {B}}_\varepsilon } \mathbb {P}_{X}(T^K_{{\mathcal {B}}_{m\varepsilon }} \ge e^{KV}) \underset{K\rightarrow +\infty }{\rightarrow } 1, \end{aligned}$$

which ends the proof.

6 Influence of the migration parameter p: numerical simulations

In this section, we present some simulations of the deterministic dynamical system (7). We are concerned with the influence of the migration rate p on the time to reach a neighbourhood of the equilibrium (12). Note that p has no impact on the corresponding relaxation time for the stochastic system, because extinction of the minorities happens on a longer time scale.

For any value of p, we evaluate the first time \(T_{\varepsilon }(p)\) such that the solution \((z_{A,1}(t),z_{a,1}(t), z_{A,2}(t),z_{a,2}(t))\) to (7) belongs to the set

$$\begin{aligned} \mathcal {S_\varepsilon }=\left\{ (z_{A,1},z_{a,1},z_{A,2},z_{a,2})\in \mathbb {R}_+^4, (z_{A,1}-\zeta )^2+z_{a,1}^2+z_{A,2}^2+(z_{a,2}-\zeta )^2\le \varepsilon ^2 \right\} , \end{aligned}$$

which corresponds to the first time the solution enters an \(\varepsilon -\)neighbourhood of \((\zeta ,0,0,\zeta )\).

In the following simulations, the demographic parameters are given by:

$$\begin{aligned} \beta =2, \qquad b=2, \qquad d=1 \qquad \text {and} \qquad c=0.1. \end{aligned}$$

For these parameters,

$$\begin{aligned} \zeta =30 \qquad \text {and} \qquad p_0=\sqrt{5}-1\simeq 1.24. \end{aligned}$$

The migration rate as well as the initial condition vary.

Description of the figures Figure 2 presents the plots of \(p \mapsto T_{\varepsilon }(p)-T_{\varepsilon }(0)\). The simulations are computed with \(\varepsilon =0.01\) and with initial conditions \((z_{A,1}(0),z_{a,1}(0),z_{A,2}(0),z_{a,2}(0))\) such that \(z_{a,1}(0)=z_{A,1}(0)-0.1\) with \(z_{A,1}(0)\in \{0.3,0.5,1,2,3,5,10,15\}\) and \((z_{A,2}(0),z_{a,2}(0))\in \{(1,30),(15,16)\}\). Figure 3 presents the trajectories of some solutions to the dynamical system (7) in the two phase planes which represent the two patches. We use the same parameters as in Fig. 2 and the initial conditions are given in the captions. For each initial condition, we plot the trajectories for three different values of p: 0, 1 and 20.

Conjecture First of all, we observe that for all values under consideration, the time \(T_{\varepsilon }(p)\) to reach the set \(\mathcal {S}_{\varepsilon }\) is finite even if \(p>p_0\). Therefore, we make the following conjecture:

Conjecture 1

For any initial condition \((z_{1,A}(0),z_{1,a}(0),z_{2,A}(0),z_{2,a}(0))\in {\mathcal {D}}\), where \({\mathcal {D}}\) is defined by (17),

$$\begin{aligned} (z_{1,A}(t),z_{1,a}(t),z_{2,A}(t),z_{2,a}(t)) \underset{t\rightarrow +\infty }{\longrightarrow } (\zeta ,0,0,\zeta ). \end{aligned}$$
Fig. 2
figure 2

For different values of the initial condition, we plot \(p \mapsto T_{\varepsilon }(p)-T_{\varepsilon }(0)\). The initial condition is \((z_{A,1}(0),z_{A,1}(0)-0.1,z_{A,2}(0),z_{a,2}(0))\) where \(z_{A,1}(0)\in \{0.3,0.5,1,2,3,5,10,15\}\) as represented by the colors of the legend. a \((z_{A,2}(0),z_{a,2}(0))=(1,30)\). b \((z_{A,2}(0),z_{a,2}(0))=(15,16)\) (color figure online)

Fig. 3
figure 3

For four different initial conditions, we plot the trajectories in the phase planes which represent the patch 1 (left) and the patch 2 (right) for \(t\in [0,10]\) and for three values of p: \(p=0\) (red), \(p=1\) (blue), \(p=20\) (green). Note that the initial conditions on a and c (resp. b and d) corresponds to the dark green (resp. light green) curve on Fig. 2a, b. a (10, 9.9, 1, 30). b (1, 0.9, 1, 30). c (10, 9.9, 15, 16). d (1, 0.9, 15, 16) (color figure online)

Influence of p when the initial condition in patch 2 is close to the equilibrium Figure 2a presents the results for \((z_{A,2}(0),z_{a,2}(0))=(1,30)\), that is if the initial condition in the patch 2 is close to its equilibrium (recall that \(\zeta =30\) with the parameters under study). Observe that for any value of \((z_{A,1}(0),z_{a,1}(0)=z_{A,1}(0)-0.1)\), the time for reproductive isolation to occur is reduced when the migration rate is large. Hence, the migration rate seems here to strengthen the homogamy. This is confirmed by Fig. 3a, b where examples of trajectories with the same initial conditions as in Fig. 2a are drawn. The two Fig. 3a, b present similar behaviours: when p increases, the number of a-individuals in patch 1 decreases at any time whereas the number and the proportion of a-individuals in patch 2 remain almost constant. These behaviours derive from two phenomena. On the one hand, the a-individuals are able to leave patch 1 faster when p is large. On the other hand, the value of p does not affect the migration outside patch 2 which is almost zero in view of the small proportion of A-individuals in the patch 2.

Influence of p when a- and A-population sizes are initially similar in patch 2 On Fig. 2b we are interested in the case where the A- and a- initial populations in patch 2 have a similar size and the sum \(z_{A,2}(0)+z_{a,2}(0)\) is close to \(\zeta \). Observe that for \(z_{A,1}(0)\in \{5,10,15\}\), the time \(T_{\varepsilon }(p)\) decreases with respect to p but not as fast as previously. By plotting some trajectories when \(z_{A,1}(0)=10\) on Fig. 3c, we note that the dynamics is not the same as for the previous case (Fig. 3a). Here, a large migration rate affects the migration outside the two patches in such a way that the equilibrium is reached faster.

Finally, Fig. 2b also presents behaviours that are essentially different for \(z_{A,1}(0)\in \{0.3,0.5,1,2,3\}\). In these cases, the migration rate does not strengthen the homogamy. We plot some trajectories from this latter case in Fig. 3d where \(z_{A,1}(0)=1\). Observe that a high value of p favors the migration outside patch 2 for the two types a and A since the proportions of the two alleles in patch 2 are almost equal at time \(t=0\). This is not the case in the patch 1 where the value of p does not affect significantly the initial migration outside patch 1 since the population sizes are smaller. Hence, patch 1 is filled by the individuals that flee patch 2 where the migration rate is high. Therefore, both a- and A- populations increase at first, but the A-individuals remain dominant in patch 1 and thus the a-population is disadvantaged. Finally, the a-individuals that flee the patch 2, find a less favorable environment in patch 1 and therefore the time needed to reach the equilibrium is extended because of the dynamics in patch 1.

Conclusion As a conclusion, similarly to the case of selection-migration model (see e.g. the article by Akerman and Bürger 2014) migration can have different impacts on the population dynamics. On the one hand, a large migration rate helps the individuals to escape a disadvantageous habitat (Clobert et al. 2001) but there are also risks to move through unfamiliar or less suitable habitat. Thus, a trade-off between the two phenomena explains the influence of p on the time to reach the equilibrium.

7 Generalisations of the model

Until now we studied a simple model to make clear the important properties allowing to get spatial segregation between patches. We now prove that our findings are robust by studying some generalisations of the model and showing that we can relax several assumptions and still get spatial segregation between patches.

7.1 Differences between patches

We assumed that the patches were ecologically equivalent in the sense that the birth, death and competition rates b, d and c, respectively, did not depend on the label of the patch \(i \in {\mathcal {I}}\). In fact we could make these parameters depend on the patch, and denote them \(b_i\), \(d_i\) and \(c_i\), \(i \in {\mathcal {I}}\). In the same way, the sexual preference \(\beta _i\) and the migration rate \(p_i\) could depend on the label of the patch \(i \in {\mathcal {I}}\). As a consequence, the dynamical system (7) becomes

$$\begin{aligned} \left\{ \begin{array}{l} \frac{d}{dt}z_{A,1}(t)= z_{A,1}\left[ b_1\frac{\beta _1 z_{A,1}+z_{a,1}}{z_{A,1}+z_{a,1}}-d_1-c_1(z_{A,1}+z_{a,1})-p_1\frac{z_{a,1}}{z_{A,1}+z_{a,1}}\right] +p_2\frac{z_{A,2}z_{a,2}}{z_{A,2}+z_{a,2}}\\ \frac{d}{dt}z_{a,1}(t)= z_{a,1}\left[ b_1\frac{\beta _1 z_{a,1}+z_{A,1}}{z_{A,1}+z_{a,1}}-d_1-c_1(z_{A,1}+z_{a,1})-p_1\frac{z_{A,1}}{z_{A,1}+z_{a,1}}\right] +p_2\frac{z_{A,2}z_{a,2}}{z_{A,2}+z_{a,2}}\\ \frac{d}{dt}z_{A,2}(t)= z_{A,2}\left[ b_2\frac{\beta _2 z_{A,2}+z_{a,2}}{z_{A,2}+z_{a,2}}-d_2-c_2(z_{A,2}+z_{a,2})-p_2\frac{z_{a,2}}{z_{A,2}+z_{a,2}}\right] +p_1\frac{z_{A,1}z_{a,1}}{z_{A,1}+z_{a,1}}\\ \frac{d}{dt}z_{a,2}(t)= z_{a,2}\left[ b_2\frac{\beta _2 z_{a,2}+z_{A,2}}{z_{A,2}+z_{a,2}}-d_2-c_2(z_{A,2}+z_{a,2})-p_2\frac{z_{A,2}}{z_{A,2}+z_{a,2}}\right] +p_1\frac{z_{A,1}z_{a,1}}{z_{A,1}+z_{a,1}}.\end{array}\right. \end{aligned}$$
(64)

The set \({\mathcal {D}}\) is still invariant under this new system and the solutions to (64) with initial conditions in \({\mathcal {D}}\) hit in finite time the invariant set

$$\begin{aligned} {\mathcal {K}}_p^{\prime }:= \left\{ \mathbf {z} \in {\mathcal {D}}, \; z_{A,i}+z_{a,i} \in \left[ \frac{b_i(\beta _i+1)-2d_i-p_i}{2c_i}, \zeta _i+\frac{p_{\bar{i}}}{2c_i}, \ i \in {\mathcal {I}}\right] \right\} , \end{aligned}$$

where

$$\begin{aligned} \zeta _i: = \frac{b_i\beta _i- d_i}{c_i}. \end{aligned}$$

As \({\mathcal {D}}\) is an invariant set under (64), we can define the function V as in (45) for every solution of V with initial condition in \({\mathcal {D}}\). Its first order derivative is

$$\begin{aligned} \frac{d}{dt}V(\mathbf {z}(t)) = -\underset{i=1,2}{\sum } \frac{z_{A,i}z_{a,i}}{z_{A,i}+z_{a,i}}\left[ \frac{2b_i(\beta _i-1)+2p_i}{z_{A,i}+z_{a,i}}- \frac{2p_{i}}{z_{A,\bar{i}}+z_{a,\bar{i}}} \right] . \end{aligned}$$

As a consequence, we can prove similar results to Theorems 2 and 3 under the assumption that \(p_1\) and \(p_2\) satisfy

$$\begin{aligned} p_i c_{\bar{i}} (2 c_i \zeta _i + p_{\bar{i}}) < c_i (b_i(\beta _i-1)+p_i) (b_{\bar{i}}(\beta _{\bar{i}}+1)-2d_{\bar{i}}-p_{\bar{i}}), \text { for } i \in {\mathcal {I}}, \end{aligned}$$

and where the constant in front of the time \(\log K\) is no more \(\frac{1}{b(\beta -1)}\) but \(\frac{1}{\omega _{1,2}}\) with

$$\begin{aligned} \omega _{1,2}= & {} \frac{1}{2}(b_1(\beta _1-1)+p_1+b_2(\beta _2-1)+p_2)\\&-\frac{1}{2}\sqrt{(b_1(\beta _1-1)+p_1-b_2(\beta _2-1)-p_2)^2+4p_1p_2}. \end{aligned}$$

Here, note that the constant does depend on all the parameters. Indeed, since there is no ecological neutrality between the two patches, there do not exist simplifications and balancings as in the previous models.

7.2 Migration

The migration rates under consideration increase when the genetic diversity increases. Indeed, let us consider

$$\begin{aligned} H_T^{(i)} := 1-\left[ \left( \frac{n_{A,i}}{n_{A,i}+n_{a,i}}\right) ^2 + \left( \frac{n_{a,i}}{n_{A,i}+n_{a,i}}\right) ^2\right] \end{aligned}$$

as a measure of the genetic diversity in the patch \(i \in {\mathcal {I}}\). Note that \(H_T^{(i)} \in [0,1/2]\) is known as the “total gene diversity” in the patch i (see the article by Nei 1975 for instance) and is widely used as a measure of diversity. When we express the migration rates in terms of this measure, we get

$$\begin{aligned} \rho _{\alpha ,\bar{i} \rightarrow i }(n)= p \frac{n_{A,i}n_{a,i}}{n_{A,i}+n_{a,i}}= \frac{p}{2} (n_{A,i}+n_{a,i})H_T^{(i)} . \end{aligned}$$

Hence we can consider that the migration helps the speciation. Let us show that we can get the same kind of result when we consider an arbitrary form for the migration rate if this latter is symmetrical and bounded. We thus consider a more general form for the migration rate. More precisely,

$$\begin{aligned} \rho _{\alpha ,\bar{i} \rightarrow i }(n)= p(n_{A,\bar{i}}, n_{a,\bar{i}}), \end{aligned}$$

and we assume

$$\begin{aligned} p(n_{A,\bar{i}}, n_{a,\bar{i}})=p(n_{a,\bar{i}}, n_{A,\bar{i}})\quad \text {and} \quad p(n_{A,\bar{i}}, n_{a,\bar{i}}) \frac{n_{A,\bar{i}}+n_{a,\bar{i}}}{n_{A,\bar{i}}n_{a,\bar{i}}}<p_0, \end{aligned}$$

where \(p_0\) has been defined in (18). Note that the second condition on the function p imposes that as one of the population sizes goes to 0, then so does the migration rate. In particular, this condition ensures that the points given by (12) and (13) are still equilibria of the system. Theorems 2 and 3 still hold with this new definition for the migration rate.

7.3 Number of patches

Finally, we restricted our attention to the case of two patches, but we can consider an arbitrary number \(N \in \mathbb {N}\) of patches. We assume that all the patches are ecologically equivalent but that the migrant individuals have a probability to migrate to an other patch which depends on the geometry of the system. Moreover, we allow the individuals to migrate outside the N patches. In other words, for \(\alpha \in {\mathcal {A}}\), \(i\le N\), \(j\le N+1\) and \(\mathbf {n} \in (\mathbb {N}^{{\mathcal {A}}})^{N}\),

$$\begin{aligned} \rho _{\alpha , i \rightarrow j}(\mathbf {n})= p_{ij}\frac{n_{A,i}n_{a,i}}{n_{A,i}+n_{a,i}}, \end{aligned}$$

where the “patch” \(N+1\) denotes the outside of the system.

As a consequence, we obtain the following limiting dynamical system for the rescaled process, when the initial population sizes are of order K in all the patches: for every \(1 \le i \le N\),

$$\begin{aligned} \begin{aligned} \frac{dz_{A,i}(t)}{dt}&= z_{A,i}\left[ b\frac{\beta z_{A,i}+z_{a,i}}{z_{A,i}+z_{a,i}}-d-c(z_{A,i}+z_{a,i})-\sum _{j\ne i, j\le N+1} p_{ij}\frac{z_{a,i}}{z_{A,i}+z_{a,i}}\right] \\&\quad +\,\sum _{j \ne i, j\le N} p_{ji}\frac{z_{A,j}z_{a,j}}{z_{A,j}+z_{a,j}}\\ \frac{dz_{a,i}(t)}{dt}&= z_{a,i}\left[ b\frac{\beta z_{a,i}+z_{A,i}}{z_{A,i}+z_{a,i}}-d-c(z_{A,i}+z_{a,i})-\sum _{j\ne i, j\le N+1} p_{ij}\frac{z_{A,i}}{z_{A,i}+z_{a,i}}\right] \\&\quad +\,\sum _{j \ne i, j\le N} p_{ji}\frac{z_{A,j}z_{a,j}}{z_{A,j}+z_{a,j}}\end{aligned} \end{aligned}$$
(65)

For the sake of readability, we introduce the two following notations:

$$\begin{aligned} p_{i\rightarrow }:=\sum _{j\ne i, j\le N+1} p_{ij} \quad \text {and} \quad p_{i\leftarrow }:= \sum _{j\ne i, j\le N} p_{ji}. \end{aligned}$$

Let \(N_A\) be an integer smaller than N which gives the number of patches with a majority of individuals of type A. We can assume without loss of generality that

$$\begin{aligned} z_{A,i}(0)> z_{a,i}(0),\ \text {for} \ 1 \le i \le N_A, \quad \text {and} \quad z_{A,i}(0)<z_{a,i}(0), \ \text {for} \ N_A+1\le i \le N . \end{aligned}$$

Let us introduce the subset of \((\mathbb {R}_+^{\mathcal {A}})^{N}\)

$$\begin{aligned} {\mathcal {D}}_{N_A,N}:=\{ \mathbf {z} \in (\mathbb {R}_+^{\mathcal {A}})^{N}, z_{A,i}-z_{a,i}>0\,\, \text {for} \,\, i \le N_A, \quad \text {and} \quad z_{a,i}-z_{A,i}>0 \ \text {for} \ i >N_A \}, \end{aligned}$$

We assume that the sequence \((p_{ij})_{i,j\in \{1,\ldots ,N\}}\) satisfies: for all \(i\in \{1,..,N\}\),

$$\begin{aligned} p_{i\rightarrow }< b(\beta +1)-2d \ \text { and } \ \frac{b(\beta -1)+p_{i\rightarrow }}{2cz +p_{i\leftarrow }}-\sum _{j\ne i, j\le N+1}\frac{p_{ij}}{b(\beta +1)-2d-p_{j\rightarrow }}>0. \end{aligned}$$
(66)

Then we have the following result:

Theorem 4

We assume that Assumption (66) holds. Let us assume that \(\mathbf {Z}^K(0)\) converges in probability to a deterministic vector \(\mathbf{{z}^0}\) belonging to \({\mathcal {D}}_{N_A,N}\) with \((z_{a,1}^0,z_{A,2}^0)\ne (0,0)\). Introduce the following bounded set depending on \(\varepsilon >0\):

$$\begin{aligned} {\mathcal {B}}_{N_A,N,\varepsilon }:= \left( [({\zeta }-\varepsilon )K,({\zeta }+\varepsilon )K] \times \{0\} \right) ^{N_A}\times \left( \{0\} \times [({\zeta }-\varepsilon )K,({\zeta }+\varepsilon )K]\right) ^{N-N_A}. \end{aligned}$$

Then there exist three positive constants \(\epsilon _0\), \(C_0\) and m, and a positive constant V depending on \((m,\varepsilon _0)\) such that if \(\varepsilon \le \varepsilon _0\),

$$\begin{aligned} \lim _{K \rightarrow \infty }\mathbb {P}\left( \left| \frac{T^K_{{\mathcal {B}}_{\varepsilon }}}{\log K}-\frac{1}{b(\beta -1)} \right| \le C_0\varepsilon , \mathbf {N}^K\left( T^K_{{\mathcal {B}}_{N_A,N,\varepsilon }}+t\right) \in {\mathcal {B}}_{N_A,N,m\varepsilon }\; \forall t \le e^{VK} \right) = 1, \end{aligned}$$

where \(T^K_{\mathcal {B}}\), \({\mathcal {B}} \subset \mathbb {R}_+^{\mathcal {E}}\) is the hitting time of the set \({\mathcal {B}}\) by the population process \(\mathbf {N}^K\).

The proof is really similar to the one for the two patches. To handle the deterministic part of the proof, we first show that for every initial condition on \({\mathcal {D}}_{N_A,N}\), the solution of (65) hits the set

$$\begin{aligned} {\mathcal {K}}_{N_A,N}:= & {} \left\{ \mathbf {z} \in \left( (\mathbb {R}_+^*)^{{\mathcal {A}}}\right) ^{N}, \; \{z_{A,i}+z_{a,i} \}\right. \\&\left. \in \left[ \frac{b(\beta +1)-2d-p_{i\rightarrow }}{2c}, \zeta +\frac{p_{i\leftarrow }}{2c}\right] \forall i \le N \right\} \cap {\mathcal {D}}_{N_A,N} \end{aligned}$$

in finite time, and that this set is invariant under (65). Then, we conclude with the Lyapunov function

$$\begin{aligned} \mathbf {z} \in {\mathcal {K}}_{N_A,N} \mapsto \underset{i \le N_A}{\sum }\ln \left( \frac{z_{A,i}+z_{a,i}}{z_{A,i}-z_{a,i}}\right) + \underset{N_A<i \le N}{\sum }\ln \left( \frac{z_{a,i}+z_{A,i}}{z_{a,i}-z_{A,i}}\right) . \end{aligned}$$

As a conclusion, several generalisations are possible and a lot of assumptions can be relaxed in the initial simple model. We can also combine some of the generalisations for the needs of a particular system. However, observe that the mating preference influences the time needed to reach speciation in the same way.