1 Introduction

All populations are structured and experience environmental fluctuations. Population structure may arise to individual differences in age, size, and spatial location (Metz and Diekmann 1986; Caswell 2001; Holyoak et al. 2005). Temporal fluctuations in environmental factors such light, precipitation, and temperature occur in all natural marine, freshwater and terrestrial systems. Since these environmental factors can influence survival, growth, and reproduction, environmental fluctuations result in demographic fluctuations that may influence species persistence and the composition of ecological communities (Tuljapurkar 1990; Chesson 2000b; Kuang and Chesson 2009). Here we present, for the first time, a general approach to studying coexistence of structured populations in fluctuating environments.

For species interacting in an ecosystem, a fundamental question is what are the minimal conditions to ensure the long-term persistence of all species. Historically, theoretical ecologists characterize persistence by the existence of an asymptotic equilibrium in which the proportion of each population is strictly positive (May 1975; Roughgarden 1979). More recently, coexistence was equated with the existence of an attractor bounded away from extinction (Hastings 1988), a definition that ensures populations will persist despite small, random perturbations of the populations (Schreiber 2006, 2007). However, “environmental perturbations are often vigourous shake-ups, rather than gentle stirrings” (Jansen and Sigmund 1998). To account for large, but rare, perturbations, the concept of permanence, or uniform persistence, was introduced in late 1970s (Freedman and Waltman 1977; Schuster et al. 1979). Uniform persistence requires that asymptotically species densities remain uniformly bounded away from extinction. In addition, permanence requires that the system is dissipative i.e. asymptotically species densities remain uniformly bounded from above. Various mathematical approaches exist for verifying permanence (Hutson and Schmitt 1992; Smith and Thieme 2011) including topological characterizations with respect to chain recurrence (Butler and Waltman 1986; Hofbauer and So 1989), average Lyapunov functions (Hofbauer 1981; Hutson 1984; Garay and Hofbauer 2003), and measure theoretic approaches (Schreiber 2000; Hofbauer and Schreiber 2010). The latter two approaches involve the long-term, per-capita growth rates of species when rare. For discrete-time, unstructured models of the form \(x_{t+1}^i= f_i(x_t)x_t^i\) where \(x_t=(x_t^1,\dots ,x_t^n)\) is the vector of population densities at time \(t\), the long-term growth rate of species \(i\) with initial community state \(x_0=x\) equals

$$\begin{aligned} r_i (x) = \limsup _{t\rightarrow \infty }\frac{1}{t}\sum _{s=0}^{t-1} \log f_i (x_s). \end{aligned}$$

Garay and Hofbauer (2003) showed, under appropriate assumptions, that the system is permanent provided there exist positive weights \(p_1,\dots ,p_n\) associated with each species such that \(\sum _i p_i r_i (x)>0\) for any initial condition \(x\) with one or more missing species (i.e. \(\prod _i x^i=0\)). Intuitively, the community persists if on average the community increases when rare.

The permanence criterion for unstructured populations also extends to structured populations. However, in this case, the long-term growth rate is more complicated. Consider, for example, when both time and the structuring variables are discrete; the population dynamics are given by \(x_{t+1}^i=A_i(x_t)x_t^i\) where \(x_t^i\) is a vector corresponding to the densities of the stages of species \(i\), \(x_t=(x_t^1,\dots ,x_t^n)\), and \(A_i(x)\) are non-negative matrices. Then the long term growth rate \(r_i(x)\) of species \(i\) corresponds to the dominant Lyapunov exponent associated with the matrices \(A_i(x)\) along the population trajectory:

$$\begin{aligned} r_i(x) = \limsup _{t\rightarrow \infty } \frac{1}{t} \log \Vert A_i(x_{t-1})\dots A_i(x_0)\Vert . \end{aligned}$$

At the extinction state \(x=0\), the long-term growth rate \(r_i(0)\) simply corresponds to the \(\log \) of the largest eigenvalue of \(A_i(0)\). For structured single-species models, Cushing (1998) and Kon et al. (2004) proved that \(r_1(0)>0\) implies permanence. For structured, continuous-time, multiple species models, \(r_i(x)\) can be defined in an analogous manner to the discrete-time case using the fundamental matrix of the variational equation. Hofbauer and Schreiber (2010) showed, under appropriate assumptions, that \(\sum _i p_i r_i(x)>0\) for all \(x\) in the extinction set is sufficient for permanence. For discrete-time structured models, however, there exists no general proof of this fact (see, however, Salceanu and Smith 2009a, b, 2010). When both time and the structuring variables are continuous, the models become infinite dimensional and may be formulated as partial differential equations or functional differential equations. Much work has been done is this direction (Hutson and Moran 1987; Zhao and Hutson 1994; Thieme 2009, 2011; Magal et al. 2010; Xu and Zhao 2003; Jin and Zhao 2009). In particular, for reaction-diffusion equations, the long-term growth rates correspond to growth rates of semi-groups of linear operators and, \(\sum _i p_i r_i(x)>0\) for all \(x\) in the extinction set also ensures permanence for these models (Hutson and Moran 1987; Zhao and Hutson 1994; Cantrell and Cosner 2003).

Environmental stochasticity can be a potent force for disrupting population persistence yet maintaining biodiversity. Classical stochastic demography theory for stochastic matrix models \(x_{t+1}=A(t)x_t\) shows that temporally uncorrelated fluctuations in the projection matrices \(A(t)\) reduce the long-term growth rates of populations when rare (Tuljapurkar 1990; Boyce et al. 2006). Hence, increases in the magnitude of these uncorrelated fluctuations can shift populations from persisting to asymptotic extinction. Under suitable conditions, the long-term growth rate for these models is given by the limit \(r=\lim _{t\rightarrow \infty } \frac{1}{t} \ln \Vert A(t)\dots A(1)\Vert \) with probability one. When \(r>0\), the population grows exponentially with probability one for these density-independent models. When \(r<0\), the population declines exponentially with probability one. Hardin et al. (1988) and Benaïm and Schreiber (2009) proved that these conclusions extend to models with compensating density-dependence. However, instead of growing without bound when \(r>0\), the populations converge to a positive stationary distribution with probability one. These results, however, do not apply to models with over-compensating density-dependence or, more generally, non-monotonic responses of demography to density.

Environmental stochasticity can promote diversity through the storage effect (Chesson and Warner 1981; Chesson 1982) in which asynchronous fluctuations of favorable conditions can allow long-lived species competing for space to coexist. The theory for coexistence in stochastic environments has focused on stochastic difference equations of the form \(x_{t+1}^i=x_t^i f_i (\xi _{t+1},x_t)\) where \(\xi _1,\xi _2,\dots \) is a sequence of independent, identically distributed random variables (for a review see  Schreiber 2012). Schreiber et al. (2011) prove that coexistence, in a suitable sense, occurs provided that \(\sum _i p_i r_i(x)>0\) with probability for all \(x\) in the extinction set. Similar to the deterministic case, the long-term growth rate of species \(i\) equals \(r_i (x) = \limsup _{t\rightarrow \infty }\frac{1}{t}\sum _{s=0}^{t-1} \log f_i (x_s)\). Here, stochastic coexistence implies that each species spends an arbitrarily small fraction of time near arbitrarily small densities.

Here, we develop persistence theory for models simultaneously accounting for species interactions, population structure, and environmental fluctuations. Our main result implies that the “community increases when rare” persistence criterion also applies to these models. Our model, assumptions, and a definition of stochastic persistence are presented in Sect. 2. Except for a compactness assumption, our assumptions are quite minimal allowing for overcompensating density dependence and correlated environmental fluctuations. Long-term growth rates for these models and our main theorem are stated in Sect. 3. We apply our results to stochastic models of predator-prey interactions, stage-structured beetle dynamics, and competition in spatial heterogenous environments. The stochastic models for predator-prey interactions are presented in Sect. 4 and examine to what extent “colored” environmental fluctuations facilitate predator-prey coexistence. In Sect. 5, we develop precise criteria for persistence and exclusion for structured single species models and apply these results to the classic stochastic model of larvae-pupae-adult dynamics of flour beetles (Costantino et al. 1995; Dennis et al. 1995; Costantino et al. 1997; Henson and Cushing 1997) and metapopulation dynamics (Harrison and Quinn 1989; Gyllenberg et al. 1996; Metz and Gyllenberg 2001; Roy et al. 2005; Hastings and Botsford 2006; Schreiber 2010). We show, contrary to initial expectations, that multiplicative noise with logarithmic means of zero can facilitate persistence. In Sect. 6, we examine spatial-explicit lottery models (Chesson 1985, 2000a, b) to illustrate how spatial and temporal heterogeneity, collectively, mediate coexistence for transitive and intransitive competitive communities. Proofs of most results are presented in Sect. 8.

2 Model and assumptions

We study the dynamics of \(m\) interacting populations in a random environment. Each individual in population \(i\) can be in one of \(n_i\) individual states such as their age, size, or location. Let \(X_t^i = (X_t^{i1}, \dots , X_t^{in_i})\) denote the row vector of populations abundances of individuals in different states for population \(i\) at time \(t\in \mathbb N\). \(X_t^i\) lies in the non-negative cone \(\mathbb R^{n_i}_+\). The population state is the row vector \(X_t=(X_t^1, \dots , X_t^m)\) that lies in the non-negative cone \(\mathbb R^n_+\) where \(n=\sum _{i=1}^mn_i\). To account for environment fluctuations, we consider a sequence of random variables, \(\xi _1,\xi _2,\dots ,\xi _t,\dots \) where \(\xi _t\) represents the state of the environment at time \(t\).

To define the population dynamics, we consider projection matrices for each population that depend on the population state and the environmental state. More precisely, for each \(i\), let \(A_i(\xi ,X)\) be a non-negative, \(n_i\times n_i\) matrix whose \(j\)\(k\)-th entry corresponds to the contribution of individuals in state \(j\) to individuals in state \(k\) e.g. individuals transitioning from state \(j\) to state \(k\) or the mean number of offspring in state \(k\) produced by individuals in state \(j\). Using these projection matrices and the sequence of environmental states, the population dynamic of population \(i\) is given by

$$\begin{aligned} X_{t+1}^i = X_t^iA_i(\xi _{t+1}, X_t). \end{aligned}$$

where \(X_t^i\) multiplies on the left hand side of \(A_i(\xi _{t+1},X_t)\) as it is a row vector. If we define \(A(\xi ,X)\) to be the \(n\times n\) block diagonal matrix \(\mathrm {diag}(A_1(\xi ,X),\dots ,A_m(\xi ,X))\), then the dynamics of the interacting populations are given by

$$\begin{aligned} X_{t+1} = X_t A(\xi _{t+1},X_t). \end{aligned}$$
(1)

For these dynamics, we make the following assumptions:

  1. H1:

    \(\xi _1,\xi _2, \dots \) is an ergodic stationary sequence in a compact Polish space \(E\) (i.e. compact, separable and completely metrizable).

  2. H2:

    For each \(i\), \((\xi ,X) \mapsto A_i(\xi ,X)\) is a continuous map into the space of \(n_i\times n_i\) non-negative matrices.

  3. H3:

    For each population \(i\), the matrix \(A_i\) has fixed sign structure corresponding to a primitive matrix. More precisely, for each \(i\), there is a \(n_i\times n_i\), non-negative, primitive matrix \(P_i\) such that the \(j\)-\(k\)-th entry of \(A_i(\xi ,X)\) equals zero if and only if \(j\)-\(k\)th entry \(P_i\) equals zero for all \(1\le j,k\le n_i\) and \((\xi ,X)\in E \times \mathbb R^n_+\).

  4. H4:

    There exists a compact set \(S\subset \mathbb R^n_+\) such that for all \(X_0\in \mathbb R^n_+\), \(X_t\in S\) for all \(t\) sufficiently large.

Our analysis focuses on whether the interacting populations tend, in an appropriate stochastic sense, to be bounded away from extinction. Extinction of one or more population corresponds to the population state lying in the extinction set

$$\begin{aligned} S_0 = \left\{ x \in S : \prod _i \Vert x^i\Vert =0\right\} \end{aligned}$$

where \(\Vert x^i\Vert =\sum _{j=1}^{n_i} x^{ij}\) corresponds to the \(\ell ^1\)–norm of \(x^i\). Given \(X_0=x\), we define stochastic persistence in terms of the empirical measure

$$\begin{aligned} \Pi _t^x = \frac{1}{t}\sum _{s=1}^t \delta _{X_s} \end{aligned}$$
(2)

where \(\delta _y\) denotes a Dirac measure at \(y\), i.e. \(\delta _{y}(A) =1\) if \(y\in A\) and \(0\) otherwise for any Borel set \(A \subset \mathbb R_+^n\). These empirical measures are random measures describing the distribution of the observed population dynamics up to time \(t\). In particular, for any Borel set \(B\subset S\),

$$\begin{aligned} \Pi _t^x(B)= \frac{\#\{ 1\le s \le t | X_s \in B \}}{t} \end{aligned}$$

is the fraction of time that the populations spent in the set \(B\). For instance, if we define

$$\begin{aligned} S_\eta =\{ x \in S : \Vert x^i\Vert \le \eta \text{ for } \text{ some } i\}, \end{aligned}$$

then \(\Pi _t^x(S_\eta )\) is the fraction of time that the total abundance of some population is less than \(\eta \) given \(X_0=x\).

Definition 2.1

The model (1) is stochastically persistent if for all \(\varepsilon >0\), there exists \(\eta >0\) such that, with probability one,

$$\begin{aligned} \Pi _t ^x(S_\eta ) \le \varepsilon \end{aligned}$$

for \(t\) sufficiently large and \(x\in S{\setminus } S_0\).

The set \(S_\eta \) corresponds to community states where one or more populations have a density less than \(\eta \). Therefore, stochastic persistence corresponds to all populations spending an arbitrarily small fraction of time at arbitrarily low densities.

3 Results

3.1 Long-term growth rates and a persistence theorem

Understanding persistence often involves understanding what happens to each population when it is rare. To this end, we need to understand the propensity of the population to increase or decrease in the long term. Since

$$\begin{aligned} X_{t}^i= X_0^iA_i(\xi _1,X_0)A_i(\xi _2,X_1)\dots A_i(\xi _t, X_{t-1}), \end{aligned}$$

one might be interested in the long-term “growth” of random product of matrices

$$\begin{aligned} A_i(\xi _1,X_0)A_i(\xi _2,X_1)\dots A_i(\xi _t, X_{t-1}) \end{aligned}$$
(3)

as \(t\rightarrow \infty \). One measurement of this long-term growth rate when \(X_0=x\) is the random variable

$$\begin{aligned} r_i(x) = \limsup _{t\rightarrow \infty } \frac{1}{t} \log \Vert A_i(\xi _1,X_0)A_i(\xi _2,X_1)\dots A_i(\xi _t, X_{t-1})\Vert . \end{aligned}$$
(4)

Population \(i\) is tending to show periods of increase when \(r_i(x)>0\) and asymptotically decreasing when \(r_i(x)<0\). Since, in general, the sequence

$$\begin{aligned} \left\{ \frac{1}{t} \log \Vert A_i(\xi _1,X_0)A_i(\xi _2,X_1)\dots A_i(\xi _t, X_{t-1})\Vert \right\} _{t=1}^\infty \end{aligned}$$

does not converge, the \(\limsup _{t\rightarrow \infty }\) instead of \(\lim _{t\rightarrow \infty }\) in the definition of \(r_i(x)\) is necessary. However, as we discuss in Sect. 3.2, the \(\limsup _{t\rightarrow \infty }\) can be replaced by \(\lim _{t\rightarrow \infty }\) on sets of “full measure”.

An expected, yet useful property of \(r_i(x)\) is that \(r_i(x)\le 0\) with probability one whenever \(\Vert x^i\Vert >0\). In words, whenever population \(i\) is present, its per-capita growth rate in the long-term is non-positive. This fact follows from \(X_t^i\) being bounded above for \(t \ge 0\). Furthermore, on the event of \(\{\limsup _{t\rightarrow \infty } \Vert X^i_t\Vert >0\}\), we get that \(r_i(x)=0\) with probability one. In words, if population \(i\)’s density infinitely often is bounded below by some minimal density, then its long-term growth rate is zero as it is not tending to extinction and its densities are bounded from above. Both of these facts are consequences of results proved in the Sec. 8 (i.e. Proposition 8.10, Corollary 8.17 and Proposition 8.19).

Our main result extends the persistence conditions discussed in the introduction to stochastic models of interacting, structured populations. Namely, if the community increases on average when rare, then the community persists. More formally, we prove the following theorem in the Sec. 8.

Theorem 3.1

  If there exist positive constants \(p_1,\dots , p_m\) such that

$$\begin{aligned} \sum _i p_i r_i(x)> 0\quad \text{ with } \text{ probability } \text{ one } \end{aligned}$$
(5)

for all \(x\in S_0\), then the model (1) is stochastically persistent.

For two competing species (\(k=2\)) that persist in isolation (i.e. \(r_1(0)>0\) and \(r_2(0)>0\) with probability one), inequality (5) reduces to the classical mutual invasibility condition. To see why, consider a population state \(x=(x_1,0)\) supporting species \(1\). Since species \(1\) can persist in isolation, Proposition 8.19 implies that \(r_1(x)=0\) with probability one. Hence, inequality (5) for this initial condition becomes \(p_1 r_1(x)+ p_2 r_2(x)=p_2 r_2(x)>0\) with probability one for all initial conditions \(x=(x_1,0)\) supporting species \(1\). Similarly, inequality (5) for an initial condition \(x=(0,x_2)\) supporting species \(2\) becomes \(r_1 (x)>0\) with probability one. In words, stochastic persistence occurs if both competitors have a positive per-capita growth rate when rare. A generalization of the mutual invasibility condition to higher dimensional communities is discussed at the end of the next subsection.

3.2 A refinement using invariant measures

The proof of Theorem 3.1 follows from a more general result that we now present. For this result, we show that one need not verify the persistence condition (5) for all \(x\) in the extinction set \(S_0\). It suffices to verify the persistence condition for invariant measures of the process supported by the extinction set.

Definition 3.2

A Borel probability measure \(\mu \) on \(E \times S\) is an invariant measure for the model (1) provided that

  1. (i)

    \({\mathbb {P}}[\xi _t \in B] = \mu (B\times S)\) for all Borel sets \(B\subset E\), and

  2. (ii)

    if \({\mathbb {P}}[(\xi _0,X_0)\in C]= \mu (C)\) for all Borel sets \(C\subset E\times S\), then \({\mathbb {P}}[(\xi _t,X_t)\in C]= \mu (C)\) for all Borel sets \(C\subset E\times S\) and \(t\ge 0\).

Condition (i) ensures that invariant measure is consistent with the environmental dynamics. Condition (ii) implies that if the system initially follows the distribution of \(\mu \), then it follows this distribution for all time. When this occurs, we say \((\xi _t, X_t)\) is stationary with respect to \(\mu \). One can think of invariant measures as the stochastic analog of equilibria for deterministic dynamical systems; if the population statistics initially follow \(\mu \), then they follow \(\mu \) for all time.

When an invariant measure \(\mu \) is statistically indecomposable, it is ergodic. More precisely, \(\mu \) is ergodic if it can not be written as a convex combination of two distinct invariant measures, i.e. if there exist \(0<\alpha <1\) and two invariant measures \(\mu _1,\mu _2\) such that \(\mu =\alpha \mu _1 +(1-\alpha ) \mu _2\), then \(\mu _1=\mu _2=\mu \).

Definition 3.3

If \((\xi _t, X_t)\) is stationary with respect to \(\mu \), the subadditive ergodic theorem implies that \(r_i(X_0)\) is well-defined with probability one. Moreover, we call the expected value

$$\begin{aligned} r_i(\mu )=\int \mathbb {E}[r_i(X_0)|X_0=x,\,\, \xi _1=\xi ]\mu (d\xi ,dx) \end{aligned}$$

to be long-term growth rate of species i with respect to \(\mu \). When \(\mu \) is ergodic, the subadditive ergodic theorem implies that \(r_i(X_0)\) equals \(r_i(\mu )\) for \(\mu \)-almost every \((X_0,\xi _1)\).

With these definitions, we can rephrase Theorem 3.1 in terms of the long-term growth rates \(r_i(\mu )\) as well as provide an alternative characterization of the persistence condition.

Theorem 3.4

If one of the following equivalent conditions hold

  1. (i)

    \(r_*(\mu ) := \max _{1\le i\le m} r_i(\mu )>0\) for every invariant probability measure with \(\mu (S_0)=1\), or

  2. (ii)

    there exist positive constants \(p_1,\dots ,p_m\) such that

    $$\begin{aligned} \sum _i p_i r_i(\mu )>0 \end{aligned}$$

    for every ergodic probability measure with \(\mu (S_0)=1\), or

  3. (iii)

    there exist positive constants \(p_1,\dots , p_m\) such that

    $$\begin{aligned} \sum _i p_i r_i(x)> 0\quad \text{ with } \text{ probability } \text{ one } \end{aligned}$$

    for all \(x\in S_0\)

then the model (1) is stochastically persistent.

With Theorem 3.4’s formulation of the stochastic persistence criterion, we can introduce a generalization of the mutual invasibility condition to higher-dimensional communities. To state this condition, observe that for any ergodic, invariant measure \(\mu \), there is a unique set of species \(I\subset \{1,\dots ,k\}\) such that \(\mu (\{x: x^{ij}> 0\) for all \(i\in I, 1\le j \le n_i\})=1\). In other words, \(\mu \) supports the community \(I\). Proposition 8.19 implies that \(r_i(\mu )=0\) for all \(i\in I\). Therefore, if \(I\) is a strict subset of \(\{1,\dots ,k\}\) i.e. not all species are in the community \(I\), then coexistence condition (ii) of Theorem 3.4 requires that there exists a species \(i\notin I\) such that \(r_i(\mu )>0\). In other words, the coexistence condition requires that at least one missing species has a positive per-capita growth rate for any subcommunity represented by an ergodic invariant measure. While this weaker condition is sometimes sufficient to ensure coexistence (e.g. in the two species models that we examine), in general it is not as illustrated in Sect. 6.1. Determining, in general, when this “at least one missing species can invade” criterion is sufficient for stochastic persistence is an open problem.

4 Predator-prey dynamics in auto-correlated environments

To illustrate the applicability of Theorems 3.1 and 3.4, we apply the persistence criteria to stochastic models of predator-prey interactions, stage-structured populations with over-compensating density-dependence, and transitive and intransitive competition in spatially heterogeneous environments.

For unstructured populations, Theorem 3.4 extends Schreiber et al. (2011)’s criteria for persistence to temporally correlated environments. These temporal correlations can have substantial consequences for coexistence as we illustrate now for a stochastic model of predator-prey interactions. In the absence of the predator, assume the prey, with density \(N_t\) at time \(t\), exhibits a noisy Beverton-Holt dynamic

$$\begin{aligned} N_{t+1} = \frac{R_{t+1}N_t}{1+a\,N_t} \end{aligned}$$
(6)

where \(R_t\) is a stationary, ergodic sequence of random variables corresponding to the intrinsic fitness of the prey at time \(t\), and \(a>0\) corresponds to the strength of intraspecific competition. To ensure the persistence of the prey in the absence of the predator, assume \(\mathbb {E}[\ln R_1]>0\) and \(\mathbb {E}[\ln R_1]<\infty \). Under these assumptions, Theorem 1 of Benaïm and Schreiber (2009) implies that \(N_t\) converges in distribution to a positive random variable \(\widehat{N}\) whenever \(N_0>0\). Moreover, the empirical measures \(\Pi ^{(N,P)}_t\) with \(N>0,P=0\) converge almost surely to the law \(\nu \) of the random vector \((\widehat{N},0)\) i.e. the probability measure satisfying \(\nu (A)={\mathbb {P}}[(\widehat{N}, 0)\in A]\) for any Borel set \(A\subset \mathbb R^2_+\).

Let \(P_t\) be the density of predators at time \(t\) and \(\exp (-bP_t)\) be the fraction of prey that “escape” predation during generation \(t\) where \(b\) is the predator attack rate. The mean number of predators offspring produced per consumed prey is \(c\), while \(s\) corresponds to the fraction of predators that survive to the next time step. The predator-prey dynamics are

$$\begin{aligned} \begin{aligned} N_{t+1} = \frac{R_{t+1}N_t}{1+a\,N_t} \exp (-bP_t) P_{t+1} = c N_t (1-\exp (-bP_t))+s P_t. \end{aligned} \end{aligned}$$
(7)

To see that (7) is of the form of our models (1), we can expend the exponential term in the second equation. To ensures that (7) satisfies the assumptions of Theorem 3.1, we assume \(R_t\) takes values in the half open interval \((0,R^*]\). Since \(N_{t+1} \le R_{t+1}/a \le R^*/a\) and \(P_{t+1} \le c N_t + sP_t \le cR^*/a+sP_t\), \(X_t=(N_t,P_t)\) eventually enters and remains in the compact set

$$\begin{aligned} S=[0,R^*/a]\times [0,c R^*/(a(1-s))]. \end{aligned}$$

To apply Theorem 3.1, we need to evaluate \(r_i((N,P))\) for all \(N\ge 0, P\ge 0\) with either \(N=0\) or \(P=0\). Since \((0,P_t)\) converges to \((0,0)\) with probability one whenever \(P_0\ge 0\), we have \(r_1((0,P))=\mathbb {E}[\ln R_t]>0\) and \(r_2((0,P))=\ln s<0\) whenever \(P\ge 0\). Since \(\Pi ^{(N,0)}_t\) with \(N>0\) converges almost surely to \(\nu \), Proposition 8.19 implies \(r_1((N,0))=0\). Moreover,

$$\begin{aligned} r_2((N,0))&= \mathbb {E}\left[ \ln \left( cb \widehat{N} +s \right) \right] \nonumber \\&= \int \ln (c b x + s) \nu (dx). \end{aligned}$$
(8)

By choosing \(p_1=1-\varepsilon \) and \(p_2= \varepsilon >0\) for \(\varepsilon \) sufficiently small (e.g. \(0.5\mathbb {E}[\ln R_t]/(\mathbb {E}[\ln R_t]- \ln s)\)), we have \(\sum _i p_i r_i((N,P))>0\) whenever \(NP=0\) if and only if

$$\begin{aligned} \mathbb {E}\left[ \ln \left( cb \widehat{N} +s \right) \right] >0. \end{aligned}$$
(9)

Namely, the predator and prey coexist whenever the predator can invade the prey-only system. Since \(\ln (cb N +s)\) is a concave function of the prey density and the predator life history parameters \(c,b,s\), Jensen’s inequality implies that fluctuations in any one of these quantities decreases the predator’s growth rate.

To see how temporal correlations influence whether the persistence criterion (9) holds or not, consider an environment that fluctuates randomly between good and bad years for the prey. On good years, \(R_t\) takes on the value \(R_{good}\), while in bad years it takes on the value \(R_{bad}\). Let the transitions between good and bad years be determined by a Markov chain where the probability of going from a bad year to a good year is \(p\) and the probability of going from a good year to a bad year is \(q\). For simplicity, we assume that \(p=q\) in which case half of the years are good and half of the years are bad in the long run. Under these assumptions, the persistence assumption \(\mathbb {E}[\ln R_1]>0\) for the prey is \(\ln \left( R_{good} R_{bad}\right) >0.\)

To estimate the left-hand side of (9), we consider the limiting cases of strongly negatively correlated environments (\(p\approx 1\)) and strongly positively correlated environments (\(p\approx 0\)). When \(p\approx 1\), the environmental dynamics are nearly periodic switching nearly every other time step between good and bad years. Hence, one can approximate the stationary distribution \(\widehat{N}\) by the positive, globally stable fixed point of

$$\begin{aligned} \begin{aligned} x_{t+2}&= \frac{R_{good} x_{t+1}}{1+a x_{t+1}}\\&= \frac{R_{good}R_{bad} x_t /(1+a x_t))}{1+ a (R_{bad} x_t/(1+ax_t))}\\&= \frac{R_{good}R_{bad} x_t}{1+a(1+R_{bad})x_t} \end{aligned} \end{aligned}$$

which is given by \(\frac{R_{good}R_{bad}-1}{a(1+R_{bad})}\). Hence, if \(p\approx 1\), then the distribution \(\nu \) of \(\widehat{N}\) approximately puts half of its weight on \(\frac{R_{good}R_{bad}-1}{a(1+R_{bad})}\) and half of its weight on \(\frac{R_{good}R_{bad}-1}{a(1+R_{good})}\) and the persistence criterion (9) is approximately

$$\begin{aligned} \frac{1}{2} \ln \left( bc\frac{R_{good}R_{bad}-1}{a(1+R_{bad})} + s\right) + \frac{1}{2} \ln \left( bc\frac{R_{good}R_{bad}-1}{a(1+R_{good})}+s \right) >0. \end{aligned}$$
(10)

Next, consider the case that \(p\approx 0\) in which there are long runs of good years and long runs of bad years. Due to these long runs, one expects that half time \(\widehat{N}\) is near the value \((R_{good}-1)/a\) and half the time it is near the value \(\max \{(R_{bad}-1)/a,0\}\). If \(R_{bad}>1\), then the persistence criterion is approximately

$$\begin{aligned} \frac{1}{2} \ln \left( bc\frac{R_{good}-1}{a}+s \right) + \frac{1}{2} \ln \left( bc\frac{R_{bad}-1}{a}+s \right) >0. \end{aligned}$$
(11)

Relatively straightforward algebraic manipulations (e.g. exponentiating the left hand sides of (10) and (11) and multiplying by \((1+R_{bad})(1+R_{good})\)) show that the left hand side of (10) is always greater than the left hand side of (11).

Biological Interpretation 4.1

Positive autocorrelations, by increasing variability in prey density, hinders predator establishment and, thereby, coexistence of the predator and prey. In contrast, negative auto-correlations by reducing variability in prey density can facilitate predator-prey coexistence (Fig. 1).

Fig. 1
figure 1

Effect of temporal autocorrelations on predator-prey coexistence in a Markovian environment. a The long-term growth rate \(r_2((N,0))\) with \(N>0\) of the predator when rare is plotted as a function of the temporal autocorrelation between good and bad reproductive years for the prey. b, c The mean and interquartile ranges of long-term distribution of prey and predator densities are plotted as function of the temporal autocorrelation. Parameters: \(R_{good}=4, R_{bad}=1.1, a=0.01, c=1, s=0.1, b=0.01\)

5 Application to structured single species models

For single species models with negative-density dependence, we can prove sufficient and necessary conditions for stochastic persistence. The following theorem implies that stochastic persistence occurs if the long-term growth rate \(r_1(0)\) when rare is positive and asymptotic extinction occurs with probability one if this long-term growth rate is negative.

Theorem 5.1

  Assume that \(m=1\) (i.e. there is one species), H1-H4 hold and the entries of \(A(\xi ,x)=A_1(\xi ,x)\) are non-increasing functions of \(x\). If \(r_1(0)>0\), then

$$\begin{aligned} X_{t+1} = X_t A(\xi _{t+1},X_t) \end{aligned}$$
(12)

is stochastically persistent. If \(r_1(0)<0\), then \(\lim _{t\rightarrow \infty } X_t = (0,0,\dots ,0)\) with probability one.

Our assumption that the entries \(A(\xi ,x)\) are non-increasing functions of \(x\) ensures that \(r_1(0)\ge r_1(x)\) for all \(x\) which is the key fact used in the proof of Theorem 5.1. It remains an open problem to identify other conditions on \(A(\xi ,x)\) that ensure \(r_1(0)\ge r_1(x)\) for all \(x\).

Proof

The first statement of this theorem follows from Theorem 3.1.

Assume that \(r_1(0)<0\). Provided that \(X_0\) is nonnegative with at least one strictly positive entry, Ruelle’s stochastic version of the Perron Frobenius Theorem ((Ruelle 1979b, Proposition 3.2)) and the entries of \(A(\xi ,x)\) being non-increasing in \(x\) imply

$$\begin{aligned} \lim _{t\rightarrow \infty } \frac{1}{t} \log \Vert X_t \Vert \le \lim _{t\rightarrow \infty } \frac{1}{t} \log \Vert X_0 A(\xi _t,0)\dots A(\xi _1,0)\Vert = r(0)<0 \end{aligned}$$

with probability one. Hence, \(\lim _{t\rightarrow \infty } X_t=(0,\dots ,0)\) with probability one. \(\square \)

Theorem 5.1 extends Theorem 1 of Benaïm and Schreiber (2009) as it allows for over-compensating density dependence and makes no assumptions about differentiability of \(x\mapsto A(\xi ,x)\). To illustrate its utility, we apply this result to the larvae-pupue-adult model of flour beetles and a metapopulation model.

5.1 A stochastic Larvae-Pupae-Adult model for flour beatles

An important, empirically validated model in ecology is the “Larvae-Pupae-Adult” (LPA) model which describes flour beetle population dynamics (Costantino et al. 1995; Dennis et al. 1995; Costantino et al. 1997). The model keeps track of the densities \(\ell _t, p_t, a_t\) of larvae, pupae, and adults at time \(t\). Adults produce \(b\) eggs each time step. These eggs are cannibalized by adults and larvae at rates \(c_{ea}\) and \(c_{el}\), respectively. The eggs escaping cannibalism become larvae. A fraction \(\mu _l\) of larvae die at each time step. Larvae escaping mortality become pupae. Pupae are cannibalized by adults at a rate \(c_{pa}\). Those individuals escaping cannibalism become adults. A fraction \(\mu _a\) of adults survive through a time step. These assumptions result in a system of three difference equations

$$\begin{aligned} \begin{aligned} \ell _{t+1}&= b a_t \exp (-c_{el} \ell _t - c_{ea} a_t) \\ p_{t+1}&= (1-\mu _l) \ell _t \\ a_{t+1}&= \left( p_t \exp (-c_{pa} a_t)+ (1-\mu _a) a_t\right) \end{aligned} \end{aligned}$$
(13)

Environmental fluctuations have been included in these models in at least two ways. Dennis et al. (1995) assumed that each stage experienced random fluctuations due to multiplicative factors \(\exp (\xi _t^l),\exp (\xi _t^p),\exp (\xi _t^a)\) such that \(\xi _t^i\) for \(i=l,p,a\) are independent and normally distributed with mean zero i.e. on the log-scale the average effect of environmental fluctuations are accounted for by the deterministic model. Alternatively, Henson and Cushing (1997) considered periodic fluctuations in cannibalism rates due to fluctuations in the size \(V_t\) of the habitat i.e. the volume of the flour. In particular, they assumed that \(c_i = \kappa _i/V_t\) for \(i=ea,el,pa\), for positive constants \(\kappa _i\). If we include both of these stochastic effects into the deterministic model, we arrive at the following system of random difference equations:

$$\begin{aligned} \begin{aligned} \ell _{t+1}&= b a_t \exp (-\kappa _{el} \ell _t/V_{t+1} - \kappa _{ea} a_t/V_{t+1}+ \xi _t^l)\\ p_{t+1}&= (1-\mu _l) \ell _t \exp (\xi _t^p)\\ a_{t+1}&= \left( p_t \exp (-\kappa _{pa} a_t/V_{t+1})+ (1-\mu _a) a_t\right) \exp ( \xi _t^a). \end{aligned} \end{aligned}$$
(14)

We can use Theorem 3.4 to prove the following persistence result. In the case of \(\xi _t^i =0\) with probability one for \(i=l,p,a\), this theorem can be viewed as a stochastic extension of Theorem 4 of Henson and Cushing (1997) for periodic environments.

Theorem 5.2

Assume \(c_i>0\) for \(i=ea,el,pa, \mu _i \in (0,1)\) for \(i=l,a, \xi _t^l, \xi _t^p, \xi _t^a\), and \(V_t\) are ergodic and stationary sequences such that \(\xi _t^i, \log V_t \in (-M,M)\) for \(i=l,p,a\), \(t\ge 0\) and some \(M>0\), and \((1-\mu _a)\exp (\xi ^a_t) \in [0,1-\delta ]\) for some \(\delta >0\) with probability one. Then there exists a critical birth rate \(b_{crit}>0\) such that

Extinction::

If \(b<b_{crit}\), then \(X_t=(\ell _t,p_t,a_t)\) converges almost surely to \((0,0,0)\) as \(t\rightarrow \infty \).

Stochastic persistence::

If \(b>b_{crit}\), then the LPA model is stochastically persistent.

Moreover, if \(\xi _t^l=\xi _t^a=\xi _t^p\) with probability one and \(\mathbb {E}[ \xi _t^l]=0\), then \(b_{crit}= \mu _a /(1-\mu _l)\).

Remark The assumption that \(\xi _t^i\) are compactly supported formally excludes the normal distributions used by Dennis et al. (1995). However, truncated normals with a very large \(M\) can approximate the normal distribution arbitrarily well. The assumption \((1-\mu _a)\exp (\xi ^a_t)\in [0,1-\delta ]\) for some \(\delta >0\) is more restrictive. However, from a biological standpoint, it is necessary as this term corresponds to the fraction of adults surviving to the next time step. None the less, we conjecture that the conclusions of Theorem 5.2 hold when \(\xi _t^i\) are normally distributed with mean \(0\).

Theorem 5.2 implies that including multiplicative noise with \(\log \)-mean zero has no effect on the deterministic persistence criterion when \(\xi _t^l=\xi _t^p=\xi _t^a\) with probability one. However, when these random variables are not perfectly correlated, we conjecture that this form of multiplicative noise always decreases the critical birth rate (Fig. 2). To provide some mathematical evidence for this conjecture, we compute a small noise approximation for the per-capita growth rate \(r_1(\delta _0)\) when the population is rare (Ruelle 1979b; Tuljapurkar 1990). Let

$$\begin{aligned} \begin{aligned} B_t=&\begin{pmatrix} 0&{}\quad (1-\mu _l)\exp (\xi _t^p)&{}\quad 0\\ 0&{}\quad 0&{}\quad \exp (\xi _t^a) \\ b\exp (\xi _t^l)&{}\quad 0 &{}\quad (1-\mu _a)\exp (\xi _t^a) \end{pmatrix}\\ \end{aligned} \end{aligned}$$

be the linearization of the stochastic LPA model (14) at \((L,P,A)=(0,0,0)\). Assume that \(\xi _t^i = \varepsilon Z_t^i\) where \(\mathbb {E}[Z_t^i]=0\) and \(\mathbb {E}[\left( Z_t^i\right) ^2]=1\). Ruelle (1979b, Theorem 3.1) implies that \(r_1(0)\) is an analytic function of \(\varepsilon \). Therefore, one can perform a Taylor’s series expansion of \(r_1(0)\) as function of \(\varepsilon \) about the point \(\varepsilon =0\). As we shall shortly show, the first non-zero term of this expansion is of second order. Expanding \(B_t\) to second order in \(\varepsilon \) yields

$$\begin{aligned} \begin{aligned} B_t \approx&\underbrace{ \begin{pmatrix} 0&{}\,\,(1-\mu _l)&{}\,0\\ 0&{}\,0&{}\, 1 \\ b&{}\,0 &{}\,\,(1-\mu _a) \end{pmatrix}}_{=B}\left( I \!+\! \varepsilon \, \mathrm {diag}\{Z_t^l,Z_t^p,Z_t^a\}\!+\! \varepsilon ^2\mathrm {diag}\{Z_t^l,Z_t^p,Z_t^a\}^2/2\right) . \end{aligned} \end{aligned}$$

The entries of the second order term are positive due to the convexity of the exponential function. Hence, Jensen’s inequality implies that fluctuations in \(Z_t^i\) increase the mean matrix \(\mathbb {E}[B_t]\). This observation, in and of itself, suggests that fluctuations in \(Z_t^i\) increase \(r_1(0)\). However, to rigorously verify this assertion, let \(v\) and \(w\) be the left and right Perron-eigenvectors of \(B\) such that \(\sum _i v_i=1\) and \(\sum _i v_iw_i=1\). Let \(r_0\) be the associated Perron eigenvalue of \(B\). Provided the \(Z_t^i\) are independent in time, a small noise approximation for the stochastic growth rate of the random products of \(B_t\) is

$$\begin{aligned} r_*(\delta _0)\approx \log r_0 + \frac{\varepsilon ^2}{2}\left( \mathbb {E}\left[ \sum _i v_i w_i \left( Z_t^i\right) ^2\right] - \mathbb {E}\left[ \left( \sum _i v_i w_i Z_t^i\right) ^2\right] \right) . \end{aligned}$$
(15)

Since the function \(x\mapsto x^2\) is strictly convex and \(\sum _i v_i w_i (Z_t^i)^2\) is a convex combination of \((Z_t^l)^2, (Z_t^p)^2\), and \((Z_t^a)^2\), Jensen’s inequality implies

$$\begin{aligned} \left( \sum _i v_i w_i Z_t^i\right) ^2\le \sum _i v_i w_i \left( Z_t^i\right) ^2. \end{aligned}$$

Therefore

$$\begin{aligned} \mathbb {E}\left[ \left( \sum _i v_i w_i Z_t^i\right) ^2\right] \le \mathbb {E}\left[ \sum _i v_i w_i \left( Z_t^i\right) ^2\right] . \end{aligned}$$

It follows that the order \(\varepsilon ^2\) correction term in (15) is non-negative and equals zero if and only if \(Z_t^l=Z_t^p=Z_t^a\) with probability one. Therefore, “small” multiplicative noise (with \(\log \)-mean zero) which isn’t perfectly correlated across the stages increases the stochastic growth rate and, therefore, decreases the critical birth rate \(b_{crit}\) required for stochastic persistence.

Fig. 2
figure 2

Effects fluctuations in fecundity and larval survival on the critical birth rate \(b\) required for persistence. \((\xi _t^l)\) are normally distributed with mean \(0\) and variance one, \(\xi _t^a=\xi _t^p=0\) for all \(t\) and \(\mu _a=0.1034\) (the value found in Table 1D in Costantino et al. (1995))

Biological Interpretation 5.3

For the LPA model, there is a critical mean fecundity, above which the population persists and below which the population goes asymptotically to extinction. Fluctuations in the log survival rates decrease the critical mean fecundity unless the log survival rates are perfectly correlated.

Proof of Theorem 5.2

We begin by verifying H1H4. H1 and H2 follow from our assumptions. To verify H3, notice that the sign structure of the nonlinear projection matrix \(A_t(\xi ,X)\) for (14) is given by

$$\begin{aligned} C= \begin{pmatrix} 0&{}\quad 1&{}\quad 0\\ 0&{}\quad 0&{}\quad 1\\ 1&{}\quad 0&{}\quad 1 \end{pmatrix}. \end{aligned}$$

Since

$$\begin{aligned} C^4= \begin{pmatrix} 1&{}\quad 1&{}\quad 1\\ 1&{}\quad 1&{}\quad 2\\ 2&{}\quad 1&{}\quad 3 \end{pmatrix} \end{aligned}$$

\(A_t(\xi ,X)\) has the sign structure of the primitive matrix \(C\) for all \(\xi ,X\) and \(t\). Finally, to verify H4, define

$$\begin{aligned} K=b e^{2M-1}/ \kappa _{ea} \end{aligned}$$

Then

$$\begin{aligned} \ell _{t+1} \le b a_t \exp (-\kappa _{ea} a_t/V_{t+1}+ \xi _t^l)\le b a_t \exp (-\kappa _{ea} a_t \exp (-M) +M) \le K \end{aligned}$$

for all \(t\ge 0\). Therefore, \(\ell _t\le K\) for \(t\ge 1\) and

$$\begin{aligned} p_{t} \le \ell _{t-1} e^M \le Ke^M \end{aligned}$$

for all \(t\ge 2\). Hence,

$$\begin{aligned} a_{t+1} \le p_{t}e^M + (1-\delta ) a_t \end{aligned}$$

for all \(t \ge 2\) which implies \(a_t \le Ke^{3M}/\delta \) for \(t\) sufficiently large. The compact forward invariant set \(S=[0,K]\times [0,Ke^M] \times [0,Ke^{3M}]/\delta \) satisfies H4.

At low density we get

$$\begin{aligned} B_t=A(\xi _t,0)= \begin{pmatrix} 0&{}\quad (1-\mu _l)\exp (\xi _t^p)&{}\quad 0\\ 0&{}\quad 0 &{}\quad \exp (\xi _t^a) \\ b\exp (\xi _t^l)&{}\quad 0 &{}\quad (1-\mu _a)\exp (\xi _t^a) \end{pmatrix}. \end{aligned}$$

Define \(r(b)\) to be the dominant Lyapunov exponent of the random products of \(B_1,B_2,\dots \). Note that with the notation of Theorem 3.4, \(r(b)=r_1(0)\). Theorem 3.1 of Ruelle (1979b) implies that \(r(b)\) is differentiable for \(b>0\) and the derivative is given by (see, e.g., section 4.1 of Ruelle 1979b)

$$\begin{aligned} r'(b)= \mathbb {E}\left[ \frac{v_t(b) E_{31} w_{t+1}(b)}{v_t(b) B_t(b) w_{t+1}(b)}\right] >0 \end{aligned}$$

where \(v_t(b),w_t(b)\) are the normalized left and right invariant sub-bundles associated with \(B_t(b)\) and \(E_{31}\) is the matrix with \(\exp (\xi _t^l)\) in the \(3-1\) entry and \(0\) entries otherwise. Since the numerator and denominators in the expectation are always positive, \(r(b)\) is a strictly increasing function of \(b\). Since \(\lim _{b\rightarrow 0} r(b)=-\infty \) and \(\lim _{b\rightarrow \infty } r(b)=\infty \), there exists \(b_{crit}>0\) such that \(r(b)<0\) for \(b<b_{crit}\) and \(r(b)>0\) for \(b>b_{crit}\).

If \(b>b_{crit}\), then \(r(b)>0\) and Theorem 5.1 implies that (14) is stochastically persistent. On the other hand, if \(b<b_{crit}\), then \(r(b)<0\) and Theorem 5.1 implies that \((\ell _t,p_t,a_t)\) converges to \((0,0,0)\) with probability one as \(t\rightarrow \infty \).

The final assertion about the stochastic LPA model follows from observing that if \(\xi _t^a=\xi _t^l=\xi _t^a\) with probability one for all \(t\), then

$$\begin{aligned} B_t = \begin{pmatrix} 0&{}\quad (1-\mu _l)&{}\quad 0\\ 0&{}\quad 0&{}\quad 1 \\ b&{}\quad 0 &{}\quad (1-\mu _a) \end{pmatrix}\exp (\xi _t^l) \end{aligned}$$

with probability one. Hence, \(r(b)=\log r_0(b) + \mathbb {E}[\xi _t^l]\) where \(r_0(b)\) is the dominant eigenvalue of the deterministic matrix

$$\begin{aligned} \begin{pmatrix} 0&{}\quad (1-\mu _l)&{}\quad 0\\ 0&{}\quad 0&{}\quad 1 \\ b&{}\quad 0 &{}\quad (1-\mu _a) \end{pmatrix}. \end{aligned}$$

Therefore, if \(\mathbb {E}[\xi _t^l]=0\), then \(r(b)=\log r_0(b)\). Using the Jury conditions, Henson and Cushing (1997) showed that \(r_0(b)>1\) if \(b>\mu _a/(1-\mu _l)\) and \(r_0(b)<1\) if \(b<\mu _a/(1-\mu _l)\). Hence, when \(\xi _t^l=\xi _t^p=\xi _t^a\) with probability one and \(\mathbb {E}[\xi _t^l]=0\), \(b_{crit}\) equals \(\mu _a/(1-\mu _l)\) as claimed. \(\square \)

5.2 Metapopulation dynamics

Interactions between movement and spatio-temporal heterogeneities determine how quickly a population grows or declines. Understanding the precise nature of these interactive effects is a central issue in population biology receiving increasing attention from theoretical, empirical, and applied perspectives (Petchey et al. 1997; Lundberg et al. 2000; Gonzalez and Holt 2002; Schmidt 2004; Roy et al. 2005; Boyce et al. 2006; Hastings and Botsford 2006; Matthews and Gonzalez 2007; Schreiber 2010).

A basic model accounting for these interactions considers a population living in an environment with \(n\) patches. Let \(X^r_t\) be the number of individuals in patch \(r\) at time \(t\). Assuming Ricker density-dependent feedbacks at the patch scale, the fitness of an individual in patch \(r\) is \(\lambda ^r_t \exp ( - \alpha _r X_t^r)\) at time \(t\), where \(\lambda _t^r\) is the maximal fitness and \(\alpha _r>0\) measures the strength of infraspecific competition. Let \(d_{rs}\) be the fraction of the population from patch \(r\) that disperse to patch \(s\). Under these assumptions, the population dynamics are given by

$$\begin{aligned} X^r_{t+1}= \sum _{s=1}^n d_{sr} \lambda _t^s X_t^s \exp (-\alpha _s X_t^s) \qquad r =1,\dots , n. \end{aligned}$$
(16)

To write this model more compactly, let \(F(X_t, \lambda _t)\) be the diagonal matrix with diagonal entries \(\lambda _1 \exp (-\alpha _1 X_t^1), \dots , \lambda _n \exp (-\alpha _n X_t^n)\), and \(D\) be the matrix whose \(i\)-\(j\)th entry is given by \(d_{ij}\). With this notation, (16) simplifies to

$$\begin{aligned} X_{t+1}=X_t F(X_t,\lambda _t) D. \end{aligned}$$

If \(\lambda ^r_t\) are ergodic and stationary, \(\lambda _t^r\) take values in a positive compact interval \([\lambda _*,\lambda ^*]\) and \(D\) is a primitive matrix, then the hypotheses of Theorem 5.1 hold. In particular, stochastic persistence occurs only if \(r_1(0)\), corresponding to the dominant Lyapunov exponent of the random matrix product \(F(0,\lambda _t)D\dots F(0,\lambda _1) D\), is positive.

When populations are fully mixing (i.e. \(d_{rs}=v_s\) for all \(r,s\)), Metz et al. (1983) derived a simple expression for \(r_1(0)\) given by

$$\begin{aligned} r_1(0)=\mathbb {E}\left[ \log \left( \sum _{r=1}^n v_r \lambda _t^r \right) \right] \end{aligned}$$
(17)

i.e. the temporal log-mean of the spatial arithmetic mean. Owing to the concavity of the log function, Jensens inequality applied to the spatial and temporal averages in (17) yields

$$\begin{aligned} \log \left( \sum _{r=1}^n v_r \mathbb {E}[\lambda _t^r]\right) >r_1(0)> \sum _{r=1}^n v_r \mathbb {E}[\log \lambda _t^r]. \end{aligned}$$
(18)

The second inequality implies that dispersal can mediate persistence as \(r_1(0)\) can be positive despite all local growth rates \(\mathbb {E}[\log \lambda _t^r]\) being negative. Hence, populations can persist even when all patches are sinks, a phenomena that has been observed in the analysis of density-independent models and simulations of density-dependent models (Jansen and Yoshimura 1998; Bascompte et al. 2002; Evans et al. 2013). The first inequality in equation (18), however, implies that dispersal-mediated persistence for well-mixed populations requires that the expected fitness \(\mathbb {E}[\lambda _t^ r]\) is greater than one in at least one patch.

For partially mixing populations for which \(d_{rs}=v_s +\varepsilon \delta _{rs}\), Schreiber (2010) developed first-order approximation of \(r_1(0)\) with respect to \(\varepsilon \). This approximation coupled with Theorem 5.1 implies that temporal autocorrelations for partially mixing populations can mediate persistence even when the expected fitness \(\mathbb {E}[\lambda _t^r]\) is less than one in all patches, a finding related to earlier work by Roy et al. (2005). This dispersal mediated persistence occurs when spatial correlations are sufficiently weak, temporal fluctuations are sufficiently large and positively autocorrelated, and there are sufficiently many patches.

Biological Interpretation 5.4

Metapopulations with density-dependent growth can stochastically persist despite all local populations being extinction prone in the absence of immigration. Temporal autocorrelations can enhance this effect.

6 Applications to competing species in space

The roles of spatial and temporal heterogeneity in maintaining diversity is a fundamental problem of practical and theoretical interest in population biology (Chesson 2000a, b; Loreau et al. 2003; Mouquet and Loreau 2003; Davies et al. 2005). To examine the role of both forms of heterogeneity in maintaining diversity of competitive communities, we consider lottery-type models of \(m\) competing populations in a landscape consisting of \(n\) patches. For there models, competition for vacant space determines the within patch dynamics, while dispersal between the patches couples the local dynamics. After describing a general formulation of these models for an arbitrary number of species with potentially frequency-dependent interactions, we illustrate how to apply our results to case of two competing species and three competing species exhibiting an intransitive, rock-paper-scissor like dynamic.

6.1 Formulation of the general model

To describe the general model, let \(X_t^{ir}\) denote the fraction of patch \(r\) occupied by population \(i\) at time \(t\). At each time step, a fraction \(\varepsilon >0\) of individuals die in each patch. The sites emptied by the dying individuals get randomly assigned to progeny in the patch. Birth rates within each patch are determined by local pair-wise interactions. Let \(\xi _t^{ij}(r)\) be the “payoff” to strategy \(i\) interacting with strategy \(j\) in patch \(r\) at time \(t\). Let

$$\begin{aligned} \Xi _t(r)=\left( \xi _t^{ij}(r)\right) _{1\le i, j\le m} \end{aligned}$$
(19)

be the payoff matrix for patch \(r\). The total number of progeny produced by an individual playing strategy \(i\) in patch \(r\) is \(\sum _j \xi _t^{ij} X_t^{jr}\). Progeny disperse between patches with \(d_{sr}\) the fraction of progeny dispersing from patch \(s\) to patch \(r\). Under these assumptions, the spatial-temporal dynamics of the competing populations are given by

$$\begin{aligned} X_{t+1}^{ir}=\varepsilon \frac{\sum _{s} d_{sr} \sum _{j} \xi _t^{ij}(s)X_t^{js} X_t^{is}}{\sum _s d_{sr} \sum _{j,l} \xi _t^{lj}(s) X_t^{js}X_t^{ls}} +(1-\varepsilon ) X_t^{ir}. \end{aligned}$$
(20)

Let \(A_i(\xi ,X)\) be the matrix whose \(s-r\) entry is given by

$$\begin{aligned} \varepsilon \frac{d_{sr} \sum _{j} \xi ^{ij}(s) X^{js}}{\sum _{s'} d_{s'r} \sum _{j,l} \xi ^{lj}(s') X^{js'}X^{ls'}} \end{aligned}$$

for \(r\ne s\), and

$$\begin{aligned} \varepsilon \frac{ d_{sr} \sum _{j} \xi ^{ij}(s) X_t^{js}}{\sum _{s'} d_{s'r} \sum _{j,l} \xi ^{lj}(s') X^{js'}X^{ls'}}+1-\varepsilon \end{aligned}$$

for \(r=s\). With these definitions, (20) takes on the form of our model (1).

To illustrate the insights that can be gained from a persistence analysis of these models, we consider two special cases. The first case is a spatially explicit version of Chesson and Warner (1981)’s lottery model. The second case is a spatial version of a stochastic rock-paper-scissor game. For both of these examples, we assume that a fraction \(d\) of all progeny disperse randomly to all patches and the remaining fraction \(1-d\) do not disperse. Under this assumption, we get \(d_{sr}=d/(m-1)\) for \(s\ne r\) and \(d_{ss}=1-d\). These populations are fully mixing when \(d=\frac{m-1}{m}\) in which case \(d_{sr}=\frac{1}{m}\) for all \(s,r\).

6.2 A spatially-explicit lottery model

The lottery model of Chesson and Warner (1981) assumes that the competing populations do not exhibit frequency dependent interactions. More specifically, the “payoffs” \(\xi _t^{ij}(r)=\xi _t^i(r)\) for all \(i,j\) are independent of the frequencies of the other species. Consequently, the model takes on a simpler form

$$\begin{aligned} X_{t+1}^{ir}=\varepsilon \frac{\sum _{s} d_{sr} \xi _t^{i}(s) X_t^{is}}{\sum _s d_{sr} \sum _{j} \xi _t^{j}(s)X_t^{js}} +(1-\varepsilon ) X_t^{ir} \end{aligned}$$
(21)

where \(d_{sr}=\frac{d}{m-1}\) for \(r\ne s\) and \(d_{ss}=1-d\).

For two competing species (i.e. \(m=2\)), the population states \(z_1 =(1,\dots ,1,0, \dots , 0)\) and \(z_2=(0,\dots ,0,1,\dots , 1)\) correspond to only species \(1\) and only species \(2\) occupying the landscape, respectively. The extinction set is \(S_0=\{z_1,z_2\}\). Theorem 3.1 implies that a sufficient condition for stochastic persistence is the existence of positive weights \(p_1,p_2\) such that

$$\begin{aligned} p_1 r_1(z_1)+p_2 r_2(z_1)>0 \quad \text{ and } \quad p_1r_1(z_2)+p_2r_2(z_2)>0. \end{aligned}$$

Proposition 8.19 implies that the long-term growth rate of any invariant measure, with a support bounded away from the extinction set, is equal to zero. In particular, this proposition applies to the subsystems of species \(1\) and \(2\), and to the Dirac measures \(\delta _{z_1}\) and \(\delta _{z_2}\), respectively. Therefore \(r_1(z_1)=r_2(z_2)=0\) with probability one. This implies that \(r_1(z_1)=r_2(z_2)=0\). Hence, the persistence criterion simplifies to

$$\begin{aligned} r_1(z_2)>0\quad \text{ and } \quad r_2(z_1)>0. \end{aligned}$$

In other words, persistence occurs if each species has a positive invasion rate.

To get some biological intuition from the mutual invasibility criterion, we consider the limiting cases of relatively sedentary populations (i.e. \(d\approx 0\)) and highly dispersive populations (i.e. \(d\approx 1\)). In these cases, we get explicit expressions for the realized per-capita growth rates \(r_i(z_j)\) that simplify further for short-lived (i.e. \(\varepsilon \approx 1\)) and long-lived (i.e. \(\varepsilon \approx 0\)) species. Our analytical results are illustrated numerically in Fig. 3.

Fig. 3
figure 3

Effects of dispersal and survival on coexistence of two species. The log-fecundities \(\xi ^i\) are independent and normally distributed with means \(\mu _1=(5,0,5,0,\dots ,0)\), \(\mu _2=(0,5,0,\dots ,0)\) and variances \(\sigma ^2_1=\sigma ^2_2=(1,\dots ,1)\) for (I) and \((3,\dots ,3)\) for (II). The white lines correspond to the zero-lines of the respective Lyapunov exponents

6.2.1 Relatively sedentary populations

When populations are completely sedentary (i.e. \(d=0\) ), the projection matrix \(A_2(\xi ,z_1)\) corresponding to species \(2\) trying to invade a landscape monopolized by species \(1\) reduces to a diagonal matrix whose \(r\)-th diagonal entry equals

$$\begin{aligned} \varepsilon \frac{\xi _t^2(r)}{\xi _t^1(r)} +1-\varepsilon . \end{aligned}$$

The dominant Lyapunov exponent in this limiting case is given by

$$\begin{aligned} r_2(z_1)= \max _r \mathbb {E}\left[ \log \left( \varepsilon \frac{\xi _t^2(r)}{\xi _t^1(r)} +1-\varepsilon \right) \right] . \end{aligned}$$

Proposition 3 from Benaïm and Schreiber (2009) implies that \(r_2(z_1)\) is a continuous function of \(d\). Consequently, \(r_2(z_1)\) is positive for small \(d>0\) provided that \(\mathbb {E}\left[ \log \left( \varepsilon \frac{\xi _t^2(r)}{\xi _t^1(r)} +1-\varepsilon \right) \right] \) is strictly positive for some patch \(r\). Similarly, \(r_1(z_2)\) is positive for small \(d>0\) provided that \(\mathbb {E}\left[ \log \left( \varepsilon \frac{\xi _t^1(r)}{\xi _t^2(r)} +1-\varepsilon \right) \right] \) is strictly positive for some patch \(r\). Thus, coexistence for small \(d>0\) occurs if

$$\begin{aligned} \max _r \mathbb {E}\left[ \log \left( \varepsilon \frac{\xi _t^2(r)}{\xi _t^1(r)} +1-\varepsilon \right) \right] >0\quad \text{ and } \quad \max _r \mathbb {E}\left[ \log \left( \varepsilon \frac{\xi _t^1(r)}{\xi _t^2(r)} +1-\varepsilon \right) \right] >0. \end{aligned}$$

When \(\varepsilon \approx 1\) or \(\varepsilon \approx 0\), we get more explicit forms of this coexistence condition. When the populations are short-lived (\(\varepsilon \approx 1\)), the coexistence condition simplifies to \(\mathbb {E}[\log \xi _t^1(r)]>\mathbb {E}[\log \xi _t^2(r)]\) and \(\mathbb {E}[\log \xi _t^2(s)]>\mathbb {E}[\log \xi _t^1(s)]\) for some patches \(r\ne s\). Coexistence requires that each species has at least one patch in which they have a higher geometric mean in their reproductive output.

When the populations are long lived (\(\varepsilon \approx 0\)) and relatively sedentary (\(d\approx 0\)), the coexistence condition is

$$\begin{aligned} \mathbb {E}\left[ \frac{\xi _t^2(r)}{\xi _t^1(r)}\right] >1\quad \text{ and } \quad \mathbb {E}\left[ \frac{\xi _t^1(s)}{\xi _t^2(s)}\right] >1 \end{aligned}$$

for some patches \(r,s\). Unlike short-lived populations, it is possible that both inequalities are satisfied for the same patch. For example, when the log-fecundities \(\log \xi _t^i(r)\) are independent and normally distributed with mean \(\mu _i(r)\) and variance \(\sigma _i^2(r)\), the coexistence conditions is

$$\begin{aligned} \mu _2(r) -\mu _1(r) + \frac{\sigma _1^2(r)+\sigma _2^2(r)}{2}>1 \end{aligned}$$

for some patch \(r\) and

$$\begin{aligned} \mu _1(s)-\mu _2(s) + \frac{\sigma _1^2(s)+\sigma _2^2(s)}{2}>1 \end{aligned}$$

for some patch \(s\). Both conditions can be satisfied in the same patch \(r\) provided that \(\sigma _1(r)\) or \(\sigma _2(r)\) is sufficiently large.

Biological Interpretation 6.1

For relatively sedentary populations, coexistence occurs if each species has a patch it can invade when rare. If the populations are also short-lived, coexistence requires that each species has a patch in which it is competitively dominant. Alternatively, if populations are also long-lived, regional coexistence may occur if species coexist locally within a patch due to the storage effect. For uncorrelated and log-normally distributed fecundities, this within-patch storage effect occurs if the difference in the mean log-fecundities is sufficiently smaller than the net variance in the log-fecundities.

6.2.2 Well-mixed populations

For populations that are highly dispersive (i.e. \(d=\frac{m-1}{m}\)), the spatially explicit Lottery model reduces to a spatially implicit model where

$$\begin{aligned} \begin{aligned} r_1(z_2)&= \mathbb {E}\left[ \log \left( \varepsilon \frac{\sum _r \xi _t^1(r)}{\sum _r \xi _t^2(r)}+1-\varepsilon \right) \right] \text{ and }\\ r_2(z_1)&= \mathbb {E}\left[ \log \left( \varepsilon \frac{\sum _r \xi _t^2(r)}{\sum _r \xi _t^1(r)}+1-\varepsilon \right) \right] .\\ \end{aligned} \end{aligned}$$

For short lived populations (\(\varepsilon =1\)), these long-term growth rates simplify to

$$\begin{aligned} \begin{aligned} r_1(z_2)&= \mathbb {E}\left[ \log \sum _r \xi _t^1(r)\right] - \mathbb {E}\left[ \log \sum _r \xi _t^2(r)\right] \text{ and }\\ r_2(z_1)&= \mathbb {E}\left[ \log \sum _r \xi _t^2(r)\right] - \mathbb {E}\left[ \log \sum _r \xi _t^1(r)\right] .\\ \end{aligned} \end{aligned}$$

Since \(r_1(z_2)=-r_2(z_1)\), the persistence criterion that \(r_1(z_2)>0\) and \(r_2(z_1)>0\) is not satisfied generically.

Alternatively, for long-lived populations (\(\varepsilon \approx 0\)), the invasion rates of well-mixed populations becomes to first-order in \(\varepsilon >0\):

$$\begin{aligned} \begin{aligned} r_1(z_2)&\approx \varepsilon \left( \mathbb {E}\left[ \frac{ \sum _r \xi _t^1(r)}{ \sum _r \xi _t^2(r)}\right] -1\right) \text{ and }\\ r_2(z_1)&\approx \varepsilon \left( \mathbb {E}\left[ \frac{ \sum _r \xi _t^2(r)}{ \sum _r \xi _t^1(r)}\right] -1\right) . \end{aligned} \end{aligned}$$

We conjecture that this coexistence condition is less likely to be met than the coexistence condition for relatively sedentary populations. To see why, consider a small variance approximation of these invasion rates. Assume that \(\xi _t^i = \bar{\xi ^i} +\eta Z^i_t(r)\) where \(Z^i_t(r)\) are independent and identically distributed in \(i,r\) and \(\mathbb {E}[Z^i_t(r)]=0\) for all \(i,r\). Let \(\sigma ^2 = \mathbb {E}[(Z_t^i(r))^2]\). A second order Taylor’s approximation in \(\eta \) yields the following approximation of the (rescaled) long-term growth rates for well-mixed populations

$$\begin{aligned} \begin{aligned} \mathbb {E}\left[ \frac{ \sum _r \xi _t^1(r)}{ \sum _r \xi _t^2(r)}\right] -1&\approx \frac{\bar{\xi }^1}{\bar{\xi }^2}+\frac{\bar{\xi }^1\sigma ^2/n}{(\bar{\xi }^2)^3}-1 \end{aligned} \end{aligned}$$
(22)

and the following approximation for relatively sedentary populations

$$\begin{aligned} \begin{aligned} \max _r\mathbb {E}\left[ \frac{\xi _t^1(r)}{\xi _t^2(r)}\right] -1&\approx \frac{\bar{\xi }^1}{\bar{\xi }^2}+\frac{\bar{\xi }^1\sigma ^2}{(\bar{\xi }^2)^3}-1. \end{aligned} \end{aligned}$$
(23)

Since (23) is greater than (22), persistence is more likely for relatively sedentary populations in this small noise limit.

Biological Interpretation 6.2

Short-lived and highly dispersive competitors do not satisfy the coexistence condition. Long-lived and highly-dispersive competitors may coexist. However, coexistence appears to be less likely than for sedentary populations as spatial averaging reduces the temporal variability experienced by both populations and, thereby, weakens the storage effect.

6.3 The rock-paper-scissor game

In the last few years the rock-paper-scissor game, which might initially seem to be of purely theoretical interest, has emerged as playing an important role in describing the behavior of various real-world systems. These include the evolution of alternative male mating strategies in the side-blotched lizard Uta Stansburiana (Sinervo and Lively 1996), the in vitro evolution of bacterial populations (Kerr et al. 2002; Nahum et al. 2011), the in vivo evolution of bacterial populations in mice (Kirkup and Riley 2004), and the competition between genotypes and species in plant communities (Lankau and Strauss 2007; Cameron et al. 2009). More generally, the rock-scissors-paper game – which is characterized by three strategies R, P and S, which satisfy the non-transitive relations: P beats R (in the absence of S), S beats P (in the absence of R), and R beats S (in the absence of P) – serves as a simple prototype for studying the dynamics of more complicated non-transitive systems (Buss and Jackson 1979; Paquin and Adams 1983; May and Leonard 1975; Schreiber 1997; Schreiber and Rittenhouse 2004; Vandermeer and Pascual 2005; Allesina and Levine 2011). Here, we examine a simple spatial version of this evolutionary game in a fluctuating environment.

Let \(x^1_t(r),x^2_t(r)\), and \(x^3_t(r)\) be the frequencies of the rock, paper, and scissor strategies in patch \(r\), respectively. All strategies in patch \(r\) receive a basal payoff of \(a^r_t\) at time \(t\). Winners in an interaction in patch \(r\) receive a payoff of \(b_t^r\) while losers pay a cost \(c_t^r\). Thus, the payoff matrix (19) for the interacting populations in patch \(r\) is

$$\begin{aligned} \Xi _t(r) = a_t^r + \begin{pmatrix} 0&{}\quad -c_t^r &{}\quad b_t^r \\ b_t^r &{}\quad 0&{}\quad -c_t^r \\ -c_t^r &{}\quad b_t^r &{}\quad 0 \end{pmatrix}. \end{aligned}$$

We continue to assume that the fraction of progeny dispersing from patch \(s\) to patch \(r\) equals \(d/(m-1)\) for \(s\ne r\) and \(1-d\) otherwise.

Our first result about the rock-paper-scissor model is that it exhibits a heteroclinic cycle in \(S_0\) between the three equilibria \(E_1=(1,\dots ,1,0,\dots ,0,0,\dots ,0)\), \(E_2=(0,\dots ,0,1,\dots ,1,0,\dots ,0)\) and \(E_3=(0,\dots ,0,0,\dots ,0,1,\dots ,1)\). For two vectors \(x=(x_1,\dots ,x_n)\), \(y=(y_1,\dots ,y_n)\), we write \(x>y\) if \(x_i\ge y_i\) for all \(i\) with at least one strict inequality.

Proposition 6.3

Assume \(d,\varepsilon \in (0,1]\) and \(a_t^r>c_t^r\), \(\log a_t^r, \log c_t^r, \log b_t^r \in [-M,M]\) with probability one for some \(M>0\). If \(x^1_0>(0,\dots ,0)\) and \(x^2_0>(0,\dots ,0)\) and \(x_0^3=(0,\dots ,0)\), then \(\lim _{t\rightarrow \infty } x_t = E_2\) with probability one. If \(x^1_0>(0,\dots ,0)\) and \(x^3_0>(0,\dots ,0)\) and \(x_0^2=(0,\dots ,0)\), then \(\lim _{t\rightarrow \infty } x_t = E_1\) with probability one. If \(x^2_0>(0,\dots ,0)\) and \(x^3_0>(0,\dots ,0)\) and \(x_0^1=(0,\dots ,0)\), then \(\lim _{t\rightarrow \infty } x_t = E_3\) with probability one.

Proof

It suffices to prove the assertion for the case in which \(x^1_0>(0,\dots ,0)\) and \(x^2_0>(0,\dots ,0)\) and \(x_0^3=(0,\dots ,0)\). Let \(\mathbf {1}= (1,\dots ,1) \in \mathbb R^n\). Our assumptions \(b_t^r>0\) and \(a_t^r>c_t^r>0\) imply there exists \(\eta >0\) such that \(A_2(\xi _{t+1},X_t)\gg \exp (\eta ) A_1(\xi _{t+1},X_t)\) with probability one. It follows that

$$\begin{aligned} \begin{aligned} \limsup _{t\rightarrow \infty } \frac{1}{t} \log \Vert X^1_t\Vert&= \limsup _{t\rightarrow \infty } \frac{1}{t} \log \Vert X_0^1A_1 (\xi _1,X_0)\dots A_1(\xi _t,X_{t-1}) \Vert \\&\le \limsup _{t\rightarrow \infty } \frac{1}{t} \log \Vert X_0^1A_2(\xi _1,X_0)\dots A_2(\xi _t,X_{t-1})\Vert -\eta \\&= \limsup _{t\rightarrow \infty } \frac{1}{t} \log \Vert \mathbf {1}A_2(\xi _1,X_0)\dots A_2(\xi _t,X_{t-1})\Vert -\eta \\&\le -\eta \end{aligned} \end{aligned}$$

where the last two lines follow from Proposition 8.16 and its Corollary 8.17. Hence, \(\lim _{t\rightarrow \infty } \Vert X_t^1 \Vert =0\) as claimed. \(\square \)

Proposition 6.3 implies that for any \(x\in S_0\) and \(1\le i \le 3\), \(r_i(x)=r_i(E_j)\) for some \(1\le j \le 3\). Hence, the persistence criterion of Theorem 3.1 requires \(p_1,p_2,p_3>0\) such that

$$\begin{aligned} \sum _i p_i r_i (E_j) > 0 \qquad \text{ for } \text{ all } 1\le j\le 3. \end{aligned}$$

A standard algebraic calculation shows that this persistence criterion is satisfied if and only if

$$\begin{aligned} r_2(E_1)r_3(E_2)r_1(E_3)> -r_3(E_1)r_1(E_2)r_2(E_3) \end{aligned}$$

i.e. the product of the positive invasion rates is greater than the absolute value of the product of the negative invasion rates. The symmetry of our model implies that all the positive invasion rates are equal and all the negative invasion rates are equal. Hence, coexistence requires

$$\begin{aligned} r_2(E_1)>-r_3(E_1). \end{aligned}$$

As for the case of two competing species, we can derive more explicit coexistence criteria when the populations are relatively sedentary (i.e. \(d \approx 0\)) or the populations are well-mixed (i.e. \(d=\frac{m}{m-1}\)). For relatively sedentary populations, coexistence requires

$$\begin{aligned} \max _r \mathbb {E}\left[ \log \left( 1-\varepsilon + \varepsilon \frac{a_t^r +b_t^r}{a_t^r}\right) \right] > -\max _r \mathbb {E}\left[ \log \left( 1-\varepsilon + \varepsilon \frac{a_t^r -c_t^r}{a_t^r}\right) \right] . \end{aligned}$$

For long-lived populations, this coexistence criterion simplifies further to

$$\begin{aligned} \max _r \mathbb {E}\left[ \frac{b_t^r}{a_t^r} \right] > \min _r \mathbb {E}\left[ \frac{c_t^r}{a_t^r} \right] . \end{aligned}$$

Alternatively, when the populations are well-mixed, coexistence requires

$$\begin{aligned} \mathbb {E}\left[ \log \left( 1-\varepsilon + \varepsilon \frac{\sum _r a_t^r +b_t^r}{\sum _r a_t^r}\right) \right] > -\max _r \mathbb {E}\left[ \log \left( 1-\varepsilon + \varepsilon \frac{\sum _r a_t^r -c_t^r}{\sum _r a_t^r}\right) \right] . \end{aligned}$$

For long-lived populations, this coexistence criterion simplifies further to

$$\begin{aligned} \mathbb {E}\left[ \frac{\sum _r b_t^r}{\sum _r a_t^r} \right] > \min _r \mathbb {E}\left[ \frac{\sum _r c_t^r}{\sum _r a_t^r} \right] . \end{aligned}$$

Biological Interpretation 6.4

For relatively sedentary populations, coexistence only requires that average benefits (relative to the base payoff) in one patch is greater than the average costs (relative to the base payoff) in another patch. Negative correlations between benefits \(b_t^r\) and basal payoffs \(a_t^r\) promote coexistence. For highly dispersive species whose base payoffs are constant in space in time (i.e. \(a_t^r=a\) for all \(t,r\)), coexistence requires the spatially and temporally averaged benefits of interactions exceed the spatially and temporally averaged costs of interactions.

7 Discussion

Understanding the conditions that ensure the long-term persistence of interacting populations is of fundamental theoretical and practical importance in population biology. For deterministic models, coexistence naturally corresponds to an attractor bounded away from extinction. Since populations often experience large perturbation, many authors have argued that the existence of a global attractor (i.e. permanence or uniform persistence) may be necessary for long-term persistence (Hofbauer and Sigmund 1998; Smith and Thieme 2011). Most populations experience stochastic fluctuations in their demographic parameters (May 1973) which raises the question (May 1973, p. 621) “How are the various usages of the term [persistence] in deterministic and stochastic circumstances related?” Only recently has it been shown that the deterministic criteria for permanence extend naturally to criteria for stochastic persistence in stochastic difference and differential equations (Benaïm et al. 2008; Schreiber et al. 2011). These criteria assume that the populations are unstructured (i.e. no differences among individuals) and environmental fluctuations are temporally uncorrelated. However, many populations are structured as highlighted in a recent special issue in Theoretical Population Biology (Tuljapurkar et al. 2012) devoted to this topic. Moreover, many environmental factors such as temperature and precipitation exhibit temporal autocorrelations (Vasseur and Yodzis 2004). Here, we prove that by using long-term growth rates when rare, the standard criteria for persistence extend to models of interacting populations experiencing correlated as well as uncorrelated environmental stochasticity, exhibiting within population structure, and any form of density-dependent feedbacks. To illustrate the utility of these criteria, we applied them to persistence of predator-prey interactions in auto-correlated environments, structured populations with overcompensating density-dependence, and competitors in spatially structured environments.

Mandelbrot (1982) proposed that environmental signals commonly found in nature may be composed of frequencies \(f\) that scale according to an inverse power law \(1/f^\beta \). With this scaling, uncorrelated (i.e. white) noise corresponds to \(\beta =0\), positively auto-correlated (i.e. red or brown) noise corresponds to \(\beta >0\), and negatively auto-correlated (e.g. blue) noise corresponds to \(\beta <0\). Many environmental signals important to ecological processes including precipitation, mean air temperature, degree days, and seasonal indices exhibit positive \(\beta \) exponents (Vasseur and Yodzis 2004). Consistent with prior work on models with compensating density dependence (Roughgarden 1975; Johst and Wissel 1997; Petchey 2000), we found that positive autocorrelations in the maximal per-capita growth rate of species increases the long-term variability in their densities. If this species is the prey for a predatory species, we showed that this increased variability in prey densities reduced a predator’s realized per-capita growth rate when rare. Hence, positive autocorrelations may impede predator-prey coexistence. In contrast, negative autocorrelations, possibly due to a biotic feedback between the prey species and its resources, may facilitate coexistence by reducing variation in prey densities and, thereby, increase the predator’s growth rate when rate. These results are qualitatively consistent with prior results that positive-autocorrelations in predator-prey systems can increase variation in prey and predator densities when they coexist (Collie and Spencer 1994; Ripa and Ives 2003). Specifically, in a simulation study of predator-prey interactions in pelagic fish stocks, Collie and Spencer (1994) found reddened noise resulted in predator-prey densities “to shift between high and low equilibrium levels” and, thereby, increase variability in their abundances. Similarly, using linear approximations, Ripa and Ives (2003) found that environmental autocorrelations increased the amplitude of populations cycles. All of these results, however, stem from the per-capita growth rate of the predator being an increasing, concave function of prey density. Changes in concavity (e.g. a type-III functional response) could produce an opposing result: increased variability in prey densities may facilitate predator invasions. A more detailed analysis of this alternative is still needed.

Classical stochastic demography theory (Tuljapurkar 1990; Boyce et al. 2006) considers population growth rates in the absence of density-dependent feedbacks. Our results for populations experiencing negative-density dependence show that stochastic persistence depends on the population’s long-term growth rate \(r(0)\) when rare. Hence, applying stochastic demography theory to \(r(0)\) provides insights into how environmental stochasticity interacts with population structure to determine stochastic persistence. For example, a fundamental result from stochastic demography is that positive, within-year correlations between vital rates decreases \(r(0)\) and thereby may thwart stochastic persistence, a result consist with our analysis of the stochastic LPA model for flour beetle dynamics. Stochastic demography theory also highlights that temporal autocorrelations can have subtle effects on \(r(0)\). In particular, for a density-independent version of the metapopulation model considered here, Schreiber (2010) demonstrated that positive temporal autocorrelations can increase the metapopulation growth rate \(r(0)\) when rare for partially mixing populations, a prediction consistent with laboratory experiments (Matthews and Gonzalez 2007) and earlier theoretical work (Roy et al. 2005). In contrast, Tuljapurkar and Haridas (2006) found that negative temporal autocorrelations between years with and without fires increased the realized per-capita growth rate \(r(0)\) for models of the endangered herbaceous perennial Lomatium bradshawii. Our results imply that these results also apply to models accounting for density-dependence.

Spatial heterogeneity of populations has been shown theoretically and empirically to have an effect on coexistence of competitive species (see e.g. Amarasekare (2003) or Chesson (2000b) for a review). Coexistence requires species to exhibit niche differentiation that decrease the interspecific competition (Chesson 2000a). In a fluctuating environment, these niches can arise as differential responses to temporal variation (McGehee and Armstrong 1977; Armstrong and McGehee 1980; Chesson 2000a, b), spatial variation (May and Hassell 1981; Chesson 2000a, b; Snyder and Chesson 2003), or a combination of both forms of variation (Chesson 1985; Snyder 2007, 2008). For the spatial lottery model where species disperse between a finite number of patches and compete for micro sites within these patches, our coexistence criterion applies, and reduces to the mutual invasibility criterion. Although Chesson (1985) proved this result in the limit of an infinite number of patches with temporally uncorrelated fluctuations, our result is less restrictive as the number of patches can be small and temporal fluctuations can be autocorrelated. Using this mutual invasibility criterion, we derive explicit coexistence criteria for relatively sedentary populations and highly dispersive populations. In the former case, coexistence occurs if each species has a patch it can invade when rare. For short-lived populations, coexistence requires that each species has a patch in which it is competitively dominant. Alternatively, for long-lived populations, regional coexistence may occur if species coexist locally within a patch due to the storage effect  (Chesson and Warner 1981; Chesson 1982, 1994) in the one patch case. For highly dispersive populations, the coexistence criterion is only satisfied if populations exhibit overlapping generations, a conclusion consistent with (Chesson 1985). By providing the first mathematical confirmation of the mutual invasibility criterion for the spatial lottery model with spatial and temporal variation, our result opens the door for mathematically more rigorous investigations in understanding the relative roles of temporal variation, spatial heterogeneity, and dispersal on coexistence.

For lottery models with three or more species, persistence criteria are more subtle and invasibility of all sub communities isn’t always sufficient (May and Leonard 1975). For example, in rock-paper-scissor communities where species \(2\) displaces species \(1\), \(3\) displaces \(2\) , and \(1\) displaces \(3\), all sub communities which consist of a single species are invasible by another, but coexistence may not occur (Hofbauer and Sigmund 1998; Schreiber and Killingback 2013). For the deterministic models, coexistence requires that the geometric mean of the benefits of pair-wise interactions exceeds the costs of these interactions (Schreiber and Killingback 2013). Schreiber et al. (2011) and Schreiber and Killingback (2013) studied these interactions in models separately accounting for temporal fluctuations or spatial heterogeneity. In both cases, temporal heterogeneity or spatial heterogeneity can individually promote coexistence . Here we extend these result to intransitive communities experiencing both spatial heterogeneity and temporal fluctuations, thereby unifying this prior work. Our persistence criterion reduces to: the geometric mean of the positive long-term, low-density growth rates of each species (e.g. invasion rate of rock to scissor) is greater than the geometric mean of the absolute values of the negative, long-term, low-density growth rates (e.g. invasion rate of rock to paper). For relatively sedentary populations, coexistence only requires that average benefits (relative to the base payoff) in one patch is greater than the average costs in another patch. Moreover, negative correlations between benefits and basal payoffs promote coexistence. For highly dispersive species, coexistence requires the spatially and temporally averaged benefits of interactions exceed the spatially and temporally averaged costs of interactions, assuming that base payoffs are constant in space and time.

The theory of stochastic population dynamics is confronted with many, exciting challenges. First, our persistence criterion requires every sub community (as represented by an ergodic invariant measure supporting a subset of species) is invasible by at least one missing species. While this invasibility condition in general isn’t sufficient for coexistence, understanding when it is sufficient remains a challenging open question. For example, it should be sufficient for most food chain models (see the argument for deterministic models in Schreiber 2000), non-interacting prey species sharing a common predator (see the argument for deterministic models in Schreiber 2004), and species competing for a single resource species. However, finding a simple criterion underlying these examples is lacking. Second, while we have provided a sufficient condition for stochastic persistence, it is equally important to develop sufficient conditions for the asymptotic exclusion of one or more species with positive probability. In light of the deterministic theory, a natural conjecture in this direction is the following: if there exist non-negative weights \(p_1,\dots , p_k\) such that

$$\begin{aligned} \sum _i p_i r_i(x)<0 \end{aligned}$$

for every population state \(x\) in the extinction set \(S_0\), then there exist positive initial conditions such that \(X_t\) asymptotically approaches \(S_0\) with positive probability. Benaïm et al. (2008) proved a stronger version of this conjecture for stochastic differential equation models where the diffusion term is small and the populations are unstructured. However, it is not clear whether there methods carry over to models with “large” noise or population structure. Another important challenge is relaxing the compactness assumption H4 for our stochastic persistence results. While this assumption is biologically realistic (i.e. populations always have an upper limit on their size), it is theoretically inconvenient as many natural models of environmental noise have non-compact distributions (e.g. log-normal or gamma distributions). One promising approach developed by Benaïm and Schreiber (2009) for structured models of single species is identifying Lyapunov-like functions that decrease on average when population densities get large. Finding sufficient conditions for “stochastic boundedness” is only half of the challenge, extending the stochastic persistent results to these “stochastically bounded” models will require additional innovations. Finally, and most importantly, there is a desperate need to develop more tools to analytically approximate or directly compute the long-term growth rates \(r_i(\mu )\) when rare. One promising approach is Pollicott (2010)’s recently derived power series representation of Lyapunov exponents.

8 Proof of Theorems 3.1 and 3.4

This Section proves Theorem 3.4 from which Theorem 3.1 follows. Sections 8.1 and 8.2 lead to the statement of Theorem 8.11 which is equivalent to Theorem 3.4. The rest of the section is dedicated to the proof of Theorem 8.11. More specifically, in Sect. 8.1, we recast our stochastic model (1) and our main hypothesis in Arnold’s framework of random dynamical system (Arnold 1998; Bhattacharya and Majumdar 2007). The purpose of this recasting is to write explicitly the underlying dynamics of the matrix products (3) in order to use the Random Perron-Frobenius Theorem (Ruelle (1979a)), a key element in the proof of Theorem 3.4. The Random Perron-Frobenius Theorem requires this underlying dynamics to be invertible which is, a priori, not the case here. Therefore, in Sect. 8.2, we extend the underlying dynamics to an invertible dynamic on the trajectory space and state Theorem 8.11 which is equivalent to Theorem 3.4. Working in the Arnold’s framework and extending the dynamic to the trajectory space requires three forms of notation (i.e. main text, random dynamical system and trajectory space) that are summarized in Table 1. In Sect. 8.3, we prove basic results about the average per-capita growth rates \(r_i\). In Sect. 8.4, we prove several basic results about occupational measures and their weak* limit points. These basic results are proven for the extended state space. Proposition 8.10 and Lemma 8.20 translate these results to non-extended state space. A proof of Theorem 8.11 is provided in Sect. 8.5.

Table 1 Notation for the probabilistic, RDS, and trajectory space formulations of the population dynamics

8.1 Random dynamical systems framework

To prove our main result, it is useful to embed (1) and assumptions H1-H4 within Arnold’s general framework of random dynamical systems. Let \(\Omega =E^\mathbb Z\) be the set of possible environmental trajectories, \(\mathcal {F} = \mathcal {E}^\mathbb Z\) be the product \(\sigma \)-algebra on \(\Omega \), \(\theta : \Omega \mapsto \Omega \) be the shift operator defined by \(\theta (\{\omega _t\}_{t \in \mathbb Z}) = \{\omega _{t+1}\}_{t \in \mathbb Z}\), and \({\mathbb {Q}}\) be the probability measure on \(\Omega \) satisfying

$$\begin{aligned} {\mathbb {Q}}(\{\omega \in \Omega : \omega _t \in E_0,\dots ,\omega _{t+k}\in E_k\}) ={\mathbb {P}}(\xi _0\in E_0,\dots ,\xi _k \in E_k) \end{aligned}$$

for any Borel sets \(E_0,\dots ,E_k \subset E\). Since \(E\) is a Polish space, the space \(\Omega \) endowed with the product topology is Polish as well. Therefore, by the Kolmogorov consistency theorem, the probability measure \({\mathbb {Q}}\) is well defined, and by a theorem of Rokhlin (1964), \(\theta \) is ergodic with respect to \({\mathbb {Q}}\). Randomness enters by choosing randomly a point \(\omega =\{\omega _t\}_{t\in \mathbb Z} \in \Omega \) with respect to the probability distribution \({\mathbb {Q}}\) and defining the environmental state at time \(t\) as \(\omega _t\).

In this framework, the dynamics (1) takes on the form

$$\begin{aligned} \left\{ \begin{array}{ll} X_{t+1}(\omega ,x) = X_t(\omega ,x) A(\omega _t,X_t(\omega ,x)) \\ X_0(\omega ,x) =x \in S. \end{array} \right. \end{aligned}$$
(24)

We call (24), the random dynamical system determined by \((\theta ,{\mathbb {P}}, A)\).

Define the skew product

$$\begin{aligned} \begin{array}{ccl} \Phi : &{}\Omega \times \mathbb R^n_+ &{}\rightarrow \Omega \times \mathbb R^n_+ \\ &{} (\omega , x) &{} \mapsto (\theta (\omega ), xA(\omega _0,x)) \end{array} \end{aligned}$$

associated with the dynamics (24) and define the projection maps \(p_1 : \Omega \times \mathbb R^n \rightarrow \Omega \) and \(p_2 : \Omega \times \mathbb R^n \rightarrow \mathbb R^n\) by \(p_1(x,\omega ) = \omega \) and \(p_2(x,\omega ) = x\). Let \(\Phi ^t\) denote the composition of \(\Phi \) with itself \(t\) times, for \(t \in \mathbb N\). Remark 1.1.8 in Arnold (1998) implies that the random dynamical system (24) is characterized by the skew product \(\Phi \) and vice versa. In particular, note that \(X_{t+1}(\omega ,x) = p_2 \circ \Phi ^{t+1}(\omega ,x)\) for \(x\in S\) and \(\omega \in \Omega \). Working with \(\Phi \) allows the use of the discrete dynamical system theory.

Definition 8.1

A compact set \(K\subset \Omega \times \mathbb R^n_+\) is a global attractor for \(\Phi \) if there exists a neighborhood \(V\) of \(K\) such that

  1. (i)

    for all \((\omega ,x) \in \Omega \times \mathbb R^n_+ \), there exist \(T \in \mathbb N\) such that \(\Phi ^t(\omega ,x) \in V\) for all \(t \ge T\);

  2. (ii)

    \(\Phi (V) \subset V\) and \(K=\bigcap _{t\in \mathbb N}\Phi ^t(V)\).

In this random dynamical systems framework, our assumptions H1 and H4 take on the form

  1. H1’:

    \(\Omega \) is a compact space, \({\mathbb {Q}}\) is a Borel probability measure, and \(\theta \) is an invertible map that is ergodic with respect to \({\mathbb {Q}}\), i.e. for all Borel set \(B \subset \Omega \), such that \(\theta ^{-1}(B)=B\), we have \({\mathbb {Q}}(B) \in \{0,1\}\).

  2. H4’:

    There exists a global attractor \(K \subset \Omega \times \mathbb R^n_+\) for \(\Phi \).

Assumptions H2–H3 do not need to be rewritten in the new framework. Since every ergodic stationary processes on a Polish space can be described as an ergodic measure preserving transformation (Kolmogorov consistency theorem and Rokhlin theorem), assumption H1’ is less restrictive than H1. Assumption H4’ is simply restatement of assumption H4 in the random dynamical systems framework.

To state Theorem 3.4 in this random dynamical systems framework, we define invariant measures for the random dynamical system (24). We follow the definition given by Arnold (1998). First, recall some useful definitions and notations. Let \(M\) be a metric space, and let \(\mathcal {P}(M)\) be the space of Borel probability measures on \(M\) endowed with the weak\(^*\) topology. If \(M'\) is also a metric space and \(f :M \rightarrow M'\) is Borel measurable, then the induced linear map \(f^{*} :\mathcal {P}(M) \rightarrow \mathcal {P}(M')\) associates with \(\nu \in \mathcal {P}(M)\) the measure \(f^{*}(\nu ) \in \mathcal {P}(M')\) defined by

$$\begin{aligned} f^*(\nu ) (B)=\nu (f^{-1}(B)) \end{aligned}$$

for all Borel sets \(B\) in \(M'\). If \(\theta :M \rightarrow M\) is a continuous map, a measure \(\nu \in \mathcal {P}(M)\) is called \(\theta \) -invariant if \(\nu (\theta ^{-1}(B)) = \nu (B)\) for all Borel sets \(B \in M\). A set \(B\subset M\) is positively invariant if \(\theta (B) \subset B\). For every positively invariant compact set \(B\), let \({{\mathrm{Inv}}}(\theta )(B)\) be the set of all \(\theta \)-invariant measures supported on \(B\).

Definition 8.2

A probability measure \(\mu \) on \(\Omega \times \mathbb R^n_+\) is invariant for the random dynamical system (24) if

  1. (i)

    \(\mu \in {{\mathrm{Inv}}}(\Phi )(\Omega \times \mathbb R^n_+)\),

  2. (ii)

    \(p_1^*(\mu ) = {\mathbb {Q}}\), i.e. for all Borel sets \(D \subset \Omega \), \(\mu (D \times \mathbb R^n_+)= {\mathbb {Q}}(D)\).

For any positively invariant set \(\Omega \times C\) where \(C \subset \mathbb R^n_+\) is compact, \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times C)\) is the set of all measures \(\mu \) satisfying (i) and (ii) such that \(\mu (\Omega \times C)=1\).

In words, a probability measure \(\mu \) is invariant for the random dynamical system (24) if it is invariant for the skew product \(\Phi \) and if its first marginal is the probability \({\mathbb {Q}}\) on \(\Omega \).

The following result is a consequence of Theorem 1.5.10 in Arnold (1998). In fact, the topology defined in his definition 1.5.3 is finer than the weak\(^*\) topology on the set of all probability measures on \(\Omega \times C\).

Proposition 8.3

If \(C \subset \mathbb R^n_+\) is a positively invariant compact set, then \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times C)\) is a nonempty, convex, compact subset of \(\mathcal {P}(\Omega \times \mathbb R^n_+)\).

The main assumption in Theorem 3.4 deals with the long-term growth rates which characterize, in some sense, the long-term behavior of random matrix products (see Definition 3.3). In order to define those products in the new framework, let \(\mathbf {M}_{d}(\mathbb R)\) be the set of all \(d\times d\) matrices over \(\mathbb R\) and consider the maps \(A_i : \Omega \times S \rightarrow \mathbf {M}_{n_i}(\mathbb R)\), defined by

$$\begin{aligned} A_i(\omega ,x)=A_i(\omega _0,x). \end{aligned}$$

While our choice of notation here differs slightly from the main text, this choice simplifies the proof. We write

$$\begin{aligned} A^t_i(\omega ,x) :=A_i(\omega ,x)A_i(\Phi (\omega ,x)) \cdots A_i(\Phi ^{t-1}(\omega ,x)), \end{aligned}$$
(25)

with the convention that \(A_i^0(\omega ,x) = \mathrm {id}\), the identity matrix.

Then, for each \(i\in \{1,\dots ,m\}\), the asymptotic growth rate of the product (25) associated with \((\omega ,x) \in \Omega \times \mathbb R^n_+\) is

$$\begin{aligned} r_i(\omega ,x) := \limsup _{t \rightarrow \infty }\frac{1}{t}\ln \Vert A_i^t(\omega ,x)\Vert , \end{aligned}$$

which is finite, due to assumptions H3 and H4’. According to Definition 8.2, the invasion rate of species i with respect to an invariant measure \(\mu \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )\) is

$$\begin{aligned} r_i(\mu ):= \int _{\Omega \times \mathbb R^n_+}r_i(\omega , x) \mu (d\omega , dx). \end{aligned}$$

Remark 8.4

Note that for any \(x\in \mathbb R^n_+\), the random variable \(r_i(x)\) defined by (4) is equal in distribution to the random variable \(r_i(\cdot ,x)\). Also by definition of \({\mathbb {Q}}\) and \(\Phi \) there is a bijection, say \(h\), between the set \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times \mathbb R^n_+)\) and the set of measures defined in Definition 3.2. Moreover the invasion rate with respect to an invariant measure is invariant by \(h\), i.e. for all \(\mu \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )\), \(r_i(\mu )=r_i(h(\mu ))\).

Given a point \((\omega ,x) \in \Omega \times \mathbb R^n_+\), let \(\Pi _t(\omega ,x)\) denote the empirical occupation measure of the trajectory \(\{X_s(\omega ,x)\}_{s\ge 0}\) at time \(t\) defined by

$$\begin{aligned} \Pi _t(\omega ,x):= \frac{1}{t}\sum _{s=0}^{t-1}\delta _{X_s(\omega ,x)}. \end{aligned}$$

For each Borel set \(B \subset \mathbb R^n_+\), the random variable \(\Pi ^x_t(B)\) given by (2) is equal in distribution to the random variable \(\Pi _t(\cdot ,x)(B)\).

For all \(\eta >0\), recall that \(S_{{\eta }}:= \{x \in \mathbb R^n_+ \ : \ \Vert x^i\Vert \le \eta \text { for some }i \}\). We can now rephrase Theorem 3.4 in the framework of random dynamical systems.

Theorem 8.5

If one of the following equivalent conditions hold

  1. (i)

    \(r_*(\mu ):= \max _{0\le i \le m}r_i(\mu ) >0\) for every probability measure \(\mu \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times S_0)\), or

  2. (ii)

    there exist positive constants \(p_1,\dots ,p_m\) such that

    $$\begin{aligned} \sum _i p_i r_i(\mu )>0 \end{aligned}$$

    for every ergodic probability measure \(\mu \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times S_0)\), or

  3. (iii)

    there exist positive constants \(p_1,\dots ,p_m\) such that

    $$\begin{aligned} \sum _i p_i r_i(\omega ,x)>0 \end{aligned}$$

    for every \(x\in S_0\) and \({\mathbb {Q}}\)-almost all \(\omega \in \Omega \),

then for all \(\varepsilon >0\), there exists \(\eta >0\) such that

$$\begin{aligned} \limsup _{t \rightarrow \infty } \Pi _t(\omega ,x)( S_{\eta }) \le \varepsilon \ \ \text { for }{\mathbb {Q}}\text {-almost all } \omega , \end{aligned}$$

whenever \(x \in \mathbb R^n_+ \backslash S_0\).

Remark 8.4 implies that Theorem 8.5 and Theorem 3.4 are equivalent. The remainder of the section is devoted to prove Theorem 8.5.

8.2 Trajectory space

The key element of the proof of Theorem 8.5 is Proposition 8.13 due to Ruelle (1979a) in which it is crucial that the map \(\Phi \) is an homeomorphism. However, the map \(\Phi \) is, a priori, not invertible. To circumvent this issue, we extend the dynamics induced by \(\Phi \) to an invertible map on the space of possible trajectories. Then, we state an equivalent version of Theorem 8.5 in this larger space that we prove in Sect. 8.5.

By definition of the global attractor \(K\), there exist a neighborhood \(V\) of \(p_2(K)\) in \(\mathbb R^n_+\) such that \(\Phi (\Omega \times V) \subset \Omega \times V\). By continuity of \(\Phi \), this inclusion still holds for the closure \(\overline{V}\) of \(V\), i.e.

$$\begin{aligned} \Phi (\Omega \times \overline{V}) \subset \Omega \times \overline{V}. \end{aligned}$$

This inclusion implies that, for every point \((\omega ,x) \in \Omega \times \overline{V}\), there exists a sequence \(\{x_t\}_{t\in \mathbb N} \subset \overline{V}^{\mathbb N}\) such that \(x_0=x\), and \((\theta ^{t+1}(\omega ),x_{t+1})=\Phi (\theta ^t(\omega ),x_t)\) for all \(t \ge 0\). The sequence \(\{(\theta ^t(\omega ), x_t)\}_{t\ge 0}\) is called a \(\Phi \) -positive trajectory. Note that the first coordinate of a \(\Phi \)-positive trajectory is characterized by \(\omega \) and \(\theta \). Therefore a \(\Phi \)-positive trajectory can be seen as a couple \((\omega , \{x_t\}_{t \ge 0})\). In order to create a past for all those \(\Phi \)-positive trajectories, let us pick a point \(x^*\in S\backslash (\overline{V}\cup S_0)\), and consider the product space \(\mathcal {T}:=\Omega \times (\overline{V} \cup \{x^*\})^{\mathbb Z}\) endowed with the product topology, and the homeomorphism \(\Theta : \mathcal {T} \rightarrow \mathcal {T} \) defined by \(\Theta (\omega ,\{x_t\}_{t\in \mathbb Z}) = (\theta (\omega ),\{x_{t+1}\}_{t\in \mathbb Z})\) and called the shift operator. Since both \(\Omega \) and \(\overline{V}\cup \{x^*\}\) are compact, the space \(\mathcal {T}\) is compact as well.

Every \(\Phi \)-positive trajectory can be realized as an element of \(\mathcal {T}\) by creating a fixed past (i.e. \(x_t=x^*\) for all \(t<0\)). Then, define

$$\begin{aligned} \Gamma = \overline{\bigcup _{t\in \mathbb Z}\Theta ^t\{ \gamma \in \mathcal {T} : \gamma \ \text { is a }\Phi \text {-positive trajectory} \}}. \end{aligned}$$

In words, \(\Gamma \) is the adherence in \(\mathcal {T}\) of the set of all shifted (by \(\Theta ^t\) for some \(t\in \mathbb Z\)) \(\Phi \)-positive trajectories. Since \(\Gamma \) is a closed subset of the compact \(\mathcal {T}\), it is compact as well. Moreover \(\Gamma \) is invariant under \(\Theta \), which implies that the restriction \({\Theta }|_{\Gamma }\) of \(\Theta \) on \(\Gamma \) is well-defined. To simplify the presentation we still denote this restriction by \(\Theta \). The projection map \(\pi _0 : \Gamma \rightarrow \Omega \times \overline{V} \cup \{x^*\}\) is defined by \(\pi _0(\gamma )=(\omega ,x_0)\) for all \(\gamma =(\omega ,\{x_t\}_t) \in \Gamma \). By definition, the map \(\pi _0\) is continuous and surjective. For now on, when we write \(\gamma \in \Gamma \), we mean \(\gamma =(\omega ,\{x_t\}_{t\in \mathbb Z})\).

Define the compact set of all \(\Phi \) -total trajectories as

$$\begin{aligned} \Gamma _+:= \pi _0^{-1}(\Omega \times \overline{V}), \end{aligned}$$

and the compact set of \(\Phi \) -total-solution trajectory on the extinction set \(S_0\) as

$$\begin{aligned} \Gamma _0:= \pi _0^{-1}(\Omega \times S_0). \end{aligned}$$

The dynamic induced by \(\Phi \) on \(\Omega \times \overline{V}\) is linked to the dynamic induced by \(\Theta \) on \(\Gamma _+\) by the following semi conjugacy

$$\begin{aligned} \pi _0 \circ \Theta = \Phi \circ \pi _0. \end{aligned}$$
(26)

Thus, the map \(\Theta \) on \(\Gamma _+\) can be seen as the extension of the map \(\Phi \) on \(\Omega \times \overline{V}\).

In order to write an equivalent statement of Theorem 8.5 with respect to the dynamics of \(\Theta \), we consider a subset of the invariant measures of \(\Theta \) consistent with the set \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times S)\) in the sense of Corollary 8.8 below. For \(B \subset \Gamma \) positively \(\Theta \)-invariant and compact, define

$$\begin{aligned} {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(B) :=\{\tilde{\mu } \in {{\mathrm{Inv}}}(\Theta )(B) : p_1^*\circ \pi _0^*(\tilde{\mu }) = \mathbb {Q} \}. \end{aligned}$$

Proposition 8.6

\({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)\) and \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0)\) are compact and convex subsets of \(\mathcal {P}(\Gamma )\).

Proof

Since \(\Gamma _+\) and \(\Gamma _0\) are positively invariant compacts, \({{\mathrm{Inv}}}(\Theta )(\Gamma _+)\) and \({{\mathrm{Inv}}}(\Theta )(\Gamma _0)\) are non empty, compact and convex subsets of \(\mathcal {P}(\Gamma )\). Then, since \(p_1^*\circ \pi _0^*\) is continuous, \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)\) (resp. \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0)\)) is compact as closed subset of \({{\mathrm{Inv}}}(\Theta )(\Gamma _+)\) (resp. \({{\mathrm{Inv}}}(\Theta )(\Gamma _0)\)). The convexity of \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)\) and \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0)\) is a consequence of the convexity of \({{\mathrm{Inv}}}(\Theta )(\Gamma _+)\) and \({{\mathrm{Inv}}}(\Theta )(\Gamma _0)\), and the linearity of \(p_1^*\circ \pi _0^*\). \(\square \)

As a consequence of equation (26), we have

Proposition 8.7

For every \(\Theta \)-invariant measure \(\tilde{\mu }\) supported on \(\Gamma _+\), \(\pi _0^*(\tilde{\mu })\) is \(\Phi \)-invariant.

Proof

Let \(\tilde{\mu }\) be a \(\Theta \)-invariant measure supported on \(\Gamma _+\). Then the measure \(\pi _0^*(\tilde{\mu })\) is supported by \(\Omega \times \overline{V}\). Let \(B \subset \Omega \times \overline{V}\) be a Borel set. We have

$$\begin{aligned} \pi ^*_0(\tilde{\mu })(\Phi ^{-1}(B))&= \tilde{\mu }(\pi _0^{-1}(\Phi ^{-1}(B)))\\&= \tilde{\mu }(\pi _0^{-1}(\Phi ^{-1}(B)) \cap \Omega \times \overline{V})\\&= \tilde{\mu }(({ \left. \Phi \right| _{\Omega \times \overline{V}} } \circ \pi _0)^{-1}(B))\\&= \tilde{\mu }((\pi _0\circ { \left. \Theta \right| _{\Gamma _+} })^{-1}(B))\\&= \tilde{\mu }(\pi _0^{-1}(B))\\&= \pi ^*_0(\tilde{\mu })(B). \end{aligned}$$

The second equality follows from the fact that the support of \(\tilde{\mu }\) is included in \(\Gamma _+\), and the fourth is a consequence of the conjugacy (26). \(\square \)

Corollary 8.8

\(\pi _0^*({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+))\) is a compact and convex subset of \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times \overline{V})\).

Proof

Since \(\pi _0^*\) is continuous and linear, Proposition 8.6 implies that \(\pi _0^*({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+))\) is compact and convex. Proposition 8.7 implies that \(\pi _0^*({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)) \subset {{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times \overline{V})\). \(\square \)

Remark 8.9

The definition of \(\Theta \) and assumption H3 imply that the sets \(\Gamma _0\) and \(\Gamma _+\backslash \Gamma _0\) are both positively \(\Theta \)-invariant. Therefore every \(\Theta \)-invariant measure \(\tilde{\mu }\) on \(\Gamma _+\) can be written as a convex combination of two \(\Theta \)-invariant measures \(\tilde{\nu }_0\) and \(\tilde{\nu }_1\) such that \(\tilde{\nu }_0(\Gamma _0)=1\) and \(\tilde{\nu }_1(\Gamma _+\backslash \Gamma _0)=1\).

In order to restate Theorem 8.5 in the space of trajectories, the random matrix products (25) over \(\Phi \) have to be rewritten as products over \(\Theta \). For each \(i\in \{1,\dots ,m\}\), define the maps \(A_i : \Gamma \rightarrow \mathbf {M}_{n_i}(\mathbb R)\) by

$$\begin{aligned} A_i(\gamma )= \left\{ \begin{array}{ll} A_i(\omega , x^*) &{}\quad \text { if } x_0=x^* \\ A_i(\omega , x_0) &{}\quad \text { either}. \end{array} \right. \end{aligned}$$

As (25), we write

$$\begin{aligned} A^t_i(\gamma ) := A_i(\gamma )\cdots A_i(\Theta ^{t-1}(\gamma )). \end{aligned}$$
(27)

The conjugacy (26) implies that for all \((\omega ,x) \in \Omega \times \overline{V}\) and all \(\gamma \in \pi _0^{-1}(\omega ,x)\), we have

$$\begin{aligned} A^t_i(\gamma )=A^t_i(\omega ,x), \end{aligned}$$
(28)

for all \(t\ge 0\).

Then the long-term growth rates for the product (28) is

$$\begin{aligned} r_i(\gamma ) := \limsup _{t \rightarrow \infty }\frac{1}{t} \ln \Vert A^t_i(\gamma )\Vert , \end{aligned}$$

and, for a \(\Theta \)-invariant measure \(\tilde{\mu }\), the long-term growth rates is

$$\begin{aligned} r_i(\tilde{\mu }) = \int _{\Gamma }r_i(\gamma )d\tilde{\mu }. \end{aligned}$$

The following proposition shows that the long-term growth rates for the product (28) defined on the trajectory space are consistent with those for the product (25) defined on \(\Omega \times \overline{V}\).

Proposition 8.10

For all species \(i\), we have

  1. (i)

    \(r_i(\omega ,x) = r_i(\gamma )\), for all \((\omega ,x) \in \Omega \times \overline{V}\) and for all \(\gamma \in \pi _0^{-1}(\omega ,x)\),

  2. (ii)

    for all \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)\), \(\pi _0^*(\tilde{\mu }) \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Phi )(\Omega \times \overline{V})\), and

    $$\begin{aligned} r_i(\tilde{\mu })=r_i(\pi ^*_0(\tilde{\mu })). \end{aligned}$$

Proof

Assertion (i) is a consequence of equality (28), and assertion (ii) is a consequence of Corollary 8.8. \(\square \)

We can now state an equivalent version of Theorem 8.5 on the space of trajectories \(\Gamma \).

Theorem 8.11

If one of the following equivalent conditions hold

  1. (a)

    \(r_*(\tilde{\mu }):= \max _{0\le i \le m}r_i(\tilde{\mu }) >0\) for every probability measure \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0)\), or

  2. (b)

    there exist positive constants \(p_1,\dots ,p_m\) such that

    $$\begin{aligned} \sum _i p_i r_i(\tilde{\mu })>0 \end{aligned}$$

    for every ergodic probability measure \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0)\), or

  3. (c)

    there exist positive constants \(p_1,\dots ,p_m\) such that

    $$\begin{aligned} \sum _i p_i r_i(\omega ,x)>0 \end{aligned}$$

    for every \(x\in S_0\) and \({\mathbb {Q}}\)-almost all \(\omega \in \Omega \),

then for all \(\varepsilon >0\), there exists \(\eta >0\) such that

$$\begin{aligned} \limsup _{t \rightarrow \infty } \Pi _t(\omega ,x)( S_{\eta }) \le \varepsilon \ \ \text { for }{\mathbb {Q}}\text {-almost all } \omega , \end{aligned}$$

whenever \(x \in \mathbb R^n_+ \backslash S_0\).

Remark 8.12

Condition (c) of Theorem 8.11 and (iii) Theorem 8.5 are equivalent, and the implications from conditions (iii) to (ii) and (ii) to (i) of Theorem 8.5 are direct. The proof of Theorem 8.11 (see Sect. 8.5) shows that (a), (b) and (c) of Theorem 8.11 are equivalent. Finally, condition (i) of Theorem 8.5 implies condition (a) of Theorem 8.11 as a direct consequence of assertion (ii) of Proposition 8.10. Hence, Theorems 8.5 and 8.11 are equivalent.

8.3 Random Perron-Frobenius Theorem and long-term growth rates

In this section, we first state Proposition 3.2 of Ruelle (1979a) (which we call the Random Perron-Frobenius Theorem) in its original framework, and extend it to ours. We use this extension to deduce some properties on the long-term growth rates which are crucial for the proof of Theorem 8.11. Let \({{\mathrm{int}}}\mathbb R_+^d = \{x\in \mathbb R^d_+ : \prod _i x_i >0\}\) be the interior of \(\mathbb R^d_+\).

Proposition 8.13

(Ruelle 1979a) Let \(\Xi \) be a compact space, \(\Psi : \Xi \rightarrow \Xi \) be an homeomorphism. Consider a continuous map \(T : \Xi \rightarrow \mathbf {M}_d(\mathbb R)\) and its transpose \(T^*\) defined by \(T^*(\xi )=T(\xi )^*\). Write

$$\begin{aligned} T^t(\xi )= T(\xi )\cdots T(\Psi ^{t-1}\xi ), \end{aligned}$$

and assume that

  1. A:

    for all \(\xi \in \Xi \), \(T(\xi )(\mathbb R^d_+) \subset \{0\} \cup {{\mathrm{int}}}\mathbb R_+^d\).

Then there exist continuous maps \(u, v : \Xi \rightarrow \mathbb R^d_+\) with \(\Vert u(\xi )\Vert =\Vert v(\xi )\Vert =1\) such that

  1. (i)

    the line bundles \(E\) (resp. \(F\)) spanned by \(u(\cdot )\) (resp. \(v(\cdot )\)) are such that \(\mathbb R^{d} = E\bigoplus F^{\perp }\) where \(b \in F(\xi )^{\perp }\) if and only if \(\langle b(\xi ), v(\xi )\rangle =0\).

  2. (ii)

    \(E\) (resp. \(F\)) is \(T,\Psi \)-invariant (resp. \(T^*,\Psi ^{-1}\)-invariant), i.e. \(E(\Psi (\xi )) = E(\xi )T(\xi )\) and \(F(\Psi \xi )T^*(\Psi \xi )=F(\xi )\), for all \(\xi \in \Xi \);

  3. (iii)

    there exist constants \(\alpha <1\) and \(C>0\) such that for all \(t\ge 0\), and \(\xi \in \Xi \),

    $$\begin{aligned} \Vert b T(\xi )\cdots T(\Psi ^{t-1}\xi ) \Vert \le C\alpha ^t \Vert a T(\xi )\cdots T(\Psi ^{t-1}\xi ) \Vert , \end{aligned}$$

    for all unit vectors \(a \in E(\xi ), b \in F(\xi )^{\perp }\).

Our choice to called Proposition 8.13 the Random Perron-Frobenius Theorem is motivated by the following remark.

Remark 8.14

Assume that the map \(T: \Xi \rightarrow \mathbf {M}_d(\mathbb R)\) is constant, i.e. there exists \(B \in \mathbf {M}_d(\mathbb R)\) a positive matrix such that \(T(\xi )= B\) for all \(\xi \in \Xi \). Then Proposition 8.13 can be restated as follows: there exist \(u,v \in \mathbb R^d_+\) such that \(u(\xi )=u\) and \(v(\xi )=v\) for all \(\xi \in \Xi \); the positive vectors \(u\) and \(v^*\) are respectively the right and left eigenvector of \(B\) associated to its dominant eigenvalue (also called Perron eigenvalue) \(r>0\); assertion (iii) can be restated as the strong ergodic theorem of demography. That is

$$\begin{aligned} \lim _{t \rightarrow \infty }B^tx/r^t = v^*xu, \end{aligned}$$

for all \(x\in {{\mathrm{int}}}\mathbb R^d_+\). Since \(B^tx\) is the population at time \(t\) with an initial population \(x\), the interpretation of this theorem is that the eigenvector \(u\) represents the stable population structure, and the coefficients of \(v\) are the reproductive values of the population.

In Proposition 8.13, the stable population structure and the reproductive values can not be fixed vectors whereas long-term dynamics of the population depends on the sequence of the environment incapsulated in \(\xi \). Therefore, they have to be functions of the environment, i.e. \(u, v : \Xi \rightarrow \mathbb R^d_+\). To interpret those functions, we look at the following consequence of assertion (iii)

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{xT^t(\Psi ^{-t}\xi )}{\Vert xT^t(\Psi ^{-t}\xi ) \Vert }=u(\xi ), \end{aligned}$$
(29)

and its dual version

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{T^t(\xi )y^*}{\Vert T^t(\xi )y^* \Vert }=v(\xi )^*. \end{aligned}$$
(30)

The former equation appears in the proof of Proposition 8.15 as equation (31). For the sake of interpretation, assume that the environment along time has been fixed (here \(\dots ,\Psi ^{-1}\xi ,\xi ,\Psi ^1\xi ,\dots \)). Then (29) is interpreted as follows: whatever was the population a long time ago (here \(x\)), its structure today is given by \(u(\xi )\). For equation (30), the interpretation is: whatever we assume to be the reproductive values in a long time (here \(y\)), the reproductive values at time \(t=0\) is given by \(v(\xi )\).

In applications, the environment is represented by a stationary and ergodic process \((E_t)\). Here \(\xi \) represents itself a realization of this process, i.e. a possible trajectory of the environment. Therefore, there exist two stationary and ergodic processes \((U_t)\) and \((V_t)\) such that respectively \(u(\xi )\) and \(v(\xi )\) are realizations of them. Then equations (29) and (30) can be interpreted as for any initial population, in a long-term, the stage structure are given by a version of the process \((U_t)\) and the reproductive values are given by a version of \((V_t)\).

Since assumption H2 does not directly imply assumption A for the map \(A_i(\cdot ,\cdot )\), we need to extend Ruelle’s proposition to the case where

  1. A1’:

    for all \(\xi \in \Xi \), \(T(\xi ){{\mathrm{int}}}\mathbb R^d_+ \subset {{\mathrm{int}}}\mathbb R^d_+\), and

  2. A2’:

    there exists \(s\ge 1\) such that, for all \(\xi \in \Xi \), \(T(\xi )\cdots T(\Psi ^{s-1}\xi )(\mathbb R^d_+) \subset \{0\} \cup {{\mathrm{int}}}\mathbb R^d_+\).

Proposition 8.15

The conclusions of Proposition 8.13 still hold under assumptions A1’–A2’.

Proof

Define the continuous map \( T' : \Xi \rightarrow \Xi \times \mathbf {M}_d(\mathbb R)\) by

$$\begin{aligned} T'(\xi )= T(\xi )\cdots T(\Psi ^{s-1}(\xi )). \end{aligned}$$

By assumption A2’, \(T'(\xi )\mathbb R^{d}_+ \subset \{0\} \cup {{\mathrm{int}}}\mathbb R^{d}_+\). Therefore, Proposition 8.13 applies to the map \(T'\) and to the homeomorphism \(\Psi ^s\) which give us maps \(u, v : \Xi \rightarrow \mathbb R^d_+\) with \(\Vert u(\xi )\Vert =\Vert v(\xi )\Vert =1\), their respective vector bundles \(E(\cdot ),F(\cdot )\), and some constants \(C,\alpha \) verifying properties (i), (ii), and (iii).

The vector bundles \(E(\cdot ),F(\cdot )\) are our candidate bundles for \(T\). We need only to check properties (ii) and (iii) for the map \(T\) as property (i) is immediate.

We claim that

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{xT^t(\Psi ^{-t}\xi )}{\Vert xT^t(\Psi ^{-t}\xi ) \Vert }=u(\xi ), \end{aligned}$$
(31)

uniformly on all compact subsets of \(\mathbb R^{d}_+\setminus \{0\}\). The motivation of equation (31) follows from assumption A2’ which implies that the positive cone is contracted after every interval of time of length \(s\). For an interpretation of (31), see Remark 8.14. Before we prove (31), we show property (ii), i.e. \(E(\cdot )\) is \(T,\Psi \)-invariant, is a consequence (31). Let \(y \in {{\mathrm{int}}}\mathbb R^{d}_+\setminus \{0\}\), and \(\xi \in \Xi \). Continuity of \(T\) and equality (31) applied to \(y\) imply

$$\begin{aligned} u(\xi )T(\xi )&= \lim _{t \rightarrow \infty }\frac{yT^t(\Psi ^{-t}\xi )}{\Vert yT^t(\Psi ^{-t}\xi ) \Vert } T(\xi )\\&= \lim _{t \rightarrow \infty }\frac{yT(\Psi ^{-t}\xi )T^t(\Psi ^{-t}(\Psi \xi ))}{\Vert yT^t(\Psi ^{-t}\xi ) \Vert }\\&= u(\Psi \xi ) \lim _{t\rightarrow \infty }\frac{\Vert yT(\Psi ^{-t}\xi )T^t(\Psi ^{-t}(\Psi \xi ) \Vert }{\Vert yT^t(\Psi ^{-t}\xi ) \Vert }, \end{aligned}$$

where the final line follows from (31) with \(\xi =\Psi \xi \) and \(x =yT(\Psi ^{-t}\xi )/ \Vert y T(\Psi ^{-t}\xi ) \Vert \) which belongs to the compact \(\{z \in \mathbb R^d_+ : \Vert z \Vert =1\}\) for all \(t\ge 0\). This proves property (ii) for \(E\). The same argument for the transpose \(T'^*\) implies property (ii) for \(F\).

Now we prove (31). Let \(x \in \mathbb R^{d}_+{\setminus }\{0\}\) with \(\Vert x \Vert =1\). For every \(t \ge 0\), define \(s_t:= t- [\frac{t}{s}]s\) where \([q]\) is the integer part of \(q\). We have

$$\begin{aligned} xT^t(\Psi ^{-t}\xi )=xT^{s_t}(\Psi ^{-t}\xi )T'^{[\frac{t}{s}]}(\Psi ^{-t+s_t}\xi ). \end{aligned}$$

Since \(s_t\le s\) for all \(t\ge 0\), continuity of \(T\), and assumption A1’ imply that there is a compact \(H \subset \mathbb R^d_+{\setminus }\{0\}\) independent of \(x\) such that \(xT^{s_t}(\Psi ^{-t}\xi )\in H\) for all \(t>0\). Then, (31) is a consequence of inclusion (3.2) in the proof of Proposition 3.2 in Ruelle (1979a) applied to the map \(T'\).

It remains to check property (iii): show that there exist \(\alpha ', C'>0\) such that

$$\begin{aligned} \Vert b T^t(\xi ) \Vert \le C'\alpha '^t \Vert u(\xi )T^t(\xi ) \Vert \quad \text { for all } t \ge s, \xi \in \Xi , b \in F(\xi )^{\perp }. \end{aligned}$$

We have

$$\begin{aligned} b T^t(\xi )=b T^{s_t}(\xi )T'^{[\frac{t}{s}]}(\Psi ^{s_t}\xi ). \end{aligned}$$

Since \(F(\cdot )\) is \(T^*\)-invariant, \(b T^{s_t}(\xi ) \in F(\Psi ^{s_t}\xi )^{\perp }\) and property (iii) for \(T'\) implies

$$\begin{aligned} \frac{1}{\Vert b T^{s_t}(\xi ) \Vert }\Vert b T^{s_t}(\xi )T'^{[\frac{t}{s}]}(\Psi ^{s_t}\xi ) \Vert \le \frac{C(\alpha ^{\frac{1}{s}})^t}{\Vert u(\xi ) T^{s_t}(\xi )\Vert } \Vert u(\xi ) T^{s_t}(\xi )T'^{[\frac{t}{s}]}(\Psi ^{s_t}\xi ) \Vert . \end{aligned}$$

The continuity of \(T\) and \(u(\cdot )\), and assumption A1’ imply that there exist a constant \(R\ge 0\) such that

$$\begin{aligned} \frac{\max \{\Vert w T^{k}(\xi )\Vert : \Vert w\Vert =1\}}{\min \{\Vert u(\xi ) T^{k}(\xi )\Vert : \xi \in \Xi \}} \le R, \end{aligned}$$

for all \(k\le s\) and all \(\xi \in \Xi \). Then property (iii) is verified with \(C'=CR\) and \(\alpha '=\alpha ^{\frac{1}{s}}\). \(\square \)

Assumptions H2-H3 imply that each continuous map \(A_i : \Gamma \rightarrow \mathbf {M}_{n_i}(\mathbb R)\) satisfies assumptions A1’–A2’. Hence Proposition 8.15 applies to each continuous map \(A_i\), and to the homeomorphism \(\Theta \) on the compact space \(\Gamma \). Then, for each of those maps, there exist row vector maps \(u_i(\cdot )\), \(v_i(\cdot )\), their respective vector bundles \(E_i(\cdot )\), \(F_i(\cdot )\), and the constant \(C_i, \alpha _i >0\) satisfying properties (i), (ii), and (iii) of Proposition 8.15.

For each \(i\in \{1,\dots ,m\}\), define the continuous map \(\varvec{\zeta }_i : \Gamma \rightarrow \mathbb R\) by

$$\begin{aligned} \varvec{\zeta }_i(\gamma ) := \ln \Vert u_i(\gamma )A_i(\gamma )\Vert . \end{aligned}$$

In the rest of this subsection, we deduce from Proposition 8.15 some crucial properties of the invasions rates.

Proposition 8.16

For all \(\gamma \in \Gamma \) and every population \(i\), \(r_i(\gamma )\) satisfies the following properties:

  1. (i)
    $$\begin{aligned} r_i(\gamma )=\limsup _{t \rightarrow \infty }\frac{1}{t} \ln \Vert vA^t_i(\gamma )\Vert , \end{aligned}$$

    for all \(v \in \mathbb R^{n_i}_+{\setminus } \{0\}\) and

  2. (ii)
    $$\begin{aligned} r_i(\gamma )=\limsup _{t \rightarrow \infty }\frac{1}{t} \sum _{s=0}^{t-1}\varvec{\zeta }_i(\Theta ^s(\gamma )). \end{aligned}$$

The proof of this proposition follows the ideas of the proof of Proposition 1 in Hofbauer and Schreiber (2010).

Proof

Let \( \gamma \in \Gamma \) be fixed. To prove the first part, we start by showing that

$$\begin{aligned} r_i(\gamma ) = \limsup _{t \rightarrow \infty }\frac{1}{t}\ln \Vert u_i(\gamma ) A^t_i(\gamma )\Vert . \end{aligned}$$
(32)

Let \(v\in \mathbb R^{n_i}\), \(v \ne 0\). Since \(\mathbb R^{n_i} = E_i(\gamma )\bigoplus F_i^{\perp }(\gamma )\), there exist a constant \(a \in \mathbb R\) and a vector \(w \in F_i^{\perp }(\gamma )\) such that \(v = a u_i(\gamma ) + w\). Then, by Proposition 8.15, we have

$$\begin{aligned} \Vert vA^t_i(\gamma ) \Vert&\le a \Vert u_i(\gamma ) A^t_i(\gamma )\Vert + \Vert wA^t_i(\gamma ) \Vert \\&\le \Vert u_i(\gamma ) A^t_i(\gamma )\Vert \left( a+ C_i \alpha _i^t \Vert w\Vert \right) . \end{aligned}$$

Hence,

$$\begin{aligned} \limsup _{t \rightarrow \infty }\frac{1}{t}\ln \Vert vA^t_i(\gamma ) \Vert \le \limsup _{t \rightarrow \infty }\frac{1}{t}\ln \Vert u_i(\gamma ) A^t_i(\gamma )\Vert \end{aligned}$$

for all \(v \in \mathbb R^{n_i} {\setminus } \{0\}\). Since \(\Vert A^t_i(\gamma )\Vert = \sup _{\Vert v\Vert =1}\Vert vA^t_i(\gamma ) \Vert \), the last inequality implies that

$$\begin{aligned} r_i(\gamma ) \le \limsup _{t \rightarrow \infty }\frac{1}{t}\ln \Vert u_i(\gamma ) A^t_i(\gamma )\Vert \le r_i(\gamma ), \end{aligned}$$

which proves the equality (32).

Now, we consider positive vector \(v \in \mathbb R^{n_i}_+{\setminus } \{0\}\). We show that the equality (32) is also satisfied for \(v\). We write \(v=au_i(\gamma )+w\) with \(a>0\) and \(w\in F_i^{\perp }(\gamma )\). Proposition 8.15 implies

$$\begin{aligned} \Vert vA^t_i(\gamma )\Vert&\ge a \Vert u_i(\gamma ) A^t_i(\gamma ) \Vert - \Vert wA^t_i(\gamma ) \Vert \\&\ge \Vert u_i(\gamma ) A^t_i(\gamma )\Vert \left( a- C_i \alpha _i^ t \Vert w\Vert \right) . \end{aligned}$$

Since \(a>0\),

$$\begin{aligned} r_i(\gamma ) \ge \limsup _{t \rightarrow \infty }\frac{1}{t}\ln \Vert vA^t_i(\gamma ) \Vert \ge \limsup _{t \rightarrow \infty }\frac{1}{t}\ln \Vert u_i(\gamma ) A^t_i(\gamma )\Vert = r_i(\gamma ), \end{aligned}$$

which completes the proof of assertion (i).

The second assertion results directly from the first assertion and the following equalities:

$$\begin{aligned} \ln \Vert u_i(\gamma ) A_i^{t+1}(\gamma )\Vert&= \ln \Vert u_i(\gamma )A^t_i(\gamma )A_i(\Theta ^t(\gamma )) \Vert \\&= \ln \left\| u_i(\Theta (\gamma )^t)A_i(\Theta ^t(\gamma )) \right\| \left\| u_i(\gamma )A^t_i(\gamma ) \right\| \\&= \varvec{\zeta }_i(\Theta ^t(\gamma )) + \ln \left\| u_i(\gamma )A^t_i(\gamma )\right\| . \end{aligned}$$

The second step is a consequence of the invariance of the line bundle \(E_i\). \(\square \)

Recall that \(\Gamma _+= \pi _0^{-1}(\Omega \times \overline{V})\) and \(\Gamma _0= \pi _0^{-1}(\Omega \times S_0)\).

Corollary 8.17

For all \(\gamma \in \Gamma _+ {\setminus } \Gamma _0\), and every \(i\in \{1,\dots ,m\}\),

$$\begin{aligned} r_i(\gamma ) \le 0. \end{aligned}$$

Proof

Fix \(i \in \{1,\dots ,m\}\), and \(\gamma \in \Gamma _+\backslash \Gamma _0\) with \((\omega ,x) :=\pi _0(\gamma )\). By definition of \(\Gamma _+\backslash \Gamma _0\), \(x^i\in \mathbb R^{n_i}_+\) and \(x^i \ne 0\). We have

$$\begin{aligned} x^iA^t_i(\gamma )&= x^iA_i(\gamma )\cdots A_i(\Theta ^{t-1}\gamma )\\&= x^iA_i(\omega ,x)\cdots A_i(\Phi ^{t-1}(\omega ,x))\\&= p_2(\Phi ^t(\omega ,x)), \end{aligned}$$

where the second equality is a consequence of (28), and the third one follows from the definition of the cocycle \(\Phi \). Assumption H4’ implies that there exists \(T>0\) such that \(p_2(\Phi ^t(\omega ,x))\) belongs to the compact set \(\overline{V}\) for all \(t\ge T\), which implies that there exists \(R>0\) such that \(\Vert x^iA^t_i(\gamma ) \Vert \le R\) for all \(t\ge T\). Assertion (i) of Proposition 8.16 applied to \(v= x^i\) concludes the proof. \(\square \)

Now we give some properties of the invasion rate with respect to a \(\Theta \)-invariant probability measure.

Proposition 8.18

The invasion rate of each population \(i\) with respect to an \(\Theta \)-invariant measure \(\tilde{\mu }\) satisfies the following property:

$$\begin{aligned} r_i(\tilde{\mu }) = \int _{\Gamma } \varvec{\zeta }_i(\gamma ) d\tilde{\mu }. \end{aligned}$$

Proof

This result is a direct consequence of property (ii) of Proposition 8.16 and the Birkhoff’s Ergodic Theorem applied to the continuous maps \(\Theta \) and \(\varvec{\zeta }\). \(\square \)

Proposition 8.19

Let \(\tilde{\mu }\) be a \(\Theta \)-invariant measure. If \(\tilde{\mu }\) is supported by \(\Gamma _+ \backslash \Gamma _0\), then \(r_i(\tilde{\mu })=0\) for all \(i \in \{1, \dots ,m\}\).

Proof

Let \(\tilde{\mu }\) be such a probability measure. Fix \(i \in \{1, \dots ,m\}\), and define the set \(\Gamma ^{i,{\eta }}:= \{ \gamma \in \Gamma _+ : \Vert p_2(\pi _0(\Theta ^t(\gamma )))^i\Vert > \eta \}\). By assumption on the measure \(\tilde{\mu }\), there exists a real number \(\eta ^{*}>0\) such that \(\tilde{\mu }(\Gamma ^{i,{\eta }}) >0\) for all \(\eta <\eta ^*\).

The Poincaré recurrence theorem applies to the map \(\Theta \), and implies that for each \(\eta <\eta ^*\),

$$\begin{aligned} \tilde{\mu }(\{\gamma \in \Gamma ^{i,{\eta }} \vert \ \Theta ^t(\gamma ) \in \Gamma ^{i,{\eta }} \text { infinitely often }\})=1. \end{aligned}$$
(33)

Recall that the conjugacy (26) implies that for every \(\gamma \in \Gamma _+\) with \(\pi _0(\gamma )=(\omega ,x) \in \Omega \times \overline{V}\backslash S_0\), we have

$$\begin{aligned} p_2(\pi _0(\Theta ^t(\gamma )))^i&= p_2(\Phi ^t(\pi _0(\gamma )))^i\\&= x^iA_i^t(\gamma ). \end{aligned}$$

Then, equality (33) means that for \(\tilde{\mu }\)-almost all \(\gamma \in \Gamma ^{i,{\eta }}\) with \(0<\eta <\eta ^*\), \( \Vert x^iA_i^t(\gamma ) \Vert > \eta \) infinitely often. Therefore, Proposition 8.16 (i), applied to \(v=x^i\), implies that \(r_i(\gamma )=\limsup _{t \rightarrow \infty }\frac{1}{t} \ln \Vert x^iA^t_i(\gamma )\Vert \ge 0\) for \(\tilde{\mu }\)-almost all \(\gamma \in \Gamma ^{i,{\eta }}\), with \(\eta <\eta ^*\). Hence \(r_i(\gamma )\ge 0\) for \(\tilde{\mu }\)-almost all \(\gamma \in \bigcup _{n \ge \frac{1}{\eta ^*}} \Gamma ^{i,{1/n}} = \Gamma _+ \backslash \Gamma _0\). Corollary 8.17 completes the proof. \(\square \)

8.4 Properties of the empirical occupation measures

Given a trajectory \(\gamma \in \Gamma _+\), the empirical occupation measure at time \(t\in \mathbb N\) of \(\{\Theta ^s(\gamma )\}_{s\ge 0}\) is

$$\begin{aligned} \tilde{\Lambda }_t(\gamma ) := \frac{1}{t} \sum _{s=0}^{t-1}\delta _{\Theta ^s(\gamma )}, \end{aligned}$$

and given a point \((\omega , x) \in \Omega \times \overline{V}\), the empirical occupation measure at time \(t\in \mathbb N\) of \(\{\Phi ^s(\omega ,x)\}_{s\ge 0}\) is

$$\begin{aligned} \Lambda _t(\omega ,x) := \frac{1}{t} \sum _{s=0}^{t-1}\delta _{\Phi ^s(\omega ,x)}. \end{aligned}$$

In this way, \(\Lambda _t(\omega ,x)(\Omega \times B)=\Pi _t(\omega ,x)(B)\) for every Borel subset \(B \subset \overline{V}\), and \(x \in \overline{V}\).

The dynamics \(\Theta \) and \(\Phi \) being semi-conjugated by \(\pi _0\), their respective empirical occupation measures are semi-conjugated by \(\pi _0^*\) as follows.

Lemma 8.20

Let \(\gamma \in \Gamma _+\). Then for all \(t\ge 0\) we have

$$\begin{aligned} \pi _0^*(\tilde{\Lambda }_t(\gamma ))=\Lambda _t(\pi _0(\gamma )). \end{aligned}$$

Proof

Let \(B \subset \Omega \times \overline{V}\) be a Borel set, and \(\gamma \in \Gamma _+\). Then we have

$$\begin{aligned} \pi _0^*(\tilde{\Lambda }_t(\gamma ))(B)&= \tilde{\Lambda }_t(\gamma )(\pi _0^{-1}(B))\\&= \frac{1}{t} \sum _{s=0}^{t-1}\delta _{\Theta ^s(\gamma )}(\pi _0^{-1}(B))\\&= \frac{1}{t} \sum _{s=0}^{t-1}\delta _{\Phi ^s(\pi _0(\gamma ))}(B)\\&= \Lambda _t(\pi _0(\gamma ))(B). \end{aligned}$$

The third equality is a consequence of the semi conjugacy (26). \(\square \)

Proposition 8.21

There exists \(\tilde{\Omega }\) with \({\mathbb {Q}}(\tilde{\Omega })=1\) such that for all \(\gamma \in \pi _0^{-1}(\tilde{\Omega }\times \overline{V})\), the set of all weak\(^*\) limit point of the family of probability measures \(\{\tilde{\Lambda }_t(\gamma )\}_{t\in \mathbb N}\) is a non-empty subset of \({{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)\).

Proof

Since \({\mathbb {Q}}\) is ergodic (assumption H4’), Birkhoff’s Ergodic Theorem implies that there exists a subset \(\tilde{\Omega } \subset \Omega \) such that \({\mathbb {Q}}(\tilde{\Omega })=1\), and for all \(\omega \in \tilde{\Omega }\),

$$\begin{aligned} \lim _{t \rightarrow \infty }\frac{1}{t} \sum _{s=0}^{t-1}\delta _{\theta ^s(\omega )} = {\mathbb {Q}}\end{aligned}$$
(34)

(in the weak\(^*\) topology). Let \((\omega ,x) \in \tilde{\Omega }\times \overline{V}\) and \(\gamma \in \pi _0^{-1}(\omega ,x) \subset \Gamma _+\). For all \(t \in \mathbb N\), we have

$$\begin{aligned} p_1^*\circ \pi _0^*(\tilde{\Lambda }_t(\gamma )) = \frac{1}{t} \sum _{s=0}^{t-1}\delta _{\theta ^s(\omega )}. \end{aligned}$$
(35)

Since \(\Gamma _+\) is positively \(\Theta \)-invariant and compact, the set of all weak\(^*\) limit point of the family of probability measures \(\{\tilde{\Lambda }_t(\gamma )\}_{t\in \mathbb N}\) is a non-empty subset of \(\mathcal {P}(\Gamma _+)\). Since the maps \(p_1\) and \(\pi _0\) are continuous, equalities (34) and (35) imply that \(p_1^*\circ \pi _0^*(\tilde{\mu })={\mathbb {Q}}\). Moreover, Theorem 6.9 in Walters (1982) implies that \(\tilde{\mu }\) is \(\Theta \)-invariant. Therefore, \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)\), which concludes the proof. \(\square \)

Recall that \(S_{{\eta }}=\{ x \in S : \Vert x^i \Vert \le \eta \text{ for } \text{ some } i\}\), and define the subset \(\Gamma _{\eta }:= \pi _0^{-1}(\Omega \times S_{\eta })\).

Proposition 8.22

If conation (a) of Theorem 8.11 is satisfied, then for all \(\varepsilon >0\) there exists \(\eta ^*>0\) such that

$$\begin{aligned} \tilde{\mu }( \Gamma _{\eta })<\varepsilon , \end{aligned}$$

for all \(\eta <\eta ^*\) and all \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+ \backslash \Gamma _0)\).

Proof

If false, there exist \(\varepsilon >0\) and a sequence of measures \(\{\tilde{\mu }_n\}_{n\in \mathbb N} \subset {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+ \backslash \Gamma _0)\) such that \(\tilde{\mu }_n( \Gamma _{1/n}) > \varepsilon \) for all \(n \ge 1\). By Proposition 8.6, let \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)\) be a weak\(^*\) limit point of the sequence \(\{\tilde{\mu }_n\}_{n\in \mathbb N}\). Proposition 8.19 implies that \(r_*(\tilde{\mu }_n)=0\) for all \(n \ge 0\). Proposition 8.18 and weak\(^*\) convergence imply that \(0 = \lim _{n\rightarrow \infty }r_i(\tilde{\mu }_n) = r_i(\tilde{\mu })\) for all \(i\). Hence, \(r_*(\tilde{\mu })=0\). The Portmanteau theorem (see e.g. Theorem 2.1. in Billingsley (1999)) applied to the closed set \(\Gamma _{1/n}\) implies that for all \(n\ge 1\),

$$\begin{aligned} \tilde{\mu }(\Gamma _{1/n})&\ge \liminf _{m\rightarrow \infty }\tilde{\mu }_m(\Gamma _{1/n})\\&\ge \liminf _{m\rightarrow \infty }\tilde{\mu }_m(\Gamma _{1/m})\\&\ge \varepsilon . \end{aligned}$$

Therefore \(\tilde{\mu }(\Gamma _0) = \tilde{\mu }(\cap _n \Gamma _{1/n})\ge \varepsilon \). Remark 8.9 implies there exist \(\alpha > 0\) such that \(\tilde{\mu } = \alpha \tilde{\nu }_0 + (1-\alpha )\tilde{\nu }_1\) where \(\tilde{\nu }_j\) are \(\Theta \)-invariant probability measures satisfying \(\tilde{\nu }_0(\Gamma _0) =1\) and \(\tilde{\nu }_1(\Gamma _+ \backslash \Gamma _0)=1\). By Proposition 8.19, \(r_i(\tilde{\nu }_1) = 0\) for all \(i \in \{1, \dots ,k \}\). Condition (a) implies that \(r_*(\tilde{\nu }_0) >0\), in which case \(0=r_*(\tilde{\mu }) = \alpha r_*(\tilde{\nu }_0)>0\) which is a contradiction. \(\square \)

8.5 Proof of Theorem 8.11

First, we show that condition (a) of Theorem 8.11 implies that for all \(\varepsilon >0\), there exists \(\eta >0\) such that

$$\begin{aligned} \limsup _{t \rightarrow \infty } \Pi _t(\omega ,x)( S_{\eta }) \le \varepsilon \ \ \text { for }{\mathbb {Q}}\text {-almost all } \omega , \end{aligned}$$

whenever \(x \in \mathbb R^n_+ \backslash S_0\). Second, we prove the equivalence of conditions (a), (b) and (c).

Let \(\tilde{\Omega } \subset \Omega \) be defined as in Proposition 8.21. Choose \((\omega ',x') \in \tilde{\Omega } \times \mathbb R^n_+ \backslash S_0\). By definition of the set \(\overline{V}\), there exists a time \(T\ge 0\) such that \(\Phi ^t(\omega ',x') \in \Omega \times \overline{V}\), for all \(t\ge T\). Choose \(\gamma \in \pi _0^{-1}(\Phi ^T(\omega ',x')) \subset \Gamma _+ \backslash \Gamma _0\). Since \(\mu \) is a weak\(^*\) limit point of the family \(\{\Lambda _t(\Phi ^T(\omega ',x'))\}_{t\ge 0}\) if and only if it is a weak\(^*\) limit point of the family \(\{\Lambda _t(\omega ',x')\}_{t\ge 0}\), we do not loss generality by considering \(\{\Lambda _t(\Phi ^T(\omega ',x'))\}_{t\ge 0}\). Since \(\Omega \times \overline{V}\) is compact, the set of all weak\(^*\) limit points of the family of probability measures \(\{\Lambda _t(\Phi ^T(\omega ',x'))\}_{t\in \mathbb N}\) is a non-empty subset of \(\mathcal {P}(\Omega \times \overline{V})\). Let \(\mu = \lim _{k\rightarrow \infty } \Lambda _{t_k}(\omega ,x)\) be such a weak\(^*\) limit point. Since \(\Gamma _+\) is positively \(\Theta \)-invariant and compact, passing to a subsequence if necessary, there exists \(\tilde{\mu } = \lim _{k \rightarrow \infty }\tilde{\Lambda }_{t_k}(\gamma ) \in \mathcal {P}(\Gamma _+)\). By Proposition 8.21, \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _+)\). Furthermore by Lemma 8.20 and continuity of \(\pi _0\), \(\pi _{0}^*(\tilde{\mu })=\mu \). Hence, Proposition 8.18, the continuity of the map \(\varvec{\zeta }\), and property (ii) of Proposition 8.16, imply the following equalities for all \(i\):

$$\begin{aligned} r_i(\tilde{\mu })&= \int _{\Gamma } \varvec{\zeta }(\eta ) d\tilde{\mu }(\eta ) \\&= \lim _{k \rightarrow \infty }\frac{1}{t_k}\sum _{s=0}^{t_k-1}\varvec{\zeta }(\Theta ^s(\gamma ))\\&\le r_i(\gamma ). \end{aligned}$$

Hence, by Corollary 8.17,

$$\begin{aligned} r_i(\tilde{\mu }) \le 0, \ \text { for all } i. \end{aligned}$$

Remark 8.9 implies there exists \(\alpha \ge 0\) such that \(\tilde{\mu } = \alpha \tilde{\nu }_0 + (1-\alpha )\tilde{\nu }_1\) where \(\tilde{\nu }_j\) are invariant probability measure satisfying \(\tilde{\nu }_0(\Gamma _0) =1\) and \(\tilde{\nu }_1(\Gamma _+ \backslash \Gamma _0)=1\). By Proposition 8.19, \(r_i(\tilde{\nu }_1) = 0\) for all \(i \in \{1, \dots ,k \}\). Condition (a) implies \(r_*(\tilde{\nu }_0) >0\). Therefore \(\alpha \) must be zero, i.e. \(\tilde{\mu }(\Gamma _+ \backslash \Gamma _0)=1\). Fix \(\varepsilon >0\). By Proposition 8.22 there exists \(\eta ^*>0\) such that

$$\begin{aligned} \tilde{\mu }( \Gamma _{\eta })<\varepsilon , \ \ \forall \eta <\eta ^*, \end{aligned}$$

which implies

$$\begin{aligned} \mu (\Omega \times S_{\eta })<\varepsilon , \ \ \forall \eta <\eta ^*. \end{aligned}$$

Since \(\eta ^*\) does not depend on \(\mu \), we have

$$\begin{aligned} \limsup _{t \rightarrow \infty }\Lambda _t(\omega ',x')( \Omega \times S_{\eta }) < \varepsilon , \ \ \forall \eta <\eta ^*, \end{aligned}$$

for all \(x' \in \mathbb R^n_+\backslash S_0\) and \(\omega ' \in \tilde{\Omega }\), which concludes the first part of the proof.

Next, we show the equivalence of conditions (a) and (b). We need the following version of the minimax theorem (see, e.g., Simmons 1998):

Theorem 8.23

(Minimax theorem) Let \(A,B\) be Hausdorff topological vector spaces and let \(\mathcal {L} : A \times B \rightarrow \mathbb R\) be a continuous bilinear function. Finally, let \(E\) and \(F\) be nonempty, convex, compact subsets of \(A\) and \(B\), respectively. Then

$$\begin{aligned} \min _{a\in E}\max _{b\in F}\mathcal {L}(a, b) = \max _{b\in F}\min _{a\in E} \mathcal {L}(a,b). \end{aligned}$$

We have that

$$\begin{aligned} \min _{\tilde{\mu }} \max _i r_i (\tilde{\mu }) = \min _{\tilde{\mu } }\max _{p} \sum _i p_i r_i(\tilde{\mu }) \end{aligned}$$

where the minimum is taken over \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0)\) and the maximum over \(p\in \Delta := \{p \in \mathbb R^m_+ \ : \ \sum _i p_1=1\}\). Define \(A\) to be the dual space to the space of bounded continuous functions from \(\Gamma _0\) to \(\mathbb R\) and define \(B=\mathbb R^m\). Let \(E=\Delta \), and \(D = {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0) \subset A\) which is nonempty, convex and compact by Proposition 8.6. Let \(\mathcal {L}: A \times B \rightarrow \mathbb R\) the bilinear function defined by \(\mathcal {L}(\tilde{\mu }, p):=\sum _i p_i r_i(\tilde{\mu })\). Proposition 8.18 implies that \(\mathcal {L}\) is continuous. With these choices, the Minimax theorem implies that

$$\begin{aligned} \min _{\tilde{\mu }} \max _i r_i (\tilde{\mu }) = \max _{p\in \Delta } \min _{\tilde{\mu }} \sum _i p_i r_i(\tilde{\mu }) \end{aligned}$$
(36)

where the minimum is taken over \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0)\). By the ergodic decomposition theorem for random dynamical systems (see Lemma 6.19 in Crauel (2002)), the minimum of the right hand side of (36) is attained at an ergodic probability measure with support in \(\Gamma _0\). Thus, the equivalence of the conditions is established.

Finally, we show the equivalence of condition (b) and (c). As a direct consequence of assertion (i) of Proposition 8.10, condition (c) implies (b). To prove the other direction, let \(\tilde{\Omega } \subset \Omega \) be defined as in the proof of Proposition 8.21. Choose \((\omega ',x') \in \tilde{\Omega } \times S_0\). By the same arguments as above, there exist \(T>0\), \(\gamma \in \pi _0^{-1}(\Phi ^T(\omega ',x')) \subset \Gamma _0\) and \(\tilde{\mu } \in {{\mathrm{Inv}}}_{\mathbb {Q}}(\Theta )(\Gamma _0)\) such that

$$\begin{aligned} r_i(\tilde{\mu })&= \int _{\Gamma } \varvec{\zeta }(\eta ) d\tilde{\mu }(\eta ) \\&= \lim _{k \rightarrow \infty }\frac{1}{t_k}\sum _{s=0}^{t_k-1}\varvec{\zeta }(\Theta ^s(\gamma ))\\&\le r_i(\gamma ). \end{aligned}$$

Assertion (i) of Proposition 8.10 implies that \(r_i(\gamma )=r_i(\Phi ^T(\omega ',x'))\). Since \(\Phi ^T(\omega ',x')\) is on the same trajectory that \((\omega ',x')\), \(r_i(\tilde{\mu }) \le r_i(\Phi ^T(\omega ',x'))= r_i(\omega ',x')\). Writing \(\tilde{\mu }\) as a convex combination of ergodic probability measures, condition (b) implies \(\sum _i p_i r_i(\omega ',x')>0\). \(\square \)