Introduction

Almost all ecological and evolutionary processes are in one way or another structured by space, and consequently spatial models are ubiquitous in theoretical ecology (e.g., Durrett and Levin 1994; Dieckmann et al. 2000; Okubo and Levin 2001; Cantrell and Cosner 2003; Hanski and Gaggiotti 2004; Cantrell et al. 2010). Reflecting the many choices to be made when simplifying ecological reality into a mathematical model, theoreticians have developed a great variety of modelling frameworks, including models in which space and time are considered as discrete or continuous, and models with deterministic or stochastic dynamics. To mention a few examples (for a more thorough review, see, e.g., Berec 2002), mathematical frameworks for spatial ecology include partial differential equations (e.g., Cantrell and Cosner 2003), systems of ordinary differential equations (e.g., Hanski and Ovaskainen 2000), integro-difference equations (e.g., Kot et al. 1996), lattice models (e.g., Hiebeler 2000), point process models (e.g., Bolker and Pacala 1997), quantum field theory (e.g., Dodd and Ferguson 2009; O’Dwyer and Green 2010), and individual-based simulation models (e.g., Grimm and Railsback 2005). However, given the complex and hidden nature of ecological reality, choices of modelling framework are often based on familiarity or mathematical tractability, rather than the specific biological characteristics of the system. As models are approximations of nature, many (often somewhat arbitrary) choices are made in the process of modelling, and an important challenge is to understand which choices make qualitative differences in the conclusions of a modelling exercise.

Spatial and stochastic models are often straightforward to simulate but difficult to analyze mathematically. Thus many of the ecological insights gained from such models have been based on simulations, the results of which can be hard to generalize and communicate in spite of the effort that has gone into the development of standardized protocols (Grimm et al. 2006). Tools for linking individual-based models with analytical models include mean-field (Morozov and Poggiale 2012) and scale-transition (Chesson 2012) theories, as well as related mathematical tools such as aggregation of variables (Iwasa et al. 1987), moment closures (Levermore 1996; Keeling 2000; Murrell et al. 2004; Bolker 2004; Barraquand and Murrell 2013), and pair approximations (Matsuda et al. 1992; Keeling et al. 1997; Ellner 2001). However, most of the mathematical methods available for nonlinear stochastic and spatial models are based on heuristic rather than mathematically justified assumptions, so that, e.g., the choice of the moment closure is considered more of an art than a science (Bolker 2004). Thus, one reason for the dominance of simulation approaches in ecological modelling is the continued lack of robust and accessible mathematical tools that would enable the mathematical analysis of spatial and stochastic models.

A specific class of spatial and stochastic models called spatiotemporal point processes are common in both statistical (e.g., Thompson 1955; Penttinen et al. 1992; Shimatani 2002; Law et al. 2009) and theoretical (e.g., Bolker and Pacala1997; Murrell and Law 2003; Law et al. 2003; Ovaskainen and Cornell 2006b; North et al. 2011a; Barraquand and Murrell 2013) ecology. In this framework, individual organisms are represented by points in space, so that demographic processes such as birth, death, and dispersal can be represented by the appearance, disappearance, and movement of points. This individual-based perspective gives to the assumptions and parameters clear biological interpretations and allows demographic stochasticity to be incorporated in a natural way. The points can also be assigned marks (Penttinen et al. 1992; Illian et al. 2008; Ovaskainen and Cornell 2006a; Baddeley 2010), allowing for differentiation among, e.g., species, subpopulations or genotypes, or (with continuous-valued marks) for the modelling of size dynamics of individuals. Sophisticated statistical machinery has been developed to fit such models to data (Baddeley and Turner 2005; Illian et al. 2008, 2012; Baddeley 2010), and this area of statistics is still under active development. In theoretical ecology, spatial point process models have typically been analyzed with the help of a moment closure (Bolker and Pacala 1997; Keeling 2000; Filipe and Gibson 2001; Law et al. 2003; Murrell et al. 2004; Bolker 2004; Barraquand and Murrell 2013), i.e., by truncating the infinite hierarchy of moments by assuming a specific relationship between higher-order (typically third-order) moments and the lower-order moments. However, moment closures are uncontrolled approximations whose suitability to a particular modelling question is difficult to establish a priori, and the optimal closure may depend on the problem and the parameter regime (e.g., Murrell et al. 2004).

Within probability theory, a mathematically rigorous toolbox for the analysis of spatiotemporal point processes has recently started to emerge (Kondratiev et al. 2006b, 2008a, b, 2010; Kondratiev and Skorokhod 2006; Finkelshtein et al. 2012), but these tools have attracted little attention in the ecological literature, for two reasons. First, the mathematical literature is written in a highly technical notation, making it difficult for many theoretical ecologists even to assess the relevance of the results. For example, what most ecologists (or statisticians) would call “spatiotemporal point processes” are instead “Markov evolutions in the space of locally finite configurations” in probabilists’ terminology. Second, much of the mathematical literature focuses on issues such as proving existence and uniqueness of solutions, which may seem trivial from ecologists’ point of view, given that the processes are defined to mimic real systems (although nonexistence or nonuniqueness of solutions may usefully indicate an ambiguity or problem in the mapping from an intuitive idea about an ecological system into a formal mathematical description). Nevertheless, we believe that this formalism affords a more powerful and economical description of spatiotemporal point processes than those previously adopted in theoretical ecology. For instance, in earlier work in theoretical ecology (e.g., Bolker and Pacala 1997; Law et al. 2003; Ovaskainen and Cornell 2006b; North et al. 2011a), spatial moment equations were derived separately for each of the orders. In contrast, the formalism used by the probabilists (e.g., Kondratiev et al. 2008a; Finkelshtein et al. 2009) allows one to derive exact equations for spatial moments simultaneously of all orders.

In this paper, our aim is to translate mathematical literature for spatiotemporal point processes to provide a mathematically rigorous and practical framework for theoretical ecology. More specifically, we will (1) introduce the notation of Markov evolutions in locally finite configurations and (2) show how spatial moment equations of all orders can be systematically derived from the underlying individual-based assumptions. Further, as a new mathematical development, we will go beyond mean-field theory by (3) discussing how spatial moment equations can be perturbatively expanded around the mean-field model. While we have proposed such a perturbation expansion in our previous research (Ovaskainen and Cornell 2006a, b; North and Ovaskainen 2007; Cornell and Ovaskainen 2008; North et al. 2011a, b; Gurarie and Ovaskainen 2013), before the present work, we have not been able to give it a rigorous mathematical justification. In addition to bringing mathematical rigor, the application of the mathematically well-established framework of Markov evolutions allows us to derive our previously defined perturbation expansions in a transparent and systematic manner, which we hope will facilitate the application of the methods.

The framework and the mathematical methods to be discussed are very general in the sense that the modelled entities (typically representing individuals) may follow birth–death dynamics, move, interact with other entities, and they may have marks to represent, e.g., different species, genotypes, sexes, and age classes. To simplify the introduction of the mathematical framework, however, we will restrict the discussion in the present paper to a specific model, namely the spatial and stochastic logistic model (Bolker and Pacala 1997; Law et al.2003; Ovaskainen and Cornell2006b), which we consider as an illustrative example of a nonlinear stochastic and spatial model with localized interactions.

The modelling framework

Preliminaries

We model individuals (called particles in the mathematical literature) by discrete points in ℝd. To avoid boundary conditions and finite-size effects, we consider an infinitely large domain ℝd, but assume that the number of individuals within any finite region is finite. Thus, mathematically, we consider the space of locally finite configurations,

$$\Gamma:=\left\{\gamma\subset\mathbb{R}^{d} \bigm||\gamma\cap\Lambda|<\infty \text{ for any bounded } \Lambda\subset\mathbb{R}^{d} \right\}, $$

where, if A is a discrete set of points, then |A| stands for the number of points in A. To define a model, one might like to start from an initial configuration γ 0 and then give rules on how the configuration evolves in time. However, it turns out to be very difficult to write down the corresponding equation for γ t in a mathematically rigorous manner (for a nonrigorous attempt at such a derivation, see Ovaskainen and Cornell 2006b; Cornell and Ovaskainen 2008); therefore, we use ensembles of such starting configurations as a starting point. We use a probability measure μ t on Γ to describe the state of the system at time t. Informally, the measure μ t describes how likely the system is to be in a given configuration at time t, given that it starts from an initial state described by the measure μ 0 at time 0.

Due to the complicated nature of the measure μ t , it is difficult to define the time evolution of the system by formulating an equation for μ t . Instead, we build the evolution of the measure with the help of observables (functions on Γ which typically take real values but could be also vectors, matrices, etc.) which we denote by F. Thus, given a configuration γ, F(γ) is a numerical quantity that characterizes some property of the configuration γ. As an example, consider the indicator function 𝟙Ω(x) of a subdomain Ω ⊂ ℝd, with value 1 if x ∈ Ω and with value 0 otherwise. In this case, the observable F(γ) = ∑ xγ 𝟙Ω(x) simply counts how many points the configuration γ has within Ω.

We define the pairing between an observable and a measure by

$$\langle F,\mu\rangle:=\int_{\Gamma} F(\gamma)\,d\mu(\gamma). $$

For the current example, 〈F, μ〉 is the expected (mean) number of points within the region Ω if a configuration is sampled according to the measure μ. For example, if μ corresponds to the Poisson measure, often called complete spatial randomness in the ecological and statistical literatures (Haase 1995), with intensity (expected number of individuals per unit area) ρ, then 〈F, μ〉 = ρ |Ω| where |Ω| is the volume (in two-dimensional case, the area) of Ω.

The observable F can be considered as a test function, so that the pairing 〈F, μ〉 gives information about the state of the system, i.e., the measure μ. By varying the function F, it is possible to get essentially any desired information about the state of the system μ. This is described in more detail below, where we employ the spatial moment functions as observables.

Defining a model

A specific model is defined by describing how individual-level events modify an arbitrary observable F. More precisely, the evolution of states is defined through the differential equation

$$ \frac{d}{dt} \langle F,\mu(t)\rangle = \langle LF,\mu(t) \rangle, $$
(1)

where L is a linear operator acting on observables, i.e., functions on Γ.

To illustrate, we consider the evolution of states associated with the spatial and stochastic logistic model (Bolker and Pacala 1997; Law et al. 2003; Ovaskainen and Cornell 2006b), abbreviated henceforth as SSLM. A Lagrangian (individual-based) description of this model is as follows: (1) Sedentary individuals produce propagules at a per capita fecundity rate f (by “rate” we mean probability per time unit, so that in a continuous-time model, the probability of a propagule being produced by a particular individual during a short time dt is f d t). (2) A newly produced propagule is distributed (instantaneously) according to a dispersal kernel, and it is assumed to establish (instantaneously) as a newborn individual, which matures (instantaneously) and starts to produce propagules. (3) Existing individuals may die for two reasons. Firstly, there is a constant background per capita mortality rate m, yielding an exponentially distributed lifetime with mean 1 / m. Secondly, mortality has a density-dependent component (self-thinning), so that competition among the individuals may also lead to death. The density-dependent component of the death rate of a focal individual is a sum of contributions from all the other individuals within the entire ℝd, but the strength of the competitive effect decreases with distance.

The SSLM is mathematically defined through the linear operator L with

$$\begin{array}{@{}rcl@{}} (LF)(\gamma)&=&\underbrace{\sum\limits_{x\in \gamma}\left(\underbrace{m\vphantom{\sum\limits_{y\in \gamma\setminus x}}}_{\mathrm{dens-ind}} + \underbrace{\sum\limits_{y\in \gamma\setminus x} a^-(x-y)}_{\mathrm{dens-dep}}\right)[F(\gamma\setminus x)-F(\gamma)]}_{\text{mortality}}\\ &&+ \underbrace{\sum\limits_{y\in \gamma} \int_{\mathbb{R}^{d}} a^+(x-y)[F(\gamma\cup x)-F(\gamma)] dx}_{\text{reproduction}}. \end{array} $$
(2)

Here, the upper line corresponds to a death of an individual at location x, which changes the value of the observable F from F(γ) to F(γx) (where the notation γx means “the set γ, omitting the point x”). Because only existing individuals may die, the sum goes over xγ. The death rate of the individual located at x is the sum of the density-independent rate m and the density-dependent rate \(\sum_{y\in \gamma \setminus x} a^{-}(x-y)\). In the latter, the sum goes over all other individuals that are present in the system except the focal individual x, and the kernel a (xy) describes the mortality rate imposed by an individual located at y to the individual located at x. The lower line of Eq. (2) corresponds to the birth of a new individual at location x, which changes the value of the observable F from F(γ) to F(γx). Here, the sum goes over all individuals which are currently present and which may thus produce propagules. The reproduction kernel a +(xy) indicates the rate (per unit area) at which newborn individuals are created at location x by a parent located at y. The new individual may be born anywhere within ℝd, as indicated by the integral over the space. Note that the per capita fecundity rate is incorporated in the reproduction kernel a +, i.e., \(f=\int _{\mathbb {R}^{d}} a^{+}(x)dx\).

To finalize the model definition, an initial condition must be given. For example, the initial state μ 0 can be the Poisson measure with intensity parameter ρ(x), where we have included the spatial location x in the argument to emphasize that the initial intensity can vary in space.

Spatial moments and cumulants

While the operator L (together with initial measure μ 0) defines the model, as such it yields no statistical information on how the population behaves. To analyze the model behavior, we turn to the time evolution of spatial moments and spatial cumulants (Fig. 1), which translate the Lagrangian (individual-based) model definition into an Eulerian (population-based) framework. Spatial moments are also called correlation functions, and spatial cumulants are also called truncated correlation functions or semi-invariants (Ruelle 1964, 1969).

Fig. 1
figure 1

An overview of the mathematical framework presented in this paper. The evolution of measures represents an individual-based (Lagrangian) description of a spatiotemporal point process, whereas the evolutions of spatial moments and spatial cumulants give equivalent population-based (Eulerian) descriptions. The aggregated distribution of particles shown in panel a was generated by a process in which “habitat patches” are generated at a rate 0.05 per unit area. Each patch consists of a Poisson distributed (with mean 20) number of particles distributed in space according to a two-dimensional Gaussian kernel (length scale 0.5 spatial units) centered at the location of the patch. Particles disappear independently of each other at rate 1. The process was simulated for 20 time units to reach approximately the stationary state. Panels b and c show second-order spatial moments and cumulants, respectively, measured either empirically (the points) from the snapshot shown in panel a or computed analytically for the stationary state (the lines) using the formulae given in Cornell and Ovaskainen (2008). With the above parameters, the stationary density of the particles is one, and thus the second-order spatial moment converges to one when the distance between locations diverges. In contrast, the second-order spatial cumulant converges to zero when the distance between locations diverges and thus statistical dependency vanishes

Before defining spatial moments and cumulants, we start from what we expect to be familiar to many theoretical ecologists, namely the usual nonspatial moments and cumulants of a real-valued random variable X. Denoting the probability density function of the random variable X by f , the moment of order n is defined by

$$m_{n} = E\left[X^{n}\right] = \int_{-\infty}^{\infty} x^{n} f(x) dx. $$

A central tool for working with moments is the moment generating function,

$$ M_{X}(t):= E\left[e^{tX}\right]=1+m_{1} t + m_{2} {t^{2} \over 2!} + m_{3} {t^{3}\over 3!} +... $$
(3)

Cumulants are an alternative to moments: there is a one-to-one mapping between cumulants and moments. In some cases, cumulants lead to simpler algebra than moments, e.g., the cumulants for a sum of two independent variables are simply the sum of the cumulants of the variables. Cumulants can be defined through the cumulant generating function

$$ C_{X}(t):=\log E\left[e^{tX}\right]=c_{1} t + c_{2} {t^{2}\over 2!}+ c_{3}{t^{3}\over 3!}+... $$
(4)

The first few moments relate to the corresponding cumulants as

$$\begin{array}{@{}lll@{}} m_{1} &=& c_{1}\\ m_{2} &=& c_{1}^{2} + c_{2} \\ m_{3} &=& c_{1}^{3}+3 c_{1} c_{2}+c_{3}, \end{array} $$

so that more generally for n ≥ 2,

$$\begin{array}{@{}rcl@{}} c_{n} &=& m_{n} -\sum\limits_{\substack{p_{1}+\ldots+p_{s}=n \\ s\geq 2, p_{i}>0\; \text{for}\; i = 1,\ldots,s}}\frac{1}{s!}c_{p_{1}}\ldots c_{p_{s}}. \end{array} $$
(5)

Moving to the spatial case, the nth order spatial moment is denoted by the function k (n)(x 1, … , x n ). It is assumed to be symmetric, i.e., invariant to permutation of the arguments x 1, … , x n . It is said to correspond to measure μ if the equation

$$\begin{array}{@{}rcl@{}} &&{} \int_{\Gamma} \sum\limits_{\{x_{1},\ldots,x_{n}\}\subset\gamma} f^{(n)}(x_{1},\ldots,x_{n}) d\mu(\gamma)\\ &&=\frac{1}{n!}\int_{(\mathbb{R}^{d})^{n}} f^{(n)}(x_{1},\ldots,x_{n}) k^{(n)}(x_{1},\ldots,x_{n})\,dx_{1}\ldots dx_{n} \end{array} $$
(6)

holds for all symmetric functions f (n): (ℝd)(n) → ℝ. The spatial moments of all orders ( n = 0, 1, . . . ) are collected into the family \(k=\bigl \{k^{(n)}\bigr \}\), with the zeroth order defined as k (0) = 1. The vector k of all spatial moments is a sufficient description of the state of the system, i.e., it includes the same statistical information as the measure μ (see Electronic Supplementary Material). These objects are called correlation functions in Mathematical Physics and factorial moments in Probability Theory.

For example, let the function f (1)(x) be the indicator function of a set Ω. Then, by the above definition, the expected number of points within Ω can be written with the help of the first correlation function k (1)(x) as

$$\begin{array}{@{}rcl@{}} \int_{\Gamma} \sum\limits_{x\in \gamma} \mathbb{1}_{\Omega}(x) d\mu(\gamma) &=& \int_{\mathbb{R}^{d}} \mathbb{1}_{\Omega}(x) k^{(1)}(x) dx\\ &=& \int_{\Omega} k^{(1)}(x) dx. \end{array} $$

Thus, k (1)(x) corresponds to the expected population density at location x, so that the probability of there being at least one individual in a small neighborhood (of size dx) around the location x is k (1)(x)d x. The second-order spatial moment k (2)(x 1, x 2) measures the density of pairs of individuals, so that the probability that there is at least one individual in the neighborhood of the location x 1 and simultaneously at least one individual in the neighborhood of the location x 2 is k (2)(x 1, x 2)d x 1 d x 2. More generally, k (n)(x 1, … , x n )d x 1d x n can be interpreted as the probability that there are simultaneously individuals in the neighborhoods of each of the locations x i . Thus, the first-order spatial moment describes expected population density, while the second and higher orders describe dependency among the individuals, i.e., the degree of clustering of the pattern. In the case of Poisson measure (i.e., complete spatial randomness), the spatial moment function of any order n is simply given by the product (Albeverio et al. 1998)

$$k^{(n)}(x_{1},\ldots,x_{n}) = \prod_{i=1}^{n} k_{1}^{(1)}(x_{i}). $$

Analogously to nonspatial cumulants (5), the spatial cumulants can be defined recursively as (see, e.g., Ruelle 1964; Kondratiev et al. 2008b)

$$\begin{array}{@{}rcl@{}} u^{( 0)} :&=&0, \\ u^{( 1)}( x) :&=&k^{( 1)}( x) ,\\ u^{( 2)}( x_{1},x_{2} ) :&=&k^{( 2)}( x_{1},x_{2}) -k^{( 1)}( x_{1})k^{( 1)}( x_{2}) \end{array} $$

and, for any η, η = n ≥ 2,

$$\begin{array}{@{}rcl@{}} u^{( n)}(\eta) :&\,=\,& k^{( n)}(\eta)\,-\,\sum\limits_{\substack{\eta_{1} \sqcup \ldots \sqcup \eta_{s} =\eta \\ s\geq 2, \eta_{i} \neq \emptyset \text{ for } i = 1,\ldots,s}}\frac{1}{s!}u(\eta_{1})... u\left(\eta_{s} \right). \end{array} $$
(7)

Here, ⊔ means a disjoint union (a modified union operation that indexes the elements according to which set they originated in), and the sum includes all permutations of elements of a given partition of η. Thus, as in the nonspatial case, the cumulant of order n is obtained from the moment of order n by subtracting all combinations of lower-order cumulants.

For the Poisson measure, all spatial cumulants of order at least two are zero. Unlike the spatial moments, the spatial cumulants can often be expected to be small in the sense that they tend to zero as the distance between any two points in the definition tends to infinity.

Generating functionals

Moment (and cumulant) generating functions provide one of the most central mathematical tools in the analysis of real-valued random variables. In the same manner, much of the mathematical machinery that can be developed for Markov evolutions in the space of locally finite configurations is based on the use of generating functionals. While the material covered elsewhere in the main text of this paper does not require knowledge on generating functionals, we provide here a short and informal treatment of this topic, a slightly more detailed treatment being given in the Electronic Supplementary Material.

The spatial generalization of a moment generating function is that of a spatial moment generating functional, also called the Bogolyubov functional (Kondratiev and Kuna 2002; Kondratiev et al. 2006a), denoted by B(θ). The Bogolyubov functional corresponding to a measure μ is defined as

$$ B(\theta) := \int_{\Gamma} \prod_{x\in\gamma}\bigl(1+\theta(x)\bigr)\,d\mu(\gamma), $$
(8)

where θ is any real function θ on ℝd with a compact support. We note that as the configuration γ is locally finite, and as the function θ has compact support, the product has only a finite number of values that differ from 1, and thus there is no problem with convergence of the infinite product. This integral is analogous to the definition E e tX of the usual moment generating function (3). To see this, we note first that integration with respect to the measure μ over the space Γ corresponds to taking an expectation. In the present case, the random variable is not a real value X, but a point configuration γ, thus we cannot simply multiply it by a scalar and exponentiate. For this reason, the scalar is replaced by the function θ, and the exponential is replaced by the product over the value of 1 + θ evaluated at the locations of the point configuration.

An alternative but equivalent (see Electronic Supplementary Material) way of defining the Bogolyubov functional is through spatial moments k corresponding to the measure μ,

$$ B(\theta) := \int_{\Gamma_{0}} \left(\prod_{x\in\eta}\theta(x)\right) k(\eta)\,d\lambda(\eta) $$
(9)

Here (and elsewhere), η denotes a finite configuration, to be distinguished from a potentially infinite but locally finite configuration denoted by γ. Thus η ∈ Γ0, the set of all finite subsets of ℝd. Here, k is the vector of all spatial moments, and k(η) is evaluated using the nth order spatial moment, where n = |η| denotes the number of points in the finite configuration η. In the equation above, the integral is taken over the space of all finite configurations with respect to the measure λ. This measure is defined for any function H on Γ0 as

$$ \int_{\Gamma_{0}} H(\eta)\,d\lambda(\eta):=H^{(0)}+\sum\limits_{n=1}^{\infty} \frac{1}{n!}\int_{(\mathbb{R}^{d})^{n}}H^{(n)}(x_{1},\ldots,x_{n})\,dx_{1}\ldots dx_{n}. $$
(10)

Thus, while our first definition of the Bogolyubov functional corresponds to E e tX, the second definition corresponds to the series expansion \(1+m_{1} t + m_{2} {t^{2} \over 2!} + m_{3} {t^{3}\over 3!} +...\), where the powers of t have been replaced by products of the function θ.

The spatial cumulant generating functional is defined in the same way as the spatial moment generating functional, but replacing moments with cumulants, i.e.,

$$W( \theta ) = \int_{\Gamma_{0}}\left(\prod_{x\in\eta}\theta(x)\right) u(\eta) d\lambda (\eta). $$

In full agreement with the nonspatial case, the generating functionals of spatial moments and spatial cumulants are related to each other as (see Electronic Supplementary Material)

$$B(\theta) = e^{W(\theta)}. $$

Time evolution of spatial moments and cumulants

One central aim of this paper is to provide a general recipe for how to translate an equation for the evolution of measures (i.e., the model definition at the individual level) into a set of equations for spatial moments or equivalently for spatial cumulants (Fig. 1). In other words, we present a mapping from the Markov generator L into another linear operator L Δ, the latter of which describes the time evolution at the level of spatial moments by

$$ \frac{d}{dt}k_{t}(\eta) = \left(L^{\Delta} k_{t}\right)(\eta). $$
(11)

As the derivation is somewhat technical, we present the full version in the Electronic Supplementary Material, and illustrate here only the end result, i.e., the resulting operator L Δ. For the case of the SSLM, we obtain

$$\begin{array}{@{}rcl@{}} (L^{\Delta} k)(\eta) &=&- \left(m|\eta|+\sum\limits_{x\in\eta}\sum\limits_{y\in\eta\setminus x}a^-(x-y)\right)k(\eta) \end{array} $$
(12)
$$\begin{array}{@{}rcl@{}} & & -\sum\limits_{y\in\eta}\int_{\mathbb{R}^{d}} a^-(x-y)k(\eta\cup x)dx \end{array} $$
(13)
$$\begin{array}{@{}rcl@{}} & & + \sum\limits_{y\in\eta}\left(\sum\limits_{x\in\eta\setminus y}a^+(x-y)\right)k(\eta\setminus y) \end{array} $$
(14)
$$\begin{array}{@{}rcl@{}} & & + \sum\limits_{y\in\eta}\int_{\mathbb{R}^{d}} a^+(x-y)k((\eta\setminus y)\cup x)dx \end{array} $$
(15)

In earlier work in theoretical ecology (e.g., Bolker and Pacala 1997; Law et al. 2003; Ovaskainen and Cornell 2006b; North et al. 2011a), spatial moment equations were derived separately for each of the orders. In contrast, the above equation contains the exact equations for spatial moments simultaneously of all orders, as the finite configuration η may include any number of points.

While the general formula (1215) may seem complicated at a first glance, the terms in it can be interpreted in an intuitive manner. To see this, consider a particular η consisting of n points, η = {x 1, … , x n }. By the interpretation of the correlation function, k(η)d x 1d x n can be thought as the probability that a randomly chosen configuration γ contains individuals in the neighborhoods of each of the locations x i . Let us call the individuals near (within the neighborhoods d x i ) the locations η as the η-individuals. The negative terms in Eqs. 1215 represent the rate at which configurations containing η-individuals are lost, which happens if any of the η-individuals dies. The rate at which this happens due to background mortality is m |η| . Concerning density-dependent mortality, there are two options. First, one of the η-individuals may kill another η-individual, which case is represented by the double sum in Eq. 12. Second, an individual which is not one of the η-individuals may kill one of the η-individuals. This possibility is represented by Eq. 13. As the individual that imposes mortality to one of the η-individuals can be located at any location x, this term contains an integral over the space. Because the probability of a configuration γ having points both in the neighborhoods of η and at the point x is given by k(ηx), this term includes a spatial moment function of the order |η| + 1.

Similarly, the positive terms represent the rate at which a configuration γ which does not contain η-individuals will start doing so. As the model involves only events in which new individuals appear one at a time, the configuration must already involve individuals near all locations in η except one, denoted by yη. The density of such configurations is k(ηy). Again, there are two possibilities: the parent of the new individual to appear at y may be part of ηy (14), or it may be any other individual of the present configuration (15).

To make the link from the general Eqs. 1215 to the earlier literature, let us set η = {x} and thus consider the first spatial moment. In this case, Eq. 11 simplifies to

$$\begin{array}{@{}rcl@{}} {d \over dt}k_{t}^{(1)}(x) &=&- m k_{t}^{(1)}(x) -\int_{\mathbb{R}^{d}} a^-(x-y)k_{t}^{(2)}(y,x)\, dy\\ &&+ \int_{\mathbb{R}^{d}} a^+(x-y)k_{t}^{(1)}(y)\, dy, \end{array} $$

as derived in previous studies (e.g Bolker and Pacala 1997; Law et al. 2003; Ovaskainen and Cornell 2006b). As the first spatial moment depends on the second spatial moment (or more generally, the spatial moment of order n depends on the spatial moment of order n + 1 if the model involves pairwise interactions), the spatial moment equations form an infinite hierarchy and thus they cannot be solved exactly.

The corresponding equations for spatial cumulants can be obtained simply by employing the relationship between spatial cumulants and spatial moments. In general, this results in the equation (see Electronic Supplementary Material for details)

$$ \frac{d}{dt}u_{t}(\eta) = \left(Q^{\Delta} u_{t}\right)(\eta), $$
(16)

where, unlike in the spatial moment equations, the operator Q Δ is nonlinear. Q Δ can be decomposed into linear and nonlinear parts as

$$Q^{\Delta} = L^{\Delta} + M^{\Delta}, $$

where the linear part L Δ is the same as that determining the time evolution for spatial moments. For example, in the case of the SSLM, the first-order equation for spatial cumulants reads

$$\begin{array}{@{}rcl@{}} \frac{d}{dt}u_{t}^{(1)}(x) &=&- m u_{t}^{(1)}(x)- \int_{\mathbb{R}^{d}} a^-(x-y)u_{t}^{(2)}(y,x)\,dy\\ &&+\int_{\mathbb{R}^{d}} a^+(x-y)u_{t}^{(1)}(y) \, dy\\ &&-u_{t}^{(1)}(y) \int_{\mathbb{R}^{d}} a^-(x-y)u_{t}^{(1)}(y)\, dy, \end{array} $$
(17)

where the last term in the right-hand side is the nonlinear component. As for the spatial moment equations, spatial cumulants form an infinite hierarchy that cannot be solved exactly, but unlike spatial moments, spatial cumulants of higher orders can be expected to be small, at least in some useful limit.

Perturbation expansion around the mean-field limit

The equations for the time evolution of spatial moments (or equivalently, for the spatial cumulants) correspond exactly to the underlying Markov evolution, but as noted above, they cannot be solved in closed form except for some trivial cases (essentially, cases without ecological interactions or density dependent terms). The main problem here is that the lower-order moments depend on higher-order moments, so that, for example, in the case of the SSLM, the nth order spatial moments (or cumulants) depends on the n + 1th order spatial moments (or cumulants). In theoretical ecology as well as in physics and chemistry, a lot of emphasis has been paid on developing moment closures (e.g., Levermore 1996; Murrell et al. 2004; Bolker 2004) and pair approximations (e.g., Matsuda et al. 1992; Keeling et al. 1997; Ellner 2001), the latter of which are analogous to moment closures in the context of discrete-space problems. Moment closure methods produce a closed set of equations by making some structural assumption on how the higher-order moments depend on the lower-order moments.

While moment closure methods have been successfully applied to a wide range of problems, they do not provide a mathematically satisfactory approximation, in the sense that the approximation is uncontrolled. Comparison to simulations is usually the only way to assess how accurate approximation a given moment closure is for a particular problem, and how the accuracy of the approximation depends on the parameter regime. To overcome this limitation, some of us (OO and SC) have developed in our earlier work (Ovaskainen and Cornell 2006a, b; North and Ovaskainen 2007; Cornell and Ovaskainen 2008; North et al. 2011a, b), an alternative approach, based on considering the full spatial and stochastic model as a perturbation expansion around the mean-field model and working out the first terms of the resulting expansion. Comparison to simulations has suggested that this approach indeed provides a controlled approximation which becomes asymptotically exact at the mean-field limit. However, as our earlier derivations have been of heuristic nature, we have not been able to show the convergence of the perturbation expansion to the exact solution in a mathematically rigorous manner. In this section, we re-derive the perturbation expansion, building on the mathematical machinery that some of us (DF, OK, and YK) have developed (Kondratiev and Kuna 2002; Kondratiev et al. 2008a; Finkelshtein et al. 2010, 2011, 2012) in our earlier work. As we go here beyond the mean-field limit, this section also contains methods and results that are mathematically new.

A mean-field limit generally refers to a situation in which the law of mass action holds, i.e., it assumes that individuals are (at least locally) well-mixed in the sense that the probability of interaction of a randomly chosen individual with any other individual from the same population does not depend on the individual chosen (Morozov and Poggiale 2012). The mean-field limit, also called a mesoscopic limit (Presutti 2009), can be obtained by various kinds of scalings, which have been called, e.g., mean-field, Vlasov and Lebowitz-Penrose in the mathematical literature. In this paper, we will consider only one particular limit, which is that of long-ranged interactions (Ovaskainen and Cornell 2006b). To define this limit, we note that interactions are described in the operator L with the help of kernels, which describe pairwise interactions between individuals. For example, in case of the SSLM, two such kernels are involved, namely a (competition/density-dependent mortality) and a +(reproduction and dispersal). For any such kernel a, we define a scaled version a ε by

$$ a_{\varepsilon}(x):=\varepsilon^{d} a(\varepsilon x). $$
(18)

As ε → 0, the kernel becomes increasingly flat and long-ranged, while its integral remains constant, i.e.,

$$\int_{\mathbb{R}^{d}} a_{\varepsilon}(x)\,dx=\int_{\mathbb{R}^{d}} a(x)\,dx $$

independently of ε > 0. For a given model defined by an operator L, we define a scaled model by replacing the operator L by L ε , meaning that all the kernels of L are rescaled as in Eq. 18. Further, we denote the corresponding operators for the spatial moments and cumulants by \(L^{\triangle }_{\varepsilon }\), \(Q^{\triangle }_{\varepsilon }\), and \(M^{\triangle }_{\varepsilon }\), and the solutions to the corresponding equations by \(\hat {k}_{\varepsilon ,t}(\eta )\) and \(\hat {u}_{\varepsilon ,t}(\eta )\). Thus, e.g., \(\hat {k}_{\varepsilon ,t}\) satisfies

$$ \frac{\partial}{\partial t}\hat{k}_{\varepsilon,t} ( \eta )=\left( L^{\triangle }_{\varepsilon} \hat{k}_{\varepsilon,t} \right) ( \eta ). $$
(19)

The main idea here is that as ε → 0, the individuals interact with an increasing number of other individuals, and thus the dynamics of the model are expected to follow that of mass action with increasing accuracy. As ε → 0, the spatial patterns that emerge due to the dynamics (e.g., aggregations) are also expected to become increasingly long-ranged, and thus a limiting shape for the spatial moments (and cumulants) should involve a scaling with ε with respect to space. Motivated by this observation, we consider the following renormalization procedure. We denote for an arbitrary c > 0 by S c the scaling of space

$$(S_{c} k)(\eta):= k(c\eta), $$

where η ∈ Γ0 is any finite point configuration and c η denotes the set {c xxη}. We further denote by \(L_{\varepsilon ,\text {ren}}^{\triangle }\) a renormalized version of the operator L ε , defined by

$$ L_{\varepsilon,\text{ren}}^{\triangle} :=S_{\varepsilon^{-1}} L_{\varepsilon}^{\triangle} S_{\varepsilon} . $$
(20)

We denote the solution to the spatial moment equation corresponding to \(L_{\varepsilon ,\text {ren}}^{\triangle }\) by k ε, t (η), so that

$$ \frac{\partial}{\partial t}k_{\varepsilon,t} (\eta )=\left( L^{\triangle }_{\varepsilon,\text{ren}} k_{\varepsilon,t} \right) (\eta ). $$
(21)

Comparing (19) to (21) shows that

$$ \hat{k}_{\varepsilon,t}(\eta) = k_{\varepsilon,t} (\varepsilon \eta). $$
(22)

Before proceeding with the analysis, we note the reason why we consider two differently scaled versions of the spatial moments. If simulating the process with a given ε > 0, we obtain an estimate of the spatial moment \(\hat {k}_{\varepsilon ,t}\). However, to consider the limiting procedure analytically, we do not expect \(\hat {k}_{\varepsilon ,t}\) to yield a nontrivial limit, whereas k ε, t is expected to do so. Equation 22 connects these two scalings, making it possible to compare simulations to analytical solutions.

We denote by u ε, t the spatial cumulant that corresponds to the spatial moment k ε, t . As we approach the mean-field limit, we expect the higher-order cumulants, which describe the spatial patterns emerging due to localized interactions, to weaken. More precisely, based on our earlier nonrigorous analyses (Ovaskainen and Cornell 2006b), we expect that

$$ u_{\varepsilon,t}(\eta) = v_{t}(\eta) + \varepsilon^{d} w_{t}(\eta) + o(\varepsilon^{d}), $$
(23)

where o(ε d) refers to a term that goes to zero faster than ε d as ε → 0. Here, v t represents the mean-field, and w t the leading ’correction term’. Inserting this assumption into the equation for time evolution of spatial cumulants yields evolution equations for v t and w t . In these equations, lower-order cumulants still depend on higher-order cumulants, but equating powers of ε leads to a closed set of equations (Ovaskainen and Cornell 2006b). In the Electronic Supplementary Material, we make this procedure mathematically rigorous by showing that the equation for v(t) is nonzero only on the space of one-point configurations. In other words, at the limit of ε → 0 only the density of individuals matters, the spatial distribution becoming completely random (thus corresponding to Poisson measure in mathematical terminology).

The above observation results in a closed form equation for v(t), i.e., the mean-field equation. For the case of SSLM, the only modification is that the term corresponding to the second-order cumulant is dropped from Eq. 17, resulting in

$$\begin{array}{@{}rcl@{}} \frac{d}{dt}v_{t}^{(1)}(x){}&=&{}- m v_{t}^{(1)}(x) -v_{t}^{(1)}(y){}\int_{\mathbb{R}^{d}} a^-(x-y)v_{t}^{(1)}(y)\, dy\\ &&+\int_{\mathbb{R}^{d}} a^+(x-y)v_{t}^{(1)}(y)\, dy. \end{array} $$

The usual nonspatial logistic model is obtained by further assuming translational invariance, i.e., that the initial condition is independent of spatial location. In this case, in the limit ε → 0 the expected population density \(\rho _{t}=v^{(1)}_{t}(x)\) becomes independent of location, and evolves as

$$\frac{d}{dt}\rho_{t} =(A^{+}-\mu)\rho_{t}-A^{-} \rho_{t}^{2}, $$

where \(A^{+}=\int _{\mathbb {R}^{d}} a^{+}(x) dx\) and \(A^{-}=\int _{\mathbb {R}^{d}} a^{-}(x) dx\) denote the integrals of the reproduction and mortality kernels, respectively.

The convergence of the SSLM to the mean-field has been rigorously proved earlier by Finkelshtein et al. (2013). Our new results (Electronic Supplementary Material) show that the first-order correction term w t is nonzero only on the space of one- and two-point configurations. Thus, for large (but finite) interactions, the two-point spatial cumulant dominates the spatial pattern, the higher-order cumulants being less important. This result motivates the use of symmetric moment closures at the limit of long-ranged interactions, as suggested by Ovaskainen and Cornell (2006b). In the context of the perturbation expansion, the observation that w t is zero for higher-order terms than two-point configurations results in a closed differential equation for w t , where the mean-field solution v t is present as a source term (see Electronic Supplementary Material for the resulting equation for the SSLM).

To illustrate, Fig. 2 compares individual-based simulations to a numerical solution of v t and w t for a particular parameterization of the SSLM. As expected, with ε → 0 the density converges to the mean-field limit v t , and the linearized rate at which the system approaches the mean-field density is given by the correction term w t (Fig. 2a). Further, the limiting shape of the second cumulant is given by the correction term w t (Fig. 2b) predicted by the perturbation theory. To see this, note that the simulation results (the thin lines in Fig. 2b) converge to the limiting shape (the thick orange line in Fig. 2b) as the length scale parameter ε approaches zero, i.e., when the range of interactions in the simulations was set to increasingly large values. We note that the second-order spatial cumulant is positive and thus the spatial pattern is more aggregated than in the mean-field case of complete spatial randomness. As observed earlier (Law et al. 2003), this results in elevated competition among the individuals, leading to a lower population density than in the mean-field model. Further comparisons between first-order perturbation theory and individual-based simulations can be found from our earlier work in the contexts of population dynamics (Ovaskainen and Cornell 2006a, b; North and Ovaskainen 2007; Cornell and Ovaskainen 2008), evolutionary dynamics (North et al. 2011a, b), and animal movement (Gurarie and Ovaskainen 2013).

Fig. 2
figure 2

A comparison between individual-based simulations of the SSLM and a numerical solution to the first-order perturbation expansion of the spatial cumulant equations. Panel a depicts the first-order spatial cumulant, which corresponds to population density, and panel b the second-order spatial cumulant, which corresponds to spatial patterning at the level of two-point correlations. In panel a, the dots present simulation results with ε = 1,1/2,1/4,1/8,1/16, and 1/32, whereas the lines show the predictions of the mean-field model (function v v , horizontal line) and first-order perturbation theory (function w t , decreasing line). In panel b, the thin lines show simulation results, with color of the line corresponding to the different values of ε shown in panel a, whereas the thick orange line shows the prediction of the first-order perturbation theory (function w t ). Note that both axes in panel b have been scaled by ε to result in a nontrivial and finite limiting shape. As we assume a translationally invariant case, the first-order spatial cumulant is independent of space and the second-order spatial cumulant depends only on the distance Δx between the two points. Parameter values d = 1 (i.e., one dimensional space), initial distribution Poisson with intensity 1, fecundity A + = 2, density-independent mortality m = 1, and density-dependent mortality A = 1. Both the fecundity and mortality kernels are assumed to have a top-hat shape (explaining the sharp edges for the second-order spatial cumulant), i.e., A + (x) = 1 if |x| ≤ 1 whereas otherwise A + (x) = 0, and A (x) = 1/2 if |x| ≤ 1 whereas otherwise A (x) = 0. The state of the process is shown at time t = 3. Simulations were conducted using the Gillespie algorithm (Gillespie 1977) in a domain of size U = 200 ε with periodic boundary conditions, and the results shown are the average of 4,000 replicate simulations

Discussion

In this paper, we have presented a formalism that applies to a wide range of spatially explicit ecological models. The formalism, called Markov evolutions in the space of locally finite configurations, has become well established in the mathematical literature (Kondratiev et al. 2006b, 2008a, b, 2010; Kondratiev and Skorokhod 2006; Finkelshtein et al. 2012, 2010), but remained practically unnoticed in the theoretical ecology literature. As a consequence, theoretical ecologists have spent much effort in developing mathematical machinery for spatial and stochastic models by their own, including moment closure methods (Bolker and Pacala 1997; Keeling 2000; Filipe and Gibson 2001; Law et al. 2003; Murrell et al. 2004; Bolker 2004; Barraquand and Murrell 2013) and the use of stochastic differential equations as a heuristic mean of deriving perturbation expansions (Ovaskainen and Cornell 2006a, b; Cornell and Ovaskainen 2008; North et al. 2011a).

The advantage of the mathematical formalism presented here is that it is not only mathematically rigorous but also economical and transparent. As an example, to our knowledge, the method that we have presented for deriving spatial moment equations is the first one in the theoretical ecology literature that yields spatial moment equations for all orders simultaneously. Further, it is based on a standardized procedure for mapping one operator (L) into another operator ( L Δ). Hence, there are no biological or heuristic considerations to be made, and thus only mathematics are needed for the transition from a Lagrangian (individual level) to Eulerian (population level) description of the system.

A well-known property of spatial moments is that, except for trivial cases, the evolution equations form an infinite hierarchy and consequently they cannot be solved with exact methods analytically or even numerically. This problem cannot be avoided with any mathematical formalism. Thus, the transition from the individual level assumptions to spatial moment equations (i.e., from L to L Δ) should not be seen as the end result, but only as a starting point for model analysis. As the next step, we have proposed moving to the spatial cumulant equations (i.e., the transition from L Δ to Q Δ). This transition reveals the nonlinear mean-field equations and serves as a natural starting point for further model analysis. For example, some moment closures correspond to the assumption that cumulants of a given order (typically 3) and higher are set to zero (e.g., Keeling 2000; Marion et al. 2002).

In this paper, we have followed up our earlier work (Ovaskainen and Cornell 2006a, b; Cornell and Ovaskainen 2008; North et al. 2011a) to propose the use of a systematic perturbation expansion around the mean-field model as an alternative for model closures. The use of such an approach is mathematically appealing because the resulting approximations are controlled in the mathematical sense, i.e., they are guaranteed to converge to the exact solution at the well-defined limit of long-ranged interactions. As demonstrated by Fig. 2, the expansion can also be accurate enough for practical purposes even far from the mean-field model. In this figure, as is the case also generally, the predicted solution converges to the exact solution as ε → 0. However, the solution is close to the simulated one also, e.g., for ε = 1 / 2, in which case the individuals compete effectively only with a few neighbors, and the realized population density is only about half of the mean-field density. To gain numerically more accurate predictions, higher-order terms could be computed. The expansion can be expected to continue as

$$u_{\varepsilon,t} = v_{t} + \varepsilon^{d} w_{t} + \varepsilon^{2d} z_{t} + o(\varepsilon^{2d}), $$

where the term z(t) would be nonzero only up to three-point configurations, but we leave the treatment of this conjecture for further work. Further, we note that while developing a full form of the expansion would have both theoretical and applied value, the first-order correction term w t is sufficient for many applications.

We have used a version of the stochastic logistic model as an example thorough this paper. In principle, generalizations to other models, including those with multiple entity types, are straightforward. However, in practice, the derivations are of somewhat technical nature (see Electronic Supplementary Material) and thus require a level of mathematical expertise. What simplifies the problem is that the equations for the evolution of states and for the evolutions of spatial moments are linear, and thus more complex models can be built from simpler components. To see this, note that, e.g., the generator L of the SSLM (2) can be written as L = L D I M + L D D M + L R , where the three components refer to density-independent mortality, density-dependent mortality, and reproduction, respectively. The corresponding operator for spatial moments (1215) splits in the same way to \(L^{\Delta }=L_{DIM}^{\Delta }+L_{DDM}^{\Delta }+L_{R}^{\Delta }\). One way forward could be to construct a library of basic model ingredients for ecologically relevant processes, such as L D I M , L D D M , L R and corresponding generators for, e.g., the processes of immigration, movements by jumps or diffusion, mutation, infection, etc. Given precomputed operators L Δ for such model ingredients, the spatial moment equations for more complex models could be obtained simply by adding the relevant components together.

Let us finally note that the perturbation expansion around the mean-field is suitable for the analysis of model behavior both at the stationary state and during transient behavior. In our earlier work, we have used the perturbation of eigenvalues to derive invasibility criteria for the evolutionary applications based on adaptive dynamics (North et al. 2011a). We expect that the methods presented in this paper provide a natural starting point for a mathematically rigorous treatment of such extensions. Major challenges to which our formalism is not likely to provide easy solutions include the analysis of non-local quantities such as the probability of finding one or more particles in a finite region, and large-perturbation theory required, e.g., to understand how a system behaves as it approaches the extinction threshold.