1 Introduction

The movement of animals and humans is a fundamental process that drives gene flow, infectious disease spread, and the flow of information and resources through a population (Hanks and Hooten 2013; Coulon et al. 2006; Hooten et al. 2007; Scharf et al. 2015). Movement behavior is complex, often exhibiting directional persistence, response to local environmental conditions, dependence between conspecifics, and changing behavior in time and space. While technological advances have allowed movement (telemetry) data to be collected at high resolution in time and space, most movement data still exhibit nonnegligible observation error, requiring latent variable approaches, such as hidden Markov models (HMMs) or Bayesian hierarchical models (BHMs) to provide inference for movement parameters. The field of movement ecology is broad and growing; many different modeling approaches have been proposed for different species exhibiting different behaviors (e.g., Hooten et al. 2017). A majority of attempts to model movement stochastically rely on unconstrained stochastic processes, with positive probability of movement to any region in space (typically \(\mathbb {R}^2\)). This continuous-space assumption is realistic for many species, but is clearly violated for others, such as marine animals swimming near shorelines (Bjørge et al. 2002; Johnson et al. 2008; Small et al. 2005) or ants constrained to walk inside the confines of a nest (Mersch et al. 2013; Quevillon et al. 2015). In addition, measurement error on telemetry data often results in biologically impossible recorded animal locations, such as a seal being located miles inland, or two successive ant locations being separated by an impassible wall.

Fig. 1
figure 1

Sea lion telemetry data. Telemetry data from 30 days of observation of a sea lion (Eumatopias jubatus) in southeast Alaska, obtained using the ARGOS system.

We consider spatially constrained animal movement, where an animal can only be present within a known subset \(\mathcal {D}\) of \(\mathbb {R}^2\). To illustrate the need for movement models constrained by space, we consider movement of a Stellar sea lion (Eumatopias jubatus), a marine mammal that stays entirely in the water or hauled-out on the shoreline. Figure 1 shows telemetry data obtained using the ARGOS system (ARGOS 2015) from one sea lion over a 30-day observation period from December 6, 2010, to January 5, 2011. Stellar sea lions have experienced recent fluctuations in population size and could be threatened by disease, increased fishing in Northern waters, and other factors (Dalton 2005). Understanding where sea lions spend time can inform species management decisions and fishing regulations off the coast of Alaska. Telemetry data provide a natural approach to studying Stellar sea lion space use.

Remote tracking of marine mammals is challenging, because common tracking systems (such as GPS) are impeded by water. While the sea lion is always either in the water or hauled-out meters from the water’s edge, many of the telemetry locations are kilometers inland (Fig. 1). If a movement model were fit to the data without accounting for the constraint that the sea lion remain within water at all times, the posterior distribution of paths the animal could have taken would overlap land. This may lead to biased inference for space use or resource selection of pinnipeds (Brost et al. 2015), which could, in turn, lead to inefficient species management decisions. Additionally, inference without considering the spatial constraint (for example, the need to go around an island between telemetry observations) could lead to biased estimates for parameters governing animal movement.

Statistical inference for constrained movement is computationally challenging, because the spatial constraint \(\mathcal {D}\) often makes the evaluation of density functions only possible numerically. We present an approach for modeling constrained animal movement based on reflected stochastic differential equations (RSDEs), which have been used to model constrained processes in many fields. To implement our approach, we present a Markov chain Monte Carlo (MCMC) algorithm for sampling from the posterior distribution of model parameters by augmenting the constrained process with an unconstrained process. We illustrate our approach through a simulation example and an application to telemetry data from the sea lion shown in Fig. 1.

2 Modeling Constrained Movement with Reflected Stochastic Differential Equations

Stochastic differential equation (SDE) models are popular stochastic process models for animal movement (Brillinger et al. 2002; Brillinger 2003; Johnson et al. 2008; Preisler et al. 2013; Russell et al. 2017). Brillinger (2003) considered simulation of animal movement under a constrained RSDE model, but did not consider inference under such a model. We develop a class of SDE models that can capture a wide range of movement behavior and then propose approaches for simulation and inference under this class of models.

2.1 Modeling Observational Error

In general, we assume that we observe animal locations \(\mathbf {s}_t, t\in \{\tau _1,\tau _2,\ldots ,\tau _T\}\) at T distinct points in time \(\{\tau _t,t=1,\ldots ,T\}\). We assume that the locations are in \(\mathbb {R}^2\), with \(\mathbf {s}_t \equiv (s_t^{(1)},s_t^{(2)})'\) representing the observed location at time \(\tau _t\). The extension to higher dimensions (e.g., three-dimensional space) is straightforward. The observations are assumed to be noisy versions of the true animal location \(\mathbf {x}_t \equiv (x_t^{(1)},x_t^{(2)})'\) at time \(\tau _t\), with observation error distribution

$$\begin{aligned} \mathbf {s}_t \sim \ell (\mathbf {x}_t;{\varvec{\theta }}) \end{aligned}$$
(1)

where \({\varvec{\theta }}\) contains parameters controlling the distribution of observations centered at the true location. We begin by leaving this observation error distribution unspecified, develop a general framework for inference, and then apply a specific class of models to the sea lion telemetry data.

To allow for switching between notation for discrete and continuous-time processes, we adopt the following convention for subscripts. A Greek letter in the subscript implies an observation in continuous time, with \(\mathbf {x}_\tau \) being the location of the animal at time \(\tau \). We use a standard Latin letter in the subscript to index a set of locations at a discrete set of times; thus, \(\mathbf {x}_t=\mathbf {x}_{\tau _t}\) represents the individual’s location at time \(\tau _t\). We also adopt the notation that \(\mathbf {x}_{s:t}\equiv \{\mathbf {x}_s,\mathbf {x}_{s+1},\ldots ,\mathbf {x}_{t-1},\mathbf {x}_t\}\) represents the set of \((t-s+1)\) observations in discrete time between the sth and tth observations (inclusive) in the sequence.

In the next section, we will develop a model for movement based on an approximate solution to an SDE. As this approximation operates in discrete time, with a temporal step size of h, the approximation yields latent animal locations \(\mathbf {x}_\tau : \tau \in \{0,h,2h,\ldots ,Th\}\) at discrete times.

2.2 A General SDE Model for Animal Movement

We first consider the unconstrained case (\(\mathcal {D}\equiv \mathbb {R}^2\)) and then consider constrained processes. A class of SDE models that can capture a wide range of movement behavior are expressed as follows. Let the individual’s position at time \(\tau \) be \(\mathbf {x}_\tau \) and define \(\mathbf {v}_\tau \) to be the individual’s true velocity at time \(\tau \)

$$\begin{aligned} {\hbox {d}}\mathbf {x}_\tau = \mathbf {v}_\tau {\hbox {d}}\tau . \end{aligned}$$
(2)

This differential equation may be equivalently written as an integral equation (e.g., Hooten and Johnson 2017)

$$\begin{aligned} \mathbf {x}_\tau = \mathbf {x}_0+\int _0^\tau \mathbf {v}_\gamma {\hbox {d}}\gamma ; \end{aligned}$$

however, we adopt the differential equation form throughout this section.

By modeling the time derivative of an individual’s velocity, we focus on modeling acceleration, or, equivalently, the force applied to an individual animal over time. This provides a natural framework for modeling intrinsic and extrinsic forces applied to a moving animal. Consider the following SDE model for the time derivative of velocity

$$\begin{aligned} {\hbox {d}}\mathbf {v}_\tau = -\beta (\mathbf {v}_\tau -{\varvec{\mu }}(\mathbf {x}_\tau ,\tau )){\hbox {d}}\tau + c(x_\tau ,\tau )\mathbf {I}{\hbox {d}}\mathbf {w}_\tau . \end{aligned}$$
(3)

In (3), \(\beta \) is an autocorrelation parameter, \({\varvec{\mu }}(\mathbf {x}_\tau ,\tau )\) is a function specifying the vector-valued mean direction of movement (drift), perhaps as a function of time \(\tau \) or current location \(\mathbf {x}_\tau \), \(\mathbf {w}_\tau = (w^{(1)}_\tau ,w^{(2)}_\tau )'\) is a vector of two independent standard Brownian motion processes, \(\mathbf {I}\) is the \(2\times 2\) identity matrix, and \(c(\mathbf {x}_\tau ,\tau )\) is a scalar function controlling the magnitude of the stochastic component of (3).

Several existing models for animal movement fit into the general framework defined by (2)–(3). For example, the continuous-time correlated random walk model developed by Johnson et al. (2008), with a constant drift \({\varvec{\mu }}\), is obtained by setting \({\varvec{\mu }}(\mathbf {x}_\tau ,\tau )={\varvec{\mu }}\) and assuming constant stochastic variance across time \(c(\mathbf {x}_\tau ,\tau )=\sigma \). Johnson et al. (2008) also consider a time-varying drift parameter by modeling \(d\mathbf {v}_\tau \) as the sum of two stochastic processes similar to those in (3), operating on different timescales.

As a second example, the potential function approach to modeling animal movement (Brillinger et al. 2001, 2002; Preisler et al. 2004, 2013) results from specifying the drift function as \({\varvec{\mu }}(\mathbf {x}_\tau ,t)=-\nabla H(\mathbf {x}_\tau )\), the negative gradient of a potential surface \(H(\mathbf {x})\), which is a scalar function defined in \(\mathbb {R}^2\). In the overdamped case where \(\beta \rightarrow \infty \), and when the stochastic variance is constant over time and space (\(c(\mathbf {x}_\tau ,\tau )=\sigma \)), the SDE (3) reduces to

$$\begin{aligned} d\mathbf {x}_\tau = -\nabla H(\mathbf {x}_\tau )d\tau + \sigma \mathbf {I}d\mathbf {W}_\tau . \end{aligned}$$
(4)

See Brillinger et al. (2001) for details. The velocity-based movement model of Hanks et al. (2011) results from taking a discrete (Euler) approximation to the SDE in (4).

As a third example, the spatially varying SDE approach of Russell et al. (2017) for modeling spatial variation in motility (overall rate of speed) and directional bias could be approximated by setting \(c(\mathbf {x}_\tau ,\tau ) = \sigma m(\mathbf {x}_\tau )\) and \(\mu (\mathbf {x}_\tau ,\tau )=m(\mathbf {x}_\tau )\left( -\frac{{\hbox {d}}}{{\hbox {d}}x}H(\mathbf {x}_\tau )\right) \), where \(H(\mathbf {x})\) is a potential function as in Brillinger et al. (2001), and \(m(\mathbf {x}_\tau )\) is a spatially varying motility surface that acts by dilating or compressing time, as is done by Hooten and Johnson (2017) using a time warping function. While Russell et al. (2017) allow this motility or time-dilation surface to vary across space, Hooten and Johnson (2017) allowed their warping function to vary across time to capture time-varying movement behavior in which individuals exhibit periods of little or no movement interspersed with periods of higher activity.

2.3 Numerically Approximating Constrained SDEs

We consider simulation of a constrained SDE and describe a related approach for inference. The model in (2)–(3) is a semi-linear Ito SDE (e.g., Allen 2007), and in some cases, such as the CTCRW of Johnson et al. (2008) and the potential function approach of Brillinger et al. (2001), closed-form solutions are available for the transient distribution without spatial constraints. However, when movement is constrained to occur within a fixed spatial domain \(\mathcal {D}\), no closed form for the general transient distribution exists. Thus, we consider numerical approximations to the solution of the SDE, both without the spatial constraint and using modified approximations that account for the spatial constraint \(\mathcal {D}\).

The simplest and most common numerical approximation to the solution to SDE (2)–(3) is the Euler–Maruyama scheme, which results from a first-order Taylor series approximation (e.g., Kloeden and Platen 1992). Given a temporal step size of h, the Euler–Maruyama iterations are

$$\begin{aligned} \mathbf {x}_{\tau +h}&=\mathbf {x}_\tau +\mathbf {v}_\tau h \end{aligned}$$
(5)
$$\begin{aligned} \mathbf {v}_{\tau +h}&=\mathbf {v}_\tau -\beta (\mathbf {v}_\tau -{\varvec{\mu }}(\mathbf {x}_\tau ,\tau ))h+c(\mathbf {x}_\tau ,\tau )\mathbf {I} \mathbf {w}_\tau , \end{aligned}$$
(6)

where \(\mathbf {w}_\tau \mathop {\sim }\limits ^{iid} N(\mathbf {0},h\mathbf {I})\). The numerical approximation in (5)–(6) is known to be of strong order 1/2 (Kloeden and Platen 1992). Russell et al. (2017) use this Euler–Maruyama numerical procedure to specify an approximate statistical model for spatially varying movement behavior of ants.

2.3.1 A Two-Step Higher-Order Procedure

Brillinger (2003) notes that, for constrained SDEs, the Euler–Maruyama scheme may require a very fine temporal discretization to result in realistic paths, and recommends higher-order numerical schemes be used. One modification of the above Euler–Maruyama procedure involves replacing the velocity \(\mathbf {v}_\tau \) with a first difference approximation. This is similar to the approach taken in Runge–Kutta procedures for solving partial differential equations (e.g., Cangelosi and Hooten 2009; Wikle and Hooten 2010; Cressie and Wikle 2011). From (5), note that \(\mathbf {v}_\tau =(\mathbf {x}_{\tau +h}-\mathbf {x}_\tau )/h\). Substituting this expression for \(\mathbf {v}_{\tau }\) into (6) gives

$$\begin{aligned} \mathbf {x}_{\tau +2h}=\mathbf {x}_{\tau +h}(2-\beta h)+\mathbf {x}_\tau (\beta h -1)+\beta h^2 {\varvec{\mu }}(\mathbf {x}_\tau ,\tau )+\mathbf {w}_\tau , \end{aligned}$$
(7)

where \(\mathbf {w}_\tau \mathop {\sim }\limits ^{iid} N(\mathbf {0},h^3c^2(\mathbf {x}_\tau ,t)\mathbf {I})\). This numerical procedure has three main benefits relative to the Euler–Maruyama approach. First, the resulting solution to the unconstrained SDE is an approximation of strong order 1 (Kloeden and Platen 1992) and thus provides a more accurate approximation to the continuous-time solution than does the Euler–Maruyama procedure. Second, this procedure removes the latent velocity \(\mathbf {v}_\tau \) from the probability distribution, which simplifies the transition densities to only rely on animal locations at the two previous time points. Russell et al. (2017) used the Euler–Maruyama approach to motivate a statistical model and treated \(\mathbf {v}_\tau \) as latent variables to be estimated. The two-step procedure in (7) removes the need to make inference on the latent \(\mathbf {v}\). Third, removing the latent velocity from the approximation simplifies the solution in the presence of a spatial constraint \(\mathcal {D}\), because the velocity is not constrained, but the animal’s position is constrained to occur within \(\mathcal {D}\).

2.3.2 Reflected Stochastic Differential Equations for Animal Movement

The SDE in (2)–(3), whose solution is approximated by (7), is not constrained to occur within \(\mathcal {D}\). One theoretical approach to constructing a constrained process is to consider a process \(\mathbf {k}_\tau \) that is defined as the minimal process required to keep \(\mathbf {x}_\tau \) within \(\mathcal {D}\). Thus, we modify (2)–(3) to obtain the constrained stochastic process

$$\begin{aligned} {\hbox {d}}\mathbf {x}_\tau&= \mathbf {v}_\tau {\hbox {d}}t +\mathbf {k}_\tau {\hbox {d}}t \end{aligned}$$
(8)
$$\begin{aligned} {\hbox {d}}\mathbf {v}_\tau&= -\beta (\mathbf {v}_\tau -{\varvec{\mu }}(\mathbf {x}_\tau ,t)){\hbox {d}}t + c(x_\tau ,t)\mathbf {I}{} \mathbf d \mathbf {w}_\tau . \end{aligned}$$
(9)

This approach is a so-called “reflected” stochastic differential equation (RSDE, e.g., Lépingle 1995; Grebenkov 2007; Dangerfield et al. 2012), a generalization of reflected Brownian motion. In reflected Brownian motion, a Brownian trajectory is reflected when it encounters the boundary \(\partial \mathcal {D}\) of the domain \(\mathcal {D}\). While there are many theoretical results for reflected Brownian motion, we note that the SDE in (2)–(3) is a variation on integrated Brownian motion, and therefore results for reflected Brownian motion are not directly applicable here.

The process \(\mathbf {k}_\tau \) is defined as the minimal process required to restrict \(\mathbf {x}_\tau \) to be within \(\mathcal {D}\) and can be described by considering a unit vector \(\mathbf {n}(\mathbf {x})\) that points toward the interior of \(\mathcal {D}\) orthogonal to \(\partial \mathcal {D}\) at \(\mathbf {x}\). Then this minimal process is defined as

$$\begin{aligned} \mathbf {k}_\tau = {\left\{ \begin{array}{ll}{} \mathbf{0} &{}\text { if }\mathbf {x}_\tau \in \mathcal {D}\\ -\mathbf {n}(\mathbf {x}_\tau )\frac{\mathbf {n}(\mathbf {x}_\tau )^\prime \mathbf {v}_\tau }{\mathbf {n}(\mathbf {x}_\tau )'\mathbf {n}(\mathbf {x}_\tau )} &{}\text { if }\mathbf {\mathbf {x}}_\tau \in \partial \mathcal {D} \end{array}\right. }. \end{aligned}$$
(10)

Under this specification, when an individual encounters the boundary \(\partial \mathcal {D}\), the process \(\mathbf {k}_\tau \) nullifies the component of the individual’s velocity that would carry it out of \(\mathcal {D}\), and the individual’s velocity becomes parallel to the boundary \(\partial \mathcal {D}\) until acted upon by other forces (such as \(\mathbf {w}_\tau \)). \(\mathbf {k}_\tau \) is defined when \(\partial \mathcal {D}\) admits an orthogonal vector \(\mathbf {n}\), which is true for smooth boundaries \(\partial \mathcal {D}\). A natural way to define \(\partial \mathcal {D}\) is as a polygon, which is piece-wise continuous. In this setting, \(\mathbf {n}\) would be undefined at polygon vertices, but for a fine temporal resolution, the latent process \(\mathbf {x}\) will rarely or never directly encounter the vertices.

The numerical solution (approximation) to such a constrained SDE (8)–(9) can be obtained in multiple ways. The most common approach (e.g., Lépingle 1995; Grebenkov 2007; Dangerfield et al. 2012) is to consider a projected version of a numerical solution to the unconstrained SDE. This corresponds to the projection approach proposed by Brillinger (2003) for a simpler SDE, who also proposes two other schemes for constraining \(\mathbf {x}_\tau \) to remain within \(\mathcal {D}\). We do not consider these other schemes here, but make note of them in Discussion.

In a projected approach to solving the RSDE, the two-time-step numerical procedure in (7) is modified by augmenting the solution \(\mathbf {x}_\tau \) to the constrained SDE with an unconstrained process \(\tilde{\mathbf {x}}_\tau \) that may occur outside \(\mathcal {D}\), as follows. Conditioned on the constrained process at previous times \(\mathbf {x}_{1:(\tau +h)}\), the distribution of the unconstrained process \(\tilde{\mathbf {x}}_{\tau +2h}\) is given by (7), with

$$\begin{aligned}&\tilde{\mathbf {x}}_{\tau +2h}|\mathbf {x}_{1:(\tau +h)} \sim N\left( (2-\beta h)\mathbf {x}_{\tau +h}+(\beta h-1)\mathbf {x}_{\tau }+\beta h^2 {\varvec{\mu }}(\mathbf {x}_\tau ,\tau ),\sigma ^2h^3\mathbf {I}\right) ,\nonumber \\&\quad r=3,4,\ldots ,T. \end{aligned}$$
(11)

Any simulated animal location \(\tilde{\mathbf {x}}_{\tau +2h} \notin \mathcal {D}\) that falls outside of the spatial region \(\mathcal {D}\) is projected onto the nearest location \(\mathbf {x}_{\tau +2h} \in \partial \mathcal {D}\) on the spatial boundary

$$\begin{aligned} \mathbf {x}_{\tau +2h} = \mathop {{{\mathrm{argmin}}}}\limits _{\mathbf {u}\in \mathcal {D}} \{ ||\mathbf {u}-\tilde{\mathbf {x}}_{\tau +2h}||\}. \end{aligned}$$
(12)

This results in a computationally efficient approach to simulating sample paths from the constrained SDE in (8)–(9), as the boundary \(\partial \mathcal {D}\) can be approximated as a polygon, and fast algorithms can be specified for projection of a point outside of \(\mathcal {D}\) onto the polygonal boundary \(\partial \mathcal {D}\). Pseudo-code for simulation of the RSDE in (8)–(9) for a given temporal step size h is given in “Appendix A,” and R code to implement this approach is available upon request.

3 Inference on RSDE Model Parameters

We now consider inference on the movement parameters \({\varvec{\theta }} \equiv (\beta ,\sigma ^2)^\prime \) from observed telemetry data \(\{\mathbf {s}_i, \ i=1,2,\ldots ,n\}\). To maintain generality in our description, we consider a general observation error model (1), with

$$\begin{aligned} \mathbf {s}_t \sim \ell (\mathbf {x}_t;{\varvec{\theta }}) \end{aligned}$$

and with the latent movement path \(\mathbf {x}_{1:T}\) defined by (11)–(12). We treat our discrete-time approximation (11)–(12) as the statistical model for the latent movement process, rather than the RSDE in (8)–(9). This requires a sufficiently fine temporal resolution h to maintain fidelity to RSDE (8)–(9). Our goal is inference on the latent discrete-time representation of the animal’s movement path \(\{\mathbf {x}_r,\ r=1,2,\ldots ,T\}\) together with the movement parameters \({\varvec{\theta }} \equiv (\beta ,\sigma ^2)^\prime \).

The main difficulty in such inference is the latent unknown movement path \(\mathbf {x}_{1:T}\), because if we were able to condition on \(\mathbf {x}_{1:T}\), inference on \({\varvec{\theta }}\) would be straightforward. If the latent movement path is unconstrained, then model (1), (11)–(12) is a hidden Markov model (HMM), and inference can be made using recursive algorithms such as the Kalman filter (Cappé 2005; Zucchini and MacDonald 2009; Cressie and Wikle 2011).

The projection in (12) is nonlinear; thus, we need to make inference on the states and parameters in a nonlinear (constrained) state space model. Many methods for such inference have been proposed, including the ensemble Kalman filter (Katzfuss et al. 2016), MCMC (Cangelosi and Hooten 2009) and particle filtering (Andrieu et al. 2010; Cappé et al. 2007; Del Moral et al. 2006; Kantas et al. 2009). In particle filtering, the filtering densities \(f(\mathbf {x}_t|\mathbf {s}_{1:t};{\varvec{\theta }})\) are recursively approximated using particles that are propagated at each time point using transition density (11)–(12) and then reweighted based on observation likelihood \(\ell \) (1). Particle filtering approaches to inference, like particle MCMC (Andrieu et al. 2010), are appealing for constrained processes because they do not require the evaluation of transition densities (11)–(12), which are intractable due to the projection, but only require that they be simulated from.

3.1 Inference on RSDEs Through Markov Chain Monte Carlo

To make inference on model parameters \({\varvec{\theta }}\equiv (\beta ,\sigma )'\) and the individual’s latent path \(\mathbf {x}_{1:T}\), we constructed an MCMC algorithm to sample from the posterior distribution of \(\mathbf {x}_{1:T},{\varvec{\theta }}|\mathbf {s}_{1:n}\). In doing so, we make explicit use of the simulation procedure in (11)–(12), which is a discretized, constrained movement model. It would be difficult to directly obtain the transition density function, because this would require marginalizing over the auxiliary \(\tilde{\mathbf {x}}_r\)

$$\begin{aligned} q_r(\mathbf {x}_r|\mathbf {x}_{r-1},\mathbf {x}_{r-2}) =\int _{\tilde{\mathbf {x}}_r} [\tilde{\mathbf {x}}_r|\mathbf {x}_{r-1},\mathbf {x}_{r-2}] 1_{\{{{\mathrm{argmin}}}_{\mathbf {u}\in \mathcal {D}} ||\tilde{\mathbf {x}}_r-\mathbf {u}||=\mathbf {x}_r \}} {\hbox {d}}\tilde{\mathbf {x}}_r, \end{aligned}$$

where \([\tilde{\mathbf {x}}_r|\mathbf {x}_{r-1},\mathbf {x}_{r-2}]\) is given by (11).

However, because we have a tractable conditional density for \([\tilde{\mathbf {x}}_r|\mathbf {x}_{r-1},\mathbf {x}_{r-2}]\), and \(\mathbf {x}_r\) is a deterministic function of \(\tilde{\mathbf {x}}_r\) (\(\mathbf {x}_r\) is the projection of \(\tilde{\mathbf {x}}_r\) onto \(\mathcal {D}\)), we constructed an MCMC algorithm that jointly updates \(( \tilde{\mathbf {x}}_r,\mathbf {x}_r\)), as follows. At the mth iteration of the MCMC algorithm, let the current state of the latent constrained process be \(\mathbf {x}_{1:T}^{(m)}\), augmented by the unconstrained \(\tilde{\mathbf {x}}_{1:T}^{(m)}\). To update \(( \tilde{\mathbf {x}}_r,\mathbf {x}_r\)) at one time point r, we propose a new location \(\tilde{\mathbf {x}}_r^* \sim N(\tilde{\mathbf {x}}_r^{(m)},\gamma ^2_r\mathbf {I})\) as a random walk centered on \(\tilde{\mathbf {x}}_r^{(m)}\), with proposal variance \(\gamma ^2_r\). Projecting this proposed location onto \(\mathcal {D}\) (if \(\tilde{\mathbf {x}}_r^*\not \in \mathcal {D}\)) as in (12) gives \(\mathbf {x}_r^*\), the proposed individual location at time \(\tau _r\). The proposed pair \(( \tilde{\mathbf {x}}_r^*,\mathbf {x}_r^*\)) can then be accepted with probability

$$\begin{aligned} p_r=\min \left\{ 1,\frac{[\mathbf {s}_{1:n}|\mathbf {x}_{1:(r-1)}^{(m)},\mathbf {x}^*_r,\mathbf {x}_{(r+1):T}^{(m)}] [\tilde{\mathbf {x}}_{r+2}^{(m)}|\mathbf {x}_{r+1}^{(m)},\mathbf {x}_r^*] [\tilde{\mathbf {x}}_{r+1}^{(m)}|\mathbf {x}_{r}^*,\mathbf {x}_{r-1}^{(m)}] [\tilde{\mathbf {x}}_{r}^*|\mathbf {x}_{r-1}^{(m)},\mathbf {x}_{r-2}^{(m)}] }{[\mathbf {s}_{1:n}|\mathbf {x}^{(m)}_{1:T}] [\tilde{\mathbf {x}}_{r+2}^{(m)}|\mathbf {x}_{r+1}^{(m)},\mathbf {x}_r^{(m)}] [\tilde{\mathbf {x}}_{r+1}^{(m)}|\mathbf {x}_{r}^{(m)},\mathbf {x}_{r-1}^{(m)}] [\tilde{\mathbf {x}}_{r}^{(m)}|\mathbf {x}_{r-1}^{(m)},\mathbf {x}_{r-2}^{(m)}] }\right\} .\nonumber \\ \end{aligned}$$
(13)

The likelihood of the data \([\mathbf {s}_{1:n}|\mathbf {x}_{1:(r-1)}^{(m)},\mathbf {x}^*_r,\mathbf {x}_{(r+1):T}^{(m)}]\) is the likelihood (1) of all observed telemetry locations, conditioned on the latent path and the proposed location \(\mathbf {x}^*_r\) (e.g., Gaussian error for GPS data). Each of the transition densities (e.g., \( [\tilde{\mathbf {x}}_{r}^*|\mathbf {x}_{r-1}^{(m)},\mathbf {x}_{r-2}^{(m)}] \)) are multivariate normal densities given by (11).

This approach allows for Metropolis–Hastings updates for the latent locations \(\mathbf {x}_r\) one at a time. Block updates based on the simulation procedure could also be constructed. We found that updating each location at a time, using adaptive tuning (e.g., Craiu and Rosenthal 2014) for each proposal variance \(\gamma ^2_r\) resulted in acceptable mixing, both in simulation (see Appendix B), and for the sea lion analysis in Sect. 4.

To complete the MCMC algorithm, we update the movement parameters \({\varvec{\theta }} \equiv (\beta ,\sigma )'\), conditioned on \(\tilde{\mathbf {x}}_{1:T}\). One could specify conjugate priors (e.g., a Gaussian prior for \(\beta \) and an inverse gamma prior for \(\sigma ^2\)), or one could use block Metropolis–Hastings updates to jointly update the movement parameters \(\beta \) and \(\sigma \) at each iteration of the MCMC algorithm. We favor this approach because it allows for more flexible prior specification. Similar update schemes could be used for parameters in observation error model (1).

In “Appendix B,” we show a simulation example where we simulate movement constrained to lie within a polygon \(\mathcal {D}\), and make inference on model parameters. Code to replicate this simulation study is available upon request.

4 Modeling Constrained Sea Lion Movement

Having specified an RSDE-based approach for simulating animal trajectories constrained to lie within a domain \(\mathcal {D}\), and for making inference on model parameters, we now apply this approach to the sea lion telemetry data.

4.1 Telemetry Data

As described in the Introduction, we consider the telemetry observations obtained from a sea lion off the coast of Alaska from December 6, 2010, to January 5, 2011. In this 30-day period of observation, \(n=211\) telemetry observations were obtained using the ARGOS system (ARGOS 2015). The ARGOS system is unable to obtain a location fix when the sea lion is under water; thus, the telemetry observations \(\mathbf {s}_{1:n}\) were obtained at n irregular times \(\tau _i, i\in \{1,2,\ldots ,n\}\). Each ARGOS telemetry location is also accompanied by a code \(c_i\in \{3, 2, 1, 0, A, B\}\) specifying the precision of the location fix at each time point, where \(c_i=3\) corresponds to observations with the highest precision and \(c_i=B\) corresponds to observations with the lowest precision. Multiple studies have shown that ARGOS error has a distinctive X-shaped pattern (Costa et al. 2010; Brost et al. 2015), and that each error class exhibits increasing error variance.

4.2 Model and Inference

To model the X-shaped error distribution, we follow Brost et al. (2015) and Buderman et al. (2016) and model observation error using a mixture of two multivariate t-distributed random variables, centered at the individual’s true location \(\mathbf {x}_{\tau _i}\)

$$\begin{aligned} \mathbf {s}_i \sim {\left\{ \begin{array}{ll} \text {MVT}(\mathbf {x}_{\tau _i},{\varvec{\Sigma }}_{c_i},\nu _{c_i}) &{}\text { w.p. }0.5 \\ \text {MVT}(\mathbf {x}_{\tau _i},{\varvec{\Sigma }}^*_{c_i},\nu _{c_i}) &{}\text { w.p. }0.5 \end{array}\right. }. \end{aligned}$$
(14)

In (14), \(\nu _{c_i}\) is the degrees of freedom parameter for ARGOS error class \(c_i\), and \({\varvec{\Sigma }}_{c_i}\) and \({\varvec{\Sigma }}^*_{c_i}\) capture the X-shaped ARGOS error pattern

$$\begin{aligned} {\varvec{\Sigma }}_{c} = \kappa _c^2 \begin{bmatrix} 1&\quad \rho _c \sqrt{a}_c \\ \rho _c \sqrt{a}_c&\quad a \end{bmatrix}, \quad {\varvec{\Sigma }}_{c}^* = \kappa _c^2 \begin{bmatrix} 1&\quad -\rho _c \sqrt{a}_c \\ -\rho _c \sqrt{a}_c&\quad a \end{bmatrix} \end{aligned}$$

and \({\varvec{\theta }}_c\equiv (\kappa _c,\rho _c,a_c,\nu _c)^\prime \) are ARGOS class-specific error parameters. See Brost et al. (2015) for additional details, and Costa et al. (2010) for an empirical analysis of ARGOS error patterns. As the distribution of ARGOS error has been studied extensively, we consider the ARGOS error parameters \({\varvec{\theta }}_c\) to be fixed and known. For this study, we set these parameters equal to the posterior means reported in Appendix D of Brost et al. (2015).

To model sea lion movement, which is constrained to be in water \((\mathbf {x}_\tau \in \mathcal {D})\), we consider a continuous-time model defined as a linear interpolation of the numerical approximation (11)–(12) of RSDE (8)–(9). For a given temporal step size h, taken to be \(h=5\) min for this analysis, we consider an approximation to the RSDE at times \(t_r\equiv \tau _1+rh, r\in \{0,1,\ldots ,T,\ T\equiv 30\times 24\times 12 = 8640\}\). At any observation time \(\tau _i\), the individual’s position is given by a linear interpolation of the discrete approximation to the RSDE at the two nearest time points \(t_{r(i)},t_{r(i)+1}\), where \(\tau _i \in (t_{r(i)},t_{r(i)+1})\) and

$$\begin{aligned} \mathbf {x}_{\tau _i} = \mathbf {x}_{r(i)}\frac{\tau _i-t_{r(i)}}{h}+\mathbf {x}_{r(i)}\frac{t_{r(i)+1}-\tau _i}{h}. \end{aligned}$$
(15)

The RSDE model for movement is approximated at discrete times \(t_r\equiv \tau _1+rh, r\in \{0,1,\ldots ,T\}\) according to (11)–(12). An alternative to this linear interpolation is to augment the approximation times (\(t_r\equiv \tau _1+rh, r\in \{0,1,\ldots ,T\}\)) with the observation times (\(\tau _i,\ i=1,\ldots ,n\)). This results in a nonuniform step size between time points at which the RSDE is approximated. The computational complexity of simulating the RSDE is linear in the number of time points, so the addition of the n additional time points is computationally feasible in many situations. The error in numerical approximation (7) to the SDE scales with the largest time h between approximation times (Kloeden and Platen 1992), so it is not clear that adding these n additional time points would result in an increase in numerical efficiency. We thus retain our regular temporal resolution, with a step size of h.

For this study of sea lion movement, we characterize space use over time. While it would be possible in some situations to model sea lion movement as being attracted to a haul-out or other central point (Hanks et al. 2011; Brost et al. 2015), we do not consider this here because our goal is only to characterize space use. Thus, we set \({\varvec{\mu }}(\mathbf {x}_\tau ,\tau )=0\). We also assume a constant variance in the velocity process over time and space, with \(\sigma ^2\equiv c^2(\mathbf {x}_\tau ,\tau )\). The resulting model for the discretized movement process constrained to be within water is

$$\begin{aligned} \mathbf {x}_{r}|\tilde{\mathbf {x}}_r&= \mathop {{{\mathrm{argmin}}}}\limits _{\mathbf {u}\in \mathcal {D}} \{ ||\mathbf {u}-\tilde{\mathbf {x}}_{r}||\},\quad&r=1,2,\ldots ,T \end{aligned}$$
(16)
$$\begin{aligned} \tilde{\mathbf {x}}_{r}|\mathbf {x}_{r-1},\mathbf {x}_{r-2}&\sim N\left( (2-\beta h)\mathbf {x}_{r-1}+(\beta h-1)\mathbf {x}_{r-2},\sigma ^2h^3\mathbf {I}\right) ,\quad&r=3,4,\ldots ,T. \end{aligned}$$
(17)

This model is illustrated conceptually in Fig. 2.

Fig. 2
figure 2

Modeling movement using projected processes. A time-discretized solution to the reflected SDE is obtained by first forward simulating from the transition density to obtain \(\tilde{\mathbf {x}}_{t}|\mathbf {x}_{t-1},\mathbf {x}_{t-2}\), and then projecting \(\tilde{\mathbf {x}}_t\) onto \(\mathcal {D}\) to obtain \(\mathbf {x}_t\).

We complete the hierarchical state space model by specifying prior distributions for all parameters. For the initial two time points, we specify independent uniform priors

$$\begin{aligned} \mathbf {x}_1\sim \text {Unif}(\mathcal {D}),\quad \mathbf {x}_2\sim \text {Unif}(\mathcal {D}) \end{aligned}$$
(18)

and we specify independent half-normal priors for the autocorrelation parameter \(\beta \) and the Brownian motion standard deviation \(\sigma \)

$$\begin{aligned} {[}\beta ] \propto \exp \left\{ -\beta ^2/(2\gamma ^2_\beta )\right\} 1_{\{\beta>0\}}\quad ,\quad [\sigma ] \propto \exp \left\{ -\sigma ^2/(2\gamma ^2_\sigma )\right\} 1_{\{\sigma >0\}} \end{aligned}$$
(19)

with \(\gamma _\sigma =\gamma _\beta =100\) as hyperparameters.

Our goal is inference on all parameters in the hierarchical Bayesian model for animal movement in (14)–(19). We constructed an MCMC algorithm to draw samples from the posterior distribution of model parameters, conditioned on the observed telemetry data, using methods described in Sect. 3.1. We used variable at a time Metropolis–Hastings updates (13) for the latent locations \((\tilde{\mathbf {x}}_r,\mathbf {x}_r)\) and used block Metropolis–Hastings updates to jointly update the movement parameters \(\beta \) and \(\sigma \) at each iteration of the MCMC algorithm. Random walk proposal distributions were specified for all parameters, and the variance of each proposal distribution was tuned adaptively using the log-adaptive procedure of Shaby and Wells (2010).

To initialize the MCMC algorithm for the sea lion analysis, we first chose starting values for movement parameters (\(\beta _0=.001,\sigma _0=1\)km) and used a particle filtering algorithm (Cappé et al. 2007; Kantas et al. 2009) to provide a starting movement path \(\mathbf {x}_{1:T}\) constrained to be in water. We initialized \(\tilde{\mathbf {x}}_r=\mathbf {x}_r\) for each time point and then ran the MCMC algorithm for 200,000 iterations. Convergence was assessed visually, with chains for \(\beta \), \(\sigma \), and each \(\mathbf {x}_r\) showing good mixing. The entire procedure required 14 h on a single core of a 2.7 GHz Intel Xeon processor. Code to replicate this analysis is available upon request.

4.3 Results

The posterior mean for \(\log (\sigma )\), which controls the variance of the Brownian motion process on velocity, was \(\log (\hat{\sigma })=11.8\), with an equal-tailed 95\(\%\) credible interval of (11.2, 12.4). The posterior mean for \(\log (\beta )\), which controls autocorrelation beyond that implied by IBM, was \(\log (\hat{\beta })=-7.1\), with an equal-tailed 95\(\%\) credible interval of \((-7.6,-6.2)\). The small estimated value for \(\beta \) implies that this term may not be needed in the model, and that IBM could be an appropriate model for this sea lion’s movement.

Fig. 3
figure 3

Ten sample paths from the posterior distribution of sea lion paths are shown in a, with each path plotted in a different color. These paths show a propensity of the sea lion to stay close to coastlines. A single path is shown in blue in b, with the telemetry observations shown as red points, with lines connecting the telemetry observation to the estimated location \(\mathbf {x}_{\tau _r}\). The imputed path between observed telemetry locations skirts islands and other barriers, as the movement model constrains the sea lion to be in the water or on the shoreline at all times (Color figure online).

Figure 3a shows 10 realizations of paths \(\mathbf {x}_{1:T}\) from the posterior distribution, and Fig. 3b shows one path realization together with the observed ARGOS telemetry locations \(\mathbf {s}_{1:n}\), with lines drawn from the telemetry locations to the realization of the individual’s location at the time of observation. Figure 4a shows 10 realizations from the posterior path distribution of the animal on December 11, 2010, as it navigated a narrow passage. Figure 4b–j shows the posterior distribution of the sea lion’s location \(\mathbf {x}_t\) at 15-min intervals. Our temporal discretization had a step size of \(h=5\) min. Thus, there are two time points in our latent representation of the movement process between each shown time point in 4b–j. Using a coarser time discretization would speed up computation at the expense of realism, as linear interpolation (15) would result in paths that cross larger portions of land.

The spatial constraint is clear in both Figs. 3 and 4 and shows that our RSDE approach was successful in modeling realistic animal movement that is spatially constrained to occur within water (\(\mathcal {D}\)). The posterior distribution of \(\mathbf {x}_{1:T}|\mathbf {s}_{1:n}\) (Fig. 3a) estimates sea lion space use over the 30 days of observation and indicates that the individual spends a majority of its time near land, as expected for this species of pinniped.

Fig. 4
figure 4

a Ten paths from the posterior distribution of sea lion locations as it navigates a narrow passage. bj In total, 5000 samples from the posterior distribution of sea lion locations at 15-min intervals. The posterior distribution shows that the individual is constrained to be within water (shown in white) or on the shoreline.

5 Discussion

We developed an approach for modeling spatially constrained animal movement based on a numerical approximation to a stochastic differential equation. The base SDE is general enough to capture a range of realistic animal movement, and the two-step procedure in Sect. 2.2.1 leads to a computationally tractable transition density. Our approach to constraining movement is based on the reflected SDE literature and consists of projecting numerical approximations to the solution of the SDE onto the domain \(\mathcal {D}\).

Our approach for inference is computationally challenging, and future work will consider approaches that make inference more computationally efficient. The main computational burden for each iteration of the MCMC algorithm is projecting each latent \(\tilde{\mathbf {x}}_{1:T}\) onto \(\mathcal {D}\). We coded Algorithm 1 using C++, but there is still significant room for improving computational efficiency. Algorithm 1 assumes that \(\partial \mathcal {D}\) is given as a polygon or set of polygons, and checks each side of each polygon. When \(\mathcal {D}\) is large relative to the distance an animal can move between telemetry observations, Algorithm 1 could potentially be made more efficient by only considering a subset of polygon edges for each time point. An additional computational difficulty comes from the lack of conjugacy for the long latent time series \(\mathbf {x}_{1:T}\). Our approach is to use adaptively tuned Metropolis–Hastings steps for each time point. One possible future approach is to construct a joint proposal for all \(\tilde{\mathbf {x}}_{1:T}\) through a forward filtering, backward sampling algorithm (e.g., Cressie and Wikle 2011) ignoring constraints. This would provide an approach for block updates of \((\mathbf {x}_{1:T},\tilde{\mathbf {x}}_{1:T})\), which may improve mixing of the MCMC algorithm.

We note that there are other possible approaches to simulating RSDEs. One alternative, less common, approach for approximating the constrained SDE in (8)–(9) is to change the distribution of increments (7) of the unconstrained numerical approximation to be distributed as truncated normal distributions with spatial constraint \(\mathcal {D}\) instead of the unconstrained Gaussian increments in (7). Cangelosi and Hooten (2009) used this approach and included a correction term in the mean of the truncated bivariate normal distribution to better capture the dynamics implied by the unconstrained SDE. Brillinger (2003) considered additional approaches to simulation, including specifying a mean function \({\varvec{\mu }}(\mathbf {x})\) that repels trajectories near the boundary \(\partial \mathcal {D}\). Russell et al. (2017) use a similar approach to constrain ant movement to lie within a nest.

Another approach to modeling movement constrained to lie within \(\mathcal {D}\) is to consider discrete space (gridded) approximations to the movement process (Hooten et al. 2010; Hanks et al. 2015; Avgar et al. 2016; Brost et al. 2015). A discrete support allows the spatial constraint on movement to be easily captured, but discrete space approaches can be computationally challenging to implement when the evaluation of the transition density requires the computation of all pairwise transitions from any grid cell to any other grid cell (e.g., Brost et al. 2015).

We assumed that animal locations are observed with error, which is the case for most telemetry data. If it can safely be assumed that observation locations have negligible error, then all observations will be within \(\mathcal {D}\). In this setting, the projection-based approach to inference developed in Sect. 3 could still be applied. This is true for SDE models that only model change in location, like potential function models (4), as well as SDE models that model change in velocity, like (2)–(3). An appealing alternative to the projection-based approach is to consider a truncated normal (TN) transition density with suitable location parameter \({\varvec{\mu }}_t\), covariance \({\varvec{\Sigma }}_t\), and support \(\mathcal {D}\)

$$\begin{aligned} \mathbf {x}_{t}|\mathbf {x}_{1:(t-1)} \sim \text {TN}({\varvec{\mu }}_t,{\varvec{\Sigma }}_t,\mathcal {D}). \end{aligned}$$
(20)

For example, the location and covariance parameters could be those defined by approximation (11) to SDE (8)–(9). The density function of the truncated normal distribution in (20) is

$$\begin{aligned} q_t\left( \mathbf {x}_{t}|\mathbf {x}_{1:(t-1)};{\varvec{\theta }}\right) =\frac{\exp \left( -\frac{1}{2}(\mathbf {x}_t-{\varvec{\mu }}_t)'{\varvec{\Sigma }}^{-1}_t(\mathbf {x}_t-{\varvec{\mu }}_t)\right) }{\int _{\mathcal {D}}\exp (-\frac{1}{2}\left( \mathbf {v}-{\varvec{\mu }}_t\right) '{\varvec{\Sigma }}^{-1}_t\left( \mathbf {v}-{\varvec{\mu }}_t\right) d\mathbf {v}}, \end{aligned}$$
(21)

and fast approaches exist (Abramowitz and Stegun 2012; Genz and Bretz 2009) with accessible software (Meyer et al. 2016) for computing the normalizing constant when \(\mathcal {D}\) can be approximated as a polygon in \(\mathbb {R}^2\).

We have focused on SDE models for the time derivative of velocity. In some situations it is reasonable to model movement based solely on an SDE model for the time derivative of position (Brillinger et al. 2002; Preisler et al. 2013). The RSDE approach developed here could also be applied in this case, though there would be no need to do two-step numerical approximation (7). Instead, an Euler approximation or other numerical approximation to the SDE could be used (Kloeden and Platen 1992).

Modeling movement without accounting for spatial constraints can lead to bias in movement parameter estimates and resulting inference. Our work and the work of others who have also considered constrained movement (Brillinger 2003; Cangelosi and Hooten 2009; Brost et al. 2015) provide approaches that formally account for constraints and lead to more realistic animal movement and space use.