Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

2.1 Introduction to Movement Models

Many biological problems lend themselves well to mathematical models. Often we use these models to predict the behaviour of a population. We can attempt to predict only population size using ordinary differential equation models of the population dynamics, or attempt to predict spatial characteristics of the population through the use of partial differential equation models. In either case, certain simplifications are necessary. A key question, which must be addressed when dealing with population models, is how to obtain a model for the macroscopic behaviour of a population based on information about individuals in the population.

2.1.1 Measurements

As a first example, we consider the case of randomly moving individuals, and discuss how we may use information about these individuals to derive a model for the population. First, consider a random variable X t that represents the location of an individual at time t. On the population level we consider two statistical measures as illustrated in Fig. 2.1a:

Fig. 2.1
figure 1

(a ): schematic of individuals undergoing a random walk; the locations can be used to estimate a mean location and a mean squared displacement. (b ): measurement of individual movement path for speed, turning rate and turning angle distribution

(A) :

Population measurements:

(A.1) :

Mean location: \(\bar{X}_{t} = E(X_{t})\), and

(A.2) :

Mean quadratic variation: \((\sum _{i=1}^{n}(X_{t}{}_{i} -\bar{ X}_{t})^{2})/(n - 1) = V (X_{t})\)

These two measures represent characteristic values of the population based on averages of movement of their individuals. In many cases we can also consider characteristics of the individual particles themselves:

(B) :

Individual measurements:

(B.1) :

The mean speed, γ

(B.2) :

The mean turning rate, μ, and

(B.3) :

The distribution of newly chosen directions, T(v, v′), where v and v′ are the new and previous velocities, respectively.

These measures correspond to the situation seen in Fig. 2.1b. The measure T(v, v′) is often referred to as a kernel, and can be described as the probability density of turning into velocity v given previous velocity v′. For a homogeneous environment, this will typically be a uniform distribution, but for directed environments, the distribution may not be uniform. For example, for a cancer cell moving within a brain, it will be more likely to turn into alignment with the fibrous brain structures than to travel orthogonally. This means that the turning kernel will have a higher probability in the direction of this fibre.

The aim of this manuscript is to develop mathematical models which are based on the above observations. In particular, we are interested in the following questions:

Q1: :

How can we make a mathematical model for these types of measurements?

Q2: :

How are these models related?

2.1.2 Random Walk on a Grid

To derive a first and simple model on the population level, we first consider a random walker on a one dimensional grid [60, 45]. In this situation, consider an individual starting at point 0, and having some probability 0 < q < 1 of moving to the right, and some probability 1 − q of moving to the left. In this example, we assume that there is 0 probability of the random walker staying where it is. We let δ be the spatial step and τ be the time step. This situation is illustrated in Fig. 2.2a.

Fig. 2.2
figure 2

(a ): simple random walk with constant jump probabilities q and 1 − q. (b ): random walk with variable jump probabilities T i ±

We now consider X n to be a random variable representing the position of a random walker that started at 0 after n discrete steps. After one step the expected value of X 1 is

$$\displaystyle{ E(X_{1}) =\sum _{y}xp(x = y) =\delta q + (-\delta )(1 - q) =\delta (2q - 1), }$$
(2.1)

where we are summing over the whole domain. Now if we want to compute E(X 2), we simply take

$$\displaystyle{E(X_{2}) = E(X_{1}) +\sum _{y}xp(x = y) =\delta (2q - 1) +\delta q + (-\delta )(1 - q) = 2\delta (2q - 1).}$$

We can recursively define the expectation after n steps E(X n ) to be

$$\displaystyle{ E(X_{n}) =\delta (2q - 1) + E(x_{n-1}) = n\delta (2q - 1). }$$
(2.2)

We now notice that if \(q = \frac{1} {2}\) in Eq. (2.2), we have that E(X n ) = 0. This makes sense, as we would expect to find no net displacement when the probabilities for moving left and right are equal. If however \(q > \frac{1} {2}\), then we have a higher probability of moving to the right, thus we would expect the net movement to be to the right. We see that in this case E(X n ) > 0, as expected. Conversely, when \(q < \frac{1} {2}\), we have E(X n ) < 0, and see net movement to the left.

We can also consider the variance of our random variable. This is computed using the following formula:

$$\displaystyle{ V (X_{1}) = E(X_{1}^{2}) - E(X_{ 1})^{2}. }$$
(2.3)

We have E(X 1) as computed in (2.1), so we easily find that E(X 1)2 = δ 2(2q − 1)2. We next compute

$$\displaystyle{E(X_{1}^{2}) =\sum _{ y}x^{2}p(x = y) =\delta ^{2}q + (-\delta )^{2}(1 - q) =\delta ^{2}.}$$

Therefore,

$$\displaystyle{V (X_{1}) =\delta ^{2} -\delta ^{2}(2q - 1)^{2} = 4\delta ^{2}q(1 - q),}$$

and by the same argument as for the expectation,

$$\displaystyle{V (X_{n}) = 4n\delta ^{2}q(1 - q).}$$

These measurements are for the discrete time situation, where an individual performs n jumps, \(n \in \mathbb{N}\). How do these compare to the continuous time situation? If we consider a time step to have length τ, then t = n τ and \(n = \frac{t} {\tau }\). We then define a mean velocity c and a diffusion coefficient D as:

$$\displaystyle{ c = \frac{E(X_{t})} {t} = \frac{\delta } {\tau }(2q - 1),\qquad D = \frac{1} {2} \frac{V (X_{t})} {t} = \frac{2\delta ^{2}} {\tau } q(1 - q). }$$
(2.4)

2.1.3 A Continuous Random Walk

To derive a mathematical description of the random walk from above, we introduce p(x, t) as probability density for the location of the random walker. We begin with a description of the discrete case discussed above. If we want to define an equation for p(x, t +τ), we are looking for the probability that an individual will be found at x at time t +τ. We note that the only way for an individual to arrive at position x at time t +τ, is to come from the grid point to the left, or to the right from time t. We use the Master equation approach

$$\displaystyle{ p(x,t+\tau ) = qp(x-\delta,t) + (1 - q)p(x+\delta,t), }$$
(2.5)

where q, 1 − q are the probabilities for a jump to the right/left per unit of time τ, respectively. In order to determine the continuous limit of this discrete equation, we assume that τ and δ are small parameters. Then a formal Taylor expansion becomes

$$\displaystyle\begin{array}{rcl} p +\tau p_{t} + \frac{\tau ^{2}} {2}p_{tt} + h.o.t.& =& q\left (\,p -\delta p_{x} + \frac{\delta ^{2}} {2}p_{xx} - h.o.t.\right ) {}\\ & & +(1 - q)\left (\,p +\delta p_{x} + \frac{\delta ^{2}} {2}p_{xx} + h.o.t.\right ). {}\\ \end{array}$$

Simplifying, we obtain

$$\displaystyle{ p_{t}(x,t) = \frac{\delta } {\tau }(1 - 2q)p_{x}(x,t) + \frac{\delta ^{2}} {2\tau }p_{xx}(x,t) + h.o.t.. }$$
(2.6)

We see that the dominating terms in Eq. (2.6) are the standard advection-diffusion equation

$$\displaystyle{p_{t}(x,t) + cp_{x}(x,t) = Dp_{xx}(x,t)}$$

with

$$\displaystyle{c = \frac{\delta } {\tau }(1 - 2q)\quad \mbox{ and }\quad D = \frac{\delta ^{2}} {2\tau }.}$$

At this stage we can study different possible limit scenarios for δ, τ → 0 and q → 1∕2. We can do this in a number of ways, and we present three choices below. Of course, there are many more choices of these scalings, but most of them will not lead to a useful limit equation. In other words, if δ, τ, q do not scale as indicated below, then this method is not appropriate.

  1. (a)

    \(\frac{\delta }{\tau } \rightarrow \alpha\)=constant. Then \(\frac{\delta ^{2}} {\tau } =\delta \frac{\delta }{\tau } \rightarrow 0\), which causes the diffusive term to vanish, and we are left with a simple transport equation

    $$\displaystyle{p_{t} + cp_{x} = 0.}$$
  2. (b)

    \(\frac{\delta ^{2}} {\tau } \rightarrow 2D =\) constant, then we can consider two cases:

    (b.1) :

    if \(q = \frac{1} {2}\), then c = 0, and we obtain a pure diffusion equation

    $$\displaystyle{p_{t} = Dp_{xx}.}$$
    (b.2) :

    If \(q \rightarrow \frac{1} {2}\) in such a way that \(\frac{\delta }{\tau }(1 - 2q) \rightarrow c\), and \(\frac{\delta ^{2}} {2\tau } = \frac{D} {4q(1-q)} \rightarrow D\), then the scaling results in the advection-diffusion equation

    $$\displaystyle{ p_{t} + cp_{x} = Dp_{xx}, }$$
    (2.7)

    where c and D are given by the measurements (2.4)

    $$\displaystyle{c \approx \frac{E(X_{t})} {t},D \approx \frac{1} {2} \frac{V (X_{t})} {t}.}$$

Summary

  • When δ and τ scale in the same way, then we obtain a transport equation. This case is called drift dominated.

  • When δ 2 ∼ τ, we have the diffusion dominated case.

  • Only in the case where \(q -\frac{1} {2} \sim \tau\) do we get both terms, an advection and a diffusion term (mixed case ).

2.1.4 Outline of This Manuscript

We have seen that population measurements lead us in a very natural way to drift-diffusion models of the type (2.7). If individual measurements are available, then the framework of transport equations becomes available, which we develop in the next sections. Of course there is a close relation between individual behaviour and population behaviour. This is reflected in our theory through the parabolic limit or drift-diffusion limit.

Above, we related population measurements to population drift and diffusion terms. Using the parabolic limit, we will be able to relate individual measurements to population drift and diffusion.

In Sect. 2.2 we consider the one-dimensional version of the transport equations. Individuals can only move up or down a linear feature, so that the movement is essentially one dimensional. The one-dimensional formulation has the advantage that most computations can be carried out and even explicit solutions can be found. We summarize results on invasions and pattern formation as well as applications to growing and interacting populations, chemotaxis, swarming and alignment.

In Sect. 2.3 we formally define transport equations and then define the mathematical setup. We introduce the basic assumptions (T1)–(T4) and we immediately explore the spectral properties of the turning operator.

Section 2.4 is devoted to the diffusion limit. Based on biological observations, we introduce a time and space scaling which, in the limit of macroscopic time and spaces scales, leads to a diffusion equation. Important is the structure of the resulting diffusion tensor D, which can be isotropic (\(D =\alpha \mathbb{I}\)), or anisotropic. We find easy-to-check conditions for the isotropy of the diffusion tensor. In Sect. 2.4 we also consider examples of bacterial movement, amoeboid movement, myxobacteria and chemotaxis. Finally, we introduce an important biological concept of persistence, which leads us back to biological measurements. The persistence, also called mean cosine, can easily be measured in many situations.

There are interesting further developments for individual movement in oriented habitats. Unfortunately, we are not able to include all these models and applications here, hence we chose to give a detailed list of references for further reading in Sect. 2.5.

2.2 Correlated Random Walk in One Dimension

The one dimensional correlated random walk is an extension of the diffusion random walks studied earlier, as it allows for correlation of movement from one time step to the next; in particular correlation in velocity. These models are easy to understand and they form a basis for the understanding of higher dimensional transport equations. In fact, many of the abstract methods, which we introduce later for transport equations, are simply illustrated in the one-dimensional context. However, the 1-D model is not only a motivating example, it is a valid model for random walk on its own and it has been applied to many interesting biological problems. See for example the review article of Eftimie [10] on animal swarming models.

In the following sections we will introduce the model and various equivalent variations, we will discuss suitable boundary conditions, and we will write the model in an abstract framework, which will become important later.

2.2.1 The Goldstein-Kac Model in 1-D

Taylor [56] and Fuerth [15] developed the one dimensional correlated random walk model in the same year. Goldstein [18] and Kac [39] formulated it as a partial differential equation, and this is where we start. Let u ±(x, t) denote the densities of right/left moving particles. The Goldstein-Kac model for correlated random walk is

$$\displaystyle{ \begin{array}{rcl} u_{t}^{+} +\gamma u_{x}^{+} & =& - \frac{\mu } {2}u^{+} + \frac{\mu } {2}u^{-} \\ u_{t}^{-}-\gamma u_{x}^{-}& =& \frac{\mu } {2}u^{+} - \frac{\mu } {2}u^{-}, \end{array} }$$
(2.8)

where γ denotes the (constant) particle speed and μ∕2 > 0 is the rate of switching directions from plus to minus or vice versa. We can also consider an equivalent formulation as a one-dimensional transport equation

$$\displaystyle{ \begin{array}{rcl} u_{t}^{+} +\gamma u_{x}^{+} & =& -\mu u^{+} + \frac{\mu } {2}(u^{+} + u^{-}) \\ u_{t}^{-}-\gamma u_{x}^{-}& =& -\mu u^{-} + \frac{\mu } {2}(u^{+} + u^{-}),\end{array} }$$
(2.9)

where now μ > 0 is the rate of directional changes; new directions are chosen as plus or minus with equal probability 1∕2. The systems (2.8) and (2.9) are equivalent, however the second formulation allows us to consider the system in the transport equation framework, as well as prepare it for extension to higher dimensions.

Another equivalent formulation arises if we look at the total population u = u + + u and the population flux v = γ(u +u ):

$$\displaystyle{ \begin{array}{rcl} u_{t} + v_{x}& =&0 \\ v_{t} +\gamma ^{2}u_{x}& =& -\mu v,\end{array} }$$
(2.10)

which is also known as Cattaneo system [38]. This formulation will be more natural for scientists with experience in continuum mechanics, as the first equation is a conservation of mass equation, while the second equation can be seen as a momentum equation, where the flux adapts to the negative population gradient with a time factor or e μ t. See Joseph and Preziosi [38] for a detailed connection to continuum mechanics and to media with memory. Here, we stay with the interpretation of population models.

If we assume the solutions are twice continuously differentiable, we get yet another closely related equation. Indeed, differentiating the first equation of (2.10) by t and the second by x we get

$$\displaystyle{ \begin{array}{rcl} u_{tt} + v_{xt}& =&0 \\ v_{xt} +\gamma ^{2}u_{xx}& =& -\mu v_{x}, \end{array} }$$
(2.11)

which can be rearranged into an equation for u alone, making use of (2.10) in order to substitute v x  = −u t :

$$\displaystyle{ \frac{1} {\mu } u_{tt} + u_{t} = \frac{\gamma ^{2}} {\mu } u_{xx}, }$$
(2.12)

which is the telegraph equation. This equation can be derived for the electrical potential along a transatlantic telegraph cable; a quite astonishing relation for our original random walk model. In this equation we then clearly see the relation to a diffusion equation. Just imagine that μ →  and we loose the second time derivative term. At the same time we let γ →  such that

$$\displaystyle{0 <\lim _{\gamma \rightarrow \infty,\mu \rightarrow \infty }\frac{\gamma ^{2}} {\mu } =: D < \infty.}$$

Then D becomes the diffusion coefficient and the parabolic limit equation reads

$$\displaystyle{ u_{t} = Du_{xx}. }$$
(2.13)

We see that the one-dimensional model for correlated random walk is in fact closely related to transport in media with memory as well as to transatlantic cables. This reinforces the universality of mathematical theories, and the fact that often unexpected relations can be found.

2.2.2 Boundary Conditions

It is an interesting exercise to find appropriate boundary conditions for these models. Let us focus on the correlated random walk model (2.8). Since the model equations are hyperbolic, we need to look at the characteristics. For the first equation, the characteristics are x(t) = x +γ t and for the second equation they are x(t) = xγ t. Hence the variable u + needs boundary conditions on the left boundary, while no boundary condition on the right boundary. In Fig. 2.3 we indicate the characteristics with arrows. Formally, we define the domain boundary of Ω = [0, l] × [0, t) as hyperbolic boundary

$$\displaystyle{\partial \varOmega =\delta \varOmega ^{+} \cup \delta \varOmega ^{-}}$$

with

$$\displaystyle{\delta \varOmega ^{+}:=\{ 0\} \times [0,t) \cup [0,l] \times \{ 0\},\qquad \delta \varOmega ^{-}:= [0,l] \times \{ 0\} \cup \{ l\} \times [0,t).}$$

Then u + needs boundary conditions at ∂ Ω + and u needs boundary conditions at ∂ Ω (see Fig. 2.3).

Fig. 2.3
figure 3

Illustration of the two parts of the hyperbolic boundary of Ω. Arrows indicate the directions of the characteristics

Both quantities require initial conditions at time t = 0:

$$\displaystyle{u^{+}(x,0) = u_{ 0}^{+}(x),\qquad u^{-}(x,0) = u_{ 0}^{-}(x).}$$

On the sides of the domain we can mimic the classical Dirichlet, Neumann and periodic boundary conditions.

  • Dirichlet boundary conditions describe a given concentration at the boundaries. In the hyperbolic case this means a given incoming concentration

    $$\displaystyle{u^{+}(0,t) =\alpha _{ 1}(t),\qquad u^{-}(l,t) =\alpha _{ 2}(t).}$$

    In the case of α 1 = α 2 = 0 we call these the homogeneous Dirichlet boundary conditions.

  • Neumann boundary conditions describe the flux at the boundary. Hence in our case

    $$\displaystyle{\gamma (u^{+}(0,t) - u^{-}(0,t)) =\beta _{ 1}(t),\qquad \gamma (u^{+}(l,t) - u^{-}(l,t)) =\beta _{ 2}(t).}$$

    In the no-flux case they become homogeneous Neumann boundary conditions

    $$\displaystyle{u^{+}(0,t) = u^{-}(0,t),\qquad u^{-}(l,t) = u^{+}(l,t).}$$
  • Periodic boundary conditions are as expected

    $$\displaystyle{u^{+}(0,t) = u^{+}(l,t),\qquad u^{-}(l,t) = u^{-}(0,t).}$$

The corresponding initial-boundary value problems for the correlated random walk as well as for the Cataneo equations and for the telegraph equation have been studied in great detail in [28], including results on existence, uniqueness, and positivity. One curious result is the fact that the Dirichlet problem regularizes, while the Neumann and periodic problems do not regularize. In the Dirichlet problem singularities or jumps are washed out at the boundary, while the Neumann case they are reflected and in the periodic case they are transported around the domain.

2.2.3 Abstract Formulation

The main part of this manuscript provides analysis of a generalization of the one-dimensional correlated random walk to higher dimensions. We will construct an abstract framework of function spaces and turning operators, and the one dimensional model will arise as a special case. To prepare this relation, we now formulate Eq. (2.8) as a differential equation in a Banach space. In fact, we use the (equivalent) system (2.9) and introduce an integral operator \(\mathcal{T}\) for the last term on the right hand sides:

$$\displaystyle{\mathcal{T}: \mathbb{R}^{2} \rightarrow \mathbb{R};\qquad \left (\begin{array}{c} u^{+} \\ u^{-} \end{array} \right )\mapsto \frac{1} {2}(u^{+}+u^{-}).}$$

Here it does not look like an integral operator, but the higher dimensional version will include an integration. In fact here the integration is over the discrete space V = { +γ, −γ}. The operator norm of this linear operator can be easily computed to be

$$\displaystyle{\|\mathcal{T}\|_{1} = \frac{1} {2},}$$

where the operator norm is defined as

$$\displaystyle{\|\mathcal{T}\|_{1} =\sup _{\|\phi \|_{1}=1}\|\mathcal{T}\phi \|_{1}.}$$

It will be important later that this norm is less than or equal to 1. The whole right hand side of the system (2.9) defines another operator, which we call the turning operator \(\mathcal{L}\):

$$\displaystyle{\mathcal{L}: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2};\qquad \left (\begin{array}{c} u^{+} \\ u^{-} \end{array} \right )\mapsto \left (\begin{array}{c} -\mu u^{+} + \mathcal{T} (u^{+},u^{-}) \\ -\mu u^{-} + \mathcal{T} (u^{+},u^{-}) \end{array} \right ).}$$

If we write \(\mathcal{L}\) as a matrix, we get

$$\displaystyle{\mathcal{L} = \frac{\mu } {2}\left (\begin{array}{cc} - 1& 1\\ 1 & -1 \end{array} \right )}$$

we obtain eigenvalues of λ 1 = 0 and λ 2 = −μ. The zero eigenvalue corresponds to the fact that the total population size is conserved for Eq. (2.9). The corresponding eigenspace is spanned by the vector (1, 1)T. Hence the kernel of \(\mathcal{L}\) is given as

$$\displaystyle{\mbox{ ker}\mathcal{L} =\langle \left (\begin{array}{c} 1\\ 1 \end{array} \right )\rangle.}$$

The abstract formulation appears a bit staged, but this will form the framework for the multi dimensional situation.

2.2.4 Explicit Solution Using Bessel Functions

There exists an explicit solution of the correlated random walk model (2.8) using Bessel functions. Based on Poincare’s [54] Bessel function solutions for the one-dimensional telegraph equation (2.12) on \(\mathbb{R}\), Hadeler [21] modified these solutions to find explicit solutions to the correlated random walk system (2.8) on \(\mathbb{R}\).

Let us recall some basic facts about the modified Bessel functions. For \(k \in \mathbb{N}\) we denote by J k the Bessel functions of first kind, and by I k (x): = e k π i J k (ix) the modified Bessel functions with purely imaginary argument. For k = 0 and k = 1 we have the relations

$$\displaystyle{I_{0}(x) = J_{0}(ix) =\sum _{ k=0}^{\infty } \frac{1} {(k!)^{2}}\left (\frac{x} {2}\right )^{2k}\quad \mbox{ and}\quad I_{ 1}(x) = \frac{d} {dx}I_{0}(x).}$$

The functions I 0(x), I 1(x) and I 1(x)∕x are real analytic and positive for x > 0. For an initial condition u 0 ∈ L p the solution of u(t, x) = (u +(t, x), u (t, x)) of (2.8) on \(\mathbb{R}\) can then be written as

$$\displaystyle\begin{array}{rcl} u^{+}(t,x)& =& u_{ 0}^{+}(x -\gamma t)e^{-\mu t/2} +\int _{ x-\gamma t}^{x+\gamma t}K(t,x,y)u_{ 0}^{-}(y)dy {}\\ & & +\int _{x-\gamma t}^{x+\gamma t}K_{ +}(t,x,y)u_{0}^{+}(y)dy {}\\ u^{-}(t,x)& =& u_{ 0}^{-}(x +\gamma t)e^{-\mu t/2} +\int _{ x-\gamma t}^{x+\gamma t}K(t,x,y)u_{ 0}^{+}(y)dy {}\\ & & +\int _{x-\gamma t}^{x+\gamma t}K_{ -}(t,x,y)u_{0}^{-}(y)dy {}\\ \end{array}$$

with integral kernels

$$\displaystyle\begin{array}{rcl} K(t,x,y)&:=& \frac{\mu e^{-\mu t/2}} {4\gamma } \,I_{0}\left ( \frac{\mu } {2\gamma }\sqrt{\gamma ^{2 } t^{2 } - (y - x)^{2}}\right ) {}\\ K_{\pm }(t,x,y)&:=& \frac{\mu e^{-\mu t/2}} {4\gamma } \,\frac{I_{1}\left ( \frac{\mu }{2\gamma }\sqrt{\gamma ^{2 } t^{2 } - (y - x)^{2}}\right )} {\sqrt{\gamma ^{2 } t^{2 } - (y - x)^{2}}} (\gamma t \mp (y - x)). {}\\ \end{array}$$

For further reading on where these solutions come from, see [28]. From this representation we see that solutions are in \(L^{\infty }([0,\infty ) \times \mathbb{R})\) for initial conditions in L . The integrals are absolutely continuous such that possible discontinuities can only travel along the characteristics xγ t = c and x +γ t = c.

Additionally, if u 0 is k–times differentiable then u is also. In general we conclude:

$$\displaystyle\begin{array}{rcl} u_{0} \in L^{\infty }(\mathbb{R})& \Longrightarrow& u \in L^{\infty }([0,\infty ) \times \mathbb{R}) {}\\ u_{0} \in C^{k}(\mathbb{R})& \Longrightarrow& u \in C^{k}([0,\infty ) \times \mathbb{R}). {}\\ \end{array}$$

2.2.5 Correlated Random Walk Models for Chemotaxis

Chemotaxis describes the active orientation of individuals, such as cells or bacteria, on gradients of a chemical signal, which is produced by the cells themselves. In many examples, such as Dictyostelium discoideum (DD) or Escherichia coli, this process leads to macroscopic cell aggregations. Chemotaxis is a prototype of self organization, where the resulting pattern is more than the sum of its parts. Keller and Segel [40] started a systematic modelling in the 1970s and since then a large amount of literature has been published on chemotaxis. The models have been developed to accurately describe biological experiments and the models have inspired a generation of mathematicians working on finite time blow up as well as spatial pattern formation. For further details we recommend the two review articles by Horstmann [37] and by Hillen and Painter [30].

We can model chemotactic behaviour via a correlated random walk model. The action of the chemical signal on the movement mechanics of cells is very different in eukaryotic cells versus amoeboid cells. In E. coli, for example, the chemical sensing receptors are internally coupled to the rotation mechanisms of the flagella. If the cell encounters an increasing signal concentration, it prolongs straight movement and reduces reorientations. This in effect leads to oriented movement up a signal gradient [48, 14]. In our context this corresponds to a change in turning rate μ depending on the signal strength and its gradient. Amoeboid cells, however, are moving through tread milling of an internal actin-myosin filament mechanism. Amoeboid cells are able to detect directions of increased chemical signal and they can actively choose directions and adapt their speed. In this particular case, the turning rate as well as the speed are affected by the signal S [6]. In one dimension, the corresponding hyperbolic chemotaxis model reads

$$\displaystyle{ \begin{array}{rl} u_{t}^{+} + (\gamma (S)u^{+})_{x}& = -\mu ^{+}(S,S_{x})u^{+} +\mu ^{-}(S,S_{x})u^{-} \\ u_{t}^{-}- (\gamma (S)u^{-})_{x}& =\mu ^{+}(S,S_{x})u^{+} -\mu ^{-}(S,S_{x})u^{-} \\ \tau S_{t}& = D_{S}S_{xx} +\alpha (u^{+} + u^{-}) -\beta S,\end{array} }$$
(2.14)

where u ±(x, t) are as before, the densities of right and left moving particles respectively. The density of the chemical signal is given by S(x, t), and γ(S) and μ(S, S x ) are the density dependent speed and turning rate. Notice that here μ is used without a factor of 1∕2, so it is a turning rate (and not a rate of change of direction). The last equation in (2.14) describes diffusion, production and decay of the chemotactic signal S(x, t), where D S  > 0 is the diffusion coefficient of the signal, α > 0 is the production rate as produced from the total cell population u + + u and β > 0 is a constant decay rate. The parameter τ > 0 is used to indicate that signal diffusion might happen on a faster (or slower) time scale than cell movement.

As in our derivation from (2.12) to (2.13) we can use scaling arguments to compute a parabolic limit (see [32, 33] for details):

$$\displaystyle{ u_{t} = \left (A(S,S_{x})u_{x} -\chi (S,S_{x})uS_{x}\right )_{x}. }$$
(2.15)

It is interesting to see how the diffusivity A and the chemotactic sensitivity χ depend on speed γ and turning rate μ. The diffusivity is

$$\displaystyle{ A(S,S_{x}) = \frac{\gamma ^{2}(S)} {\mu ^{+}(S,S_{x}) +\mu ^{-}(S,S_{x})} }$$
(2.16)

while the chemotactic flux is

$$\displaystyle{ \chi (S,S_{x})S_{x} = - \frac{\gamma (S)} {\mu ^{+}(S,S_{x}) +\mu ^{-}(S,S_{x})}\left (\gamma '(S)S_{x} + (\mu ^{+}(S,S_{ x}) -\mu ^{-}(S,S_{ x}))\right ). }$$
(2.17)

When χ > 0, we have positive taxis, which supports aggregation. Here we have two effects which can cause positive chemotactic flow:

  1. 1.

    If γ = γ(S) and γ′(S) < 0, then particles slow down at high concentrations of S which leads to aggregation at high concentrations of S. Or, alternatively,

  2. 2.

    If μ + < μ for S x  > 0, then the turning rate is reduced when moving up the gradient of S, which also leads to aggregation.

Specifically, we study two examples.

Example 1 (And Homework)

Assume γ = const and

$$\displaystyle{\mu ^{\pm }(S,S_{ x}) = \frac{\gamma } {2A}\left (\gamma \mp \varphi (S)S_{x}\right )^{+}.}$$
  1. 1.

    Describe a biological situation for the above choice of μ and γ. Does this choice correspond to the bacterial or amoeboid case? Explain.

  2. 2.

    Compute the diffusivity A and the chemotactic flux χ S x .

Example 2

For the second example, we consider the case where τ = D = 0, and γ(S) = S. Then the last equation of (2.14) can be solved as \(S = \frac{\alpha }{\beta }(u^{+} + u^{-})\) and we therefore have that the left hand side of the first equation of (2.14) becomes

$$\displaystyle{ \begin{array}{rcl} u_{t}^{+} + (\gamma (S)u^{+})_{x}& =&u_{t}^{+} + \left (\frac{\alpha }{\beta }(u^{+} + u^{-})u^{+}\right )_{x} \\ & =&u_{t}^{+} + \frac{\alpha }{\beta }\left (u^{+}{}^{2}\right )_{x} + \frac{\alpha }{\beta }\left (u^{+}u^{-}\right )_{x}.\end{array} }$$
(2.18)

We see that the first two terms on the right hand side come from Burger’s equation. The standard form of Burger’s equation is u t + (u 2) x  = 0, and it is well known that Burger’s equation has shock solutions [2]. Hence in this case we might expect shock solutions for the chemotaxis model. In [33] we use the method of viscosity solutions to further analyse the appearance of sharp gradients in chemotaxis invasion waves [see Fig. 2.4 (right)].

Fig. 2.4
figure 4

Left: two aggregations evolve from a compactly supported initial condition for a case corresponding to Example 1. Right: occurrence of a sharp gradient due to Burger’s equation terms as in Example 2.. In both simulations the initial signal concentration is linearly increasing from left to right (not shown)

2.2.6 Reaction Random Walk Systems

The above random walk models have been considered under (local) preservation of particles. However, these models can also be used to describe growing, shrinking and interacting populations through reaction random walk systems [22]. If using diffusion equations, then the most often used assumption is an independence between species movement and species growth and death, leading to a typical reaction diffusion equation u t  = Du xx + f(u). In the correlated random walk (2.8) the population has been split into right and left moving populations, hence the inclusion of reaction terms must be done carefully. We follow Hadeler [19, 20, 21] and present three ideas from the literature:

  1. (i)

    A straight forward analog of the reaction diffusion modelling was introduced in [36] leading to the reaction random walk system

    $$\displaystyle\begin{array}{rcl} u_{t}^{+} +\gamma u_{ x}^{+}& =& \frac{\mu } {2}(u^{-}- u^{+}) + \frac{1} {2}f(u^{+} + u^{-}) {}\\ u_{t}^{-}-\gamma u_{ x}^{-}& =& \frac{\mu } {2}(u^{+} - u^{-}) + \frac{1} {2}f(u^{+} + u^{-}). {}\\ \end{array}$$

    The reaction acts symmetrically between the two classes and newborn particles choose either direction with the same probability. Holmes [36] showed the existence of travelling waves for this model, while in [24] and [25] we developed Turing instabilities and Lyapunov functions, respectively.

  2. (ii)

    If the reaction is split into growth and death terms, like f(u) = um(u) − ug(u), where m(u) denotes a birth rate and g(u) a death rate, then the above splitting is inappropriate, as death of a right moving particle leads to reduction of left moving particles. Individuals can only be discounted from their own class. Hence the reaction random walk model needs to be modified in the form

    $$\displaystyle\begin{array}{rcl} u_{t}^{+} +\gamma u_{ x}^{+}& =& \mu (u^{-}- u^{+}) + \frac{1} {2}(u^{+} + u^{-})\,m(u^{+} + u^{-}) - u^{+}\,g(u^{+} + u^{-}) {}\\ u_{t}^{-}-\gamma u_{ x}^{-}& =& \mu (u^{+} - u^{-}) + \frac{1} {2}(u^{+} + u^{-})\,m(u^{+} + u^{-}) - u^{-}\,g(u^{+} + u^{-}). {}\\ \end{array}$$
  3. (iii)

    Here we consider the same reaction terms as in (ii) but we additionally assume that the movement direction of newborn particles correlates to the direction of their mother by a parameter τ ∈ [0, 1]. The corresponding model equations are

    $$\displaystyle\begin{array}{rcl} u_{t}^{+} +\gamma u_{ x}^{+}& =& \mu (u^{-}- u^{+}) + (\tau u^{+} + (1-\tau )u^{-})\,m(u^{+} + u^{-}) - u^{+}\,g(u^{+} + u^{-}) {}\\ u_{t}^{-}-\gamma u_{ x}^{-}& =& \mu (u^{+} - u^{-}) + ((1-\tau )u^{+} +\tau u^{-})\,m(u^{+} + u^{-}) - u^{-}\,g(u^{+} + u^{-}). {}\\ \end{array}$$

    If τ = 1∕2 we have case (ii). For τ > 1∕2 the daughter particles tend to prefer the same direction as the mother and for τ < 1∕2 they prefer the opposite.

Earlier we derived a Cattaneo system and a telegraph equation (2.12) from a correlated random walk (2.8). We can do the same transformation for a reaction random walk model. The corresponding reaction Cattaneo model for the total population u = u + + u and flux v = γ(u +u ) then becomes

$$\displaystyle\begin{array}{rcl} u_{t} +\gamma v_{x}& =& f(u) {}\\ v_{t} +\gamma u_{x}& =& -h(u)\,v, {}\\ \end{array}$$

with

$$\displaystyle\begin{array}{rcl} f(u)& =& \left \{\begin{array}{ll} f(u) &\mbox{ in case (i)}\\ u\,m(u) - u\,g(u) &\mbox{ in case (ii) and (iii)}, \end{array} \right. {}\\ h(u)& =& \left \{\begin{array}{ll} \mu &\mbox{ in case (i)} \\ \mu + g(u) &\mbox{ in case (ii)} \\ \mu + (1 - 2\tau )m(u) + g(u)&\mbox{ in case (iii)}. \end{array} \right.{}\\ \end{array}$$

The boundary conditions transform as

  • homogeneous Dirichlet

    $$\displaystyle{u(t,0) = -v(t,0),\qquad u(t,l) = v(t,l).}$$
  • homogeneous Neumann

    $$\displaystyle{v(t,0) = 0,\qquad v(t,l) = 0.}$$
  • Periodic

    $$\displaystyle{u(t,0) = u(t,l),\qquad v(t,0) = v(t,l).}$$

To further investigate the relation to a telegraph equation, we focus on the case (i). The reaction Cattaneo system in this case is

$$\displaystyle\begin{array}{rcl} u_{t} +\gamma v_{x}& =& f(u) {}\\ v_{t} +\gamma u_{x}& =& -\mu \,v. {}\\ \end{array}$$

We assume that solutions are twice continuously differentiable and we differentiate the first equation with respect to t and the second with respect to x and we eliminate v to obtain the reaction telegraph equation

$$\displaystyle{u_{tt} + (\mu -f'(u))u_{t} =\gamma ^{2}u_{ xx} +\mu \, f(u).}$$

The above transformation applied to cases (ii) and (iii) would not lead to a single telegraph equation, unless τ = 1∕2 and g = 0, which is case (i) (see also [22, 21]).

2.2.7 Correlated Random Walk Models for Swarming

The correlated random walk models have also been used to describe animal alignment and swarming. Pfistner [53] and Lutscher and Stevens [42] developed and analysed a non-local correlated random walk model for alignment and rippling of myxobacteria. Myxobacteria are rod shaped bacteria that are known to form tightly aligned swarms which sometimes show spatial rippling. The following model of Pfisnter [53] and Lutscher and Stevens [42] was able to explain these patterns:

$$\displaystyle\begin{array}{rcl} u_{t}^{+} +\gamma u_{ x}^{+}& =& -\mu ^{+}u^{+} +\mu ^{-}u^{-} {}\\ u_{t}^{-}-\gamma u_{ x}^{-}& =& \mu ^{+}u^{+} -\mu ^{-}u^{-} {}\\ \end{array}$$

with turning rates

$$\displaystyle{ \mu ^{\pm } = F\left (\int _{ -R}^{R}\alpha (r)u^{\pm }(t,x \pm r) +\beta (r)u^{\mp }(t,x \pm r)dr\right ). }$$
(2.19)

The turning rates depend non-locally on the surrounding population up to a sampling radius R > 0. The kernel α(r) describes the influence of individuals that move away from x, while the β-term describes the contribution of particles that move towards x. A careful choice of α(r) and β(r) enabled them to model alignment and rippling.

Fig. 2.5
figure 5

Illustration of the non-local swarming model of Eftimie. The closest area is repulsive, the intermediate region supports alignment and the far region is attractive

Eftimie et al. [11, 10] generalized this approach and separated the three major interaction modes of local repulsion, intermediate alignment and long range attraction. A schematic is shown in Fig. 2.5. They assume that the turning rate μ ±(y r ±, y al ±, y a ±) is a function of a repulsive signal y r ±, an alignment signal y al ± and an attractive signal y a ±. Each of these signals has a non-local form as in (2.19), where the corresponding weights α r , α al , α a and β r , β al , β a concentrate their weight in the corresponding spatial distances around x, as illustrated in Fig. 2.5. There are many possible choices of these weight functions, and Eftimie [10] presents a detailed review of relevant possibilities. Depending on the choice of these weight functions, the model is able to describe travelling pulses, travelling trains, static pulses, breathers, feathers, ripples and many more. The non local model for correlated random walk is a rich source of interesting patterns and interesting mathematics and a full mathematical analysis is just beginning. For more details see the excellent review of Eftimie [10].

2.3 Transport Equations

We can extend the ideas introduced in Sect. 2.2 to higher dimensions in the form of the transport equation. Transport equations are a powerful tool to derive mesoscopic models for the spatial spread of populations. With mesoscopic we denote an intermediate scale, where properties of individual cells are used in the modelling, however, cells are not represented as individual entities, like in individual based modelling, rather, they are presented as macroscopic densities. Transport equations are particularly useful if the movement velocity (= speed ⋅ direction) of the individuals is of importance. The theory of kinetic transport equations developed from the Boltzmann equation and the thermodynamics of diluted gases (see e.g. [3]) and has since been developed for biological populations as well [46, 52]. One major difference between physical and biological applications is the number of conserved quantities. While in ideal gas theory five quantities are conserved (mass, three momentum components, energy), in biological populations, we often only conserve mass. Mathematically the conserved quantities are reflected as linearly independent functions in the kernel of a so called turning operator. The kernel of the turning operator in gas theory is five dimensional, while in our applications it is one dimensional. The kernel of the turning operator sets the stage for the mathematical details later. Hence this difference in size of the kernel is, in a nutshell, the main difference between physical and biological applications. The rest is details, which we will present as fully as possible in this manuscript.

We need to distinguish two important cases. Case 1: the kernel of the turning operator contains only constant functions and case 2: the kernel is spanned by a function that depends on the velocity. Such a function is called Maxwellian in a physical context [52]. The first case allows for a quite general theory as was developed in Othmer and Hillen in [29, 47], while the second case is more complicated. We use the remainder of this manuscript to study case 1,where we explain the mathematical setup, derive the parabolic limit, and apply the method to chemotaxis. Case 2 is covered in the literature [4, 8, 52, 27, 31] and we plan a summary of that case in a forthcoming textbook [35].

2.3.1 The Mathematical Set-Up

We begin by parameterizing a population density p(x, v, t) by space x, velocity v and time t. This allows us to incorporate individual cell movement into the model, an important feature which distinguishes transport models from macroscopic models. As we are typically dealing with biological phenomena, we take t ≥ 0 and \(x \in \mathbb{R}^{n}\), with n = 2, 3. The case of n = 1 corresponds to the one-dimensional correlated random walk, which we discussed already in Sect. 2.2. The velocities v are taken from V, where \(V \subset \mathbb{R}^{n}\) and \(V = [s_{1},s_{2}] \times \mathbb{S}^{n-1}\) or \(V = s\mathbb{S}^{n-1}\). The general transport equation for a population density p(x, v, t) is thus

$$\displaystyle{ p_{t} + v \cdot \nabla p = -\mu p +\mu \int _{V }T(v,v')p(x,v',t)dv', }$$
(2.20)

where we omitted the arguments, except in the integral. The terms on the left hand side describe the particles’ movement in space, while the terms on the right hand side describe how the particles change direction. The parameter μ is the turning rate, which describes how often the particles change direction. As such, \(\frac{1} {\mu }\) describes the mean run length. In other words, \(\frac{1} {\mu }\) represents how long a particle travels on average in a straight line before it changes direction. The distribution T(v, v′) inside the integral is called the turning kernel, or turning distribution, and describes the probability that a cell traveling in the direction of v′ will turn into the direction of v. As such, the first term on the right hand side describes cells turning out of velocity v, while the integral term describes cells turning into velocity v from all other directions v′ ∈ V. Together, these two terms are called the turning operator. Here we follow the theory as developed by Stroock [55], Othmer et al. [46] and Hillen and Othmer [29, 47].

Given the compact set V of possible velocities, we work in the function space L 2(V ) and we denote by \(\mathcal{K}\subset L^{2}(V )\) the cone of non-negative functions. Given by the right hand side of Eq. (2.20) we define an integral operator on \(L^{2}(V );\mathcal{T}: L^{2}(V ) \rightarrow L^{2}(V )\) as

$$\displaystyle{\mathcal{T}\phi (v) =\int _{V }T(v,v')\phi (v')dv'}$$

with adjoint

$$\displaystyle{\mathcal{T}^{{\ast}}\phi (v') =\int _{ V }T(v',v)\phi (v')dv'.}$$

The integral kernel T and the integral operator \(\mathcal{T}\) set the stage for the theory. In the context of biological applications, we make the following general assumptions. We will now list these general assumptions, with a detailed explanation of each assumption following.

Basic Assumptions

(T1) :

T(v, v′) ≥ 0,  V T(v, v′)dv = 1, and V V T 2(v, v′)dvdv < .

(T2) :

There is a function \(u_{0} \in \mathcal{K}\setminus \{0\}\), a ρ > 0 and an N > 0 such that for all (v, v′) ∈ V × V, either

  1. (a)

    u 0(v) ≤ T N(v′, v) ≤ ρ u 0(v), or

  2. (b)

    u 0(v) ≤ T N(v, v′) ≤ ρ u 0(v), 

where the N-th iterate of T is

$$\displaystyle{T^{N}(v',v) =\int _{ V }\ldots \int _{V }T(v',w)\ldots T(w_{N-1},v)dw_{1}\ldots dw_{N-1}.}$$
(T3) :

\(\|\mathcal{T}\|_{\langle 1\rangle ^{\perp }} < 1\), where L 2(V ) = 〈1〉 ⊕〈1〉 ⊥ .

(T4) :

V T(v, v′)dv′ = 1. 

Assumption (T1) Assumption (T1) implies that T(⋅ , v′) is a non-negative probability density on V. The fact that T ∈ L 2(V × V ) implies that \(\mathcal{T}\) and \(\mathcal{T}^{{\ast}}\) are Hilbert-Schmidt operators, defined as follows [23]:

Definition 1

An integral operator \(\mathcal{T} f(v) =\int T(v,v')f(v')dv'\) with T ∈ L 2(V × V ) is called a Hilbert-Schmidt operator.

Hilbert-Schmidt operators have some compactness properties:

Theorem 1 ([23])

Hilbert-Schmidt operators are bounded and compact.

Furthermore, (T1) implies that \(\mathcal{T}\) and \(\mathcal{T}^{{\ast}}\) are positive operators.

Assumption (T2) We will show that assumption (T2a) ensures that \(\mathcal{T}^{{\ast}}\) is u 0-positive in the sense of Krasnosleskii [41], while (T2b) ensures that \(\mathcal{T}\) is u 0-positive. One of these is sufficient. Krasnoselskii defines u 0-positivity as follows.

Definition 2

Let X be a Banach space, \(\mathcal{K}\) the non-negative cone and L: X → X linear. Then

  1. (a)

    L is positive if \(L: \mathcal{K}\rightarrow \mathcal{K}.\)

  2. (b)

    Let L be positive. L is u 0-bounded from below if there is a fixed \(u_{0} \in \mathcal{K}\setminus \{0\}\) such that \(\forall \phi \in \mathcal{K}\setminus \{0\}\) \(\exists N > 0,\alpha > 0\) with

    $$\displaystyle{\alpha u_{0} \leq L^{N}\phi.}$$
  3. (c)

    Let L be positive. L is u 0-bounded from above if there is a fixed \(u_{0} \in \mathcal{K}\setminus \{0\}\) such that \(\forall \psi \in \mathcal{K}\setminus \{0\}\) \(\exists N > 0,\beta > 0\) with

    $$\displaystyle{L^{N}\psi \leq \beta u_{ 0}.}$$
  4. (d)

    L is u 0-positive if conditions (b) and (c) are both satisfied.

  5. (e)

    \(\mathcal{K}\) is reproducing if for all ϕ ∈ X there exist \(\phi ^{+},\phi ^{-}\in \mathcal{K}\) such that ϕ = ϕ +ϕ .

Using this definition, we can prove the following Lemma:

Lemma 1

Assumption (T2a) implies that \(\mathcal{T}^{{\ast}}\) is u 0 -positive, while (T2b) implies that \(\mathcal{T}\) is u 0 positive.

Proof

Consider \(\phi \in \mathcal{K}\). We compute the iterate

$$\displaystyle\begin{array}{rcl} T^{{\ast}}{}^{N}\phi & =& \int _{ V }T(v',w_{1})\cdots T(w_{N_{1}},v)\phi (v')dw_{1}\ldots dw_{N+1}dv' {}\\ & =& \int _{V }T^{N}(v',v)\phi (v')dv' {}\\ & \geq & \int _{V }u_{0}(v)\phi (v')dv' = u_{0}(v)\int _{V }\phi (v')dv' =\alpha u_{0}(v). {}\\ \end{array}$$

The last inequality is a direct consequence of (T2a). Similarly, we have

$$\displaystyle{T^{{\ast}}{}^{N}\phi \leq \int _{ V }\rho u_{0}(v)\phi (v')dv' =\rho \int _{V }\phi (v)dvu_{0}(v) =\beta u_{0}(v).}$$

The second statement has a very similar proof. ⊓⊔

Condition (T2) has an interesting biological meaning. It is not assumed that the kernel T is positive. In fact, it is allowed for T to have support that is smaller than V, but some iterate of T must cover V. For example if individuals are able to turn for up to 45 degrees per turn, then they are able to reach any direction after 4 turns. In that case T 4 would be u 0 positive. See Fig. 2.6 for an illustrative explanation. Using (T2) we are more general than most of the publications on transport equations in biology. It is almost always assumed that T > 0, but here we can relax that assumption.

Fig. 2.6
figure 6

Illustration of the iterates of a turning operator. On the left we indicate the support of a turning kernel that allows directional changes of up to 45. On the right we indicate the range of the iterates T 1, T 2, T 3, T 4. After four turns, all directions are possible

The u 0 positivity is already sufficient to have a Krein-Rutman property:

Theorem 2 (Krasnoselskii, [41], Theorems 2.10, 2.11)

Let K be a reproducing non-negative cone in X. Let L be u 0 -positive. Let \(\varphi _{0} \in \mathcal{K}\) be an eigenfunction of L. Then

  1. (i)

    0 = λ 0 φ 0 and λ 0 is a simple, leading eigenvalue,

  2. (ii)

    φ 0 is unique in \(\mathcal{K}\) up to scalar multiples, and

  3. (iii)

    0 | > |λ| for all other eigenvalues λ.

In our case we have

$$\displaystyle{\mathcal{T}^{{\ast}}1 =\int _{ V }T(v',v)1dv' = 1}$$

by (T1). Hence \(\varphi _{0} = 1 \in \mathcal{K}\) is the leading non-negative eigenfunction of \(\mathcal{T}^{{\ast}}\) with eigenvalue λ 0 = 1. All of the other eigenvalues are such that | λ |  < 1. We also have

$$\displaystyle{\mathcal{T} 1 =\int _{V }T(v,v')dv' = 1}$$

by (T4).This means that we also have that φ 0 = 1 is the leading non-negative eigenfunction of \(\mathcal{T}\).

Assumption (T3) Note that in Krasnoselskii’s theorem above it is assumed that there exists an eigenfunction in \(\mathcal{K}\). This is not always the case, and assumption (T3) ensures the existence of a spectral gap between the leading eigenvector φ 0 = 1 and the remainder of the spectrum. We will show later that if \(\mathcal{T}\) is a normal operator (or if \(\mathcal{T}^{{\ast}}\) is normal), then (T2) implies (T3).

Assumption (T4) Condition (T4) looks as natural as the second condition in (T1). It has, however a very different meaning. The meaning of (T4) is that the eigenvalue equation

$$\displaystyle{\int _{V }T(v,v')\phi (v')dv' =\lambda \phi (v)}$$

has a constant solution ϕ(v) = 1 with eigenvalue λ = 1. This is a very special case that allows us to develop a full theory and to do the macroscopic scalings done later in this chapter. If the leading eigenfunction φ 0(v) is not constant the methods will change slightly, and particular care must be given to the resulting non-isotropic diffusion equations, which is discussed elsewhere [31, 35]. Both cases are equally important in terms of applications.

2.3.2 The Turning Operator

The turning operator describes the whole right hand side of (2.20) and is given by \(\mathcal{L}: L^{2}(V ) \rightarrow L^{2}(V )\):

$$\displaystyle{\mathcal{L}p(v) = -\mu p(v) +\mu \mathcal{T} p(v)}$$

with adjoint

$$\displaystyle{\mathcal{L}^{{\ast}}p(v) = -\mu p(v) +\mu \mathcal{T}^{{\ast}}p(v).}$$

We can now write down a result about the spectrum of the turning operator.

Lemma 2

Assume (T1)–(T4). Then 0 is a simple eigenvalue of \(\mathcal{L}^{{\ast}}\) and \(\mathcal{L}\) with leading eigenfunction φ 0 = 1. All other eigenvalues λ satisfy − 2μ <Reλ < 0. All other eigenfunctions have integral zero.

Proof

Both \(\mathcal{T}\) and \(\mathcal{T}^{{\ast}}\) have a spectral radius of 1, which implies that \(\mu \mathcal{T}\) has a spectral radius of μ. We therefore have

$$\displaystyle{-2\mu <\mathrm{ Re}\lambda < 0.}$$

If φφ 0 is another eigenfunction, then φ ∈ 〈1〉 ⊥  which implies

$$\displaystyle{0 =\int _{V }\varphi (v)1dv =\int _{V }\varphi (v)dv.}$$

⊓⊔

Condition (T3) allows us to introduce another constant, called μ 2, which will give us information about the dissipativity of the turning operator. Consider ψ ∈ 〈1〉 ⊥  then

$$\displaystyle\begin{array}{rcl} \int _{V }\psi \mathcal{L}\psi dv& =& -\mu \int _{V }\psi ^{2}dv +\mu \int _{ V }\psi \mathcal{T}\psi dv {}\\ & \leq & -\mu (1 -\|\mathcal{T}\|_{\langle 1\rangle ^{\perp }})\int _{V }\psi ^{2}dv {}\\ & =& -\mu _{2}\|\psi \|_{2}^{2} {}\\ \end{array}$$

with \(\mu _{2} =\mu (1 -\|\mathcal{T}\|_{\langle 1\rangle ^{\perp }})\) and \(\|\mathcal{T}\|_{\langle 1\rangle ^{\perp }} < 1\).

2.3.3 Normal Operators

In this section we discuss what it means for an operator to be normal, and explore some of the consequences of this characteristic.

Definition 3

An operator A is defined to be normal if AA  = A A.

Theorem 3 ([5], p. 55, et. seq.)

If A is normal, then there exists a complete orthogonal set of eigenfunctions. A has a spectral representation A = ∑λ j P j where λ j are the eigenvalues and P j the spectral projections.

If \(\mathcal{T}\) is normal, then we can choose an orthonormal basis ϕ n with ∥ ϕ n  ∥  = 1.

Lemma 3

If \(\mathcal{T}\) is normal, then (T3) follows from (T1) and (T2).

Proof

Consider the operator norm of \(\mathcal{T}\) on 〈1〉 ⊥ :

$$\displaystyle\begin{array}{rcl} \|\mathcal{T}\|_{\langle 1\rangle ^{\perp }}& =& \sup _{{ \phi \in \langle 1\rangle ^{\perp } \atop \|\phi \|_{2}=1} }\|\mathcal{T}\phi \|_{2} {}\\ & =& \sup _{\phi }\|\mathcal{T}\sum _{n=1}^{\infty }\alpha _{ n}\phi _{n}\|_{2} {}\\ & =& \sup _{\phi }\|\sum _{n=2}^{\infty }\alpha _{ n}\lambda _{n}\phi _{n}\|_{2} {}\\ & =& \sup _{\phi }\left (\sum _{n=2}^{\infty }\vert \alpha _{ n}\lambda _{n}\vert ^{2}\right )^{\frac{1} {2} } {}\\ & <& \sup _{\phi }\left (\sum _{n=2}^{\infty }\vert \alpha _{ n}\vert ^{2}\right )^{\frac{1} {2} } =\|\phi \| _{2} = 1. {}\\ \end{array}$$

⊓⊔

In our case we need to check if \(\mathcal{T}\) is normal:

$$\displaystyle\begin{array}{rcl} \mathcal{T}\mathcal{T}^{{\ast}}\phi & =& \mathcal{T}\left (\int _{ V }T(v,v')\phi (v')dv'\right ) {}\\ & =& \int _{V }\int _{V }T(v,v'')T(v',v'')\phi (v')dv'dv'' {}\\ \mathcal{T}^{{\ast}}\mathcal{T}\phi & =& \int _{ V }\int _{V }T(v'',v)T(v'',v')\phi (v')dv'dv''. {}\\ \end{array}$$

In order for our operator to be normal, we thus obtain the necessary symmetry condition

$$\displaystyle{\int _{V }T(v,v'')T(v',v'')dv'' =\int _{V }T(v'',v)T(v'',v')dv''.}$$

This is satisfied, for example, when T is a symmetric kernel of the form \(T(v,v') = T(v',v),\forall (v,v') \in V ^{2}\).

2.3.4 Important Examples

We now consider two important examples, and investigate how the theory discussed so far applies.

2.3.4.1 Example 1: Pearson Walk

For the first example, we will choose our space of directions to be a sphere of constant radius, i.e. \(V = s\mathbb{S}^{n-1}\). This means that our particles can choose any direction, and will travel with constant speed. We will choose the simplest turning kernel, which is constant and normalized: \(T(v,v') = \frac{1} {\vert V \vert }\).

We will now check (using ) if our four basic assumptions are satisfied for this simple choice of V and T.

(T1) :

T ≥ 0 ✓, V Tdv = 1 ✓, V V T 2 dvdv′ = 1 ✓, and so the conditions of assumption (T1) are met.

(T2) :

Not only do we have that T ≥ 0, but we actually have the stronger condition T > 0. This implies that \(\mathcal{T}\) is u 0-positive. ✓

(T3) :

We have

$$\displaystyle{\mathcal{T}^{{\ast}}\phi =\int _{ V } \frac{1} {\vert V \vert }\phi (v')dv' = \mathcal{T}\phi =\int _{V } \frac{1} {\vert V \vert }\phi (v)dv.}$$

We can thus conclude that \(\mathcal{T}\) is self adjoint and henceforth it is normal. Then by Lemma 3, we can conclude that (T3) is satisfied. ✓

(T4) :

V Tdv′ = 1. ✓

The Pearson walk satisfies all assumptions (T1)–(T4), and it will form our prototype for the theory and scaling developed later.

2.3.4.2 Example 2.: Movement on Fibre Networks

There are many examples that arise naturally in biology where the particles in question, whether they be animals or cells, make their turning decisions based on their environment. For example, glioma cells diffusing in the brain will use the white matter tracts as highways for their movement [9, 17, 16, 59]. We also see this phenomenon in ecology, where wolves will use paths that are cut in the forest for oil exploration to hunt more efficiently [43, 44]. We thus consider in this example these types of situations, where the turning kernel is given by an underlying anisotropy of the environment. We use unit vectors \(\theta \in \mathbb{S}^{n-1}\) to describe the anisotropies of the environment through a directional distribution q(x, θ) with

$$\displaystyle{\int _{\mathbb{S}^{n-1}}q(x,\theta )d\theta = 1\mathrm{and}q(x,\theta ) \geq 0.}$$

In the context of glioma growth, q(x, θ) denotes the distribution of nerve fibre directions in each location x [50]. In the example of wolf movement the function q would provide information of preferred movement directions due to roads or seismic lines [31]. We assume that individuals favour directions that are given by the environment, and, for simplicity, we consider unit speed \(\vert v\vert = 1,V = \mathbb{S}^{n-1}\). We also make the simplifying assumption that it does not matter which direction an individual was previously travelling, essentially neglecting inertia. Then T(v, v′, x) = q(x, v). The assumption (T1)–(T4) relate to the v dependence only, hence in the following we ignore the x dependence in q, noting, however, that q, in general, would depend on x.

(T1) :

q ≥ 0 ✓, V T(v, v′)dv =  V q(v)dv = 1, ✓and V V q 2(v)dvdv′ =  | V | ⋅  V q 2(v)dv < , henceforth \(q \in L^{2}(\mathbb{S}^{n-1})\). ✓

(T2) :

We first compute the iterates:

$$\displaystyle\begin{array}{rcl} T^{N}(v',v)& =& \int _{ V }\cdots \int _{V }T(v',w_{1})\cdots T(w_{N-1}v)dw_{1}\cdots dw_{N-1} {}\\ & =& \int _{V }\cdots \int _{V }q(v')q(w_{1})\cdots q(w_{N-1})dw_{1}\cdots dw_{N-1} {}\\ & =& q(v'). {}\\ \end{array}$$

Condition (T2a) therefore becomes:

$$\displaystyle{u_{0}(v) \leq q(v') \leq \rho u_{0}(v),}$$

which is satisfied only if q > 0.

The condition (T2b) becomes:

$$\displaystyle{u_{0}(v) \leq q(v) \leq \rho u_{0}(v),}$$

and so we have a weaker condition, only requiring that q be u 0 positive.

(T3) :

Is \(\mathcal{T}\) normal? \(\mathcal{T}\) would be normal if

$$\displaystyle{\int _{V }q(v)q(v')dv'' =\int _{V }q(v'')q(v'')dv''}$$

which is equivalent to the condition

$$\displaystyle{\vert V \vert q(v)q(v') =\| q\|_{2}^{2}.}$$

We see that this is true if q = const., bringing us back to the Pearson case. In general then, T(v, v′) = q(v) is not normal. We must therefore do some more work in order to verify (T3). We can compute \(\|\mathcal{T}\|_{\langle 1\rangle ^{\perp }}\) directly:

$$\displaystyle\begin{array}{rcl} \|\mathcal{T}\|_{\langle 1\rangle ^{\perp }}& =& \sup _{{ \phi \in \langle 1\rangle ^{\perp } \atop \|\phi \|=1} }\left \vert \left \vert \int _{V }q(v)\phi (v')dv'\right \vert \right \vert {}\\ & =& \sup \left \vert \left \vert q(v)\int _{V }\phi (v')dv'\right \vert \right \vert {}\\ & =& 0. {}\\ \end{array}$$

Therefore on 〈1〉 ⊥  the operator \(\mathcal{T}\) is the zero operator. This satisfies assumption (T3), but it also shows that the splitting of L 2(V ) = 〈1〉 ⊕〈1〉 ⊥  is not a good choice here. Indeed, we will later see that we should choose L 2(V ) = 〈q〉 ⊕〈q ⊥ .

Finally, we check condition (T4).

$$\displaystyle{\int _{V }T(v,v')dv' = q(v)\vert V \vert = 1,}$$

which is only true for q(v) = const.

So for this example, if T(v, v′) = q(v) is not constant, then it fails (T4) and (T3) which is problematic.

2.3.4.3 Example 3 (Homework) Symmetric Kernels

Check if symmetric kernels of the form a), b) or c) satisfy the assumptions (T1)–(T4):

$$\displaystyle\begin{array}{rcl} a)& \qquad & T(v,v') = t(\vert v - v'\vert ) {}\\ b)& \qquad & T(v,v') = t(v - v') {}\\ c)& \qquad & T(v,v') = t(v'). {}\\ \end{array}$$

2.3.5 Main Spectral Result

In this section, we summarize the results thus far into one main theorem and provide a proof of the missing pieces.

Theorem 4 ([29])

Assume (T1)–(T4). Then

  1. 1)

    0 is a simple leading eigenvalue of \(\mathcal{L}\) with unique eigenfunction φ 0 = 1,

  2. 2)

    All other eigenvalues λ are such that − 2μ < Re λ ≤−μ 2 < 0 and all other eigenfunctions have zero mass.

  3. 3)

    L 2 (V ) =11 and for all ψ ∈1 :

    $$\displaystyle{\int _{V }\psi \mathcal{L}\psi dv \leq -\mu _{2}\|\psi \|_{2}^{2},\quad \mbox{ where }\quad \mu _{ 2} =\mu {\Bigl ( 1 -\|\mathcal{T}\|_{\langle 1\rangle ^{\perp }}\Bigr )},}$$
  4. 4)

    \(\|\mathcal{L}\|\) has a lower and upper estimate

    $$\displaystyle{\mu _{2} \leq \|\mathcal{L}\|_{\mathcal{L}(L^{2}(V ),L^{2}(V ))} \leq 2\mu,}$$
  5. 5)

    \(\mathcal{L}_{\langle 1\rangle ^{\perp }}\) has a linear inverse \(\mathcal{F}\) (pseudo-inverse) with

    $$\displaystyle{\frac{1} {2\mu } \leq \|\mathcal{F}\|_{\langle 1\rangle ^{\perp }} \leq \frac{1} {\mu _{2}}.}$$

Proof

We have already verified parts 1)–3) earlier in this section, thus we now prove 4) and 5). To verify 4):

$$\displaystyle\begin{array}{rcl} \|\mathcal{L}\|_{\mathcal{L}(L^{2}(V ),L^{2}(V ))}& =& \sup _{{ \phi \in L^{2}(V ) \atop \|\phi \|=1} }\|\mathcal{L}\phi \|_{2} {}\\ & \leq & \sup _{\phi =\alpha +\phi ^{\perp }}\left (\mathop{\underbrace{\|\mathcal{L}\alpha \|_{2}}}\limits _{=0} +\| \mathcal{L}\phi ^{\perp }\|_{ 2}\right ) {}\\ & =& \sup _{\phi ^{\perp }\in \langle 1\rangle ^{\perp }}\|\mathcal{L}\phi ^{\perp }\|_{ 2} {}\\ & =& \sup _{\phi ^{\perp }\in \langle 1\rangle ^{\perp }}\|-\mu \phi ^{\perp } +\mu \mathcal{T}\phi ^{\perp }\|_{ 2} {}\\ & \leq & \sup _{\phi \in \langle 1\rangle ^{\perp }}\mu \|\phi ^{\perp }\|_{ 2} +\mu \| \mathcal{T}\phi ^{\perp }\|_{ 2} {}\\ & \leq & \sup _{\phi ^{\perp }\in \langle 1\rangle ^{\perp }}2\mu \|\phi ^{\perp }\|_{ 2} {}\\ \end{array}$$

and \(\forall \phi \in \langle 1\rangle ^{\perp }\), ∥ ϕ ∥ 2 = 1 we have

$$\displaystyle{\mu _{2}\|\phi \|_{2}^{2} \leq -\int \phi \mathcal{L}\phi dv{ \leq \atop \mathrm{H\ddot{o}lder}} \|\phi \|_{2} \cdot \|\mathcal{L}\phi \|_{2} \leq \|\mathcal{L}\|_{\mathcal{L}(L^{2}(V ),L^{2}(V ))},}$$

which implies \(\mu _{2} \leq \|\mathcal{L}\|\leq 2\mu\).

Part 5) follows directly from \(\mathcal{F} = \left (\mathcal{L}\vert _{\langle 1\rangle ^{\perp }}\right )^{-1}\). For example, if \(\mathcal{F}\phi = z\) and ϕ, z ∈ 〈1〉 ⊥ , then \(\mathcal{L}z =\phi\) and

$$\displaystyle{\|\phi \|=\| \mathcal{L}z\|}$$
$$\displaystyle\begin{array}{rcl} & \Rightarrow & \mu _{2}\|z\| \leq \|\phi \|\leq 2\mu \|z\| {}\\ & \Rightarrow & \frac{1} {2\mu }\|\phi \| \leq \| z\| \leq \frac{1} {\mu _{2}} \|\phi \| {}\\ & \Rightarrow & \frac{1} {2\mu }\|\phi \| \leq \|\mathcal{F}\phi \|\leq \frac{1} {\mu _{2}} \|\phi \|. {}\\ \end{array}$$

⊓⊔

2.3.6 Existence and Uniqueness

Since the transport equation as formulated in (2.20) is linear, we immediately get existence and uniqueness of solutions as follows. We denote the shift operator A: = −(v ⋅ ∇) with domain of definition

$$\displaystyle{D(A) =\{\phi \in L^{2}(\mathbb{R}^{n} \times V ):\phi (.,v) \in H^{1}(\mathbb{R}^{n})\}.}$$

The shift operator is skew-adjoint and, according to Stone’s theorem [7, 51] it generates a strongly continuous unitary group on \(L^{2}(\mathbb{R}^{n} \times V )\). The right hand side of (2.20) is given by the bounded operator \(\mathcal{L}\), hence it is a bounded perturbation of the shift group. Consequently, (2.20) also generates a strongly continuous solution group on \(L^{2}(\mathbb{R}^{n} \times V )\). Moreover. given initial conditions u 0 ∈ D(A), then a unique global solution exists in

$$\displaystyle{C^{1}([0,\infty ),L^{2}(\mathbb{R}^{n} \times V )) \cap C([0,\infty ),D(A)).}$$

2.4 The Formal Diffusion Limit

The computation of the diffusion limit, as presented here, is one of the standard methods for the analysis of transport equations. The equation type of a transport equation is hyperbolic, as it is based on pieces of ballistic motion, interspersed with directional changes. As the frequency of these changes becomes large, and the speed is large, then the movement looks, on a macroscopic scale, like diffusion (see Fig. 2.7). Mathematically, this macroscopic limit can be obtained via a formal asymptotic expansion with a small parameter ɛ. This parameter ɛ relates the ratio of the microscopic spatial scale to a macroscopic spatial scale. We will see that the above assumptions (T1)–(T4) allow us to obtain a well defined and uniformly parabolic limit equation, where the diffusivity is given by the turning kernel T. Before we present the scaling method in Sect. 2.4.2, we discuss realistic scaling arguments for E. coli bacteria in Sect. 2.4.1

2.4.1 Scalings

Fig. 2.7
figure 7

Illustration of the relevant scalings for E. coli movement. While the time scales vary from seconds to minutes to hours, the spatial scale changes of the order of μm to mm to cm

We now consider the movement of E. coli bacteria as an example of different time and spatial scales [1, 14]. E. coli move by rotating their flagella, which are attached to the outside membrane of the bacterium. If most flagella rotate counterclockwise, they tend to align and propel the bacterium forward in a straight line. If many flagella rotate clockwise, then the alignment of the flagella is lost, they point in very different directions, which leads to a rotation of the cell. The clockwise - counterclockwise rotation of the flagella is controlled by an internal chemical signalling pathway, which is influenced by external signals [14]. As seen through a microscope, the bacteria fulfill a typical run and tumble movement, where longer periods of straight movement are interspersed by short moments of reorientations. On an individual scale, E. coli turn about once per second. Hence a mean turning rate μ satisfies \(\frac{1} {\mu } \sim \frac{1} {s}\). From the point of view of the cell, we call this the timescale of turning \(\tau _{\mathrm{turn}} = \mathcal{O}(1)\). If observed over 50–100 turns, the trajectories appear directed, and a net displacement can easily be measured. We call this the intermediate drift time scale \(\tau _{\mathrm{drift}} \sim \mathcal{O}\left (\frac{1} {\varepsilon } \right )\), and ɛ ∼ 10−2. If we allow for 2500–10,000 turns, then the trajectories look like diffusion and random movement. Hence we introduce a third time scale of \(\tau _{\mathrm{diff}} \sim \mathcal{O}\left (\frac{1} {\varepsilon ^{2}} \right ).\) Just by the scale of observation, we identify three time scales, a time scale of turning τ turn, a drift time scale τ drift and a diffusion time scale τ diff.

Mathematically, we identify the three scales through nondimensionalization. This serves to remove dimension from the problem, thus simplifying the model. In many situations, this will also reduce the number of parameters which we are dealing with, and it often allows us to identify large and small parameter combinations. In the case of transport equations, as introduced in the previous section, we introduce

\(\tilde{v} = \frac{v} {s}\): :

where s is the characteristic speed. In case of E. coli it is about 10–20 μm s−1,

\(\tilde{x} = \frac{x} {L}\): :

where L is the characteristic length scale. For E. coli bacterial colonies are of the order of 1 mm–1 cm, and

\(\tilde{t} = \frac{t} {\sigma }\): :

where σ is the macroscopic time scale of observation. In the bacterial case it is about 1–10 h.

If we apply these scalings, then the transport equation becomes

$$\displaystyle{\frac{1} {\sigma } \frac{\partial p} {\partial \tilde{t}} + \frac{s} {L}\tilde{v} \cdot \nabla _{\tilde{x}}p = -\mu p +\mu \int _{V }Tpdv'.}$$

Using the values which we identified for E. coli, we find

$$\displaystyle{\sigma \approx 1 - 10\,\mathrm{h} = 3600 - 36,000\,\mathrm{s} \sim 10^{4}\,\mathrm{s},}$$

and

$$\displaystyle{ \frac{s} {L} \approx \frac{10\upmu \mathrm{m}\mathrm{s}^{-1}} {10^{-3}\mathrm{m}} = \frac{10 \cdot 10^{-6}\,\mathrm{m}\mathrm{s}^{-1}} {10^{-3}\mathrm{m}} = 10^{-2}\mathrm{s}^{-1}.}$$

When ɛ = 10−2, we then have \(\frac{1} {\sigma } \sim \varepsilon ^{2}\) and \(\frac{s} {L} \sim \varepsilon\). If we remove the ∼ , then we obtain the resulting scaled transport equation:

$$\displaystyle{ \varepsilon ^{2}p_{ t} +\varepsilon v \cdot \nabla p = \mathcal{L}p. }$$
(2.21)

2.4.2 The Formal Diffusion Limit

To compute the formal diffusion limit, we will begin by studying a regular perturbation, or Hilbert expansion of p with respect to ɛ. This gives us

$$\displaystyle{ p(x,v,t) = p_{0}(x,v,t) +\varepsilon p_{1}(x,v,t) +\varepsilon ^{2}p_{ 2}(x,v,t) + h.o.t. }$$
(2.22)

We will begin by substituting this expansion into Eq. (2.21) and match orders of ɛ.

Order ɛ 0

$$\displaystyle{\mathcal{L}p_{0} = 0,}$$

which implies that p 0 is in the kernel of \(\mathcal{L}\), hence

$$\displaystyle{p_{0}(t,x,v) =\bar{ p}(x,t),}$$

which is independent of v. We get this from the first result of Theorem 4.

Order ɛ 1

$$\displaystyle{ v \cdot \nabla p_{0} = \mathcal{L}p_{1}. }$$
(2.23)

This equation can be solved for p 1 if v ⋅ ∇p 0 ∈ 〈1〉 ⊥ , so we need to check if this solvability condition is satisfied. Computing the following inner product of v ⋅ ∇p 0 and 1 we find:

$$\displaystyle{\int _{V }v \cdot \nabla p_{0}dv = \nabla \left (\mathop{\underbrace{\int _{V }vdv}}\limits _{{ =0\text{ due to} \atop \text{symmetry of V }} }\bar{p}\right ) = 0.}$$

Hence Eq. (2.23) can be solved as \(p_{1} = \mathcal{F}(v \cdot \nabla p_{0}) = \mathcal{F}(v \cdot \nabla \bar{p}),\)

Order ɛ 2

$$\displaystyle{ p_{0}{}_{t} + v \cdot \nabla p_{1} = \mathcal{L}p_{2}. }$$
(2.24)

This case is a bit more complicated to solve than the first two cases. Here we have two options for how to proceed; a) integrate, or b) use the solvability condition. In the case studied here, a) and b) are equivalent, however, in other more general cases we would choose option a) and integrate (see Sect. 2.4.6).

If we integrate Eq. (2.24), we obtain

$$\displaystyle{\int _{V }p_{0}{}_{t} + v \cdot \nabla p_{1}dv = 0,}$$

since the right hand side integrates to 0. Plugging in the results from the order 0 and order 1 matching, this becomes

$$\displaystyle{\int _{V }\bar{p}_{t}(x,t)dv +\int _{V }v \cdot \nabla \mathcal{F}(v \cdot \nabla \bar{p}(x,t))dv = 0.}$$

Since \(\bar{p}_{t}\) does not depend on v, we can simplify the first term. Also, since ∇ is a spatial derivative, and the integral is over the velocity space, we can take the derivative out of the integral in the second term. This equation thus becomes

$$\displaystyle{\vert V \vert \bar{p}_{t}(x,t) + \nabla \cdot \int _{V }v\mathcal{F}vdv \cdot \nabla \bar{p}(x,t).}$$

We can simplify this to

$$\displaystyle{ \bar{p}_{t} = \nabla \cdot D\nabla \bar{p} }$$
(2.25)

where the diffusion tensor D is defined to be

$$\displaystyle{D = - \frac{1} {\vert V \vert }\int _{V }v\mathcal{F}v^{\perp }dv = - \frac{1} {\vert V \vert }\int _{V }v \otimes \mathcal{F}vdv.}$$

Where we use two equivalent forms to denote an exterior product. We can write this in index notation as well

$$\displaystyle{\nabla \cdot D\nabla =\sum _{ i,j=1}^{n}\partial _{ i}D^{ij}\partial _{ j},\quad \mbox{ with }\quad D^{ij} = - \frac{1} {\vert V \vert }\int _{V }v^{i}\mathcal{F}v^{j}dv.}$$

The components in the diffusion tensor D give the relative rates of diffusion in different directions. This process thus allows for the directional rate of spread to vary. If D is a constant multiple of the identity matrix of appropriate dimension, then the resulting diffusion is called isotropic. Alternatively, if the rates of spread do in fact vary with direction, we have anisotropic diffusion.

2.4.2.1 Example: Pearson Walk

We can once again consider the Pearson walk as an example. Recall from before that for this example we choose \(V = s\mathbb{S}^{n-1}\) and \(T(v,v') = \frac{1} {\vert V \vert }\). We first must compute the inverse operator \(\mathcal{F}\). Given ϕ ∈ 〈1〉 ⊥ , we wish to find z ∈ 〈1〉 ⊥  such that \(\mathcal{L}z =\phi.\) We will use the fact that z ∈ 〈1〉 ⊥  implies V z(v)dv = 0.

Now if we apply the operator \(\mathcal{L}\), we have that \(\mathcal{L}z =\phi\) is equivalent with

$$\displaystyle{-\mu z(v) +\mu \mathop{\underbrace{ \int _{V } \frac{1} {\vert V \vert }z(v')dv'}}\limits _{=0} =\phi (v),}$$

and so \(z(v) = -\frac{1} {\mu } \phi (v)\). Hence

$$\displaystyle{\mathcal{F} = -\frac{1} {\mu } \quad \mbox{ as multiplication operator.}}$$

Then for this example, we find that the diffusion tensor is

$$\displaystyle{D = \frac{1} {\mu \vert V \vert }\int _{V }vv^{\mathrm{T}}dv.}$$

In order to have an explicit form for D, we must then compute

$$\displaystyle{\int _{V }vv^{\mathrm{T}}dv,\mathrm{with}V = s\mathbb{S}^{n-1}.}$$

For example, in 2-dimensions: \(V = s\mathbb{S}^{1}\), and \(v = s\left ({ \cos \phi \atop \sin \phi } \right )\). We can then explicitly compute

$$\displaystyle{vv^{\mathrm{T}} = s^{2}\left (\begin{array}{cc} \cos ^{2}\phi & \cos \phi \sin \phi \\ \cos \phi \sin \phi & \sin ^{2}\phi \end{array} \right ),}$$

and so

$$\displaystyle{D = \frac{s^{2}} {\vert V \vert }\int _{0}^{2\pi }\left (\begin{array}{cc} \cos ^{2}\phi & \cos \phi \sin \phi \\ \cos \phi \sin \phi & \sin ^{2}\phi \end{array} \right )sd\phi.}$$

We can then solve by integrating component wise. If we consider this tensor in 3 dimensions, then we have double integrals of trigonometric functions to solve. This is still possible, but tedious. In higher dimensions the integral becomes more and more cumbersome. In the proof of the next Lemma we propose a clever use of the divergence theorem to compute the above integral in any dimension. As shown by Hillen in [26], this method can be generalized to higher dimensions and higher velocity moments.

Lemma 4

Let \(V = s\mathbb{S}^{n-1},\omega _{0} = \vert \mathbb{S}^{n-1}\vert \) , then |V | = s n−1 ω 0 and

$$\displaystyle{\int _{V }vv^{\mathrm{T}}dv = \frac{\omega _{0}s^{n+1}} {n} \mathbb{I},}$$

where \(\mathbb{I}\) is the n-dimensional identity matrix.

Proof

Since \(\int _{V }vv^{\mathrm{T}}dv\) is a tensor, we use two test vectors \(a,b \in \mathbb{R}^{n}\) and use tensor notation, i.e. summation over repeated indices

$$\displaystyle{a_{i}b_{i} =\sum _{ i=1}^{n}a_{ i}b_{i}}$$

then

$$\displaystyle\begin{array}{rcl} a^{T}\int _{ V }vv^{\mathrm{T}}dv\;b& = & \int _{ V }a_{i}v_{i}v_{j}b_{j}dv {}\\ & = & s\int _{V } \frac{v_{i}} {\vert v\vert }(a_{i}v_{j}b_{j})dv {}\\ & \mathop{\underbrace{=}}\limits _{{ \mathrm{divergence} \atop \mathrm{theorem}} }& s\int _{B_{s}(0)}\mathrm{div}_{v_{i}}(a_{i}v_{j}b_{j})dv {}\\ & = & s\int _{B_{s}(0)}a_{i}b_{i}dv {}\\ & = & s\vert B_{s}(0)\vert a_{i}b_{i} {}\\ \end{array}$$

We can compute | B s (0) | as follows

$$\displaystyle{\vert B_{s}(0)\vert = s^{n}\vert B_{ 1}(0)\vert = s^{n}\int _{ B_{1}(0)}dv = \frac{s^{n}} {n} \int _{B_{1}(0)}\mathrm{div}_{v}vdv.}$$

If we apply the divergence theorem again this becomes

$$\displaystyle{\frac{s^{n}} {n} \int _{\mathbb{S}^{n-1}}\sigma \cdot \sigma d\sigma = \frac{s^{n}} {n} \vert \mathbb{S}^{n-1}\vert = \frac{s^{n}} {n} \omega _{0},}$$

where \(\omega _{0} \equiv \vert \mathbb{S}^{n-1}\vert \). Then

$$\displaystyle{a^{\mathrm{T}}\int _{ V }vv^{\mathrm{T}}dv\;b = a^{T}\frac{s^{n}} {n} \vert \mathbb{S}^{n-1}\vert b = \frac{s^{n+1}} {n} \omega _{0}a^{\mathrm{T}}b}$$

for all vectors \(a,b \in \mathbb{R}^{n}\). We therefore obtain

$$\displaystyle{\int _{V }vv^{\mathrm{T}}dv = \frac{\omega _{0}s^{n+1}} {n} \mathbb{I}.}$$

⊓⊔

Remarks

 

  1. 1.

    For general symmetric V, there exists κ > 0 such that \(\int _{V }vv^{\mathrm{T}}dv =\kappa \mathbb{I}.\)

  2. 2.

    In [26] explicit formulas for all higher velocity moments V v i v j v k dv were computed.

Now returning to our discussion of the Pearson walk example. We can explicitly compute the diffusion tensor using the above discussion, i.e.

$$\displaystyle{D = \frac{1} {\mu \vert V \vert }\int _{V }vv^{\mathrm{T}}dv = \frac{1} {\mu \vert V \vert }\frac{\omega _{0}s^{n+1}} {n} \mathbb{I},}$$

and since | V |  = s n−1 ω 0, this simplifies to

$$\displaystyle{D = \frac{s^{2}} {\mu n} \mathbb{I}.}$$

This diffusion tensor corresponds to isotropic diffusion, and so the use of the tensor is not necessary, and we can simply use a diffusion coefficient. This gives the isotropic diffusion equation

$$\displaystyle{\bar{p}_{t} = \frac{s^{2}} {\mu n} \varDelta \bar{p}.}$$

2.4.3 Ellipticity of the Diffusion Tensor

The above limit construction leads to a diffusion-like equation (2.25) and the first question is under which condition is the operator ∇⋅ D∇ uniformly parabolic. We will see that here the condition (T3) and the corresponding constant μ 2 are important.

Lemma 5

Assume (T1)–(T4). The diffusion tensor D is uniformly elliptic, i.e.

$$\displaystyle{\exists \kappa > 0\,\text{such that}\varphi \cdot D\varphi \geq \kappa \vert \varphi \vert ^{2}.}$$

Proof

Let \(\varphi \in \mathbb{R}^{n}\) and compute

$$\displaystyle{\varphi \cdot D\varphi = - \frac{1} {\vert V \vert }\int _{V }(\varphi \cdot v)\mathcal{F}(\varphi \cdot v)dv.}$$

Since φ ⋅ v ∈ 〈1〉 ⊥ , we can apply \(\mathcal{F}\) i.e. there exists \(z = \mathcal{F}(\varphi \cdot v)\) and \(\mathcal{L}z =\varphi \cdot v\). Then

$$\displaystyle\begin{array}{rcl} \varphi \cdot D\varphi & =& - \frac{1} {\vert V \vert }\int _{V }\mathcal{L}z(v)z(v)dv {}\\ & \geq & \frac{\mu _{2}} {\vert V \vert }\|z(v)\|_{2}^{2}\mbox{ from our spectral result} {}\\ & =& \frac{\mu _{2}} {\vert V \vert }\int _{V }\left \vert \mathcal{F}\left ( \frac{\varphi } {\vert \varphi \vert }\cdot v\right )\right \vert ^{2}dv\vert \varphi \vert ^{2} {}\\ & \geq & c_{0} \frac{\mu _{2}} {\vert V \vert }\vert \varphi \vert ^{2} {}\\ \end{array}$$

with

$$\displaystyle{c_{0} =\min _{\vert \varphi \vert =1}\int _{V }\vert \mathcal{F}(\varphi \cdot v)\vert ^{2}dv > 0.}$$

Note that indeed c 0 > 0 since \(\|\mathcal{F}\|_{\langle 1\rangle ^{\perp }} > \frac{1} {2\mu }\). The integral \(\int _{V }\vert \mathcal{F}(\varphi \cdot v)\vert ^{2}dv\) does not depend on the choice of φ, since V is symmetric. ⊓ ⊔

Theorem 5 ([51, 12])

Assume (T1)–(T4). The differential operator ∇⋅ D∇ generates an analytic semigroup on \(L^{2}(\mathbb{R}^{n})\) . For \(p(0,.,v) \in L^{2}(\mathbb{R}^{n})\) and \(\bar{p}_{0}(x) =\int p(0,x,v)dv\) there exists a unique global solution \(\bar{p}(x,t)\) of

$$\displaystyle{\bar{p}_{t} = \nabla \cdot D\nabla \bar{p}}$$

with the following properties:

$$\displaystyle\begin{array}{rcl} (i)& \qquad & \bar{p} \in C([0,\infty ),L^{2}(\mathbb{R}^{n})) {}\\ (ii)& & \frac{\partial \bar{p}} {\partial t} \in C^{\infty }((0,\infty ) \times \mathbb{R}^{n}) {}\\ (iii)& & \|\bar{p}(.,t)\|_{\infty }\mbox{ is a decreasing function of }t. {}\\ \end{array}$$

Corollary 1 (Regularity, [57])

For each \(m \in \mathbb{N}\) and each 0 < ϑ < ∞ there exists a constant \(C_{0} = C_{0}(m,\vartheta,\|\bar{p}_{0}(.,t)\|_{2})\) such that

$$\displaystyle{\|\bar{p}\|_{C^{m}((\vartheta,\infty )\times \mathbb{R}^{n})} \leq C_{0}.}$$

2.4.4 Graphical Representations of the Diffusion Tensor

There are two intuitive ways to graphically represent a diffusion tensor: ellipsoids and peanuts. Let D denote a three dimensional diffusion tensor.

  1. 1.

    The fundamental solution of the standard diffusion equation in \(\mathbb{R}^{n}\), i.e.

    $$\displaystyle{u_{t} = \nabla \cdot D\nabla u}$$

    is the multidimensional Gaussian distribution, of the form

    $$\displaystyle{G(x,\tilde{x}) = C\exp {\Bigl ( - (x -\tilde{ x})^{T}D^{-1}(x -\tilde{ x})\Bigr )}.}$$

    with an appropriate normalization constant C. This function describes the probability density of finding a random walker at a distance \(w = x -\tilde{ x}\) from a starting point \(\tilde{x}\). Hence the level sets of w T D −1 w describe locations of equal probability, which is the diffusion ellipsoid:

    $$\displaystyle{\mathcal{E}_{c}:=\{ w \in \mathbb{R}^{n}: w^{T}D^{-1}w = c\}}$$

    If the value of the constant is changed, we will obtain different ellipsoids, though all will be similar in the geometric sense. As such, the constant is often chosen to be equal to 1.

  2. 2.

    The function from \(S^{n-1} \rightarrow \mathbb{R}\) defined as θθ T D θ gives the apparent diffusion coefficient in direction θ, and also the mean squared displacement in that direction and it is called the peanut.

These objects are, in fact not the same. While the probability level sets are ellipsoids, the apparent diffusion coefficient is typically peanut shaped, as can be seen for our examples in Fig. 2.8.

We chose examples of diffusion tensors in diagonal form. If they are not in diagonal form, then the ellipsoids or peanuts are rotated relative to the coordinate axis. The diffusion ellipsoid for a diagonal diffusion matrix D = diag(λ 1, λ 2, λ 3) is

$$\displaystyle{\mathcal{E}_{1} = \left \{w \in \mathbb{R}^{n}: \left (\frac{w_{1}} {\sqrt{\lambda _{1}}}\right )^{2} + \left (\frac{w_{2}} {\sqrt{\lambda _{2}}}\right )^{2} + \left (\frac{w_{3}} {\sqrt{\lambda _{3}}}\right )^{2} = 1\right \},}$$

which is clearly an ellipsoid. The peanut in this case is the map

$$\displaystyle{\theta \mapsto \lambda _{1}\theta _{1}^{2} +\lambda _{ 2}\theta _{2}^{2} +\lambda _{ 3}\theta _{3}^{2}.}$$

In Fig. 2.8 we consider

$$\displaystyle{D_{1}:= \left (\begin{array}{ccc} 5&0&0\\ 0 &3 &0 \\ 0&0&1 \end{array} \right ),\qquad D_{2} = \left (\begin{array}{ccc} 8&0& 0\\ 0 &1 & 0 \\ 0&0&0.2 \end{array} \right ).}$$
Fig. 2.8
figure 8

Left: diffusion ellipsoid. right: the corresponding peanut for the apparent diffusion in direction θ. Top row: example D 1, bottom row, example D 2

Having peanuts and ellipsoids, there is a nice way to visualize the condition of ellipticity of D.

Definition 4

D is uniformly elliptic, if there exists a constant κ > 0 such that

$$\displaystyle{ \theta ^{T} \cdot D\theta >\kappa \vert \theta \vert ^{2}, }$$
(2.26)

for all vectors \(\theta \in \mathbb{R}^{n}\).

Lemma 6

  1. 1.

    The diffusion tensor D is uniformly elliptic, iff the peanut of D contains a ball centered at the origin.

  2. 2.

    The diffusion tensor D is uniformly elliptic, iff the ellipsoid of D contains a ball centered at the origin.

Proof

Let us consider the peanut case first. The map θ → κ | θ | 2 can be written as \(\theta \rightarrow \kappa \theta ^{T}\mathbb{I}\theta\) with the identity matrix \(\mathbb{I}\). Hence it is also a peanut. A very special peanut, in fact, since it is a ball of radius κ. Then condition (2.26) says that the peanut of D contains the peanut of \(\kappa \mathbb{I}\).

Related to the diffusion ellipsoid, we need to work a little more.

  • “⇒” Assume D is uniformly elliptic, and consider v with v T D −1 v = 1. Without restriction, we can study the level set of level 1. We claim:

  • Claim 1: \(\vert v\vert > \sqrt{\kappa }\).

  • To prove Claim 1. we need to show two more statements:

  • Claim 2: inf|ϕ|=1 ∥D ϕ∥≥κ.

  • Assume Claim 2 is not true. Then there exists ϕ 0 with | ϕ 0 |  = 1 such that ∥ D ϕ 0 ∥  < κ. However,

    $$\displaystyle{\kappa =\kappa \vert \phi _{0}\vert ^{2} \leq \phi _{ 0}D\phi _{0} \leq \vert \phi _{0}\vert \|D\phi _{0}\| <\kappa,}$$

    which is a contradiction. Hence Claim 2 is true.

  • Claim 3: \(\|D^{-1}\|_{op} \leq \frac{1} {\kappa }\).

  • Claim 2 implies that κ ∥ ϕ ∥ ≤ ∥ D ϕ ∥ , for all \(\phi \in \mathbb{R}^{n}\). Let z: = D ϕ, such that ϕ = D −1 z. Then

    $$\displaystyle{\kappa \|D^{-1}z\| \leq \| z\|\quad \Longrightarrow\quad \frac{\|D^{-1}z\|} {\vert z\vert } \leq \frac{1} {\kappa }.}$$

    Hence Claim 3 is true.

  • Finally, to prove Claim 1 we estimate:

    $$\displaystyle{1 = v^{T}D^{-1}v \leq \| D^{-1}\|_{ op}\|v\|^{2} \leq \frac{1} {\kappa } \vert v\vert ^{2}}$$

    Hence \(\vert v\vert \geq \sqrt{\kappa }\) and the ellipsoid \(\mathcal{E}_{1}\) contains a ball of radius \(\sqrt{\kappa }\).

  • “ ⇐ =” If the ellipsoid contains a ball of radius r, then it is non degenerate and it has n main axis e i , with lengths α i , i = 1, , n. These can be arranged such that 0 < r ≤ α 1 ≤ α 2 ≤ ⋯ ≤ α n . The main axis vectors are eigenvectors or generalized eigenvectors of D −1 with eigenvalues α i 2, i = 1, , n. Then D has the same eigenvectors and generalized eigenvectors with eigenvalues λ i  = α i −2, i = 1, , n. Then θ T D θ ≥ κ | θ | 2 for

    $$\displaystyle{\kappa:=\min \left \{ \frac{1} {\alpha _{i}^{2}},\imath = 1,\ldots,n\right \} = \frac{1} {\alpha _{n}^{2}}.}$$

⊓⊔

We show an illustration for the case of example D 1 in Fig. 2.9.

Fig. 2.9
figure 9

Left: the peanut of D 1 contains a ball. Right: the ellipsoid of case 1 contains a ball of radius 1

Fig. 2.10
figure 10

(a ): frequency plot of the population after 2000 iterations of a random walk. Notice that the highest number of individuals are found near the starting point (x = 0). (b ): plot showing the average frequency plot over 1000 runs of a population after 2000 iterations (solid line), and the corresponding Gaussian curve (dotted line)

2.4.4.1 An Anisotropic Random Walk

In order to get an approximation of the overall behaviour of a population in an anisotropic random walk, we consider an individual based model in which each individual performs a random walk. We then show a frequency plot of where each individual ends up. We show that the frequency plot closely matches the solution of the corresponding diffusion model.

2.4.4.1.1 1-Dimensional Simulations

As a first simulation, we can perform a random walk in one-dimension. It is important to note that in one dimension, there is no such thing as anisotropy since there is only one direction along which particles can diffuse. Additionally, the one dimensional diffusion equation does not permit a diffusion tensor, and instead can only have a diffusion coefficient. For the one dimensional random walk, we begin with 1000 particles at the origin. At each time step, a random decision is made with equal probability of moving to the left, moving to the right, or staying where it is. Figure 2.10a shows the results of such a simulation after 2000 time steps. There is a higher concentration of individuals near the starting point (x = 0), however for a single simulation, the results are very noisy. As such, it is better to consider an average frequency plot over many runs. Figure 2.10b shows the frequency plot that results when the average frequency over 1000 runs is computed. This is the solid line in the plot. We then use the variance of the points in this distribution to plot the corresponding Gaussian curve. This curve is the dotted line. It is clear from this that the distribution of particles after a one-dimensional random walk very closely approximates a Gaussian distribution.

2.4.4.1.2 2-Dimensional Simulations

As discussed above, when we are in higher dimensions, we can replace our diffusion coefficient with a matrix. When this matrix is a constant multiple of the identity, particles are spreading out with equal rates in all directions, corresponding to isotropic diffusion. When this matrix is not a constant multiple of the identity, particles spread at different rates in different directions. This is anisotropic diffusion. To get a better idea about what anisotropic diffusion looks like, we can compare the results of an isotropic random walk to those of an anisotropic random walk, both in two dimensions.

For an isotropic random walk in two dimensions, we started 1000 individuals at the origin and again let them spread out following some given set of rules. To determine each individual’s next step, a random angle was generated and a constant step size was assigned. The distribution of these individuals after 100 time steps is shown in the left column of the top row of Fig. 2.11. We see that the distribution looks approximately circular with the highest concentration found where the particles began [at (0, 0)]. A frequency plot of this data would be noisy, just as in one dimension, so for the frequency plot we considered the average over 150 runs. The result is shown in the right hand column of the top row of Fig. 2.11. This frequency plot shows a roughly Gaussian distribution, as was seen in one dimension.

Simulation of an anisotropic random walk in two dimensions followed a similar procedure with one notable difference. Instead of choosing a fixed step size, individuals could move further in certain directions. We began with 1000 individuals at the origin, just as we did in the isotropic case, and once again allowed 100 iterations. A random angle was generated for each individual, however the step size depended on this angle. A dominant direction was chosen along which individuals could move further, corresponding to the dominant eigenvector of the diffusion tensor. The step size was then determined by the diffusion ellipsoid. In this case, the dominant direction was chosen to be the positive and negative y-axis. Not surprisingly then, the resulting distribution showed an ellipsoidal shape with the highest concentration found at the origin. More spread occurred in the chosen dominant direction. Such a distribution can be seen in the first column of the second row of Fig. 2.11. The frequency plot for the average distribution over 150 runs is shown in the right hand column of the second row of Fig. 2.11.

Fig. 2.11
figure 11

Top row: (a ). Plot showing the distribution of 1000 particles after 100 iterations of an isotropic random walk. (b ). Frequency plot showing the average frequency over 150 runs of an the isotropic random walk as described in part (a ). Bottom row: (a ). Plot showing the distribution of 1000 particles after 100 iterations of an anisotropic random walk. (b ). Frequency plot showing the average frequency over 150 runs of an the anisotropic random walk as described in part (a )

An exercise such as this allows us to visualize anisotropic diffusion. When these individuals have a preference for diffusing in a given direction, we see that we end up with a distribution that is “stretched” in that direction. For biological situations, such as cancer cells travelling on white matter tracts in the brain, this means we would expect to see the cells travel more along this particular direction than in perpendicular directions. This results in irregular shapes of spread as is commonly seen in cancer models. As such, anisotropy of diffusion tensors provide us a valuable tool for making more accurate and useful models.

2.4.5 Anisotropic vs. Isotropic Diffusion

Now, depending on the form of the diffusion tensor D, we can obtain either anisotropic or isotropic diffusion. As mentioned before, We call the diffusion isotropic if \(D =\alpha \mathbb{I}\) for some α > 0; otherwise diffusion is called anisotropic. For isotropic diffusion the rate of spread is equivalent in all directions. The resulting distributions are spherical in nature. Anisotropic diffusion, however, occurs when the rate of diffusion varies in different directions. This can arise from many biological problems where animals have certain preferred directions of motion. The rates of spread in these directions are effectively higher, and the resulting distributions are ellipsoidal in nature, aligned with the dominant direction of spread.

In this section we will derive criteria on the turning kernel T and on the turning operator \(\mathcal{L}\) that ensure that the corresponding parabolic limit is isotropic. For this we introduce the expected velocity

$$\displaystyle{ \bar{v}(v):=\int _{V }T(v,v')v'dv' }$$
(2.27)

For the Pearson walk, with \(V = s\mathbb{S}^{n-1}\), and \(T(v,v') = \frac{1} {\vert V \vert }\) we find an expected velocity of

$$\displaystyle{\bar{v}(v) =\int _{V } \frac{1} {\vert V \vert }v'dv' = 0.}$$

More generally, if T has the form T(v), then \(\bar{v}(v) = 0\) as well.

Also, if we integrate the expected velocity, then we get zero by condition (T1):

$$\displaystyle{\int _{V }\bar{v}(v)dv =\int _{V }\int _{V }T(v,v')v'dv'dv = 0.}$$

To decide if the diffusion limit is isotropic or anisotropic we compare three statements:

(St1) :

There exists an orthonormal basis \(\{e_{1},\ldots,e_{n}\} \subset \mathbb{R}^{n}\) such that the coordinate mappings \(\phi _{i}: V \rightarrow \mathbb{R},\phi _{i}(v) = v_{i}\) are eigenfunctions of \(\mathcal{L}\) with common eigenvalue λ ∈ (−2μ, 0), for all i = 1, , n.

(St2) :

The expected velocity is parallel to v, i.e.

$$\displaystyle{\bar{v}(v)\|\;v\quad \mbox{ and}\quad \gamma:= \frac{\bar{v}(v) \cdot v} {v^{2}} }$$

is the adjoint persistence with γ ∈ (−1, 1).

(St3) :

There exists a diffusion coefficient d > 0 such that \(D = d\mathbb{I}\) (isotropic).

Theorem 6

Assume (T1)–(T4) and that V is symmetric w.r.t. SO(n), where SO(n) is the special orthogonal group of size n. Then we have the inclusions

$$\displaystyle{(St1)\quad \Leftrightarrow \quad (St2)\quad \Rightarrow (St3).}$$

The constants λ,γ,d are related as

$$\displaystyle{\gamma = \frac{\lambda +\mu } {\mu },\quad d = -\frac{K_{V }} {\vert V \vert \lambda } = \frac{K_{V }} {\vert V \vert \mu (1-\gamma )},}$$

where K V is given by

$$\displaystyle{\int vv^{T}dv = K_{ V }\mathbb{I}.}$$

Moreover, if there is a matrix M such that \(\bar{v}(v) = Mv\) for all v ∈ V then all three statements are equivalent.

Proof

(St1) ⇔ (St2):

$$\displaystyle\begin{array}{rcl} \mbox{ (St1)}& \Leftrightarrow & \mathcal{L}v_{i} =\lambda v_{i},\qquad \forall i {}\\ & \Leftrightarrow & -\mu v_{i} +\mu (\bar{v}(v))_{i} =\lambda v_{i} {}\\ & \Leftrightarrow & (\bar{v}(v))_{i} =\gamma v_{i},\qquad \gamma = \frac{\lambda +\mu } {\mu } {}\\ & \Leftrightarrow & \mbox{ (St2)} {}\\ \end{array}$$

(St1) ⇒ (St3): The coordinate mappings ϕ i are eigenfunctions of \(\mathcal{L}\) and ϕ i  ∈ 〈1〉 ⊥ . Hence ϕ i are also eigenfunctions for \(\mathcal{F}\) with eigenvalue λ −1 for each i = 1, , n. Then

$$\displaystyle\begin{array}{rcl} e_{k}De_{j}& =& - \frac{1} {\vert V \vert }\int _{V }v_{k}\mathcal{F}v_{j}dv {}\\ & =& - \frac{1} {\vert V \vert }\frac{1} {\lambda } \int _{V }v_{k}v_{j}dv {}\\ & =& -\frac{K_{V }} {\vert V \vert \lambda }\delta _{kj} {}\\ \end{array}$$

(St3) ⇒ (St1) see Hillen and Othmer [29] ⊓ ⊔

2.4.5.1 Examples

Example 1, Pearson Walk As seen earlier, for the Pearson walk we have \(\bar{v}(v) = 0\) and consequently also γ = 0. Still, statement (St2) is true and we find isotropic diffusion with diffusion coefficient

$$\displaystyle{d = \frac{K_{V }} {\vert V \vert \mu } = \frac{s^{2}} {n\mu }.}$$

Example 2., Symmetric T Now we again assume \(V = s\mathbb{S}^{n-1}\) but now T is symmetric of the form T(v, v′) = t( | vv′ | ). The expected velocity

$$\displaystyle{\bar{v}(v) =\int _{V }T(v,v')v'dv' =\int _{V }t(\vert v - v'\vert )v'dv',}$$

which is not entirely trivial to compute. To do this, we consider a given v ∈ V. Since \(V = s\mathbb{S}^{n-1}\) is a ball of radius s, the level sets

$$\displaystyle{\varGamma _{a}:=\{ v' \in V: \vert v - v'\vert = a\}}$$

are circles on \(\mathbb{S}^{n-1}\) surrounding v, for a ∈ (−1, 1). Then on Γ a we have t( | vv′ | ) = t(a). Then we can split our integral

$$\displaystyle\begin{array}{rcl} \int _{V }t(\vert v - v'\vert )v'dv'& =& \int _{-1}^{1}\int _{ \varGamma _{a}}t(\vert v - v'\vert )v'dv'da {}\\ & =& \int _{-1}^{1}t(a)\int _{\varGamma _{ a}}v'dv'da {}\\ & =& \int _{-1}^{1}t(a)da\;c_{ 1}\;v {}\\ & =& c_{2}\;v {}\\ \end{array}$$

where we use the fact that the symmetric integral \(\int _{\varGamma _{ a}}v'dv'\) is in direction v and c 1, c 2 are appropriate constants (note c 1 can be negative). Hence \(\bar{v}(v)\) is parallel to v, and statement (St2) holds. Hence the diffusion limit is isotropic.

Example 3, Nonisotropic For this example, we will consider a constant kernel T, perturbed by a second order correction term

$$\displaystyle{T(v,v') = \frac{1} {\vert V \vert } + v^{\mathrm{T}}\mathcal{M}v,\mathrm{with}\mathcal{M}\in \mathbb{R}^{n\times n}\mathrm{and}V = s\mathbb{S}^{n-1}.}$$

Then we have

$$\displaystyle{D = \frac{s^{2}} {n\mu } \left (\mathbb{I} + \frac{\vert V \vert s^{2}} {n} \mathcal{M}\left (\mathbb{I} -\frac{\vert V \vert s^{2}} {n} \mathcal{M}\right )^{-1}\right ),}$$

which is non- isotropic (see details in [29]).

Example 4, Chemotaxis

For our last example, we will define T to be

$$\displaystyle{T(v,v') = \frac{1} {\vert V \vert } +\varepsilon Q(v,v',S)\nabla S}$$

which, as we will derive in the next section, gives a chemotaxis model with

$$\displaystyle{D = \frac{s^{2}} {n\mu } \mathrm{and}\chi (S) = \frac{1} {\vert V \vert }\int _{V }\int _{V }vQ(v,v',S)dv'dv.}$$

For many more examples, see [47].

2.4.6 Chemotaxis

In the case of chemotaxis, the turning rate and the turning kernel might depend on the signal S(x, t). We study these as perturbations (see [47]). Note that we cannot use v for the signal concentration, since it is used for the velocities. Hence here we use S.

$$\displaystyle{T(v,v',S(\cdot )) = T_{0}(v,v') +\varepsilon ^{k}T_{ 1}(v,v',S(\cdot )),}$$
$$\displaystyle{\mu (v,S(\cdot )) =\mu _{0} +\varepsilon ^{\ell}\mu _{1}(v,S(\cdot )),}$$

and study the four pairwise combinations when k,  = 0, 1. We assume that T 0 satisfies (T1)–(T4), and that for T 1 we have

$$\displaystyle{T_{1} \in L^{2},\qquad \int _{ V }T_{1}(v,v',S(\cdot ))dv = 0,}$$
$$\displaystyle{\vert T_{1}(v,v',S)\vert \leq T_{0}(v,v',S).}$$

Consider then the example generated when

$$\displaystyle{T(v,v',S(\cdot )) = T_{0}(v,v') +\varepsilon \alpha (S)(v \cdot \nabla S)}$$

which says it is more likely to choose a new direction in the direction of ∇S. Then

$$\displaystyle\begin{array}{rcl} \mathcal{L}\varphi (v)& =& -\mu \varphi (v) +\mu \int _{V }T(v,v')\varphi (v')dv' +\varepsilon \mu \alpha (S)\int _{V }(v \cdot \nabla S)\varphi (v')dv' {}\\ & =& \mathcal{L}_{0}\varphi (v) +\varepsilon \mu \alpha (S)(v \cdot \nabla S)\bar{\varphi }(x,t), {}\\ & =& \mathcal{L}_{0}\varphi +\varepsilon \mathcal{L}_{1}\varphi {}\\ & & {}\\ \end{array}$$

where \(\bar{\varphi }=\int _{V }\varphi dv\), and \(\mathcal{L}_{1}\varphi =\mu \alpha (S)(v \cdot \nabla S)\bar{\varphi }(x,t).\) Because of the perturbed structure of the right hand side, we cannot directly apply the theory from above. Instead, we again compare orders of ɛ. The scaled transport equation is now

$$\displaystyle{\varepsilon ^{2}p_{ t} +\varepsilon v \cdot \nabla p = \mathcal{L}_{0}p +\varepsilon \mathcal{L}_{1}p}$$

ɛ 0

$$\displaystyle{0 = \mathcal{L}_{0}p_{0} \Rightarrow p_{0} = p_{0}(x,t)}$$

ɛ 1

$$\displaystyle{v \cdot \nabla p_{0} = \mathcal{L}_{0}p_{1} + \mathcal{L}_{1}p_{0}}$$

which is equivalent with

$$\displaystyle{v \cdot \nabla p_{0}(x,t) -\mu \alpha (S)(v \cdot \nabla S)\bar{p}_{0} = \mathcal{L}_{0}p_{1}.}$$

Since \(\bar{p}_{0} =\int _{V }p_{0}(x,t)dv = \vert V \vert p_{0}\) we can write this as

$$\displaystyle{v \cdot \nabla p_{0} -\mu \alpha (S)(v \cdot \nabla S)\vert V \vert p_{0} = \mathcal{L}_{0}p_{1}.}$$

To solve for p 1, we need to check that the left hand side is in the correct space so that we may invert our operator. We thus check the solvability condition

$$\displaystyle{\int _{V }vdv \cdot \nabla p_{0} -\mu \alpha (S)\vert V \vert \int _{V }vdv \cdot \nabla Sp_{0} = 0,}$$

which is true due to the symmetry of V. Then

$$\displaystyle{p_{1} = \mathcal{F}_{0}{\Bigl (v \cdot \nabla p_{0} -\mu \vert V \vert \alpha (S)(v \cdot \nabla S)p_{0}\Bigr )},}$$

where \(\mathcal{F}_{0}\) is the pseudo inverse of the unperturbed part \(\mathcal{L}_{0}\).

ɛ 2

$$\displaystyle{p_{0}{}_{t} + v \cdot \nabla p_{1} = \mathcal{L}_{0}p_{2} +\mu \alpha (S)(v \cdot \nabla S)\bar{p}_{1}}$$

We integrate this last equation to obtain

$$\displaystyle\begin{array}{rcl} & & \vert V \vert p_{0}{}_{t} +\int _{V }v \cdot \nabla \mathcal{F}_{0}\left (v \cdot \nabla p_{0} -\mu \vert V \vert \alpha (S)(v \cdot \nabla S)p_{0}\right )dv {}\\ & =& 0 +\mu \alpha (S)\mathop{\underbrace{\int _{V }v \cdot \nabla Sdv}}\limits _{=0}\bar{p}_{1}. {}\\ \end{array}$$

Hence

$$\displaystyle{\vert V \vert p_{0}{}_{t} + \nabla \cdot \int _{V }v\mathcal{F}_{0}vdv \cdot \nabla p_{0} -\mu \vert V \vert \nabla \cdot \int _{V }v\mathcal{F}_{0}vdv \cdot \alpha (S)\nabla Sp_{0} = 0.}$$

We arrive at a (possibly anisotropic) chemotaxis equation

$$\displaystyle{p_{0}{}_{t} = \nabla \left (D\nabla p_{0} -\mu \vert V \vert \alpha (S)p_{0}D\nabla S\right )}$$

where

$$\displaystyle{D = - \frac{1} {\vert V \vert }\int _{V }v\mathcal{F}_{0}vdv.}$$

Notice that the diffusion tensor D appears in both terms, this means that the chemotaxis term carries the same anisotropy as the diffusion term, as it should, since the cells move in a given (possibly anisotropic) environment and both movement terms should be affected by anisotropy.

Finally, if we consider the Pearson walk with \(T_{0}(v,v') = \frac{1} {\vert V \vert }\) and \(D = \frac{s^{2}} {n\mu } \mathbb{I}\), then we obtain the classical (isotropic) chemotaxis model

$$\displaystyle{p_{0}{}_{t} = \nabla (d\nabla p_{0} -\chi (S)p_{0}\nabla S)}$$

with \(d = \frac{s^{2}} {n\mu }\) and \(\chi (S) = \frac{\vert V \vert \alpha (S)s^{2}} {n}\).

2.4.6.1 Other Cases

We considered an order ɛ perturbation of T in detail in the previous section. We can also consider order one perturbations, and perturbations of μ. Doing this we get into technical challenges that we skip in this manuscript. For details we refer to [47]. Here we simply list the corresponding examples.

Examples

  1. 1.

    In case of bacterial movement, bacteria tend to turn more often if they move down a gradient and less often if they move up a gradient. This can be expressed through a perturbed turning rate

    $$\displaystyle{ \mu (S) =\mu _{0}(1 -\varepsilon b(S)(v \cdot \nabla S)). }$$
    (2.28)

    If we combine this with the Pearson walk for T = 1∕ | V | , then we obtain a chemotaxis model

    $$\displaystyle{p_{0,t} = \nabla (d\nabla p_{0} -\chi (S)p_{0}\nabla S),}$$

    with

    $$\displaystyle{\chi (S) = \frac{s^{2}} {n} b(S).}$$

    The function b(S) describes the signal sensing mechanism of the cells. Here we see how this term enters the chemotaxis model.

  2. 2.

    Amoeba are able to modify their turning rate as well as actively choose a favourable direction. This can be modelled by using a perturbed turning rate as above (2.28) as well as a perturbed turning kernel as we did above. In a special case we consider

    $$\displaystyle{T(v,v',S) = \frac{1} {\vert V \vert }{\Bigl (1 +\varepsilon a(S)(v \cdot S)\Bigr )}.}$$

    Then we obtain a chemotaxis model with chemotactic sensitivity

    $$\displaystyle{\chi (S) = \frac{s^{2}} {n} (a(S) + b(S)),}$$

    hence both effects combine in a linear way.

  3. 3.

    If myxobacteria encounter a stream of myxobacteria moving in a given direction b, then they also turn into that direction. This can be expressed through a special kernel of

    $$\displaystyle{T(v,v') =\kappa (v \cdot b)(v' \cdot b).}$$

    In addition we consider the perturbed turning rate (2.28). The parabolic limit is of chemotaxis form

    $$\displaystyle{p_{0,t} = \nabla (D\nabla p_{0} - V (\,p_{0},S)\nabla S)}$$

    with nonisotropic diffusion

    $$\displaystyle{D = \frac{s^{2}} {\mu _{0}n}\left (\mathbb{I} + \frac{\vert V \vert s^{2}} {n} \kappa bb^{T}\left (\mathbb{I} -\frac{\vert V \vert s^{2}} {n} \kappa bb^{T}\right )^{-1}\right ).}$$

    Unfortunately, we have not been able to give a biological interpretation of this diffusion tensor.

  4. 4.

    It is also possible to include volume constraints into the transport equation framework. For example choosing

    $$\displaystyle{\mu (S) =\mu _{0}{\Bigl (1 -\varepsilon b(S)(v \cdot \nabla S)\beta (\int pdv)\Bigr )},}$$

    where β is a decreasing function. Then

    $$\displaystyle{p_{0,t} = \nabla (d\nabla p_{0} - p_{0}\beta (\,p_{0})\chi (S)\nabla S),}$$

    which is the volume filling chemotaxis model as introduced by Hillen and Painter [28].

2.4.7 Persistence

An important biological quantity is the persistence. It is an indicator for the particles to keep their heading when doing a turn. A particle which never changes direction, i.e. performs a ballistic motion, would have persistence 1, while a Brownian particle has persistence 0. The persistence in the context of transport models is easily defined. Consider a given incoming velocity v′. Then the expected outgoing velocity is

$$\displaystyle{\hat{v}(v'):=\int _{V }T(v,v')vdv}$$

and the average outgoing speed is

$$\displaystyle{\hat{s}:=\int _{V }T(v,v')\|v\|dv.}$$

The index of persistence ψ α is defined as

$$\displaystyle{\psi _{\alpha }(v') = \frac{\hat{v} \cdot v'} {\hat{s}s'} \mathrm{where}\;s' =\| v'\|.}$$

Hence the parameter γ, which we introduced in Theorem 6 is the persistence of the adjoint turning operator, or the adjoint persistence.

Exercise

It is an interesting exercise to find out under which conditions is γ = ψ α . This is certainly true for a symmetric kernel, but is it also true for normal kernels?

2.4.7.1 Example

Assume that turning depends only on the relative angle

$$\displaystyle{\theta:=\mathrm{ arccos}\left (\frac{v \cdot v'} {\|v\|\|v'\|} \right ).}$$

Then T(v, v′) = h(θ(v, v′)) = h(θθ′), h(−θ) = h(θ). For example, in 2-dimensions, with s = 1, we have \(v = \left ({ \cos \theta \atop \sin \theta } \right )\) and for normalization we need

$$\displaystyle{\int _{V }T(v,v')dv =\int _{ 0}^{2\pi }h(\theta -\theta ')d\theta = 1.}$$

This is equivalent to

$$\displaystyle{\int _{-\theta '}^{2\pi -\theta '}h(\alpha )d\alpha = 2\int _{ 0}^{\pi }h(\alpha )d\alpha = 1.}$$

The expected outgoing velocity is

$$\displaystyle\begin{array}{rcl} \hat{v}(\theta ')& =& \int h(\theta -\theta ')\left ({ \cos \theta \atop \sin \theta } \right )d\theta,\mbox{ and with }\alpha:=\theta -\theta ' {}\\ & =& \int h(\alpha )\left ({ \cos (\alpha +\theta ') \atop \sin (\alpha +\theta ')} \right )d\alpha {}\\ & =& \int h(\alpha )\left ({ \cos \alpha \cos \theta ' -\sin \alpha \sin \theta ' \atop \sin \alpha \cos \theta ' +\cos \alpha \sin \theta '} \right )d\alpha {}\\ & =& \left ({ \cos \theta '\int h(\alpha )\cos \alpha d\alpha -\sin \theta '\int \mathop{\underbrace{h(\alpha )\sin \alpha }}\limits _{=0}d\alpha \atop \cos \alpha '\int \mathop{\underbrace{h(\alpha )\sin \alpha }}\limits _{=0}d\alpha +\sin \theta '\int h(\alpha )\cos \alpha d\alpha } \right ) {}\\ & =& \int h(\alpha )\cos \alpha d\alpha \left ({ \cos \theta ' \atop \sin \theta '} \right ). {}\\ \end{array}$$

Then the persistence is given as

$$\displaystyle\begin{array}{rcl} \psi _{\alpha }& =& \hat{v}(\theta ') \cdot v' =\int h(\alpha )\cos \alpha d\alpha (\cos \theta ' \cdot \sin \theta ')\left ({ \cos \theta ' \atop \sin \theta '} \right ) {}\\ & =& \int h(\alpha )\cos \alpha d\alpha, {}\\ \end{array}$$

where we can see why the persistence is sometimes called the mean cosine.

It is similar in 3-dimensions, where we normalize as:

$$\displaystyle{2\pi \int _{0}^{\pi }h(\theta )\sin \theta d\theta = 1}$$

and the persistence turns out to be (we skip the details):

$$\displaystyle{\psi _{\alpha } = 2\pi \int _{0}^{\pi }h(\theta )\cos \theta \sin \theta d\theta }$$

Again this is a mean cosine using the correct θ component of the 2-dimensional surface element in 3-D: sin θ d θ.

Persistence indices are easy to measure based on the above formulas, i.e. one follows individual particle tracks and computes the mean cosine for all the turns. It has been found that slime mold Dictyostelium discoideum have a persistence of about ψ α  = 0. 7, whereas the persistence of E. coli bacteria is about ψ α  = 0. 33.

2.4.8 Summary and Conclusions

In this section we considered the parabolic limit of transport equations in the case of constant equilibrium distribution. The general conditions (T1)–(T4) allowed us to develop a full theory including classifications into isotropic and anisotropic diffusion and including standard chemotaxis models. However, some important examples such as T(v, v′) = q(v) are not included, and the question of what to do with these cases remains. We recommend the following original publications for extended theory and applications to glioma growth and wolf movement.

2.5 Further Reading for Transport Equations in Oriented Habitats

Transport equations for movement in oriented habitats falls outside the theory developed here. In fact, as we have seen in example Sect. 2.3.4.2, these models do not satisfy condition (T4), and also condition (T3) is problematic. Hence the mathematical framework needs to be changed accordingly. The key is a split of the L 2(V ) space into the kernel of L and it’s orthogonal complement. We are unable to develop this theory here, hence we just refer to the pertinent literature for further reading on theory and applications. While these applications were not discussed here, the material of this chapter should provide the reader with the tools that they need to understand the further readings.

  • In [27, 49] we introduced a transport equation model for the migrative movement of mesenchymal cells in collagen tissues. Careful modelling and simulations revealed effects to network formations and finger-like invasions of cancer metastasis.

  • In [31] the theory is formally extended to the case of oriented habitats. We not only consider the parabolic scaling, but we also discuss alternative scaling methods such as the hyperbolic scaling and the moment closure method, and we discuss their relations. One example in [31] is an application to wolf movement along seismic lines.

  • In [50, 13] the transport equation framework is developed for application to brain tumor spread (glioma, glioblastoma multiform). We make extensive use of a new MRI imaging technique called diffusion tensor imaging (DTI). The nonisotropic diffusion limit of a transport model allows us to include DTI imaging into a glioma spread model. The above manuscripts contain the modelling details and we are currently working on model validation with clinical data.

  • In [34] we develop a fully measure-valued solution theory for the transport equations. Measure valued solutions arise naturally in highly aligned tissues and the classical L 1 or L 2 theories for transport equations are no longer sufficient. We were able to use this framework to identify non classical pointwise steady states, which explain observed network structures.

  • in [58] we give a full analysis of the one-dimensional transport model for movement in oriented habitats.