Stationary Stochastic Processes

Chorin, Alexandre J.; Hald, Ole H.

doi:10.1007/978-1-4614-6980-3_6

Alexandre J. Chorin⁶ &
Ole H. Hald⁷

Part of the book series: Texts in Applied Mathematics ((TAM,volume 58))

3875 Accesses

Abstract

This chapter is devoted to further topics in the theory of stochastic processes and their applications. We start with a weaker definition of a stochastic process that is sufficient in the study of stationary processes. We said before that a stochastic process is a function u of both a variable ω in a probability space and a continuous parameter t, making u a random variable for each t and a function of t for each ω. We made statements about the kind of function of t that was obtained for each ω. The definition here is less specific about what happens for each ω.

Access provided by Autonomous University of Puebla. Download chapter PDF

Stochastic Processes

Orientation

Mathematical background on stochastic processes

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

6.1 Weak Definition of a Stochastic Process

This chapter is devoted to further topics in the theory of stochastic processes and their applications. We start with a weaker definition of a stochastic process that is sufficient in the study of stationary processes. We said before that a stochastic process is a function u of both a variable ω in a probability space and a continuous parameter t, making u a random variable for each t and a function of t for each ω. We made statements about the kind of function of t that was obtained for each ω. The definition here is less specific about what happens for each ω.

Consider a collection of random variables u(t, ω) ∈ ℂ parametrized by t.

Definition.

We say that u(t, ω) is a real-valued stochastic process if for every finite set of points t ₁, …, t _n, the joint distribution of u(t ₁, ω), …, u(t _n, ω) is known:

$$\displaystyle{ F_{t_{1},\ldots ,t_{n}}(y_{1},\ldots ,y_{n}) = P(u(t_{1}) \leq y_{1},\ldots ,u(t_{n}) \leq y_{n}). }$$

The family of functions $F_{t_{1},\ldots ,t_{n}}(y_{1},\ldots ,y_{n})$ must satisfy some natural requirements:

1.
F ≥ 0.
2.
F(∞, …, ∞) = 1 and F( − ∞, …, − ∞) = 0.
3.
$F_{t_{1},\ldots ,t_{n}}(y_{1},\ldots ,y_{m},\infty ,\ldots ,\infty ) = F_{t_{1},\ldots ,t_{m}}(y_{1},\ldots ,y_{m})$.
4.
If (i ₁, …, i _n) is a permutation of (1, …, n), then
$$\displaystyle{ F_{t_{i_{ 1}},\ldots ,t_{i_{n}}}(y_{i_{1}},\ldots ,y_{i_{n}}) = F_{t_{1},\ldots ,t_{n}}(y_{1},\ldots ,y_{n}). }$$

This definition has a natural extension to complex-valued processes, in which one assumes that one knows the joint distribution of the real and complex parts of u.

A moment of u(t, ω) of order q is an object of the form

$$\displaystyle{ M_{i_{1},\ldots ,i_{n}} = E[{u}^{i_{1} }(t_{1})\cdots {u}^{i_{n} }(t_{n})],\;\;\;\sum _{j=1}^{n}i_{ j} = q. }$$

If a stochastic process has finite moments of order q, it is a process of order q. The moment

$$\displaystyle{ E[u(t,\omega )] = m(t) }$$

is the mean of u at t. The function

$$\displaystyle{ E\left [(u(t_{1},\omega ) - m(t_{1}))\overline{(u(t_{2},\omega ) - m(t_{2}))}\right ] = R(t_{1},t_{2}) }$$

is the covariance of u. Let us list the properties of the covariance of u:

1.
$R(t_{1},t_{2}) = \overline{R(t_{2},t_{1})}$.
2.
R(t ₁, t ₁) ≥ 0.
3.
$\vert R(t_{1},t_{2})\vert \leq \sqrt{R(t_{1 } , t_{1 } )R(t_{2 } , t_{2 } )}$.
4.
For all t ₁, …, t _n and all z ₁, …, z _n ∈ ℂ,
$$\displaystyle{ \sum _{i=1}^{n}\sum _{ j=1}^{n}R(t_{ i},t_{j})z_{i}\overline{z_{j}} \geq 0. }$$

The first three properties are easy to establish; the fourth is proved as follows: For any choice of complex numbers z _j, the sum

$$\displaystyle{ \sum _{i=1}^{n}\sum _{ j=1}^{n}R(t_{ i},t_{j})z_{i}\overline{z_{j}} }$$

is by definition equal to

$$\displaystyle{ E\left [{\left \vert \sum _{j=1}^{n}\left (u(t_{ j}) - m(t_{j})\right )z_{j}\right \vert }^{2}\right ] \geq 0 }$$

(i.e., to the expected value of a nonnegative quantity).

Definition.

A process is stationary in the strict sense if for every t ₁, …, t _n and T ∈ ℝ,

$$\displaystyle{ F_{t_{1},\ldots ,t_{n}}(y_{1},\ldots ,y_{n}) = F_{t_{1}+T,\ldots ,t_{n}+T}(y_{1},\ldots ,y_{n}). }$$

For a stochastic process that is stationary in this sense, all moments are constant in time, and in particular, m(t) = m and R(t ₁, t ₂) = R(t ₁ + T, t ₂ + T) for all T. Choose T = − t ₂; then R(t ₁, t ₂) = R(t ₁ − t ₂, 0), and it becomes reasonable to define

$$\displaystyle{ R(t_{1} - t_{2}) = R(t_{1},t_{2}), }$$

where the function R on the left side, which has only one argument, is also called R in the hope that there is no ambiguity. Note that R(T) = R(t + T, t).

The above properties become, for the new function R,

1.
$R(t) = \overline{R(-t)}$.
2.
R(0) ≥ 0.
3.
| R(t) | ≤ R(0).
4.
For all t ₁, …, t _n and all z ₁, …, z _n ∈ ℂ,
$$\displaystyle{ \sum _{i}^{n}\sum _{ j}^{n}R(t_{ i} - t_{j})z_{i}\overline{z_{j}} \geq 0. }$$
(6.1)

Definition.

A stochastic process is stationary in the wide sense if it has a constant mean and its covariance depends only on the difference between the arguments, i.e.,

1.
m(t) = m.
2.
R(t ₁, t ₂) = R(t ₁ − t ₂).

If a stochastic process is stationary in the wide sense and Gaussian, then it is stationary in the strict sense (because a Gaussian process is fully determined by its mean and covariances). Brownian motion is not stationary. White noise is stationary (but ill defined without appeal to distributions).

We now consider some instances of processes that are stationary in the wide sense. Pick ξ ∈ ℂ to be a random variable and h(t) a nonrandom function of time, and consider the process u(t, ω) = ξh(t). Assume for simplicity that h(t) is differentiable, and determine when a process of this type is stationary in the wide sense. Its mean is

$$\displaystyle{ m(t) = E[\xi h(t)] = h(t)E[\xi ], }$$

which is constant if and only if h(t) is constant or E[ξ] = 0. Suppose E[ξ] = 0. The covariance

$$\displaystyle{ R(t_{1},t_{2}) = E[\xi h(t_{1})\overline{\xi }\,\overline{h(t_{2})}] = E[\xi \overline{\xi }]h(t_{1})\overline{h(t_{2})} }$$

must depend only on the difference t ₁ − t ₂. Consider the special case t ₁ = t ₂ = t. In this case, the covariance $E[\xi \overline{\xi }]h(t)\overline{h(t)}$ must be R(0); hence $h(t)\overline{h(t)}$ must be constant. Therefore, h(t) is of the form

$$\displaystyle{ h(t) = A{e}^{i\phi (t)}, }$$

(6.2)

where A is a constant and ϕ(t) a function of t that remains to be determined. Now we narrow the possibilities some more. Suppose A ≠ 0. Then

$$\displaystyle{ R(t_{1} - t_{2}) = \vert A{\vert }^{2}E[\xi \overline{\xi }]{e}^{i\phi (t_{1})-i\phi (t_{2})}. }$$

Set t ₁ − t ₂ = T and t ₂ = t. Then

$$\displaystyle{ R(T) = \vert A{\vert }^{2}E[\xi \overline{\xi }]{e}^{i[\phi (t+T)-\phi (t)]} }$$

for all t, T. Since $R(T) = \overline{R(-T)}$, we see that

$$\displaystyle{ \frac{\phi (t + T) - 2\phi (t) +\phi (t - T)} {{T}^{2}} = 0. }$$

Letting T → 0 gives ϕ′(t) = 0 for all t, so ϕ(t) = λt + β, where λ, β are constants. Also e ^iβ is a constant. We have therefore shown that the process u(t, ω) = ξh(t) is stationary in the wide sense if h(t) = Ce ^iλt (where C, λ are constants) and E[ξ] = 0.

6.2 Covariance and Spectrum

In the last section, we presented an example of a stationary stochastic process in the wide sense, given by u(t, ω) = ξe ^iλt, where ξ is a random variable with mean 0. This stochastic process has a covariance of the form

$$\displaystyle{ R(T) = R(t_{1},t_{2}) = R(t_{1} - t_{2}) = E[\vert \xi {\vert }^{2}]{e}^{i\lambda T}, }$$

where T = t ₁ − t ₂. Now we want to generalize this example. First, we try to construct a process of the form

$$\displaystyle{ u(t,\omega ) =\xi _{1}{e}^{i\lambda _{1}t} +\xi _{ 2}{e}^{i\lambda _{2}t}, }$$

with λ ₁ ≠ λ ₂. Then $E[u] = E[\xi _{1}]{e}^{i\lambda _{1}t} + E[\xi _{2}]{e}^{i\lambda _{2}t}$, which is independent of t if E[ξ ₁] = E[ξ ₂] = 0. The covariance is

$$\begin{array}{lr} E\left [(\xi _{1}{e}^{i\lambda _{1}t_{1} } +\xi _{2}{e}^{i\lambda _{2}t_{1} })(\overline{\xi _{1}}{e}^{-i\lambda _{1}t_{2} } + \overline{\xi }_{2}{e}^{-i\lambda _{2}t_{2} })\right ] \\ = E\left [\vert \xi _{1}{\vert }^{2}{e}^{i\lambda _{1}T} + \vert \xi _{ 2}{\vert }^{2}{e}^{i\lambda _{2}T} +\xi _{ 1}\overline{\xi }_{2}{e}^{i\lambda _{1}t_{2}-i\lambda _{2}t_{2} } + \overline{\xi }_{1}\xi _{2}{e}^{i\lambda _{1}t_{1}-i\lambda _{2}t_{2} }\right ], \end{array}$$

which can be stationary only if $E[\xi _{1}\overline{\xi }_{2}] = 0$. Then u(t, ω) is stationary and

$$\displaystyle{ R(T) = E[\vert \xi _{1}{\vert }^{2}]{e}^{i\lambda _{1}T} + E[\vert \xi _{ 2}{\vert }^{2}]{e}^{i\lambda _{2}T}. }$$

More generally, a process $u =\sum _{j}\xi _{j}{e}^{i\lambda _{j}t}$ is stationary in the wide sense if $E[\xi _{j}\overline{\xi _{k}}] = 0$ when j ≠ k and E[ξ _i] = 0. In this case,

$$\displaystyle{ R(T) =\sum E\left [\vert \xi _{j}{\vert }^{2}\right ]{e}^{i\lambda _{j}T}. }$$

This expression can be rewritten in a more useful form as a Stieltjes integral. Recall that when q is a nondecreasing function of x, the Stieltjes integral of a function h with respect to q is defined to be

$$\displaystyle{ \int h\,dq =\lim _{\text{max}\{x_{i+1}-x_{i}\}\rightarrow 0}\sum h(x_{i}^{{\ast}})[q(x_{ i+1}) - q(x_{i})], }$$

where x _i ≤ x _i ^∗ ≤ x _i + 1. If q is differentiable, then

$$\displaystyle{ \int _{a}^{b}h\,dq =\int _{ a}^{b}hq^{\prime}\,dx. }$$

Suppose q(x) is the step function

$$\displaystyle{ q(x) = \left \{\begin{array}{@{}l@{\quad }l@{}} 0,\quad &x < c,\\ q_{ 0}\quad &x \geq c, \end{array} \right. }$$

with a ≤ c ≤ b. Then ∫ _a ^b h dq = h(c)q ₀ if h is continuous at c. We define the function G = G(k) by

$$\displaystyle{ G(k) =\sum _{\begin{array}{c}\{j\vert \lambda _{j}\leq k\}\end{array}}E[\vert \xi _{j}{\vert }^{2}]; }$$

i.e., G(k) is the sum of the expected values of the squares of the amplitudes of the complex exponentials with frequencies less than or equal to k. Then R(T) becomes

$$\displaystyle{ R(T) =\int _{ -\infty }^{+\infty }{e}^{ikT}dG(k). }$$

We shall now see that under some technical assumptions, this relation holds for all stochastic processes that are stationary in the wide sense. Indeed, we have the following theorem.

Theorem 6.1 (Khinchin).

1.
If R(T) is the covariance of a stochastic process u(t,ω), stationary in the wide sense such that
$$\displaystyle{ \lim _{h\rightarrow 0}E\left [\vert u(t + h) - u(t){\vert }^{2}\right ] = 0, }$$
then R(T) = ∫e ^ikT dG(k) for some nondecreasing function G(k).
2.
If a function R(T) can be written as ∫e ^ikT dG(k) for some nondecreasing function G, then there exists a stochastic process, stationary in the wide sense, satisfying the condition in part (1) of the theorem, that has R(T) as its covariance.

Khinchin’s theorem follows from the inequalities we have proved for R; indeed, one can show (but we will not do so here) that a function that satisfies these inequalities is the Fourier transform of a nonnegative function. If it so happens that dG(k) = g(k) dk, then R(T) = ∫e ^ikT g(k) dk, and g(k) is called the spectral density of the process. Thus, Khinchin’s theorem states that the covariance function is a Fourier transform of the spectral density. Hence, if we know R(T), we can compute the spectral density by

$$\displaystyle{ g(k) = \frac{1} {2\pi }\int _{-\infty }^{+\infty }{e}^{-ikT}R(T)\,dT. }$$

For a nonrandom periodic function, one can define an energy per wave number k as the squared amplitude of the kth Fourier coefficient; for a nonrandom aperiodic function, one can define the energy per wave number as the squared magnitude of the Fourier transform. The samples of a stationary stochastic process do not have Fourier transforms in the usual sense, because they do not tend to zero at ± ∞, but one can still define an average energy per wave number for a stationary stochastic process by the Fourier transform of the covariance.

Example.

Consider white noise, the derivative (in a sense we have not discussed) of Brownian motion. One can show that R(T) = δ(T) (see the exercises). Its spectral density (interpreted carefully) is ϕ(k) = 1 ∕ 2π; thus, all frequencies have the same amplitude. The adjective “white” comes from the fact that in white light, all frequencies are present with the same amplitude. A stationary random function that is not white noise is called colored noise.

6.3 The Inertial Spectrum of Turbulence

To illustrate these constructions, we now derive the spectrum of fully developed turbulence. We do not write down the equations of motion; the only properties of these equations that will be used here are that (a) they are nonlinear, and (b) energy dissipation by viscosity is proportional to an integral over the domain of the sum of the squares of the derivatives of the velocity field (a quantitive description of this property will be given below).

Consider turbulence in a fluid, far from any solid boundaries, with the Reynolds number Re = Uℓ ₀ ∕ ν very large, where U is a typical velocity difference in the flow, ℓ ₀ is a length scale for the flow, and ν is the viscosity; the dimensionless number Re is large when the velocity differences are large and the viscosity is small, which are the circumstances when turbulence appears; U is chosen to be a typical velocity difference rather than a typical velocity because a velocity component common to the whole flow field is not relevant when one is studying turbulence. The large scales of turbulent flow are typically driven by large-scale forcing (e.g., in the case of meteorology, by the rotation of the earth around its axis and around the sun); turbulence is characterized by the transfer of energy from large scales to smaller scales at which the energy is dissipated. One usually assumes that as the energy moves to large wave numbers k (i.e., small scales), the specifics of the forcing are forgotten and the flow can be viewed as approximately homogeneous (translation-invariant) and isotropic (rotation-invariant) at small scales, and that the properties of the flow at small scales are universal (i.e., independent of specific geometry and forcing). One further assumes that the solutions of the equations of fluid mechanics can be viewed as random; how nonrandom equations produce solutions that can be viewed as random is an interesting question that we will not discuss here.

Assume that the velocity field is homogeneous, i.e., statistically translation-invariant in space (not in time, as was implicitly assumed in the previous section through the choice of the letter t for the parameter). The velocity field in three space dimensions is a vector quantity: u = (u ₁, u ₂, u ₃). Each of these components is a function of the three spatial variables x ₁, x ₂, x ₃. A Fourier transform in three-dimensional space can be defined and is a function of three Fourier variables k ₁, k ₂, k ₃ that correspond to each of the spatial variables, and we write k = (k ₁, k ₂, k ₃). One can define a covariance matrix

$$\displaystyle{ R_{ij}(\mathbf{r}) = E[u_{i}(\mathbf{x})u_{j}(\mathbf{x + r})], }$$

where r is a three-component vector; then Khinchin’s theorem becomes

$$\displaystyle{ R_{ii}(\mathbf{r}) =\int _{ -\infty }^{\infty }{e}^{i\mathbf{k}\cdot \mathbf{r}}\,dG_{ ii}(\mathbf{k}), }$$

(6.3)

where k = (k ₁, k ₂, k ₃), k ⋅r is the ordinary Euclidean inner product, and the functions G _ii are nondecreasing. Without loss of generality in what follows, one can write dG _ii(k) = g _ii(k) dk ₁ dk ₂ dk ₃ (this is so because all we will care about is the dimensions of the various quantities, which are not affected by a possible lack of smoothness). Setting r = 0 in Eq. (6.3) and summing over i, we find that

$$\displaystyle{E[u_{1}^{2} + u_{ 2}^{2} + u_{ 3}^{2}] =\int _{ -\infty }^{\infty }(g_{ 11} + g_{22} + g_{33})dk_{1}\,dk_{2}\,dk_{3}.}$$

We define the left-hand side of this equation to be the specific energy (i.e., energy per unit volume) of the flow and denote it by E[u ²]. Splitting the integration into an integration in a polar variable k and integrations over angular variables, one can write

$$\displaystyle{E[{u}^{2}] =\int _{ 0}^{\infty }E(k)dk,}$$

with

$$\displaystyle{ E(k) =\int _{k_{1}^{2}+k_{2}^{2}+k_{3}^{2}={k}^{2}}\left (g_{11} + g_{22} + g_{33}\right )dS(\mathbf{k}), }$$

where dS(k) is an element of area on a sphere of radius k. We define E(k) to be the energy spectrum; it is a function only of $k = \sqrt{k_{1 }^{2 } + k_{2 }^{2 } + k_{3 }^{2}}$. The energy spectrum can be thought of as the portion of the energy that can be imputed to motion with wave numbers of magnitude k.

The kinetic energy of the flow is proportional to the square of the velocity, whereas energy dissipation is proportional to the square of the derivatives of the velocity; in spectral variables (i.e., after Fourier transformation), differentiation becomes multiplication by k, the Fourier variable. A calculation, which we skip, shows that D, the energy dissipation per unit volume D, can be written as

$$\displaystyle{D =\int _{ 0}^{\infty }{k}^{2}E(k)dk,}$$

where E(k) is the energy spectrum. This calculation requires some use of the equations of motion, and this is the only place where those equations are made use of in the argument of this section.

It is plausible that when Re is large, the kinetic energy resides in a range of k’s disjoint from the range of k’s where the dissipation is taking place, and indeed, experimental data show it to be so; specifically, there exist wave numbers k ₁ and k ₂ such that

$$\displaystyle{ \int _{0}^{k_{1} }E(k)\,dk \sim \int _{0}^{\infty }E(k)\,dk,\;\;\int _{ k_{2}}^{\infty }{k}^{2}E(k)\,dk \sim \int _{ 0}^{\infty }{k}^{2}E(k)\,dk, }$$

with k ₁ ≪ k ₂. This observation roughly divides the spectrum into three pieces: (a) the range between 0 and k ₁, the energy range, where most of the energy resides; what happens in this range depends on the boundary and initial conditions and must be determined separately for each turbulent flow; (b) the dissipation rangek > k ₂, where the energy is dissipated; and (c) the intermediate range between k ₁ and k ₂; this range is the conduit through which turbulence moves energy from the energy range to the dissipation range, and it is responsible for the enhanced dissipation produced by turbulence (see Fig. 6.1). One can hope that the properties of turbulence in the intermediate range are universal, i.e., independent of the particular flow one is studying. The nonlinearity of the equations couples the energy range to the intermediate range, and if one can find the universal properties of the intermediate range, one can use them to compute in the energy range. We now determine these universal properties.

We will be relying on dimensional analysis (see Chap. 1). The spectrum in the intermediate range E(k) is a function of k, the viscosity ν, the length scale of the turbulence ℓ ₀, the amplitude U of the typical velocity difference in the flow, and the rate of energy dissipation ε. This last variable belongs here because energy is transferred from the low-k domain through the intermediate range into the large-k domain, where it is dissipated; the fact that ε belongs in the list was the brilliant insight of Kolmogorov.

Our basic units are the units of length and of time. Suppose the former is reduced by a factor L and the latter by a factor T. The dimension of the viscosity is L ² ∕ T, that of ε is L ² ∕ T ³, that of k is 1 ∕ L, and the equation E[u ²] = ∫E(k) dk shows that the dimension of E is L ³ ∕ T ². Dimensional analysis yields E(k)(ε ^{− 2 ∕ 3} k ^5 ∕ 3) = Φ(Re, ℓ ₀ k) for some unknown function Φ of the two large arguments Re and ℓ ₀ k; Re is large because this is the condition for fully developed turbulence to appear, and ℓ ₀ k is large in the intermediate range of scales. If the function Φ has a finite nonzero limit C as its arguments grow (an assumption of complete similarity), one can deduce E(k) = Cε ^2 ∕ 3 k ^{− 5 ∕ 3}, which is the famous Kolmogorov–Obukhov scaling law for the intermediate range of fully developed turbulence, the cornerstone of turbulence theory. Note that the viscosity has dropped out from this result, leading to the conclusion that the dynamics of the intermediate range are purely inertial, i.e., independent of viscosity; this is why the intermediate range is usually called the inertial range.

This law is not fully satisfactory for various reasons, and a number of correction schemes have been proposed over the years. In recent years, it has been argued that the unknown function Φ behaves, as its arguments tend to infinity, like C(Re)(ℓ ₀ k)^{− d ∕ log(Re)}Φ₀(Re, ℓ ₀ k), where it is Φ ₀ that tends to a nonzero constant as its arguments grow, C(Re) is a function of Re, and d is a positive constant; the exponent − d ∕ log(Re) is an anomalous exponent. This is an assumption of incomplete similarity, which leads, for large Re and ℓ ₀ k, to the relation

$$\displaystyle{ E(k) = C{(Re)\epsilon }^{2/3}{k}^{-5/3}{(\ell_{ 0}k)}^{-d/\log (Re)}. }$$

The exponent − 5 ∕ 3 is corrected by the small quantity − d ∕ log(Re); this quantity is a function of the Reynolds number Re, but its variation with Re is slow. However, this correction violates the assumption that the intermediate range is purely inertial. Other proposals for the anomalous exponent, without a dependence on Re, have also been made.

6.4 Time Series

Suppose we are observing a stochastic process u(t, ω), have been observing it long enough to know that it is stationary and to determine its temporal covariances, and suppose we are given observed values U(s) of u(t, ω) for s ≤ t (we denote observed values by capital letters). The question we address in this section is how to predict a value for u(t + T, ω) based on the information we have. For simplicity, we shall do so only for a stationary random sequence.

Definition.

A stationary random sequence is a collection u(t, ω) of random variables for t = 0, 1, 2, 3, … as well as for t = − 1, − 2, − 3, … such that the joint distribution of every subset is known, subject to the obvious compatibility conditions, and such that all the distributions are invariant under the transformation t → t + T for T an integer. Such sequences are also known as time series.

Assume E[u(t)] = 0. The covariance is

$$\displaystyle{ R(T) = E[u(t + T)\overline{u(t)}], }$$

where T ∈ ℤ satisfies, as before, the following conditions:

1.
R(0) ≥ 0.
2.
| R(T) | ≤ R(0).
3.
$R(T) = \overline{R(-T)}$.
4.
$\sum _{i,j}R(i - j)z_{i}\overline{z_{j}} \geq 0$.

If u(t, ω) = ξ(ω)h(t) is stationary, we can repeat the arguments in Sect. 4.1. Since R(0) = E[ | u | ²] = E[ | ξ | ²] | h(t) | ², we see that h(t) = Ae ^iϕ(t) for t = 0, ± 1, …. Since $R(1) = \overline{R(-1)}$, we obtain

$$\displaystyle{ \phi (t + 1) -\phi (t) = -(\phi (t - 1) -\phi (t))\, \mathbin{\rm mod}\,\,\,2\pi }$$

for t = 0, ± 1, …. Setting ϕ(0) = α and ϕ(0) − ϕ( − 1) = λ, we find by induction that ϕ(t) = α + λt mod 2π. Consequently, h(t) = Ae ^{i(α + λt)} = Ce ^iλt for all integers t, where C = Ae ^iα is a possibly complex constant and λ is an integer.

Define a periodic function g of the argument k by

$$\displaystyle{ g(k) = \frac{1} {2\pi }\sum _{T=-\infty }^{+\infty }R(T){e}^{-iTk}, }$$

where T takes on integer values. Note that if R(T) does not converge rapidly enough to 0 as | T | increases, g may not be smooth. Then R(T) = ∫ ^π _− π e ^iTk g(k)dk. (The factor 2π of Fourier theory is broken up here differently from how we did it before.)

One can show that if R(T) is a covariance for a time series, then g ≥ 0. Conversely, if R(T) is given for all integers T, and if $\frac{1} {2\pi }\sum _{T}R(T){e}^{-iTk} \geq 0$, then there exists a time series for which R(T) is the covariance. This is Khinchin’s theorem for a time series.

Consider the problem of finding an estimate for u(t + m, ω) when one has values u(t − n), u(t − (n − 1)), …, u(t − 1). Nothing is assumed here about the mechanism that produces these values; all we are going to use is the assumed fact that the time series is stationary, and that we know the covariance. If the covariance vanishes whenever T ≠ 0, then the u(t) are uncorrelated, and no useful prediction can be made. We would like to find a random variable $\hat{u}(t + m,\omega )$ with m = 0, 1, 2, … such that

$$\displaystyle{ E\left [\vert u(t + m,\omega ) -\hat{ u}(t + m,\omega ){\vert }^{2}\right ] }$$

is as small as possible. We know from Sect. 2.3 that

$$\displaystyle{ \hat{u}(t + m,\omega ) = E[u(t + m,\omega )\vert u(t - 1),u(t - 2),\ldots ,u(t - n)]. }$$

The way to evaluate $\hat{u}$ is to find a basis {ϕ _i} in the space of functions of {u(t − n), …, u(t − 1)}, expand $\hat{u}$ in this basis, i.e.,

$$\displaystyle{ \hat{u} =\sum _{ j=1}^{n}a_{ j}\phi _{j}(u(t - 1),\ldots ,u(t - n)), }$$

and calculate the coefficients a _j of the expansion. This is hard in general. We simplify the problem by looking only for the best approximation in the span of {u(t − 1), …, u(t − n)}, i.e., we look for a random variable

$$\displaystyle{ \hat{u}(t + m,\omega ) =\sum _{ j=1}^{n}a_{ j}u(t - j,\omega ). }$$

This is called linear prediction. The span L of the u(t − j, ω) is a closed linear space; therefore, the best linear prediction minimizes

$$\displaystyle{ E\left [\vert u(t + m,\omega ) -\hat{ u}(t + m,\omega ){\vert }^{2}\right ] }$$

for $\hat{u}$ in L. What we have to do is to find {a _j}ⁿ _j = 1 such that

$$\displaystyle{ E\left [{\left \vert u(t + m,\omega ) -\sum _{j=1}^{n}a_{ j}u(t - j,\omega )\right \vert }^{2}\right ] }$$

is as small as possible. We have

$$\displaystyle\begin{array}{llllll} E[\vert u & - \hat{u}{\vert }^{2}] \\ & = E\left [\left (u(t + m) -\sum _{j}a_{j}u(t - j)\right )\overline{\left (u(t + m) -\sum _{l}a_{l}u(t - l)\right )}\right ] \\ & = E\left [u(t + m)\overline{u(t + m)} -\sum _{l}\overline{a_{l}}u(t + m)\overline{u(t - l)}\right. \\ & \left.-\sum _{j}a_{j}\overline{u(t + m)}u(t - j) +\sum _{j}\sum _{l}a_{j}\overline{a_{l}}u(t - j)\overline{u(t - l)}\right ] & \\ & = R(0) - 2\mathrm{Re}\left (\sum _{j}\overline{a_{j}}R(m + j)\right ) +\sum _{j}\sum _{l}a_{j}\overline{a_{l}}R(l - j), & \\ \end{array}$$

which is minimized when

$$\displaystyle{ \frac{1} {2}\,\frac{\partial E\left [\vert u -\hat{ u}{\vert }^{2}\right ]} {\partial \overline{a_{j}}} = -R(m + j) +\sum _{ l=1}^{n}a_{ l}R(j - l) = 0 }$$

(6.4)

for j = 1, …, n. Here we use the fact that if $q(x,y) = Q(x + iy,x - iy) = Q(z,\bar{z})$ is real, then q _x = q _y = 0 if and only if $Q_{\bar{z}} = 0$ or Q _z = 0 (see also Exercise 6, Chap. 1). The uniqueness of the solution of the system (6.4) and the fact that this procedure gives a minimum are guaranteed by the orthogonal projection theorem for closed linear spaces (see Sect. 1.1). The problem of prediction for time series has been reduced (in the linear approximation) to the solution of n linear equations in n unknowns. This concludes our general discussion of prediction for time series.

We now turn to a special case in which this linear system of equations can be solved analytically with the help of complex variables. The reader not familiar with contour integration should fast forward at this point to the next section. Rewrite (6.4) in terms of the Fourier transform. The spectral representation of R(T) is

$$\displaystyle{ R(T) =\int _{ -\pi }^{\pi }{e}^{ikT}g(k)\,dk. }$$

Then (6.4) becomes

$$\displaystyle{ \int _{-\pi }^{\pi }\left (-{e}^{i(j+m)k} +\sum _{ l=1}^{n}a_{ l}{e}^{i(j-l)k}\right )g(k)\,dk = 0. }$$

Moving e ^ijk outside the parentheses, we get

$$\displaystyle{ \int _{-\pi }^{\pi }{e}^{ijk}\left ({e}^{imk} -\sum _{ l=1}^{n}a_{ l}{e}^{-ilk}\right )g(k)\,dk = 0. }$$

(6.5)

So far, (6.5) is just a reformulation of (6.4). To continue, we need an explicit representation of g(k). Consider the special case R(T) = Ca ^| T | for T = 0, ± 1, ± 2, …, where C > 0 and 0 < a < 1. Is R the covariance of a stationary process? It certainly satisfies conditions 1, 2, 3. To check condition 4, we compute

$$\displaystyle\begin{array}{lll} g(k)& = \frac{1} {2\pi }\sum _{n=-\infty }^{\infty }R(n){e}^{-ink} & \\ & = \frac{C} {2\pi } \left [\sum _{n=1}^{\infty }{(a{e}^{-ik})}^{n} + 1 +\sum _{ n=1}^{\infty }{(a{e}^{ik})}^{n}\right ]& \\ & = \frac{C} {2\pi } \left [ \frac{a{e}^{-ik}} {1-a{e}^{-ik}} + 1 + \frac{a{e}^{ik}} {1-a{e}^{ik}}\right ] & \\ & = \frac{C} {2\pi } \frac{1-{a}^{2}} {(1-a{e}^{-ik})(1-a{e}^{ik})} >0. & \\ \end{array}$$

This shows that R(T) is the Fourier transform of a nonnegative function, and consequently the covariance of a stationary process.

Assume for simplicity that C(1 − a ²) ∕ (2π) = 1. We solve (6.5) using complex variables. Let e ^ik = z. Then $\bar{z} = {z}^{-1}$, dk = dz ∕ (iz), and (6.5) becomes

$$\displaystyle{ \frac{1} {2\pi }\int _{\vert z\vert =1}{z}^{j}\left ({z}^{m} -\sum _{\ell =1}^{n}a_{\ell}{z}^{-\ell}\right ) \frac{1} {(z - a)\left (\frac{1} {z} - a\right )} \frac{dz} {iz} = 0 }$$

for j = 1, 2, …, n. We must therefore determine a ₁, …, a _n such that

$$\displaystyle{ \sum _{\ell=1}^{n}a_{\ell} \frac{1} {2\pi i}\int _{\vert z\vert =1}\frac{{z}^{j-\ell}{(1 - az)}^{-1}} {z - a} \,dz = \frac{1} {2\pi i}\int _{\vert z\vert =1}\frac{{z}^{j+m}{(1 - az)}^{-1}} {z - a} \,dz. }$$

We find the coefficients recursively by comparing two consecutive values of j, starting from the back. Let j = n and j = n − 1. Using residue theory, we get

$$\displaystyle{ \sum _{\ell=1}^{n} \frac{a_{\ell}{a}^{n-\ell}} {1 - {a}^{2}} = \frac{{a}^{n+m}} {1 - {a}^{2}}, }$$

$$\displaystyle{ \sum _{\ell=1}^{n-1}\frac{a_{\ell}{a}^{n-1-\ell}} {1 - {a}^{2}} + a_{n}\left [ \frac{{a}^{-1}} {1 - {a}^{2}} + \frac{{(1 - a \cdot 0)}^{-1}} {0 - a} \right ] = \frac{{a}^{n-1+m}} {1 - {a}^{2}}. }$$

Multiplying the last equation by a and subtracting, we get a _n = 0. This simplifies the next step with j = n − 1 and j = n − 2 substantially, and using similar arguments, we obtain a _n − 1 = 0. In the last step,

$$\displaystyle{ \frac{a_{1}} {2\pi i}\int _{\vert z\vert =1}\frac{z} {z} \frac{{(1 - az)}^{-1}} {z - a} \,dz = \frac{1} {2\pi i}\int _{\vert z\vert =1}\frac{{z}^{1+m}{(1 - az)}^{-1}} {z - a} \,dz, }$$

which yields a ₁(1 − a ²)^− 1 = a ^1 + m(1 − a ²)^− 1, or a ₁ = a ^1 + m. We have therefore shown that if R(T) = Ca ^| T | with 0 < a < 1, then the best approximation of u(t + m, ω) for m = 0, 1, … is a ^1 + m u(t − 1, ω). This is intuitively obvious: the correlations between variables decays like a to the power of the distance between them, so the predictive power of the last-measured quantity decays in the same way.

6.5 Random Measures and Random Fourier Transforms

We showed previously that the covariance of a wide-sense stationary stochastic process can be written as the Fourier transform of a spectral density. We now use this fact to find useful representations for the process itself, including a stochastic generalization of the Fourier transform that does not require that the process have samples to which the Fourier transform can be applied individually. These representations will be convolutions of nonrandom functions with certain simple processes.

The reader may wish to know that the material in the present section will not be used in the remainder of the book, and therefore can be skipped on a first reading.

Given a probability space (Ω, ℬ, P), consider the set of random variables f(ω), where ω is in Ω, such that $E[f\bar{f}] < \infty $. We refer to this set as L ₂(Ω, ℬ, P). We now construct a one-to-one mapping L ₂(Ω, ℬ, P) → L ₂(A, μ), where A is a subset of the t-axis and μ is a measure on A. Consider $\mathcal{A}$, an algebra of subsets of A, i.e., a collection of sets with the property that if the sets A _i are in $\mathcal{A}$, then so are their complements, as well as their finite unions and intersections; an algebra is much like a σ-algebra, with the exception that we do not require that the union of a countably infinite family of subsets belong to the algebra, a detail that is important in a rigorous analysis, but which we will disregard here.

Consider the triple $(A,\mathcal{A},\mu )$, where μ is a rule that to each subset $A_{i} \in \mathcal{A}$ assigns a number such that

1.
μ(A _i) ≥ 0.
2.
μ(A _i) is finite.
3.
μ(∅) = 0.
4.
A _i ∩ A _j = ∅ ⇒ μ(A _i ∪A _j) = μ(A _i) + μ(A _j).

(Again, note that we are concerned only with finitely many A _i.) Next, construct a random variable ρ = ρ(A _i, ω), where $A_{i} \in \mathcal{A}$ and ω ∈ Ω (recall that a random variable is a function defined on Ω), that has the following properties:

1.
A _i ∩ A _j = ∅ ⇒ ρ(A _i ∪A _j, ω) = ρ(A _i, ω) + ρ(A _j, ω).
2.
ρ(A _i, ω) is square integrable, i.e., $E[\rho (A_{i},\omega )\bar{\rho }(A_{i},\omega )] < \infty $.
3.
ρ(∅, ω) = 0.
4.
$A_{i},A_{j} \subset A \Rightarrow E[\rho (A_{i},\omega )\bar{\rho }(A_{j},\omega )] =\mu (A_{i} \cap A_{j})$.

The properties listed above imply that μ(A _i) ≥ 0 for all $A_{i} \in \mathcal{A}$, since

$$\displaystyle{ \mu (A_{i}) =\mu (A_{i} \cap A_{i}) = E[\rho (A_{i},\omega )\bar{\rho }(A_{i},\omega )] \geq 0. }$$

We call μ the structure function of ρ. Just as a stochastic process is a function of both ω and t, a random measure is a function of both ω and the subsets A _i of A.

Now define $\chi _{A_{i}} = \chi _{A_{i}}(t)$, the characteristic function of the subset A _i of the t-axis, to be

$$\displaystyle{ \chi _{A_{i}}(t) = \left \{\begin{array}{@{}l@{\quad }l@{}} 1,\quad &t \in A_{i}, \\ 0,\quad &\text{otherwise,} \end{array} \right. }$$

and consider a function q(t) of the form

$$\displaystyle{ q(t) =\sum c_{i}\chi _{A_{i}}(t). }$$

We consider the case in which {A _i} is a finite partition of A, i.e., there are only finitely many A _i, A _i ∩ A _j = ∅, for i ≠ j, and ⋃A _i = A. Thus, q(t) takes on only a finite number of values. To this function q(t) assign the random variable

$$\displaystyle{ f(\omega ) =\sum c_{i}\rho (A_{i},\omega ). }$$

Hence, each characteristic function of a subset is replaced by the random variable that the random measure assigns to the same subset; thus, this substitution transforms a function of t into a function of ω (i.e., into a random variable).

Now consider the product $q_{1}(t)\overline{q_{2}}(t)$ of two functions of the form

$$\displaystyle{ q_{1} =\sum _{ j=1}^{n}c_{ j}\chi _{A_{j}}(t),\;\;q_{2} =\sum _{ k=1}^{m}d_{ k}\chi _{B_{k}}(t), }$$

where {B _i} is another finite partition of A. It is not necessary for n and m to be equal. There is a finite number of intersections of the A _j and B _k, and on each of these subsets, the product

$$\displaystyle\begin{array}{rcl} q_{1}\overline{q_{2}}& = \left (\sum _{j=1}^{n}c_{j}\chi _{A_{j}}\right )\left (\sum _{k=1}^{m}\overline{d_{k}}\chi _{B_{k}}\right )& \\ & =\sum _{ j=1}^{n}\sum _{k=1}^{m}c_{j}\overline{d_{k}}\chi _{A_{j}\cap B_{k}}, & \\ \end{array}$$

takes on a constant value $c_{j}\overline{d_{k}}$. Thus, the same construction allows us to assign a random variable $f_{1}\overline{f_{2}}$ to the product $q_{1}\overline{q_{2}}$. Since

$$\displaystyle{ f_{1}(\omega ) =\sum c_{j}\rho (A_{j},\omega ),\;\;f_{2}(\omega ) =\sum d_{k}\rho (B_{k},\omega ), }$$

we conclude that

$$\displaystyle\begin{array}{rcl} E[f_{1}\overline{f_{2}}]& = E\left [\sum _{j=1}^{n}\sum _{k=1}^{m}c_{j}\overline{d_{k}}\rho (A_{j},\omega )\bar{\rho }(B_{k},\omega )\right ]& \\ & =\sum _{ j=1}^{n}\sum _{k=1}^{m}c_{j}\overline{d_{k}}E\left [\rho (A_{j},\omega )\bar{\rho }(B_{k},\omega )\right ]& \\ & =\sum _{ j=1}^{n}\sum _{k=1}^{m}c_{j}\overline{d_{k}}\mu (A_{j} \cap B_{k}) & \\ & =\int q_{1}\overline{q_{2}}\mu (dt). &\end{array}$$

(6.6)

Thus we have established a mapping between random variables with finite mean squares and functions of time with finite square integrals (i.e., between the random variables f(ω) and functions q(t) such that $\int q_{1}(t)\overline{q_{2}}(t)\mu (dt)$ is finite). Although we have defined the mapping only for functions $q(t) =\sum c_{i}\chi _{A_{i}}(t)$, an argument that we omit enables us to extend the mapping to all random variables and functions of t with the square integrability properties listed above.

Example.

We now show in detail how this construction works for a very special case. Say we are given a probability space (Ω, B, P) and three subsets of the t-axis: A ₁ = [0, 1), A ₂ = [1, 3), and $A_{3} = [3,3\frac{1} {2}]$. Each A _i is assigned a real-valued random variable ρ _i(ω) = ρ(A _i, ω) that has mean 0 and variance equal to the length of A _i. For example, ρ ₁(ω) has mean 0 and variance 1, and so forth. The variables ρ ₁, ρ ₂, and ρ ₃ are independent, and E[ρ _i ρ _j] = 0 for i ≠ j, where E[ρ _i ²] is the length of the ith interval. Moreover,

$$\displaystyle\begin{array}{rcl} & \chi _{1}(t) = \left \{\begin{array}{@{}l@{\quad }l@{}} 1,\quad &0 \leq t < 1,\\ 0,\quad &\text{elsewhere,} \end{array} \right.& \\ & \chi _{2}(t) = \left \{\begin{array}{@{}l@{\quad }l@{}} 1,\quad &1 \leq t < 3,\\ 0,\quad &\text{elsewhere,} \end{array} \right.& \\ & \chi _{3}(t) = \left \{\begin{array}{@{}l@{\quad }l@{}} 1,\quad &3 \leq t \leq 3\tfrac{1} {2},\\ 0,\quad &\text{elsewhere,} \end{array} \right.& \\ \end{array}$$

where ∫χ _i χ _j dt = 0 for i ≠ j and ∫χ _i ² dt is the length of the ith interval.

Now take a function of the form q ₁(t) = ∑ _i c _i χ _i(t), where the c _i are constants. Clearly,

$$\displaystyle{ f_{1}(\omega ) =\sum _{ i=1}^{3}c_{ i}\rho _{i}(\omega ). }$$

Suppose we have another function q ₂(t) on the same partition:

$$\displaystyle{ q_{2}(t) =\sum _{ j=1}^{3}d_{ j}\chi _{j}(t) \rightarrow f_{2}(\omega ) =\sum _{ j=1}^{3}d_{ j}\rho _{j}(\omega ). }$$

Then

$$\displaystyle\begin{array}{rcl} E[f_{1}\overline{f_{2}}]& = E\left [\sum _{i=1}^{3}\sum _{j=1}^{3}c_{i}\overline{d_{j}}\rho _{i}\rho _{j}\right ]& \\ & =\sum _{ j=1}^{3}c_{j}\overline{d_{j}}E\left [\rho _{j}^{2}\right ] & \\ & =\sum _{ j=1}^{3}c_{j}\overline{d_{j}}\mu (A_{j}), & \\ \end{array}$$

where μ(A _j) is the length of A _j. Notice also that

$$\displaystyle\begin{array}{rcl} \int _{0}^{3\frac{1} {2} }q_{1}(t)\overline{q_{2}}(t)\,dt& =\int _{ 0}^{3\frac{1} {2} }\sum _{i=1}^{3}\sum _{j=1}^{3}c_{i}\overline{d_{j}}\chi _{i}(t)\chi _{j}(t)\,dt& \\ & =\sum _{j}c_{j}\overline{d_{j}}\mu (A_{j}), & \\ \end{array}$$

which verifies that q(t) → f(ω), so $E[f_{1}\overline{f_{2}}] =\int q_{1}(t)\overline{q_{2}}(t)\mu (dt)$ as in (6.6).

Now approximate every square integrable function q on A by a step function, construct the corresponding random variable, and take the limit, as the approximation improves, of the sequence of random variables obtained in this way. This makes for a mapping of square integrable functions on A onto random variables with finite mean squares. This mapping can be written as

$$\displaystyle{ f(\omega ) =\int q(s)\rho (ds,\omega ) }$$

(the right-hand side is an integral with respect to the measure ρ), where the variable t has been replaced by s for convenience. Now view a stochastic process u as a family of random variables labeled by the parameter t (i.e., there is a random variable u for every value of t) and apply the representation just derived at each value of t. Therefore,

$$\displaystyle{ u(t,\omega ) =\int q(t,s)\rho (ds,\omega ). }$$

Assume that u(t, ω) is stationary in the wide sense. Then the covariance of u is

$$\displaystyle\begin{array}{lll} R(t_{1} - t_{2})& = E[u(t_{1},\omega )\overline{u}(t_{2},\omega )] & \\ & = E\left [\int q(t_{1},s_{1})\rho (ds_{1})\int \bar{q}(t_{2},s_{2})\bar{\rho }(ds_{2})\right ]& \\ & = E\left [\int q(t_{1},s_{1})\bar{q}(t_{2},s_{2})\rho (ds_{1})\bar{\rho }(ds_{2})\right ]& \\ & =\int q(t_{1},s_{1})\bar{q}(t_{2},s_{2})E[\rho (ds_{1})\bar{\rho }(ds_{2})]& \\ & =\int q(t_{1},s)\bar{q}(t_{2},s)\mu (ds). & \\ \end{array}$$

One can show that the converse is also true: if the last equation holds, then u(t, ω) = ∫q(t, s)ρ(ds, ω) with $E[\rho (ds)\bar{\rho }(ds)] =\mu (ds)$. Note that in all of the above, equality holds in a mean square (L ₂) sense, and little can be said about the higher moments.

Example.

If u = u(t, ω) is a wide-sense stationary stochastic process, then it follows from Khinchin’s theorem that

$$\displaystyle\begin{array}{rcl} R(T)& = E[u(t + T,\omega )\overline{u(t,\omega )}]&\end{array}$$

(6.7)

$$\displaystyle\begin{array}{rcl} & =\int {e}^{ikT}dG(k).&\end{array}$$

(6.8)

Conversely, if $E[\rho (dk)\overline{\rho (dk)}] = dG(k)$, we see that if

$$\displaystyle{u(t,\omega ) =\int {e}^{ikt}\rho (dk,\omega ),}$$

then

$$\displaystyle\begin{array}{lll} E[u(t + T,\omega )\overline{u(t,\omega )}]& =\int {e}^{ik(t+T-t)}E[\rho (dk)\overline{\rho (dk)}]& \\ & =\int {e}^{ikT}dG(k). & \\ \end{array}$$

We have just shown that dG(k) is the energy density in the interval dk. This ρ(k) is the stochastic Fourier transform of u. The inverse Fourier transform does not exist in the usual sense (i.e., ∫u(t, ω)e ^− ikt dt for each ω does not exist), but for (6.5) to hold, it is sufficient for E[ | u(t) | ²] to exist for each t.

One can summarize the construction of the stochastic Fourier transform as follows: For the ordinary Fourier transform, the Parseval identity is a consequence of the definitions. To generalize the Fourier transform, we started from a general form of Parseval’s identity and found a generalized version of the Fourier transform that satisfies it.

Example.

Suppose dG(k) = g(k) dk. Then

$$\displaystyle{ \int {e}^{ik(t_{2}-t_{1})}dG(k) =\int {e}^{ikt_{2} }\sqrt{g(k)}{e}^{-ikt_{1} }\sqrt{g(k)}\,dk. }$$

Recall that g(k) ≥ 0. Write $\sqrt{g(k)} =\hat{ h}(k) =\widehat{ h(t)}(k)$, where h(t) is the inverse Fourier transform of $\hat{h}(k)$, $\hat{h}(k) = \frac{1} {\sqrt{2\pi }}\int h(t){e}^{-ikt}dt$. Then

$$\displaystyle\begin{array}{lll}{ e}^{-ikt_{2} }\sqrt{g(k)}& = {e}^{-ikt_{2}} \frac{1} {\sqrt{2\pi }}\int h(t){e}^{-ikt}dt& \\ & = \frac{1} {\sqrt{2\pi }}\int h(t){e}^{-ik(t+t_{2})}dt & \\ & = \frac{1} {\sqrt{2\pi }}\int h(t - t_{2}){e}^{-ikt}dt & \\ & =\widehat{ h(t - t_{2})}(k), & \\ \end{array}$$

where the (k) at the very end is there to remind you that $\widehat{h(t - t_{2})}$ is a function of k. Since the Fourier transform preserves inner products, we find that

$$\displaystyle{ R(t_{1},t_{2}) =\int \bar{h}(t - t_{1})h(t - t_{2})\,dt, }$$

and by changing t to s, we obtain

$$\displaystyle{ R(t_{1},t_{2}) =\int \bar{h}(s - t_{1})h(s - t_{2})\mu (ds), }$$

where μ(ds) = ds. Applying our representation, we get $u(t,\omega ) =\int \bar{ h}(s - t)\rho (ds)$, where E[ | ρ(ds) | ²] = ds. The random measure constructed as increments of Brownian motion at instants ds apart has this property. Thus, any wide-sense stationary stochastic process with dG(k) = g(k) dk can be approximated as a sum of translates (in time) of a fixed function, each translate multiplied by independent Gaussian random variables. This is the moving average representation.

6.6 Exercises

1.
Find some way to show nonrigorously that the covariance function of white noise is a delta function. Suggestion: Approximate Brownian motion by a random walk with Gaussian increments of nonzero length, find the time series of the difference quotients of this walk, calculate its covariance, and take a formal limit.
2.
Consider the stochastic process u = ξcos(t), where ξ is a random variable with mean 0 and variance 1. Find the mean and the covariance functions. Obviously, this is not a stationary process. However, cos(t) = (e ^it + e ^− it) ∕ 2. How do you reconcile this with the construction we have of stationary processes as sums of exponentials?
3.
Consider the differential equation (u ²)_x = εu _xx on the real line, with the boundary conditions u( − ∞) = u ₀, u( + ∞) = − u ₀, where ε and u ₀ are constants. Assume that u is a velocity, with dimension L ∕ T, where L is the dimension of length and T the dimension of time. Find the dimension of ε. Because of the boundary conditions, u does not have a usual Fourier transform, but one can define one by taking the Fourier transform of u′ and dividing it by ik. Let $\hat{u}(k)$ be this Fourier transform of u. Define the energy spectrum by $E(k) = \vert \hat{u}(k){\vert }^{2}$. Find the dimension of E(k); show that the dimensionless quantity E(k)k ² ∕ u ₀ ² must be a function of the variable kε ∕ u ₀. Assume complete similarity, and deduce that as you pass to the limit ε → 0, the spectrum converges to E(k) = C ∕ k ² for some constant C.
4.
Consider the wide-sense stationary stochastic process u = ξe ^it, where ξ is a Gaussian variable with mean 0 and variance 1. What is its stochastic Fourier transform? What is the measure ρ(dk)?
5.
Consider a stochastic process of the form $u(\omega ,t) =\sum _{j}\xi _{j}{e}^{i\lambda _{j}t}$, where the sum is finite and the ξ _j are independent random variables with mean 0 and variance v _j. Calculate the limit as T → ∞ of the random variable (1 ∕ T)∫ ^T _− T | u(ω, s) | ² ds. How is it related to the spectrum as we have defined it? What is the limit of (1 ∕ T)∫ _− T ^T u ds?
6.
Suppose you have to construct on the computer (for example, for the purpose of modeling the random transport of pollutants) a Gaussian stationary stochastic process with mean 0 and a given covariance function R(t ₁ − t ₂). Propose a construction.
7.
Show that there is no stationary (in the wide sense) stochastic process u = u(ω, t) that satisfies (for each ω) the differential equation y′′ + 4y = 0 as well as the initial condition y(t = 0) = 1.
8.
Let η be a random variable. Its characteristic function is defined as ϕ(λ) = E[e ^iλη]. Show that ϕ(0) = 1 and that | ϕ(λ) | ≤ 1 for all λ. Show that if $\phi _{1},\phi _{2},\ldots ,\phi _{n}$ are the characteristic functions of independent random variables $\eta _{1},\ldots ,\eta _{n}$, then the characteristic function of the sum of these variables is the product of the ϕ _i.
9.
Show that if ϕ(λ) is the characteristic function of η, then
$$\displaystyle{ E{[\eta }^{n}] = {(-i)}^{n}\frac{{d}^{n}} {{d\lambda }^{n}}\phi (0), }$$
provided both sides of the equation make sense. Use this fact to show that if ξ _i, i = 1, …, n, are Gaussian variables with mean 0, not necessarily independent, then
$$\displaystyle{ E[\xi _{1}\xi _{2}\cdots \xi _{n}] = \left \{\begin{array}{@{}l@{\quad }l@{}} \Sigma \Pi E[\xi _{i_{k}}\xi _{j_{k}}],\quad &n\;\mathrm{even}, \\ 0, \quad &n\;\mathrm{odd.} \end{array} \right. }$$
On the right-hand side, i _k and j _k are two of the indices, the product is over a partition of the n indices into disjoint groups of two, and the sum is over all such partitions (this is Wick’s theorem). Hints: Consider the variable Σλ _j ξ _j; its moments can be calculated from the derivatives of its characteristic function. By assumption, this variable is Gaussian and its characteristic function, i.e., the Fourier transform of its density, is given by a formula we have derived.
10.
Consider the following functions R(T); which ones are the covariances of some wide-sense stationary stochastic process, and why? (here T = t ₁ − t ₂, as usual):
1. 1.
  $R(T) = {e}^{-{T}^{2} }$.
2. 2.
  $R = T{e}^{-{T}^{2} }$.
3. 3.
  $R = {e}^{-{T}^{2}/2 }({T}^{2} - 1)$.
4. 4.
  $R = {e}^{-{T}^{2}/2 }(1 - {T}^{2})$.

6.7. Bibliography

G.I. Barenblatt, Scaling, Cambridge University Press, Cambridge, 2004.
Google Scholar
G.I. Barenblatt and A.J. Chorin, A mathematical model for the scaling of turbulence, Proc. Natl. Acad. Sci. USA, 101 (2004), pp. 15,023–15,026.
Google Scholar
R. Ghanem and P. Spanos, Stochastic Finite Elements, Dover, NY, 2012
Google Scholar
I. Gikhman and A. Skorokhod, Introduction to the Theory of Random Processes, Saunders, Philadelphia, 1965.
Google Scholar
A. Yaglom, An Introduction to the Theory of Stationary Random Functions, Dover, New York, 1962.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of California, Berkeley, Berkeley, CA, USA
Alexandre J. Chorin
Department of Mathematics, University of California at Berkeley, Berkeley, CA, USA
Ole H. Hald

Authors

Alexandre J. Chorin
View author publications
You can also search for this author in PubMed Google Scholar
Ole H. Hald
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chorin, A.J., Hald, O.H. (2013). Stationary Stochastic Processes. In: Stochastic Tools in Mathematics and Science. Texts in Applied Mathematics, vol 58. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6980-3_6

Download citation

DOI: https://doi.org/10.1007/978-1-4614-6980-3_6
Published: 04 April 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6979-7
Online ISBN: 978-1-4614-6980-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Stationary Stochastic Processes

Abstract

Similar content being viewed by others

Stochastic Processes

Orientation

Mathematical background on stochastic processes

Keywords

6.1 Weak Definition of a Stochastic Process

Definition.

Definition.

Definition.

6.2 Covariance and Spectrum

Theorem 6.1 (Khinchin).

Example.

6.3 The Inertial Spectrum of Turbulence

6.4 Time Series

Definition.

6.5 Random Measures and Random Fourier Transforms

Example.

Example.

Example.

6.6 Exercises

6.7. Bibliography

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Stationary Stochastic Processes

Abstract

Similar content being viewed by others

Stochastic Processes

Orientation

Mathematical background on stochastic processes

Keywords

6.1 Weak Definition of a Stochastic Process

Definition.

Definition.

Definition.

6.2 Covariance and Spectrum

Theorem 6.1 (Khinchin).

Example.

6.3 The Inertial Spectrum of Turbulence

6.4 Time Series

Definition.

6.5 Random Measures and Random Fourier Transforms

Example.

Example.

Example.

6.6 Exercises

6.7. Bibliography

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation