Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

22.1 Processes with Finite Second Moments

Let {ξ(t), −∞<t<∞} be a random process for which there exist the moments a(t)=E ξ(t) and R(t,u)=E ξ(t)ξ(u). Since it is always possible to study the process ξ(t)−a(t) instead of ξ(t), we can assume without loss of generality that a(t)≡0.

Definition 22.1.1

The function R(t,u) is said to be the covariance function of the process ξ(t).

Definition 22.1.2

A function R(t,u) is said to be nonnegative (positive) definite if, for any k; u 1,…,u k ; a 1,…,a k ≠0,

$$\sum_{i,j} a_ia_jR (u_i,u_j)\ge0\quad(>0). $$

It is evident that the covariance function R(t,u) is nonnegative definite, because

$$\sum_{i,j} a_i a_j R(u_i,u_j)=\mathbf{E} \biggl(\sum _{i,j} a_j\xi (u_i ) \biggr)^2 \ge0. $$

Definition 22.1.3

A process ξ(t) is said to be unpredictable if no linear combination of the variables ξ(u 1),…,ξ(u k ) is zero with probability 1, i.e. if there exist no u 1,…,u k ; a 1,…,a k such that

$$\mathbf{P} \biggl(\sum_i a_i \xi(u_i)=0 \biggr)=1. $$

If R(t,u) is the covariance function of an unpredictable process, then R(t,u) is positive definite. We will see below that the converse assertion is also true in a certain sense.

Unpredictability means that we cannot represent ξ(t k ) as a linear combination of ξ(t j ), j<k.

Example 22.1.1

The process \(\xi(t)=\sum_{k=1}^{N}\xi_{k} g_{k}(t)\), where g k (t) are linearly independent and ξ k are independent, is not unpredictable, because from ξ(t 1),…,ξ(t N ) we can determine the values ξ(t) for all other t.

Consider the Hilbert space L 2 of all random variables η on \(\langle\varOmega,\mathfrak{F},\mathbf{P}\rangle\) having finite second moments, E η=0, endowed with the inner product (η 1,η 2)=E η 1 η 2 corresponding to the distance ∥η 1η 2∥=[E(η 1η 2)2]1/2. Convergence in L 2 is obviously convergence in mean quadratic.

A random process ξ(t) may be thought of as a curve in L 2.

Definition 22.1.4

A random process ξ(t) is said to be wide sense stationary if the function R(t,u)=:R(tu) depends on the difference tu only. The function R(s) is called nonnegative (positive) definite if the function R(t,t+s) is of the respective type. For brevity, we will often call wide sense stationary processes simply stationary.

For the Wiener process, R(t,u)=E w(t)w(u)=min(t,u), so that w(t) cannot be stationary. But the process ξ(t)=w(t+1)−w(t) will already be stationary.

It is obvious that, for a stationary process, the function R(s) is even and \(\mathbf{E}\xi^{2}(t)=R(0)={\rm const}\). For simplicity’s sake, put R(0)=1. Then, by the Cauchy–Bunjakovsky inequality,

$$\bigl|R(s)\bigr|=\bigl|\mathbf{E}\xi(t)\xi(t+s)\bigr|\le \bigl[\mathbf{E}\xi ^2(t) \mathbf{E}\xi^2(t+s) \bigr]^{1/2}=R(0)=1. $$

Theorem 22.1.1

  1. (1)

    A process ξ(t) is continuous in mean quadratic (\(\xi(t+\varDelta)\stackrel{(2)}{\longrightarrow} \xi(t)\) as Δ→0) if and only if the function R(u) is continuous at zero.

  2. (2)

    If the function R(u) is continuous at zero, then it is continuous everywhere.

Proof

$$\begin{aligned} (1)\quad \bigl\|\xi(t+\varDelta)-\xi(t)\bigr\|^2 =&\mathbf{E} \bigl(\xi(t+\varDelta)- \xi (t) \bigr)^2=2R(0)-2R(\varDelta). \\(2) \qquad R(t+\varDelta)-R(t) =&\mathbf{E} \bigl(\xi(t+\varDelta)\xi(0)-\xi(t) \xi (0) \bigr) \\=& \bigl(\xi(0),\xi(t+\varDelta)-\xi(t) \bigr)\le\bigl\|\xi(t+\varDelta )-\xi(t)\bigr\| \\=&\sqrt{2 \bigl(R(0)-R(\varDelta) \bigr)}. \end{aligned}$$
(22.1.1)

The theorem is proved. □

A process ξ(t) continuous in mean quadratic will be stochastically continuous, as we can see from Chaps. 6 and 18. The continuity in mean quadratic does not, however, imply path-wise continuity. The reader can verify this by considering the example of the process

$$\xi(t)=\eta(t+1)-\eta(t)-1, $$

where η(t) is the Poisson process with parameter 1. For that process, the covariance function

is continuous, although the trajectories of ξ(t) are not. If

$$ \bigl|R(\varDelta)-R(0) \bigr|<c\varDelta^{1+\varepsilon} $$
(22.1.2)

for some ε>0 then, by the Kolmogorov theorem (see Theorem 18.2.1), ξ(t) has a continuous modification. From this it follows, in particular, that if R(t) is twice differentiable at the point t=0, then the trajectories of ξ(t) may be assumed continuous. Indeed, in that case, since R(t) is even, one has

$$R'(0)=0\quad\mbox{and}\quad R(\varDelta)-R(0)\sim \frac{1}{2}R''(0)\varDelta^2. $$

As a whole, the smoother the covariance function is at zero, the smoother the trajectories of ξ(t) are.

Assume that the trajectories of ξ(t) are measurable (for example, belong to the space D).

Theorem 22.1.2

(The simplest ergodic theorem)

If

$$ R(s)\to0\quad\mathit{as}\ s\to\infty, $$
(22.1.3)

then

$$\zeta_T:=\frac{1}{T}\int_0^T \xi(t)\,dt\stackrel {(2)}{\longrightarrow} 0. $$

Proof

Clearly,

$$\|\zeta_T\|^2=\frac{1}{T^2}\int_0^T \int_0^T R(t-u)\,dt\,du. $$

Since R(s) is even,

$$J:=\int_0^T\int_0^T R(t-u)\,dt\,du=2\int_0^T\int _u^T R(t-u)\,dt\,du. $$

Making the orthogonal change of variables \(v=(t-u)/{\sqrt{2}}\), \(s=(t+u)/{\sqrt{2}}\), we obtain

The theorem is proved. □

Example 22.1.2

The stationary white noise process ξ(t) is defined as a process with independent values, i.e. a process such that, for any t 1,…,t n , the variables ξ(t 1),…,ξ(t n ) are independent. For such a process,

and thus condition (22.1.3) is met. However, one cannot apply Theorem 22.1.2 here, for the trajectories of ξ(t) will be non-measurable with probability 1 (for example, the set B={t:ξ(t)>0} is non-measurable with probability 1).

Definition 22.1.5

A process ξ(t) is said to be strict sense stationary if, for any t 1,…,t k , the distribution of (ξ(t 1+u),ξ(t 2+u),…,ξ(t k +u)) is independent of u.

It is obvious that if ξ(t) is a strict sense stationary process then

$$\mathbf{E}\xi(t)\xi(u)=\mathbf{E}\xi(t-u)\xi(0)=R(t-u), $$

and ξ(t) will be wide sense stationary. The converse is, of course, not true. However, there exists a class of processes for which both concepts of stationarity coincide.

22.2 Gaussian Processes

Definition 22.2.1

A process ξ(t) is said to be Gaussian if its finite-dimensional distributions are normal.

We again assume that E ξ(t)=0 and R(t,u)=E ξ(t)ξ(u).

The finite-dimensional distributions are completely determined by the ch.f.s (λ=(λ 1,…,λ k ), ξ=(ξ(t 1),…,ξ(t k )))

$$\mathbf{E}e^{i(\lambda,\xi)}=\mathbf{E}e^{i\sum_j \lambda_j\xi(t_j)}=e^{-\frac{1}{2}\lambda R\lambda^T}, $$

where R=∥R(t i ,t j )∥ and the superscript T stands for transposition, so that

$$\lambda R\lambda^T=\sum_{i,j} \lambda_i\lambda_j R(t_i,t_j). $$

Thus for a Gaussian process the finite-dimensional distributions are completely determined by the covariance function R(t,u).

We saw that for an unpredictable process ξ(t), the function R(t,u) is positive definite. A converse assertion may be stated in the following form.

Theorem 22.2.1

If the function R(t,u) is positive definite, then there exists an unpredictable Gaussian process with the covariance function R(t,u).

Proof

For arbitrary t 1,…,t k , define the finite-dimensional distribution of the vector ξ(t 1),…,ξ(t k ) via the density

$$p_{t_1,\ldots,t_k}(x_1,\ldots,x_k)= \frac{\sqrt{|A|}}{(2\pi)^{k/2}}\exp \biggl\{-\frac{1}{2}xAx^T \biggr\}, $$

where A is the matrix inverse to the covariance matrix R=∥R(t i ,t j )∥ (see Sect. 7.6) and |A| is the determinant of A. These distributions will clearly be consistent, because the covariance matrices are consistent (the matrix for ξ(t 1),…,ξ(t k−1) is a submatrix of R). It remains to make use of the Kolmogorov theorem. The theorem is proved. □

Example 22.2.1

Let w(t) be the standard Wiener process. The process

$$w^0(t)=w(t)-tw(1), \quad t\in[0,1], $$

is called the Brownian bridge (its “ends are fixed”: w 0(0)=w 0(1)=0). The covariance function of w 0(t) is equal to

$$R(t,u)=\mathbf{E} \bigl(w(t)-tw(1) \bigr) \bigl(w(u)-uw(1) \bigr)=t(1-u) $$

for ut.

A Gaussian wide sense stationary process ξ(t) is strict sense stationary. This immediately follows from the fact that for R(t,u)=R(tu) the finite-dimensional distributions of ξ(t) become invariant with respect to time shift:

$$p_{t_1,\ldots,t_k}(x_1,\ldots,x_k)=p_{t_1+u,\ldots,t_k+u}(x_1, \ldots,x_k) $$

since ∥R(t i +u,t j +u)∥=∥R(t i ,t j )∥.

If ξ(t) is a Gaussian process, then conditions ensuring the smoothness of its trajectories can be substantially relaxed in comparison with (22.1.2).

Let for simplicity’s sake the Gaussian process ξ(t) be stationary.

Theorem 22.2.2

If, for h<1,

$$\bigl|R(h)-R\,(0)\bigr|<c \biggl(\log\frac{1}{h} \biggr)^{-\alpha},\quad \alpha>3,\ c<\infty, $$

then the trajectories of ξ(t) can be assumed continuous.

Proof

We make use of Theorem 18.2.2 and put \(\varepsilon(h)= (\log\frac{1}{h} )^{-\beta}\) for 1<β<(α−1)/2 (we take logarithms to the base 2). Then

$$\sum_{n=1}^\infty\varepsilon \bigl(2^{-n} \bigr)=\sum_{n=1}^\infty n^{-\beta} <\infty, $$

and, by (22.1.1),

(22.2.1)

Since the argument of Φ increases unboundedly as h→0, γ=α−2β>1, and by (19.3.1)

$$1-\varPhi(x)\sim\frac{1}{\sqrt{2\pi}x}e^{-x^2/2} \quad\mbox {as } x\to \infty, $$

we see that the right-hand side of (22.2.1) does not exceed

$$q(h):=c_1 \biggl(\log\frac{1}{h} \biggr)^{\beta-\alpha/2} \exp \biggl\{{-}c_2 \biggl(\log\frac{1}{h} \biggr)^{\alpha-2\beta} \biggr\}, $$

so that

$$\sum_{n=1}^\infty2^nq \bigl(2^{-n}\bigr)=c_1\sum_{n=1}^\infty n^{-\gamma/2}\exp\bigl\{{-}c_2n^\gamma+n\ln2\bigr\}<\infty, $$

because c 2>0 and γ>1. The conditions of Theorem 18.2.2 are met, and so Theorem 22.2.2 is proved. □

22.3 Prediction Problem

Suppose the distribution of a process ξ(t) is known, and one is given the trajectory of ξ(t) on a set B⊂(−∞,t], B being either an interval or a finite collection of points. What could be said about the value ξ(t+u)? Our aim will be to find a random variable ζ, which is \(\mathfrak{F}_{B}=\sigma (\xi(v),\, v\in B )\)-measurable (and called a prediction) and such that E(ξ(t+u)−ζ)2 assumes the smallest possible value. The answer to that problem is actually known (see Sect. 4.8):

$$\zeta=\mathbf{E} \bigl(\xi(t+u)\big|\mathfrak{F}_B \bigr). $$

Let ξ(t) be a Gaussian process, B={t 1,…,t k }, t 1<t 2<⋯<t k <t 0=t+u, A=(σ 2)−1=∥a ij ∥ and σ 2=∥E ξ (t i ) ξ (t j )∥ i,j=1,…,k,0. Then the distribution of the vector (ξ(t 1),…,ξ(t 0)) has the density

$$f(x_1,\ldots,x_k,x_0)=\frac{\sqrt{|A|}}{(2\pi)^{(k+1)/2}} \exp \biggl\{-\frac{1}{2}\sum_{i,j} x_ix_ja_{ij} \biggr\}, $$

and the conditional distribution of ξ(t 0) given ξ(t 1),…,ξ(t k ) has density equal to the ratio

$$\frac{f(x_1,\ldots,x_k,x_0)}{\int_{-\infty}^{\infty}f(x_1,\ldots ,x_k,x_0)\,dx_0}. $$

The exponential part of this ratio has the form

$$\exp \Biggl\{{-}\frac{a_{00}x_0^2}{2}-\sum_{j=1}^k x_0x_ja_{j0} \Biggr\}. $$

This means that the conditional distribution under consideration is the normal law \(\boldsymbol {\Phi }_{\alpha,d^{2}}\), where

$$\alpha=-\sum_j \frac{x_ja_{j0}}{a_{00}},\qquad d^2=\frac{1}{a_{00}}. $$

Thus, in our case the best prediction ζ is equal to

$$\zeta=-\sum_{j=1}^k\frac{\xi(t_j)a_{0j}}{a_{00}}. $$

The mean quadratic error of this prediction equals \(\sqrt{1/a_{00}}\).

We have obtained a linear prediction. In the general case, the linearity property is usually violated.

Consider now the problem of the best linear prediction in the case of an arbitrary process ξ(t) with finite second moments. For simplicity’s sake we assume again that B={t 1,…,t k }.

Denote by H(ξ) the subspace of L 2 generated by the random variables ξ(t), −∞<t<∞, and by H B (ξ) the subspace of H(ξ) generated (or spanned by) ξ(t 1),…,ξ(t k ). Elements of H B (ξ) have the form

$$\sum _{j=1}^k a_j \xi(t_j). $$

The existence and the form of the best linear prediction in this case are established by the following assertion.

Theorem 22.3.1

There exists a unique point ζH B (ξ) (the projection of ξ(t+u) onto H B (ξ), see Fig22.1) such that

$$ \xi(t+u)-\zeta\perp H_B(\xi). $$
(22.3.1)

Relation (22.3.1) is equivalent to

$$ \bigl\|\xi(t+u)-\zeta\bigr\|=\min _{\theta\in H_B(\xi)}\bigl\|\xi(t+u)-\theta\bigr\|. $$
(22.3.2)

Explicit formulas for the coefficients a j in the representation ζ=∑a j ξ(t j ) are given in the proof.

Fig. 22.1
figure 1

Illustration to Theorem 22.3.1: the point ζ is the projection of ξ(t+u) onto H B (ξ)

Proof

Relation (22.3.1) is equivalent to the equations

$$\bigl(\xi(t+u)-\zeta,\xi(t_j)\bigr)=0,\quad j=1,\ldots,k. $$

Substituting here

$$\zeta=\sum_{l=1}^ka_l \xi(t_l)\in H_B(\xi), $$

we obtain

$$ R(t+u,t_j)=\sum_{l=1}^k a_lR(t_j,t_l),\quad j=1,\ldots,k, $$
(22.3.3)

or, in vector form, R t+u =aR, where

If the process ξ(t) is unpredictable, then the matrix R is non-degenerate and Eq. (22.3.3) has a unique solution:

$$ a=R_{t+u}R^{-1}. $$
(22.3.4)

If ξ(t) is not unpredictable, then either R −1 still exists and then (22.3.4) holds, or R is degenerate. In that case, one has to choose from the collection ξ(t 1),…,ξ(t k ) only l<k linearly independent elements for which all the above remains true after replacing k with l.

The equivalence of (22.3.1) and (22.3.2) follows from the following considerations. Let θ be any other element of H B (ξ). Then

$$\eta:=\theta-\zeta\in H_B(\xi),\qquad\eta\perp \xi(t+u)-\zeta, $$

so that

$$\bigl\|\xi(t+u)-\theta\bigr\|=\bigl\|\xi(t+u)-\zeta\bigr\|+\|\eta\|\ge \bigl\|\xi(t+u)-\zeta\bigr\|. $$

The theorem is proved. □

Remark 22.3.1

It can happen (in the case where the process ξ(t) is not unpredictable) that ξ(t+u)∈H B (ξ). Then the error of the prediction ζ will be equal to zero.