A stochastic process represents a system, usually evolving along time, which incorporates an element of randomness, as opposed to a deterministic process.

The independent sequences of random variables, the Markov chains and the martingales have been presented above. In the present chapter, we investigate more general real stochastic processes, indexed by \(\mathbb {T}\subset \mathbb {R}\), with \(t\in \mathbb {T}\) representing the time, in a wide sense. Depending on the context, when \(\mathbb {T}=\mathbb {N}\) or \(\mathbb {R}_+\), they are called sequences of random variables—in short random sequences, stochastic processes with continuous or discrete time, signals, times series, etc.

First, we generalize to stochastic processes the notions of distributions, distribution functions, etc., already encountered in the previous chapters for random sequences. Then, we study some typical families of stochastic processes, with a stress on their asymptotic behavior: stationary or ergodic processes, ARMA processes, processes with independent increments such as the Brownian motion, point processes—especially renewal and Poisson processes, jump Markov processes and semi-Markov processes will be especially investigated in the final chapter.

4.1 General Notions

We will first define general stochastic elements, and then extend to stochastic processes notions such as distributions, stopping times, and results such as the law of large numbers, central limit theorem.

Let \((\Omega , \mathcal {F})\) and \((\mathbf {E},\mathcal {E})\) be two measurable spaces. In probability theory, a measurable function \(X : (\Omega , \mathcal {F}) \longrightarrow (\mathbf {E}, \mathcal {E})\), that is such that \(X^{-1}(\mathcal {E})\subset \mathcal { F}\), is called a stochastic element.

The stochastic elements may take values in any measurable space. If \(\mathbf {E}=\mathbb {R}\) (\(\mathbb {R}^d,\) d > 1, \(\mathbb {R}^{\scriptstyle \mathbb {N}} =\mathbb {R} \times \mathbb {R} \times \cdots \), \(\mathbb {R}^{\scriptstyle \mathbb {R}}\)), then X is a real random variable (random vector, random sequence or stochastic process with discrete time, stochastic process with continuous time).

It is worth noticing that even in the general theory of stochastic processes, \(\mathbb {T}\) is typically a subset of \(\mathbb {R}\), as we will assume thereafter. For any fixed ω ∈ Ω, the function tX t(ω) is called a trajectory or realization of the process. The value of this function is determined by the result of a random phenomenon at the time t of its observation. For example, one may observe tossing of a coin, the fluctuations of the generations of a population, the day temperature in a given place, etc. If the trajectories of the process are continuous functions (on the left or on the right), the process itself is said to be continuous (on the left or on the right).

In general, if \((\mathbf {E},{\mathcal E} )= (\mathbb {R}^{\mathbb {T}}, {\mathcal B} (\mathbb {R}^{\mathbb {T}}))\), where \(\mathbb {T}\) is any convenient set of indices, a stochastic element can be represented as a family of real random variables \({{\mathbf X}}= (X_t)_{t\in \mathbb {T}}\). It can also be regarded as a function of two variables

$$\displaystyle \begin{aligned} \mathbf{X}:\begin{array}[t]{ccc}\Omega\times \mathbb{T}&\longrightarrow& \mathbb{R}\\(\omega,t)&\longrightarrow&X_t(\omega)\end{array} \end{aligned}$$

such that X t is a real random variable defined on \((\Omega ,{\mathcal F})\) for any fixed t. If \(\mathbb {T}\) is a totally ordered enumerable set (as \(\mathbb {N}\) or \(\mathbb {R}_+\)), then X is called a time series. If \(\mathbb {T}=\mathbb {R}^d\), with d > 1 or one of its subsets, the process is a multidimensional process. If \(\mathbf {E}=(\mathbb {R}^d)^{\mathbb {T}}\), with d > 1, it is a multivariate or vector process.

The canonical space of a real process \({\mathbf X}=(X_t)_{t\in \mathbb {T}}\) with distribution \(\mathbb {P} _{\mathbf X}\) is the triple \((\mathbb {R}^{\mathbb {T}}, {\mathcal B}(\mathbb {R}^{\mathbb {T}}),\mathbb {P} _{\mathbf X})\). Generally, we will consider that the real stochastic processes are defined on their canonical spaces.

Note that a function \(\theta _s:\mathbb {R}^{\mathbb {T}}\longrightarrow \mathbb {R}^{\mathbb {T}}\) is called a translation (or shift) operator on \(\mathbb {R}^{\mathbb {T}}\) if and only if θ s(x t) = x t+s for all \(s,t\in \mathbb {T}\) and all \(x=(x_t)\in \mathbb {R}^{\mathbb {T}}\). When \(\mathbb {T}=\mathbb {N}\), θ n is the n-th iterate of the one step translation operator θ 1 (also denoted by θ, see above Definition 3.21).

A stochastic process X defined on \((\Omega ,{\mathcal F},\mathbb {P} )\) is said to be an L p process if \(X_t\in L^p(\Omega ,{\mathcal F},\mathbb {P} )\) for all \(t\in \mathbb {T},\) for \(p\in \mathbb {N}^*\). For p = 1 it is also said to be integrable, and for p = 2 to be a second order process.

Here are some classical examples of stochastic processes, the sinusoidal signals, the Gaussian processes, and the ARMA processes.

▹ Example 4.1 (Sinusoidal Signal)

For the process defined by

$$\displaystyle \begin{aligned}X_t=A\cos{}(2\pi\nu t+\varphi), \quad t\in\mathbb{R}, \end{aligned}$$

φ is called the phase, A the amplitude and ν the frequency. These parameters can be either constant or random. For instance, if A and ν are real constants and φ is a random variable, the signal is a monochromatic wave with random phase. \(\lhd \)

▹ Example 4.2 (Gaussian White Noise)

Let \((\varepsilon _t)_{t\in \mathbb {Z}}\) be such that \(\varepsilon _t\sim {\mathcal N}(0,1)\) for all \(t\in \mathbb {T}\) and \(\mathbb {E}\, (\varepsilon _{t_1}\varepsilon _{t_2})=0\) for all t 1 ≠ t 2. The process (ε t) is called a Gaussian white noise. \(\lhd \)

▹ Example 4.3 (Gaussian Process)

Let \({{\mathbf X}}=(X_t)_{t\in \mathbb {T}}\) be such that the vector \((X_{t_1},\ldots ,X_{t_n})\) is a Gaussian vector for all integers n and all t 1, …, t n. The process X is said to be Gaussian. \(\lhd \)

Definition 4.4

A sequence of random variables \(X=(X_n)_{n\in \scriptstyle \mathbb {Z}}\) is called an auto-regressive moving-average process with orders p and q, or ARMA(p, q), if

$$\displaystyle \begin{aligned}X_n+\sum_{i=1}^pa_iX_{n-i}=\varepsilon_n+\sum_{j=1}^qb_j\varepsilon_{n-j},\quad n\in\mathbb{Z}, \end{aligned}$$

where ε is a Gaussian white noise, p and q are integers and a 1, …, a n and b 1, …, b q are all real numbers, with a p≠0 and b q≠0.

This type of processes is often used for modeling signals. If q = 0, the process is said to be an auto-regressive process, or AR(p). This latter models for example economical data depending linearly of the p past values, up to a fluctuation—the noise. If a = −1 and p = 1, it is a random walk. If p = 0, the process is said to be a moving-average process, or MA(q).

The ARMA processes can also be indexed by \(\mathbb {N}\), by fixing the value of X 0, …, X p, as shown in Exercise 4.1.

▹ Example 4.5 (MA(1) Process)

Let ε be a Gaussian white noise with variance σ 2 and let (X n) be a sequence of random variables such that X n =  n−1 + ε n, for n > 0, with X 0 = ε 0. The vector (X 0, …, X n) is the linear transform of the Gaussian vector (ε 0, …, ε n), and hence is a Gaussian vector too, that is centered. For n ≥ 1 and m ≥ 1,

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{C}\mathrm{ov}\,(X_n,X_m)&\displaystyle =&\displaystyle b^2\mathbb{E}\,\varepsilon_{n-1}\varepsilon_{m-1}+ b(\mathbb{E}\,\varepsilon_{n-1}\varepsilon_m+\mathbb{E}\,\varepsilon_n\varepsilon_{m-1})+\mathbb{E}\,\varepsilon_n\varepsilon_m\\ &\displaystyle =&\displaystyle \begin{cases}0&\displaystyle \mbox{if }|n-m|\geq 2,\\ b\sigma^2&\displaystyle \mbox{if }|n-m|=1,\\ (b^2+1)\sigma^2&\displaystyle \mbox{if }n=m.\end{cases} \end{array} \end{aligned} $$

Finally, we compute \(\mathbb {V}\mathrm {ar}\, X_0=\sigma ^2\), \(\mathbb {C}\mathrm {ov}\,(X_0,X_1)=b\sigma ^2\) and \(\mathbb {C}\mathrm {ov}\, (X_0,X_m)=0\) for all m > 1. \(\lhd \)

Definition 4.6

Let \((\Omega , \mathcal {F})\) be a measurable set.

  1. 1.

    If \({\mathcal F}_t\) is a σ-algebra included in \({\mathcal F}\) for all \(t\in \mathbb {T}\), and if \({\mathcal F}_s\subset {\mathcal F}_t\) for all s < t, then \({\mathbf F}=({\mathcal F}_t)_{t\in \mathbb {T}}\) is called a filtration of \((\Omega , \mathcal {F})\). Especially, if \({{\mathbf X}}= (X_t)_{t\in \mathbb {T}}\) is a stochastic process, the filtration \({\mathbf F}=(\sigma (X_s; s\le t))_{t\in \mathbb {T}}\) is called the natural filtration of X.

  2. 2.

    A stochastic process X is said to be adapted to a filtration \((\mathcal {F}_t)_{t\in \mathbb {T}}\) if for all t ≥ 0, the random variable X t is \(\mathcal {F}_t\)-measurable.

Obviously, every process is adapted to its natural filtration. For investigating stochastic processes, it is necessary to extend the probability space \((\Omega , \mathcal {F}, {\mathbb {P} } )\) by including a filtration \({\mathbf F}=({\mathcal F}_t)_{t\in \mathbb {T}}\); then \((\Omega , \mathcal {F}, {\mathbf F}, {\mathbb {P} } )\) is called a stochastic basis. Unless otherwise stated, the filtration will be the natural filtration of the studied process.

Theorem-Definition 4.7

Let F be a filtration.

A random variable \(T: (\Omega , \mathcal {F})\to \overline {{\mathbb {R}}}_+\) such that \((T\le t)\in {\mathcal F}_t\) , for all \(t\in {\mathbb {R}}_+\) , is called an F -stopping time. Then, the family of events

$$\displaystyle \begin{aligned}{\mathcal F}_T=\{A\in{\mathcal F}\,:\,A\cap (T\le t) \in {\mathcal F}_t,\ t\in\mathbb{R}_+\}\end{aligned}$$

is a σ-algebra, called the σ-algebra of events previous to T.

A stopping time adapted to the natural filtration of some stochastic process is said to be adapted to this process. The properties of the stopping times taking values in \(\mathbb {R}_+\) derive directly from the properties of stopping times taking values in \(\mathbb {N}\) studied in Chap. 2. Note that for any stopping time T, and for the translation operator θ s for \(s\in \mathbb {R}_+\), we have (T ∘ θ s = t + s) = (T = t).

Definition 4.8

Let \({{\mathbf X}}=(X_t)_{t\in \mathbb {T}}\) be a stochastic process defined on a probability space \((\Omega ,{\mathcal F}, \mathbb {P} )\), where \(\mathbb {T}\subset \mathbb {R}\).

  1. 1.

    The probability \(\mathbb {P} _{{\mathbf X}} = \mathbb {P} \circ {{\mathbf X}}^{-1}\) defined on \((\mathbb {R}^{\mathbb {T}}, {\mathcal B} (\mathbb {R}^{\mathbb {T}}))\) by

    $$\displaystyle \begin{aligned}\mathbb{P} _{{\mathbf X}} (B)= \mathbb{P} ({{\mathbf X}}\in B),\quad B\in {\mathcal B} (\mathbb{R}^{\mathbb{T}}), \end{aligned}$$

    is called the distribution of X.

  2. 2.

    The probabilities \(\mathbb {P} _{t_1,\ldots ,t_n}\) defined by

    $$\displaystyle \begin{aligned}\mathbb{P} _{t_1,\ldots ,t_n}(B_1\times \ldots \times B_n)= \mathbb{P} (X_{t_1}\in B_1,\ldots ,X_{t_n}\in B_n),\end{aligned}$$

    for t 1 < ⋯ < t n, \(t_i\in \mathbb {T}\), are called the finite dimensional distributions of X. More generally, the restriction of \(\mathbb {P} _{{\mathbf X}}\) to \(\mathbb {R}^{\mathbb {S}}\) for any \(\mathbb {S}\subset \mathbb {T}\) is called the marginal distribution of X on \(\mathbb {S}\).

  3. 3.

    The functions \(F_{t_1,\ldots ,t_n}\) defined by

    $$\displaystyle \begin{aligned}F_{t_1,\ldots ,t_n}(x_1, \ldots , x_n)= \mathbb{P} (X_{t_1}\le x_1,\ldots ,X_{t_n}\le x_n),\end{aligned}$$

    for t 1 < ⋯ < t n, \(t_i\in \mathbb {T}\) and \(x_i\in \mathbb {R}\), are called the finite dimensional distribution functions of X.

▹ Example 4.9 (Finite Dimensional Distribution Functions)

Let X be a positive random variable with distribution function F. Let X be the stochastic process defined by X t = X − (t ∧ X) for t ≥ 0. Its one-dimensional distribution functions are given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} F_t(x)&\displaystyle =&\displaystyle P(X_t \le x)= P(X_t \le x, X\le t)+P(X_t\le x, X>t)\\ &\displaystyle =&\displaystyle \mathbb{P} (0\le x, X\le t)+ \mathbb{P} (X-t\le x, X>t)\\&\displaystyle =&\displaystyle F(t) + F(t+x)-F(t)= F(t+x),\quad x\ge 0, \end{array} \end{aligned} $$

for t ≥ 0, and its two-dimensional distribution functions are

$$\displaystyle \begin{aligned} \begin{array}{rcl} F_{t_1,t_2}(x_1,x_2)&\displaystyle =&\displaystyle F((x_1+t_1)\wedge( x_2+t_2))+F((x_1+t_1)\wedge t_2)\\&\displaystyle &\displaystyle +F(t_1\wedge (x_2+t_2))-\,2F(t_1\vee t_2),\quad x_1\ge 0,x_2\ge 0, \end{array} \end{aligned} $$

for t 1 ≥ 0 et t 2 ≥ 0. \(\lhd \)

For a given family of distribution functions, does a stochastic process on some probability space exist with these functions as distribution functions? The answer may be positive, for instance under the conditions given by the following theorem, which we state without proof.

Theorem 4.10 (Kolmogorov)

Let \((F_{t_1,\ldots ,t_n})\) , for t 1 < ⋯ < t n in \(\mathbb {T}\) , be a family of distribution functions satisfying for any \((x_1, \ldots , x_n)\in \mathbb {R}^n\) the two following coherence conditions:

  1. 1.

    for any permutation (i 1, …, i n) of (1, …, n),

    $$\displaystyle \begin{aligned}F_{t_1,\ldots ,t_n}(x_1, \ldots , x_n)= F_{t_{i_1},\ldots ,t_{i_n}}(x_{i_1}, \ldots , x_{i_n}); \end{aligned}$$
  2. 2.

    for all \( k\in \llbracket 1, n-1\rrbracket \) ,

    $$\displaystyle \begin{aligned}F_{t_1,\ldots ,t_{k},\ldots ,t_n}(x_1, \ldots ,x_{k},+\infty,\ldots ,+\infty)= F_{t_1,\ldots ,t_{k}}(x_1, \ldots , x_{k}).\end{aligned}$$

    Then, there exists a stochastic process \({{\mathbf X}}=(X_t)_{t\in \mathbb {T}}\) defined on some probability space \((\Omega , {\mathcal F}, \mathbb {P} )\) such that

    $$\displaystyle \begin{aligned} \mathbb{P} (X_{t_1}\le x_1,\ldots ,X_{t_n}\le x_n) = F_{t_1,\ldots ,t_n}(x_1, \ldots , x_n). \end{aligned}$$

For stochastic processes, the notion of equivalence takes the following form.

Definition 4.11

Let \({{\mathbf X}}=(X_t)_{t\in \mathbb {T}}\) and \({\mathbf Y}=(Y_t)_{t\in \mathbb {T}}\) be two stochastic processes both defined on the same probability space \((\Omega , \mathcal {F}, \mathbb {P} )\) and taking values in \((\mathbb {R}^d,{\mathcal B}(\mathbb {R}^d))\). They are said to be:

  1. 1.

    weakly stochastically equivalent if for all (t 1, …, t n) \(\in \mathbb {T}^n\), all \((B_1,\ldots ,B_n)\in {\mathcal B}(\mathbb {R}^d)^n\) and all \(n\in \mathbb {N}^*\),

    $$\displaystyle \begin{aligned} \mathbb{P} (X_{t_1}\in B_1,\ldots ,X_{t_n}\in B_n)= \mathbb{P} (Y_{t_1}\in B_1,\ldots ,Y_{t_n}\in B_n). \end{aligned} $$
    (4.1)
  2. 2.

    stochastically equivalent if

    $$\displaystyle \begin{aligned} \mathbb{P} (X_t=Y_t)=1,\quad t\in\mathbb{T}. \end{aligned} $$
    (4.2)

    Then, Y is called a version of X.

  3. 3.

    indistinguishable if

    $$\displaystyle \begin{aligned} \mathbb{P} ( X_t=Y_t, \forall t\in \mathbb{T})=1. \end{aligned} $$
    (4.3)

The trajectories of two indistinguishable processes are a.s. equal. This latter property is stronger than the two former ones; precisely

$$\displaystyle \begin{aligned} (4.3) \Longrightarrow (4.2) \Longrightarrow (4.1). \end{aligned}$$

The converse implications do not hold, as the following example shows.

▹ Example 4.12

Let Z be a continuous positive random variable. Let X and Y be two processes indexed by \(\mathbb {R}_+\), defined by X t = 0 for all t and . These two processes are stochastically equivalent, but not indistinguishable. Indeed, \({\mathbb {P} } (X_t\neq Y_t)={\mathbb {P} } (Z=t)=0\), for all t ≥ 0, but \({\mathbb {P} } (X_t=Y_t, \forall t\ge 0)=0\). \(\lhd \)

Definition 4.13

A stochastic process \({{\mathbf X}}= (X_t)_{t\in \scriptstyle \mathbb {R}}\) is said to be stochastically continuous at \(s\in \mathbb {R}\) if, for all ε > 0,

$$\displaystyle \begin{aligned} \lim_{t\to s}\mathbb{P} (| X_t-X_s |> \varepsilon)= 0. \end{aligned}$$

Note that stochastic continuity does not imply continuity of the trajectories of the process.

Most of the probabilistic notions defined for sequences of random variables indexed by \(\mathbb {N}\) in Chap. 1 extend naturally to stochastic processes indexed by \(\mathbb {R}_+\).

Definition 4.14

Let X be a stochastic process. The quantities

$$\displaystyle \begin{aligned} \mathbb{E}\, (X_{t_1}^{n_1}\ldots X_{t_d}^{n_d}),\quad d\in\mathbb{N}^*,\ n_i\in\mathbb{N}^*,\ \sum_{i=1}^dn_i=n, \end{aligned}$$

are called the order n moments of X, when they are finite.

Especially, \(\mathbb {E}\, X_t\) is called the mean value at time t and \(\mathbb {V}\mathrm {ar}\, X_t\) the instantaneous variance. The latter is also called power by analogy to the electric tension X t in a resistance, for which \(\mathbb {E}\,( X_t^2)\) is the instantaneous power.

The natural extension of the covariance matrix for finite families is the covariance function.

Definition 4.15

Let X be a stochastic process. The function \(R_{{\mathbf X}}\colon \mathbb {T}\times \mathbb {T}\longrightarrow \overline {\mathbb {R}}\) defined by

$$\displaystyle \begin{aligned} R_{{\mathbf X}}(t_1,t_2)=\mathbb{C}\mathrm{ov}\,(X_1,X_2)=\mathbb{E}\, (X_{t_1}X_{t_2})-(\mathbb{E}\, X_{t_1})(\mathbb{E}\, X_{t_2}), \end{aligned}$$

is called the covariance function of the process.

4.1.1 Properties of Covariance Functions

  1. 1.

    If R X takes values in \(\mathbb {R}\), then X is a second order process, and \(\mathbb {V}\mathrm {ar}\, X_t=R_{{\mathbf X}}(t,t)\).

  2. 2.

    A covariance function is a positive semi-definite function. Indeed, for all \((c_1,\ldots ,c_n)\in \mathbb {R}^n\), since covariance is bilinear,

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{i=1}^n\sum_{j=1}^nc_ic_jR_{{\mathbf X}}(t_i,t_j)&\displaystyle =&\displaystyle \mathbb{C}\mathrm{ov}\,(\sum_{i=1}^nc_iX_{t_i},\sum_{j=1}^nc_jX_{t_j}) \\&\displaystyle =&\displaystyle \mathbb{V}\mathrm{ar}\,\Big(\sum_{i=1}^nc_iX_{t_i}\Big)\ge 0. \end{array} \end{aligned} $$
  3. 3.

    For a centered process,

    $$\displaystyle \begin{aligned} R_{{\mathbf X}}(t_1,t_2)^2\le R_{{\mathbf X}}(t_1,t_1)R_{{\mathbf X}}(t_2,t_2),\end{aligned}$$

    by Cauchy-Schwarz inequality.

The moments of a stochastic process are obtained by averaging over Ω. They are called space averages. Another notion of average exists for processes, on the set of indices \(\mathbb {T}\); here we take \(\mathbb {T}=\mathbb {N}\) for simplification.

Definition 4.16

Let \({{\mathbf X}}=(X_n)_{n\in \scriptstyle \mathbb {N}}\) be a second order random sequence. The random variable

$$\displaystyle \begin{aligned}\overline{{\mathbf X}}=\lim_{N\rightarrow +\infty}\frac{1}{N}\sum_{n=1}^NX_n,\end{aligned}$$

is called the time mean of X and the random variable

$$\displaystyle \begin{aligned}\overline{{{\mathbf X}}({m})}=\lim_{N\rightarrow +\infty}\frac{1}{N}\sum_{n=1}^NX_nX_{n+m} \end{aligned}$$

is called the (mean) time power of X on \(m\in \mathbb {N}^*\). Limit is taken in the sense of convergence in mean square.

The time mean is linear as is the expectation. If Y = a X + b, then \(\overline {\mathbf Y(m)}=a^2\overline {{\mathbf X}(m)}\) as for the variance. In the next section, we will make precise the links between space and time averages.

The different notions of convergence defined in Chap. 1 for random sequences indexed by \(\mathbb {N}\) extend to stochastic processes indexed by \(\mathbb {R}_+\) in a natural way. The next extension of the large numbers law has been proven for randomly indexed sequences in Theorem 1.93 in Chap. 1; the next proof is an interesting alternative. It is then completed by a central limit theorem.

Theorem 4.17

Let (X n) be an integrable i.i.d. random sequence. If \((N_t)_{t\in \scriptstyle \mathbb {R}_+}\) is a stochastic process taking values in \(\mathbb {N}^*\) , a.s. finite for all t, independent of (X n) and converging to infinity a.s. when t tends to infinity, then

$$\displaystyle \begin{aligned}\frac 1{N_t}({X_1+\cdots+X_{N_t}})\stackrel{\mathrm{a.s.}}{\longrightarrow}\mathbb{E}\, X_1,\quad t\rightarrow +\infty. \end{aligned}$$

Proof

Set \(\overline {X}_n= (X_1+\dots +X_n)/n\) for n ≥ 1. By the strong law of large numbers, \(\overline {X}_n \) converges to \( \mathbb {E}\, X_1\) almost surely, that is on Ω ∖ A, where \(A =\{\omega : Y_n(\omega ) \not \rightarrow {\mathbb {E}\,} X_1\}\). Note that \((\overline {X}_{N_t(\omega )}(\omega ))\) is a sub-sequence of \((\overline {X}_n(\omega ))\), and set B = {ω : N t(ω)↛} and \(C =\{\omega : \overline {X}_n{N_t(\omega )}(\omega ) \not \rightarrow {\mathbb {E}\,} X_1\}\). Then C ⊂ A ∩ B, and the proof is completed. □

Theorem 4.18 (Anscombe)

Let \((X_n)_{n\in \scriptstyle \mathbb {N}^*}\) be an i.i.d. random sequence with centered distribution with finite variance σ 2. If \((N_t)_{t\in \scriptstyle \mathbb {R}_+}\) is a stochastic process taking values in \(\mathbb {N}^*\) , a.s. finite for all t, independent of (X n) and converging to infinity a.s. when t tends to infinity, then

$$\displaystyle \begin{aligned}\frac 1{\sigma\sqrt{N_t}}(X_1+\cdots+X_{N_t})\stackrel{\mathcal{D}}{\longrightarrow}{\mathcal N}(0,1),\quad t\rightarrow +\infty. \end{aligned}$$

Proof

Set S n = X 1 + ⋯ + X n. We have

so

$$\displaystyle \begin{aligned} | \mathbb{E}\,(e^{i tS_{N_t}/\sigma\sqrt{N_t}})-e^{-\sigma^2t^2/2} |\le\sum_{n\ge 1}\mathbb{P} (N_t=n)| \mathbb{E}\,(e^{i tS_{n}/\sigma\sqrt{n}})-e^{-\sigma^2t^2/2} |. \end{aligned}$$

On the one hand, due to the central limit theorem, \(S_n/\sigma \sqrt {n}\) converges in distribution to a standard Gaussian variable, so, for all ε > 0, we obtain \(| \mathbb {E}\,(e^{i tS_{n}/\sqrt {n}})-e^{-\sigma ^2t^2/2} |<\varepsilon \) for n > n ε. Therefore,

$$\displaystyle \begin{aligned} \sum_{n>n_{\varepsilon}}\mathbb{P} (N_t=n)| \mathbb{E}\,(e^{i tS_{n}/\sigma\sqrt{n}})-e^{-t^2/2} | \le\varepsilon\mathbb{P} (N_t>n_{\varepsilon})\le\varepsilon. \end{aligned}$$

On the other hand,

$$\displaystyle \begin{aligned} \sum_{n=1}^{n_{\varepsilon}}\mathbb{P} (N_t=n)| \mathbb{E}\,(e^{i tS_{n}/\sigma\sqrt{n}})-e^{-t^2/2} | \le 2\mathbb{P} (N_t\le n_{\varepsilon}), \end{aligned}$$

and the result follows because N t converges to infinity and \(\mathbb {P} (N_t\le n_{\varepsilon })\) tends to zero when t tends to infinity. □

Results of the same type can be stated for more general functionals, as for example the ergodic theorems.

4.2 Stationarity and Ergodicity

We will study here classical properties of processes taking values in \(\mathbb {R}\). First, a stochastic process is stationary if it is invariant by translation of time, that is to say if it has no interne clock.

Definition 4.19

Let \({{\mathbf X}}=(X_t)_{t\in \mathbb {T}}\) be a stochastic process. It is said to be strictly stationary if

$$\displaystyle \begin{aligned}(X_{t_1},\ldots ,X_{t_n})\sim(X_{t_1+s},\ldots ,X_{t_n+s}),\quad (s,t_1,\ldots ,t_n)\in\mathbb{T}^{n+1}. \end{aligned}$$

▹ Example 4.20 (Some Stationary Sequences)

An i.i.d. random sequence is stationary. An ergodic Markov chain whose initial distribution is the stationary distribution is stationary. \(\lhd \)

Ergodicity, a notion of invariance on the space Ω, has been introduced for Markov chains in Chap. 3. It extends to general random sequences as follows.

Definition 4.21

A random sequence (X n) is said to be strictly ergodic:

  1. 1.

    at order 1 if for any real number x the random variable

    is constant;

  2. 2.

    at order 2 if for all real numbers x 1 and x 2 the random variable

    is constant for all \(m\in \mathbb {N}\).

Since , a random sequence is ergodic at order 1 (order 2) if the distributions of the random variables X n (pairs (X n, X n+m)) can be obtained by time average.

▹ Example 4.22 (Markov Chains)

Any aperiodic positive recurrent Markov chain is ergodic, as seen in Chap. 3. \(\lhd \)

▹ Example 4.23

The stochastic process modeling the sound coming out of a tape recorder, with null ageing but parasite noise, is ergodic. Indeed, the observation of one long-time recording is obviously similar to the observation of several short-time recordings. This process is also obviously stationary. \(\lhd \)

The following convergence result is of paramount importance in statistics. We state it without proof.

Theorem 4.24 (Ergodic)

If (X n) is a strictly stationary and ergodic random sequence and if \(g:\mathbb {R}\longrightarrow \mathbb {R}\) is a Borel function, then

$$\displaystyle \begin{aligned} \frac{1}{N}\sum_{n=0}^{N-1}g(X_n)\longrightarrow\mathbb{E}\, g(X_0),\quad N\to +\infty. \end{aligned}$$

In ergodic theory, the strict stationarity and ergodicity of a stochastic process are expressed using set transformations.

Let \((E, \mathcal {E},\mu )\) be a measured set. A measurable function S : E → E is called a transformation of E; we denote by Sx the image of an element x ∈ E and we set S −1B = {x ∈ E : Sx ∈ B}, for \(B\in \mathcal {E}\).

A set B of \(\mathcal {E}\) is said to be S-invariant if μ(S −1(B)) = μ(B). The collection of S-invariant sets constitutes a σ-field, denoted by \(\mathcal {J}\). The function S is said to be μ-invariant if the elements of \(\mathcal {E}\) are all S-invariant. The transformation is said to be strictly ergodic if for all \(B\in {\mathcal E},\)

$$\displaystyle \begin{aligned} S^{-1}(B)=B \Longrightarrow\ \mu(B)=0\mbox{ or }1.\end{aligned} $$

The following convergence result, stated without proof, is also called the pointwise ergodic theorem.

Theorem 4.25 (Birkhoff)

Let μ be a finite measure on a measurable set \((E, \mathcal {E})\) . If μ is S-invariant for some transformation S : E  E, then, for all μ-integrable functions \(f:(E, \mathcal {E},\mu )\rightarrow \mathbb {R}\) ,

$$\displaystyle \begin{aligned} \frac{1}{n}\sum_{i=0}^{n-1}f(S^ix)\longrightarrow \overline{f}(x) \quad \mu-a.e.,\quad x\in E,\end{aligned} $$

where \(\overline {f}:E\to \mathbb {R}\) is a μ-integrable function such that \(\overline {f}(x)=\overline {f}(Sx)\) μ-a.e., and

$$\displaystyle \begin{aligned} \int_E f(t)d\mu(t) = \frac{1}{\mu(E)}\int_E \overline{f}(t)d\mu(t).\end{aligned} $$

If, moreover, S is ergodic, then

$$\displaystyle \begin{aligned} \overline{f}(x)=\int_E f(t)d\mu(t),\quad a.e..\end{aligned} $$

Note that if μ is a probability, then by definition, \(\overline {f}\) is the conditional expectation of f given \(\mathcal {J}\), or \(\overline {f}=\mathbb {E}\, (f\mid \mathcal {J})\). If S is ergodic, then \(\overline {f}={\mathbb {E}\,} f\) a.s.

▹ Example 4.26 (Interpretation in Physics)

Let us observe at discrete times a system evolving in an Euclidean space E.

Let x 0, x 1, x 2, … be the points successively occupied by the system. Let S transform the elements of E through x n = Sx n−1, for n ≥ 1. Let S n denote the n-th iterate of S, that is x n = S nx 0. Thus, the sequence (x 0, x 1, x 2, …, x n, … ) can be written (x 0, Sx 0, S 2x 0, …, S nx 0, … ) and is called an orbit of the point x 0 (or trajectory).

Let \(f:E\rightarrow \mathbb {R}\) be a function whose values f(x) express a physical measure (speed, temperature, pressure, etc.) at x ∈ E. Thanks to former experiments, the measure f(x) is known to be subject to error and the empirical mean [f(x 0) + f(x 1) + ⋯ + f(x n−1)]∕n to give a better approximation of the true measured quantity. For n large enough, this mean is close to the limit

$$\displaystyle \begin{aligned} \lim_{n\to +\infty}\frac{1}{n}\sum_{i=0}^{n-1}f(S^ix), \end{aligned}$$

which should be equal to the true physical quantity. \(\lhd \)

Any stochastic process \({\mathbf X}=(X_t)_{t\in \mathbb {T}}\) on \((\Omega , {\mathcal F},\mathbb {P} )\), taking values in E, can be considered as defined by

$$\displaystyle \begin{aligned} X_t(\omega)=X(S_t\omega),\quad \omega\in\Omega,\ t\in \mathbb{T}, \end{aligned}$$

where X :  Ω → E is a random variable and is a group of transformations. If \(\mathbb {T}=\mathbb {N}\), then S n is the n-th iterate of some transformation S :  Ω → Ω. Clearly, the process is strictly stationary if

$$\displaystyle \begin{aligned} \mathbb{P} ((S_t)^{-1}(A))=\mathbb{P} (A),\quad A\in{\mathcal F},\ t\in\mathbb{T}, \end{aligned}$$

and is strictly ergodic at order 1 if

$$\displaystyle \begin{aligned}\forall A\in{\mathcal F},\quad (S_t)^{-1}(A)=A,\ t\in\mathbb{T}\ \Longrightarrow\ \mathbb{P} (A)=0\mbox{ or }1. \end{aligned}$$

Any real valued process \({\mathbf X}=(X_t)_{t\in \mathbb {T}}\) can be defined on its canonic space \((\mathbb {R}^{\mathbb {T}}, {\mathcal B}(\mathbb {R}^{\mathbb {T}}),\mathbb {P} _{\mathbf X})\) by setting

$$\displaystyle \begin{aligned} X_t(\omega)=\omega(t),\quad \omega\in \mathbb{R}^{\mathbb{T}},\ t\in \mathbb{T}. \end{aligned}$$

Strict stationarity thus appears as a property of invariance of the probability \(\mathbb {P} _{\mathbf X}\) with respect to translation operators, that is \(\mathbb {P} _{\mathbf X}\circ \theta _t^{-1}=\mathbb {P} _{\mathbf X}\) for all \(t\in \mathbb {T}\), or

$$\displaystyle \begin{aligned} \mathbb{P} _{\mathbf X}(\theta_t^{-1}(B))=\mathbb{P} (B),\quad B\in{\mathcal B}(\mathbb{R}^{\mathbb{T}}),\ t\in\mathbb{T}. \end{aligned}$$

Strict ergodicity can be expressed using θ t-invariant sets. Precisely, the process is ergodic if

$$\displaystyle \begin{aligned}\forall B\in{\mathcal B}(\mathbb{R}^{\mathbb{T}}),\quad \theta_t^{-1}(B)=B,\ t\in\mathbb{T}\ \Longrightarrow\ \mathbb{P} _{\mathbf X}(B)=0\mbox{ or }1. \end{aligned}$$

For a real stationary and integrable random sequence (X n), Birkhoff’s ergodic theorem yields

$$\displaystyle \begin{aligned} \frac{1}{n}\sum_{i=0}^{n-1}X_i\stackrel{a.s.}{\longrightarrow} \overline{X}= {\mathbb{E}\,} (X_0\mid \mathcal{J}), \end{aligned}$$

where \(\mathcal {J}\) is the σ-field of θ 1-invariant sets. For an ergodic sequence—such that θ 1-invariant events have probability 0 or 1, the limit is \(\overline {X}= \mathbb {E}\, X_0\).

Note that, according to Kolmogorov 0–1 law, any i.i.d. sequence is stationary and ergodic.

Further, the entropy rate of a random sequence has been defined in Chap. 1. The entropy rate of a continuous time stochastic process indexed by \(\mathbb {R}_+\) is defined similarly.

Definition 4.27

Let \({{\mathbf X}}=(X_t)_{t\in \mathbb {R}_+}\) be a stochastic process such that the marginal distribution of X = (X t)t ∈ [0,T] has a density \(p^{{\mathbf X}}_T\) with respect either to the Lebesgue or to the counting measure, for all T. The entropy up to time T of X is the entropy of its marginal distribution, that is

$$\displaystyle \begin{aligned}\mathbb{H} _T({\mathbf X} )=-\mathbb{E}\, \log p^{{\mathbf X}}_T(X). \end{aligned}$$

If \({\mathbb {H} _T({\mathbf X} )}/{T}\) has a limit when T tends to infinity, this limit is called the entropy rate of the process and is denoted by \( \mathbb {H} ({\mathbf X})\).

The next result, stated here for stationary and ergodic sequences is of paramount importance in information theory. Note that its proof is based on Birkhoff’s ergodic theorem.

Theorem 4.28 (Ergodic Theorem of Information Theory)

Let \({\mathbf X}=(X_t)_{t\in \mathbb {T}}\) , with \(\mathbb {T}=\mathbb {R}_+\) or \(\mathbb {N}\) , be a stationary and ergodic stochastic process. Suppose that the marginal distribution of X = (X t)t ∈ [0,T] has a density \(p^{{\mathbf X}}_T\) with respect to the Lebesgue or counting measure, for all T. If the entropy rate \(\mathbb {H} ({\mathbf X})\) is finite, then \(- \log p^{{\mathbf X}}_T(X)/T\) converges in mean and a.s. to \(\mathbb {H} ({\mathbf X})\).

Explicit expressions of \(\mathbb {H} ({\mathbf X})\) for jump Markov and semi-Markov processes will be developed in the next chapter.

Other notions of stationarity exist, less strict and more convenient for applications; we define here only first and second order stationarity.

Definition 4.29

A second order process X is said to be weakly stationary:

  1. 1.

    to the order 1 if \(\mathbb {E}\, X_{t_1}=\mathbb {E}\, X_{t_2}=m_X\) for all \((t_1,t_2)\in \mathbb {T}^2\).

  2. 2.

    to the order 2 if, moreover, R X(t 1, t 2) = R X(t 1 + τ, t 2 + τ) for all \(\tau \in \mathbb {T}\) and \((t_1,t_2)\in \mathbb {T}^2\).

The mean and the variance of X t are then constant in t, but the X t do not necessarily have the same distribution.

▹ Example 4.30 (Sinusoidal Signal (Continuation of Example 4.1))

Let X be as defined in Example 4.1. Suppose that ν is constant, and that \(\varphi \sim {\mathcal U}[-\pi ,\pi ]\) and A are independent variables. Then \(\mathbb {E}\, X_t=0\) and \(R_{{\mathbf X}}(t_1,t_2)=\mathbb {E}\,(A^2)\cos [2\pi \nu (t_1-t_2)]/2\); hence, this sinusoidal signal is second order stationary. Clearly, it is not strictly stationary. \(\lhd \)

For a second order process, stationarity induces weak stationarity. If X is a Gaussian process, the converse holds true. Indeed, then \(\mathbb {E}\,(X_{t_1},\ldots ,X_{t_n})=M \) is constant and the covariance matrix \(\Gamma _{X_{t_1},\ldots ,X_{t_n}}\) is a matrix valued function of the differences t j − t i. The distribution of \((X_{t_1},\ldots ,X_{t_n})\) depends only on M and Γ.

Further, the covariance function of a weakly stationary process can be written R X(t 1, t 2) = r X(t 2 − t 1), where r X is an even function of only one variable, called the auto-correlation function of the process. The faster X fluctuates, the faster r X decreases.

If X is a second order process, the Cauchy-Schwarz inequality yields |r X(t 2 − t 1)|≤ r X(0). The function r X is positive definite; if, moreover, it is continuous, then it has a spectral representation—by Bochner’s theorem, meaning that it is the Fourier transform of a bounded positive measure. For example, if \(\mathbb {T}=\mathbb {R}\) (\(\mathbb {Z}\)),

$$\displaystyle \begin{aligned}r_{{{\mathbf X}}}(t)=\int_\Lambda e^{i \lambda t}\mu(d\lambda) \end{aligned}$$

where \(\Lambda =\mathbb {R}\) ([0, 2π]) and μ is called the spectral measure of the process. If μ is absolutely continuous with respect to the Lebesgue measure, its density is called the spectral density of the process. The term “spectral” designs what is connected to frequencies.

Then the integral representation of the process follows—by Karhunen’s theorem,

$$\displaystyle \begin{aligned} X_t=\int_\Lambda e^{i \lambda t}Z(d\lambda), \end{aligned}$$

where Z is a second order process with independent increments—see next section, indexed by Λ.

▹ Example 4.31 (White Noise)

Let \(\boldsymbol {\varepsilon }=(\varepsilon _n)_{n\in \scriptstyle \mathbb {Z}}\) be a stationary to the order 2 centered process, with \(\mathbb {V}\mathrm {ar}\,\varepsilon _n=1\) and

$$\displaystyle \begin{aligned}\mathbb{E}\, (\varepsilon_{n}\varepsilon_{m})=0=\int_0^{2\pi}\frac{1}{2\pi}e^{i \lambda (n-m)}d\lambda,\quad m\ne n. \end{aligned}$$

Therefore, the spectral density of ε is constant. All the frequencies are represented with the same power in the signal, hence its name: white noise, by analogy to white light from an incandescent body. \(\lhd \)

▹ Example 4.32 (ARMA Processes)

An MA(q) process X is obtained from a white noise ε by composition with the function \(f(x_1,\ldots ,x_n)=\sum _{k=1}^qb_kx_{n-k}\), called linear filter. We compute \(\mathbb {E}\,(X_nX_{n+m})=\sum _{k=1}^qb_kb_{m+k}\). The spectral density of X is obtained from that of ε, that is

$$\displaystyle \begin{aligned}h_{{\mathbf X}}(\lambda)=\frac{1}{2\pi}\Big|\sum_{k=1}^qb_ke^{-ik\lambda}\Big|{}^2. \end{aligned}$$

In the same way, an AR(p) process Y is obtained by recursive filtering of order p from ε, and

$$\displaystyle \begin{aligned}h_{\mathbf Y}(\lambda)=\frac{1}{2\pi}\Big|\sum_{k=1}^qa_ke^{-ik\lambda}\Big|{}^{-2}.\end{aligned} $$

Finally, the spectral density of an ARMA(p, q) process Z is

$$\displaystyle \begin{aligned} h_{\mathbf Z}(\lambda)=\frac{1}{2\pi}\frac{|\sum_{k=1}^qb_ke^{-ik\lambda}|{}^2}{|\sum_{k=1}^pa_ke^{-ik\lambda}|{}^2},\end{aligned} $$

for λ ∈ [0, 2π]. \(\lhd \)

Just as strict stationarity, strict ergodicity is rarely satisfied in applications, and hence weaker notions of ergodicity are considered.

Definition 4.33

A random sequence (X n) is said to be:

  1. 1.

    first order (or mean) ergodic if the random variable \(\overline {{\mathbf X}}\) is a.s. constant.

  2. 2.

    second order ergodic if, moreover, for all \(\tau \in \mathbb {R}\), the random variable \(\overline {{{\mathbf X}}(m)}\) is a.s. constant.

Mean ergodicity is satisfied under the following simple condition.

Proposition 4.34

A random sequence (X n) is first order ergodic if and only if its covariance function R X is summable.

Proof

For a centered sequence X.

We compute

$$\displaystyle \begin{aligned} \mathbb{E}\,\Big(\frac{1}{N}\sum_{n=0}^{N-1}X_n\Big)^2= \frac{1}{N^2}\sum_{n=1}^N\sum_{m=1}^NR_{{\mathbf X}}(n,m)\end{aligned} $$

If the sequence is first order ergodic, \(\frac 1N\sum _{n=1}^NX_n \) converges in L 2 to a constant, so

$$\displaystyle \begin{aligned}\sum_{n=0}^{N-1}\sum_{m=0}^{N-1}R_{{\mathbf X}}(n,m)< +\infty.\end{aligned} $$

Conversely, if R X is summable,

$$\displaystyle \begin{aligned}\frac{1}{N^2}\sum_{n=0}^{N-1}\sum_{n=0}^{N-1}R_{{\mathbf X}}(n,m) \stackrel{L^2}{\longrightarrow}0,\quad N\rightarrow +\infty. \end{aligned}$$

and the sequence is first order ergodic. □

Stationarity and ergodicity are not equivalent, as shown by the following examples.

▹ Example 4.35

A process X constant with respect to t, equal to some random variable V , is obviously stationary. Since \(\overline {{\mathbf X}}=V\), it is first order ergodic only if V is a.s. constant.

By linearity of integration, the sum of two ergodic processes is ergodic too. The sum of two independent second order stationary processes is second order stationary too. On the contrary, the sum of two dependent second order stationary processes is not stationary in general.

Finally, let X be a process and let V be a random variable independent of X. If X is stationary, the processes (VX t) and (X t + V ) are stationary too; on the contrary, if X is ergodic, (VX t) and (X t + V ) are ergodic too only if V is a.s. constant. \(\lhd \)

If the process is only weakly stationary and ergodic, the ergodic theorem does not hold in general. Still, the following result holds true. In this case, the mean power is equal to the instantaneous power.

Proposition 4.36

If (X n) is stationary and ergodic to the order 2, then \( \mathbb {E}\, X_n=\overline {{\mathbf X}}\) for \( n\in \mathbb {N}\) and \(\overline {{{\mathbf X}}(m)}=r_{{\mathbf X}}(m)+\overline {{\mathbf X}}^2\) a.s. for \( m\in \mathbb {N}\).

Proof

The random variable \(\overline {{\mathbf X}}\) is a.s. constant, and all the X n have the same distribution, so \(\mathbb {E}\, X_n=\overline {{\mathbf X}}\) by linearity. In the same way, since \(\mathbb {E}\, X_n=\mathbb {E}\, X_{n+m}=\overline {{\mathbf X}}\) and \(r_{{\mathbf X}}(m)=\mathbb {E}\,(X_nX_{n+m})-\mathbb {E}\, X_n\mathbb {E}\, X_{n+m}\) does not depend on n, we can write

$$\displaystyle \begin{aligned}r_{{\mathbf X}}(m)=\frac{1}{N}\sum_{n=0}^{N-1}\mathbb{E}\,(X_nX_{n+m})- \left(\frac{1}{N}\sum_{n=0}^{N-1}\overline{{\mathbf X}}\right)^2, \end{aligned}$$

and \(r_{{\mathbf X}}(m)=\overline {{{\mathbf X}}(m)}-\overline {{\mathbf X}}^2\) a.s. □

Thus, the moments of the marginal distributions of weakly stationary and ergodic sequences appear to be characterized by time averaging, that is by the knowledge of one—and only one—full trajectory. This explains the importance of ergodicity in the statistical study of marginal distributions.

4.3 Processes with Independent Increments

Numerous types of processes have independent increments, among which we will present the Brownian motion, Poisson processes, compound Poisson processes, etc.

Definition 4.37

Let \({{\mathbf X}}=(X_t)_{t\in \mathbb {T}}\) be a stochastic process adapted to a filtration \(({\mathcal F}_t)\), where \(\mathbb {T}=\mathbb {R}\) or \(\mathbb {T}=\mathbb {N}\).

  1. 1.

    The process is said have independent increments if for all s < t in \(\mathbb {T}\), the random variable X t − X s is independent of the σ-algebra \({\mathcal F}_s\).

  2. 2.

    The process is said to have stationary increments if the distribution of X t − X s, for s < t in \(\mathbb {T}\), depends only on t − s.

  3. 3.

    A process with independent and stationary increments is said to be homogeneous (with respect to time).

When \(\mathbb {T}=\mathbb {N}\) and X is a process with independent increments, the random variables \(X_{t_0},X_{t_1}-X_{t_0},\ldots , X_{t_n}-X_{t_{n-1}}\) are independent for all \(n\in {\mathbb {N}}\) and t 0 < t 1 < ⋯ < t n. Since \(X_n=\sum _{i=1}^n(X_{i}-X_{i-1})\), knowing a process with independent increments is equivalent to knowing its increments. If X is a homogeneous process, then necessarily, X 0 = 0 a.s.

Let us now present a typical process with independent increments, the Brownian motion. It is an example of Gaussian, ergodic, non stationary process, with independent increments. A trajectory of a Brownian motion is shown in Fig. 4.1.

Fig. 4.1
figure 1

A trajectory of a Brownian motion

Definition 4.38

The process \({\mathbf W}=(W_t)_{t\in \mathbb {R}_+}\) taking values in \(\mathbb {R}\), with independent increments and such that W 0 = 0, and for all 0 ≤ t 1 < t 2,

$$\displaystyle \begin{aligned}W_{t_2}-W_{t_1}\sim{\mathcal N}(0,t_2-t_1), \end{aligned}$$

is called a standard Brownian motion (or Wiener process) with parameters ν and σ 2.

The process \({\mathbf W}=(W_t)_{t\in \mathbb {R}_+}\) taking values in \(\mathbb {R}\), with independent increments and such that W 0 = 0, and for all 0 ≤ t 1 < t 2,

$$\displaystyle \begin{aligned}W_{t_2}-W_{t_1}\sim{\mathcal N}(\nu(t_2-t_1),\sigma^2(t_2-t_1)), \end{aligned}$$

is called a Brownian motion with drift parameter ν and diffusion parameter σ 2.

Considering the random walk (S n) of Example 1.77, with p = 1∕2, yields an elementary construction of a Brownian motion. Indeed, setting

$$\displaystyle \begin{aligned}X_t=S_{[nT]},\quad t\in \mathbb{R}_+ \end{aligned}$$

defines a continuous time process. If t = nT, then \(\mathbb {E}\, X_t=0\) and \(\mathbb {V}\mathrm {ar}\, X_t=ts^2/T\). Let now t be fixed. If both s and T tend to 0, then the variance of X t remains fixed and non null if and only if \(s\approx \sqrt {T}\).

Let us set s 2 = σ 2T where \(\sigma ^2\in \mathbb {R}_+^*\) and define a process W by

$$\displaystyle \begin{aligned} W_t(\omega)=\lim_{T\rightarrow 0}X_t(\omega), \quad t\in\mathbb{R}_+,\ \omega\in\Omega. \end{aligned}$$

Taking the limit yields \(\mathbb {E}\, W_t=0\) and \(\mathbb {V}\mathrm {ar}\, W_t=\sigma ^2t\).

Let us show that \(W_t\sim {\mathcal N}(0,\sigma ^2t)\) by determining its distribution function at any \(w\in \mathbb {R}_+\). Set r = ws and T = tn. If w and t are fixed and T tends to 0, since \(s\approx \sqrt {T}\), we get \(r\approx \sqrt {n}\). Since \(\mathbb {P} [X_{nT}=(2k-n)s]=\binom {n}{k} p^k(1-p)^{n-k}\), we obtain by applying de Moivre-Laplace theorem

$$\displaystyle \begin{aligned}\mathbb{P} (S_n\leq rs)\approx\int_{-\infty}^{{r/\sqrt{n}}} {1\over \sqrt{2\pi }}e^{-t^2/2}\,dt,\end{aligned}$$

or, since \(r/\sqrt {n}=w/\sqrt {\sigma ^2t}\),

$$\displaystyle \begin{aligned}\mathbb{P} (W_t\leq w)\approx \int_{-\infty}^{{w/\sqrt{\sigma^2t}}} {1\over \sqrt{2\pi }}e^{-t^2/2}\,dt.\end{aligned}$$

Let us show that if 0 ≤ t 1 < t 2 < t 3, then \(W_{t_2}-W_{t_1}\) and \(W_{t_3}-W_{t_2}\) are independent. If 0 < n 1 < n 2 < n 3, then the number of “heads” obtained between the n 1-th and the n 2-th tossings is independent of the number of “heads” obtained between the n 2-th and n 3-th tossings. Hence \(S_{n_2}-S_{n_1}\) and \(S_{n_3}-S_{n_2}\) are independent, and taking the limit yields the result.

Finally, let us compute the covariance function of W. If t 1 < t 2, then \(W_{t_2}-W_{t_1}\) and \(W_{t_1}-W_{0}\) are independent. But W 0 = 0, so

$$\displaystyle \begin{aligned}\mathbb{E}\,[(W_{t_2}-W_{t_1})W_{t_1}]=[\mathbb{E}\,(W_{t_2}-W_{t_1})]\mathbb{E}\, W_{t_1}=0. \end{aligned}$$

Since we also have \(\mathbb {E}\,[(W_{t_2}-W_{t_1})W_{t_1}]=\mathbb {E}\,(W_{t_2}W_{t_1})-\mathbb {E}\, (W_{t_1}^2),\) we obtain \(\mathbb {E}\,(W_{t_2}W_{t_1})=\mathbb {E}\, W_{t_1}^2=\sigma ^2t_1,\) or R W(t 1, t 2) = σ 2(t 1 ∧ t 2).

Note that \(X_t/\sqrt {n}=S_{[nt]}/\sqrt {n}\) gives an approximation of W t that can be simulated as a sum of Bernoulli variables.

4.4 Point Processes on the Line

A point process is a stochastic process consisting of a finite or enumerable family of points set at random in an arbitrary space, for example gravel on a road, stars in a part of the sky, times of occurrences of failures of a given system, positions of one of the basis A, C, G, T of an DNA sequence, etc.

In a mathematical sense, a point can be multiple. Even if the space to which the considered points belong can be any topological space, we will here consider \({\mathbb {R}}^d,\) for d ≥ 1. After some general notions on point processes we will present the renewal process and the Poisson process on \({\mathbb {R}}_+\).

4.4.1 Basics on General Point Processes

The point processes are naturally defined through random point measures.

Definition 4.39

Let μ be a measure on \((\mathbb {R}^d,\mathcal {B}({\mathbb {R}}^d))\) and let (x i)iI be a sequence of points in \({\mathbb {R}}^d\), with \(I\subset \mathbb {N}\). If

$$\displaystyle \begin{aligned} \mu = \sum_{i\in I} \delta_{x_i}, \end{aligned}$$

and if μ(K) < + for all compact subsets K of \({\mathbb {R}}^d\), then μ is called a point measure on \({\mathbb {R}}^d\).

A point measure is a discrete measure. The multiplicity of \(x\in {\mathbb {R}}^d\) is μ({x}). When μ({x}) = 0 or 1 for all \(x\in {\mathbb {R}}^d\), then μ is said to be simple. If μ({x}) = 1 for all \(x\in {\mathbb {R}}^d\), the measure μ(A) is equal to the number of points belonging to A, for all \(A\in \mathcal {B}({\mathbb {R}}^d)\).

Definition 4.40

A function \(\mu :\Omega \times {\mathcal B}(\mathbb {R}^d)\longrightarrow \overline {\mathbb {R}}\) such that μ(ω, ⋅) is a point measure on \(\mathbb {R}^d\) for all ω is called a random point measure on \(\mathbb {R}^d\).

When μ(ω, {x}) = 0 or 1 for all ω and x, the random measure μ is also said to be simple.

▹ Example 4.41

Let n points in \({\mathbb {R}}^d\) be set at random positions X 1, …, X n. A random point measure on \(\mathbb {R}^d\) is defined by setting for all \(A\in \mathcal { B}({\mathbb {R}}^d)\),

$$\displaystyle \begin{aligned} \mu (A)= \sum_{i=1}^n \delta_{X_i} (A), \end{aligned}$$

the random number of points belonging to A. \(\lhd \)

▹ Example 4.42

Let \((T_n)_{n\in \mathbb {N}^*}\) be the sequence of times in \({\mathbb {R}}_+\) of failures of a system. Then setting for all s < t in \( {\mathbb {R}}_+\),

$$\displaystyle \begin{aligned}\mu ([s,t])= \sum_{i\ge 1} \delta_{T_i} ([s,t]),\end{aligned}$$

the random number of failures observed in the time interval [s, t], defines a random point measure on \(\mathbb {R}_+\). \(\lhd \)

Let \(M_p({\mathbb {R}}^d)\) denote the set of all point measures on \({\mathbb {R}}^d\).

Definition 4.43

A function \({\mathbf N} : \Omega \times \mathcal {B} ({\mathbb {R}}^d) \longrightarrow M_p({\mathbb {R}}^d)\) such that N(⋅, A) are random variables for all \(A\in \mathcal {B} ({\mathbb {R}}^d)\) is called a point process.

The variables N(⋅, A), taking values in \(\overline {{\mathbb {N}}}\), are called the counting random variables of the point process N, and the measure m defined by

$$\displaystyle \begin{aligned} m(A)={\mathbb{E}\,}[ N(\cdot,A)],\quad A\in \mathcal{B}({\mathbb{R}}^d), \end{aligned}$$

is its mean measure.

Each point process N is associated with the random point measure μ defined by

It is said to be simple if m is simple.

Thus, N(ω, A) counts the number of points of the process belonging to A for the outcome ω ∈ Ω. Note that m(A) can be infinite even when N(⋅, A) is a.s. finite.

If m has a density \(\lambda :{\mathbb {R}}^d\longrightarrow {\mathbb {R}}_+\), that is m(dx) = λ(x)dx or \({\mathbb {P} }[N(\cdot ,dx)=1]=\lambda (x)dx\), or

$$\displaystyle \begin{aligned} m(A)=\int_A \lambda (x) dx,\quad A\in \mathcal{B}({\mathbb{R}}^d), \end{aligned}$$

then the function λ, called the intensity of the process N, is locally summable—meaning that λ is integrable over all bounded rectangles of \(\mathbb {R}^d\).

Let us present some properties of integration with respect to a point measure.

Let \(f:\mathbb {R}^d\longrightarrow \mathbb {R}\) be a Borel function and μ a point measure on \(\mathbb {R}^d\). The integral of f with respect to μ is

$$\displaystyle \begin{aligned}\mu (f)=\int_{\mathbb{R}^d}fd\mu=\sum_{i\in I} \int_{{\mathbb{R}}^d} fd\delta_{x_i}=\sum_{i\in I} \delta_{x_i} (f) =\sum_{i\in I}f(x_i). \end{aligned}$$

The mean is

$$\displaystyle \begin{aligned} {\mathbb{E}\,} N(f)=\int_{{\mathbb{R}}^d} f dm, \end{aligned}$$

and the Laplace functional associated with N is defined as

$$\displaystyle \begin{aligned} L_N(f) ={\mathbb{E}\,} \left( \exp\left[-\int_{{\mathbb{R}}^d}f(x)N(\cdot,dx)\right]\right). \end{aligned}$$

Especially, if f is positive and if \(N=\sum _{n\ge 1} \delta _{X_n}\), then

$$\displaystyle \begin{aligned} L_N(f) ={\mathbb{E}\,} \Big(\exp\Big[-\sum _{n\ge 1} f(X_n)\Big]\Big). \end{aligned}$$

The Laplace functional characterizes the distribution of a point process. Indeed, the distribution of a point process N is given by its finite-dimensional distributions, that is by the distributions of all the random vectors (N(⋅, A 1), …, N(⋅, A n)) for n ≥ 1 and \(A_1,\ldots ,A_n \in \mathcal {B}({\mathbb {R}}^d)\). The function

is precisely the Laplace transform of the distribution of (N(⋅, A 1), …, N(⋅, A n)) that characterizes its distribution.

4.4.2 Renewal Processes

Renewal processes are punctual processes defined on \(\mathbb {R}_+\), modeling many experiments in applied probability—in reliability, queues, insurance, risk theory…They are also valuable theoretical tools for investigating more complex processes, such as regenerative, Markov or semi-Markov processes. A renewal process can be regarded as a random walk with positive increments; the times between occurring events are i.i.d. It is not a Markovian process but a semi-Markov process, that is studied by specific methods.

Definition 4.44

Let (X n) be an i.i.d. sequence of positive variables. Set

$$\displaystyle \begin{aligned} S_0=0\quad \mbox{and}\quad S_{n}=X_1+\cdots+ X_{n}, \quad n\ge 1. \end{aligned} $$
(4.4)

The random sequence (S n) is called a renewal process. The random times S n are called renewal times. The associated counting process is defined by \({\mathbf N}=(N_t)_{t\in \scriptstyle \mathbb {R}_+}\), where

(4.5)

Generally, (X n) is the sequence of inter-arrival times of some sequence of events, and N t counts the number of events in the time interval [0, t]. Note that (N t = n) = (S n−1 ≤ t < S n) and that N 0 = 1. A trajectory of a renewal process is shown in Fig. 4.2.

Fig. 4.2
figure 2

A trajectory of a renewal process, on (N t = n)

When each of the variables X n has an exponential distribution with parameter λ, the counting process N is called a Poisson process with parameter λ, in which case it is usual to set N 0 = 0; the Poisson processes will be especially investigated in the next section.

A counting process can also be regarded as a simple point process. Indeed,

$$\displaystyle \begin{aligned} N(\cdot,A)= \sum_{n\ge 0}\delta_{S_n}(A),\quad A\in {\mathcal B} ({\mathbb{R}}_+), \end{aligned} $$
(4.6)

defines a point process, and we obtain for A = [0, t],

$$\displaystyle \begin{aligned}N_t=N(\cdot,[0,t]),\quad t\ge 0.\end{aligned}$$

▹ Example 4.45 (A Renewal Process in Reliability)

A new component begins operating at time S 0 = 0. Let X 1 denote its lifetime. When it fails, it is automatically and instantly replaced by a new identical component. When the latter fails after a time X 2, it is renewed, and so on.

If (X n) is supposed to be i.i.d., then (4.4) defines a renewal process (S n) whose distribution is that of the sum of the life durations of the components. The counting process \((N_t)_{t\in \scriptstyle \mathbb {R}_+}\) gives the number of components used in [0, t], of which the last component still works at time t. \(\lhd \)

The expectation of the counting process at time t (or expected number of renewals), is

$$\displaystyle \begin{aligned} m(t)=\mathbb{E}\, N_t=\sum_{n\ge 0}\mathbb{P} (S_n\leq t), \end{aligned}$$

and m is called the renewal function. If the variables X n are not degenerated, this function is well-defined. Therefore, we will suppose thereafter that F(0) < 1, where F is the distribution function of X n. The distribution function of S n is the n-th Lebesgue-Stieltjes convolution of F, that is

$$\displaystyle \begin{aligned}F^{*(n)}(t)=\int_{\mathbb{R}_+} F^{*(n-1)}(t-x)dF(x), \end{aligned}$$

with and F ∗(1)(t) = F(t), so that

$$\displaystyle \begin{aligned} m(t) =\sum_{n\ge 0}F^{*(n)}(t),\quad t\in\mathbb{R}_+. \end{aligned} $$
(4.7)

The mean measure m of the point process (4.6) is given by

$$\displaystyle \begin{aligned}m(A)={\mathbb{E}\,} N(\cdot,A),\quad A\in {\mathcal B} ({\mathbb{R}}_+), \end{aligned}$$

and we have m(t) = m([0, t]) pour \( t\in \mathbb {R}_+.\) Note that we use the same notation for both the renewal function and the mean measure.

When F is absolutely continuous with respect to the Lebesgue measure, the derivative of the renewal function λ(t) = m′(t) is called the renewal density (or renewal rate) of the process.

Proposition 4.46

The renewal function is increasing and finite.

Proof

For all s > 0, we have N t+s ≥ N t, so m is increasing.

Assume that F(t) < 1 for t > 0. Then F ∗(n)(t) ≤ [F(t)]n for all n, and hence

$$\displaystyle \begin{aligned} m(t)\le 1+F(t)+ [F(t)]^2+\cdots\le\frac{1}{1-F(t)}, \end{aligned}$$

so m(t) is finite. The general case is omitted. □

Relation (4.7) implies straightforwardly that m(t) is a solution of

$$\displaystyle \begin{aligned}m(t)=1+\int_0^tm(t-x)dF(x),\quad t\in\mathbb{R}_+. \end{aligned}$$

This equation is a particular case of the scalar renewal equation

$$\displaystyle \begin{aligned} h=g+F*h, \end{aligned} $$
(4.8)

where h and g are functions bounded on the finite intervals of \(\mathbb {R}_+\). The solution of this equation is determined by use of the renewal function m, as follows.

Proposition 4.47

If \(g\colon \mathbb {R}_+\longrightarrow \mathbb {R}\) is bounded on the finite intervals of \(\mathbb {R}\) , then (4.8) has a unique solution \(h\colon \mathbb {R}_+\longrightarrow \mathbb {R}\) that is bounded on the finite intervals of \(\mathbb {R}\) , given by

$$\displaystyle \begin{aligned}h(t)=m*g(t)=\int_0^tg(t-x)dm(x), \end{aligned}$$

where m is defined by (4.7), and extended to \(\mathbb {R}_-\) by 0.

Proof

We deduce from (4.7) that

$$\displaystyle \begin{aligned}F*(m*g)= F*g + F^{*(2)}*g+\cdots = m*g-g. \end{aligned}$$

Thus, m ∗ g is a solution of (4.8).

Suppose that \(k\colon \mathbb {R}_+\longrightarrow \mathbb {R}\) is another solution of (4.8) bounded on the finite intervals of \(\mathbb {R}\). Then h − k = F ∗ (h − k). Since F ∗(n) tends to zero when n tends to infinity and k − h is bounded, it follows that h − k = F ∗(n) ∗ (h − k) and hence k = h. □

Many extensions of renewal processes exist.

If S 0 is a nonnegative variable not identically zero, independent of (X n) and with distribution function F 0 different from the renewal distribution function F, then the process (S n) is said to be delayed or modified. When

$$\displaystyle \begin{aligned} F_0(x)=\frac{1}{\mu}\int_0^x[1-F(u)]du,\quad x>0, \end{aligned}$$

the delayed renewal process is said to be stationary.

Definition 4.48

Let (Y n) and (Z n) be two i.i.d. nonnegative independent random sequences, with respective distribution functions G and H. The associated process defined by

$$\displaystyle \begin{aligned} S_0=0\quad \mbox{and}\quad S_n=S_{n-1}+Y_n+Z_n,\quad n\ge 1, \end{aligned}$$

is called an alternated renewal process.

Such an alternated renewal process (S n) is shown to be a renewal process in the sense of Definition 4.44 by setting X n = Y n + Z n and F = G ∗ H. A trajectory of an alternated renewal process is shown in Fig. 4.3.

Fig. 4.3
figure 3

A trajectory of an alternated renewal process

Still another extension is the stopped renewal process, also called transient renewal process.

Definition 4.49

Let (X n) be an i.i.d. random sequence taking values in \(\overline {\mathbb {R}}_+\), with defective distribution function F. The associated renewal process defined by (4.4) is called a stopped renewal process.

The life duration of this process is \( T=\sum _{n=1}^NX_n, \) where N is the number of events at the stopping time of the process, defined by

$$\displaystyle \begin{aligned} (N=n)=(X_1<+\infty,\ldots ,X_n<+\infty, X_{n+1}=+\infty). \end{aligned}$$

▹ Example 4.50

If N has a geometric distribution on \(\mathbb {N}^*\) with parameter q, then the distribution function of T is

$$\displaystyle \begin{aligned} F_T(t)=\sum_{n\ge 1}F^{*(n)}(t)(1-q)^{n-1}q. \end{aligned}$$

Indeed, \(\mathbb {P} (N=n)= (1-q)^{n-1}q\) for n ≥ 1, and the distribution function of T follows by Point 2. of Proposition 1.73. \(\lhd \)

▹ Example 4.51 (Risk Process in Insurance)

Let u > 0 be the initial capital of an insurance company. Let (S n) be the sequence of times at which accidents occur, and let N = (N t) denote the associated counting process. Let (Y n) be the sequence of compensations paid at each accident. The capital of the company at time t is \(U_t=u+ct-\sum _{n=1}^{N_t}Y_n\), where c is the rate of subscriptions in [0, t]. A trajectory of such a process is shown in Fig. 4.4. The time until ruin is T.

Fig. 4.4
figure 4

A trajectory of a risk process

The linked problems are the ruin in a given time interval, that is \(\mathbb {P} (U_t\le 0)\) with limit \(\mathbb {P} (\lim _{t\to +\infty }U_t\le 0)\), and the mean viability of the company, that is \(\mathbb {E}\, U_t\ge 0\). Different approaches exist for solving these issues, involving either renewal theory or martingale theory. Note that a particular case of risk process, the Cramér-Lundberg process, will be presented in the next section. \(\lhd \)

Renewal process can also be considered in the vector case. Another extension will also be studied below, the Markov renewal process.

4.4.3 Poisson Processes

The renewal process whose counting process is a Poisson process is the most used in modeling real experiments. We keep the notation of the preceding section.

Definition 4.52

If \(X_n\sim {\mathcal E}(\lambda )\) for \(n\in \mathbb {N}^*\), the counting process \({\mathbf N}=(N_t)_{t\in \mathbb {R}_+}\) defined by (4.5) p. 21 for \(t\in \mathbb {R}_+\), with N 0 = 0, is called a homogeneous Poisson process with intensity (or parameter) λ.

Thanks to the absence of memory of the exponential distribution, the probability that an event occurs for the first time after time s + t given that it did not occur before time t is equal to the probability that it occurs after time s. More generally, the following result holds true.

Theorem 4.53

A Poisson process N with intensity λ is homogeneous—with independent and stationary increments—and satisfies \(N_t \sim {\mathcal P}(\lambda t)\) for \(t\in \mathbb {R}_+\).

Proof

Let us show first that \(N_t\sim {\mathcal P}(\lambda t)\) for all t > 0. We have

$$\displaystyle \begin{aligned} \mathbb{P} (N_t=k)=\mathbb{P} (S_k\leq t)-\mathbb{P} (S_{k+1}\leq t). \end{aligned}$$

According to Example 1.66, S n ∼ γ(n, λ), so

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{P} (S_n\leq x)=\int_0^x{1\over(n-1)!}\lambda^ne^{-\lambda t}t^{n-1}\,dt =1-e^{-\lambda x}\sum_{k=1}^{n-1}{(\lambda x)^k\over k!}, \end{array} \end{aligned} $$

and hence \(\mathbb {P} (N_t =k)=e^{-\lambda t}{(\lambda t)^k}/{k!}\).

Let us now show that the increments of the process are independent. We begin by determining the joint distribution of N t and N s, for 0 ≤ s ≤ t.

where E n = {0 < x 1 < ⋯ < x n}. Thus, by Fubini’s theorem,

where E = {0 < x 1 < ⋯ < x k < s} and F = {s < x k+1 < ⋯ < x n < t}. Therefore,

$$\displaystyle \begin{aligned}\mathbb{P} (N_s=k,N_t-N_s=l)=e^{-\lambda s} \frac{(\lambda s)^k}{k!} \cdot e^{-\lambda (t-s)}\frac{[\lambda (t-s)]^l }{l!}, \end{aligned}$$

and the result follows. □

Note that Theorem 4.53 can also be taken as an alternative definition of the Poisson process, under the following form.

Definition 4.54

A stochastic process \({\mathbf N}=(N_t)_{t\in \scriptstyle \mathbb {R}_+}\) is a homogeneous Poisson process if it is a process with independent and stationary increments such that \(N_{t}-N_{s}\sim {\mathcal P}(\lambda (t-s))\) for all t ≥ s > 0.

Indeed, the associated renewal process can be defined by S 0 = 0 and then S n recursively though the relation (N t = n) = (S n ≤ t < S n+1). Setting X n = S n − S n−1, we get \(\mathbb {P} (X_1>t)=\mathbb {P} (N_t =0)=e^{-\lambda t}\), and hence \(X_1\sim {\mathcal E}(\lambda )\).

The Poisson process can also be defined more qualitatively, as follows.

Definition 4.55

A stochastic process \({\mathbf N}=(N_t)_{t\in \scriptstyle \mathbb {R}_+}\) is a Poisson process if it is a process with independent increments such that tN t(ω) is for almost all ω an increasing step function with jumps of size 1.

Let us now define the compound Poisson process.

Definition 4.56

Let N be a (homogeneous) Poisson process. Let (Y n) be an i.i.d. random sequence with finite mean and variance, and independent of N. The stochastic process ξ defined on \(\mathbb {R}_+\) by

$$\displaystyle \begin{aligned} \xi_t=\sum_{n=1}^{N_t}Y_n,\quad t\ge 0, \end{aligned} $$
(4.9)

with ξ t = 0 if N t = 0, is called a (homogeneous) compound Poisson process.

The compound Poisson process has independent increments.

▹ Example 4.57 (Cramér-Lundberg Process)

With the notation of Example 4.51, we suppose here that (Y n) is i.i.d. with distribution function G with mean μ and that N is a homogeneous Poisson process with intensity λ independent of (Y n). Thus, the process ξ defined by (4.9) is a homogeneous compound Poisson process, and (U t) is called a Cramér-Lundberg process.

We compute \(\mathbb {E}\, U_t=u+ct-\mathbb {E}\, N_t\mathbb {E}\, Y_1=u+ct-\lambda \mu t\). This gives a condition of viability of the company, namely c − λμ > 0. The probability of ruin before time t is \(r(t)=\mathbb {P} (U_t\le 0)=\mathbb {P} (\xi _t\ge u+ct)\). Since the distribution function of ξ t is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{P} (\xi_t\le x)&\displaystyle =&\displaystyle \sum_{n\ge 0}\mathbb{P} \Big(\sum_{i=1}^nY_i\le x,N_t=n\Big) =\sum_{n\ge 0}\mathbb{P} \Big(\sum_{i=1}^nY_i\le x\Big)\mathbb{P} (N_t=n)\\ &\displaystyle =&\displaystyle \sum_{n\ge 0}e^{-\lambda t}\frac{(\lambda t)^n}{n!}G^{*(n)}(x), \end{array} \end{aligned} $$

we get

$$\displaystyle \begin{aligned}r(t)=\sum_{n\ge 0}e^{-\lambda t}\frac{(\lambda t)^n}{n!}G^{*(n)}(u+ct). \end{aligned}$$

The probability of ruin of the company—in other words the probability that the life duration T of the process is finite—is

$$\displaystyle \begin{aligned} \mathbb{P} (\lim_{t\to+\infty}U_t\le 0)=\mathbb{P} (T<+\infty)=\lim_{t\to+\infty}r(t), \end{aligned}$$

but this quantity remains difficult to compute under this form. \(\lhd \)

4.4.4 Asymptotic Results for Renewal Processes

All throughout this section, (S n) will be a renewal process such that \(0<\mu ={\mathbb {E}\,} X_1<+\infty \), with counting process \((N_t)_{t\in \scriptstyle \mathbb {R}_+}\). Recall that \(m(t)=\mathbb {E}\, N_t\) defines the renewal function of the process, and that we suppose F(0) < 1, where F is the distribution function of X n.

Proposition 4.58

The following statements hold true:

  1. 1.

    S n tends a.s. to infinity when n tends to infinity.

  2. 2.

    S n and N t are a.s. finite for all \(n\in \mathbb {N}\) and \(t\in {\mathbb {R}}^{*}_+\).

  3. 3.

    N t tends a.s. to infinity and m(t) tends to infinity when t tends to infinity.

Proof

  1. 1.

    The law of large numbers implies that S nn tends a.s. to μ, so S n tends a.s. to infinity.

  2. 2.

    Since (S n ≤ t) = (N t ≥ n + 1), it derives from Point 1. that N t is a.s. finite.

    Moreover, \((S_n< +\infty )=\cap _{i=1}^n (X_i< +\infty )\) for \(n\in \mathbb {N}^*\), and X n is \(\mathbb {P} \)-a.s. finite, so S n is a.s. finite for all \(n\in \mathbb {N}^*\).

  3. 3.

    We compute

    $$\displaystyle \begin{aligned}\mathbb{P} (\lim_{t\rightarrow +\infty}N_t< +\infty)= \mathbb{P} [\cup_{n\ge 1}(X_n=+\infty)]\le\sum_{n\ge 1}\mathbb{P} (X_n=+\infty)=0. \end{aligned}$$

    Therefore, N t tends a.s. to infinity, from which it follows that m(t) tends to infinity. □

Proposition 4.59

The following convergence holds true,

$$\displaystyle \begin{aligned} \frac{1}{t} N_t \stackrel{a.s.}{\longrightarrow} \frac{1}{\mu },\quad t\rightarrow +\infty.\end{aligned}$$

The induced convergence of m(t)∕t to 1∕μ is known as the elementary renewal theorem.

Proof

Thanks to the law of large numbers, S nn tends a.s. to μ. Moreover, N t is a.s. finite and tends a.s. to infinity when t tends to infinity. Thanks to Theorem 4.17, \(S_{N_t}/N_t\) tends a.s. to μ, and the inequality

$$\displaystyle \begin{aligned} \frac{S_{N_t}}{N_t}\le \frac{t}{N_t}<\frac{S_{N_t+1}}{N_t+1}\frac{N_t+1}{N_t} \end{aligned}$$

yields the result. □

▹ Example 4.60 (Cramér-Lundberg Process (Continuation of Example 4.57))

We have

$$\displaystyle \begin{aligned} \frac{1}{t}U_t=\frac{u}{t}+c-\frac{N_t}{t}\frac{1}{N_t}\sum_{n=1}^{N_t}X_n. \end{aligned}$$

According to Proposition 4.59, N tt tends to 1∕λ. Hence, thanks to Theorem 4.17, U tt tends a.s. to c − μλ when t tends to infinity. \(\lhd \)

The next result is an extension of the central limit theorem to renewal processes.

Theorem 4.61

If \(0<\sigma ^2=\mathbb {V}\mathrm {ar}\, X_1<+\infty \) , then

$$\displaystyle \begin{aligned} \frac{N_t-t/\mu}{\sqrt{t\sigma^2/\mu^3}} \stackrel{\mathcal{L}}{\longrightarrow} {\mathcal N}(0,1),\quad t\rightarrow +\infty. \end{aligned}$$

Proof

Set

$$\displaystyle \begin{aligned}Z_t=\sqrt{t}\sqrt{\mu^3\over \sigma^2}\left({N_t\over t}-{1\over\mu}\right). \end{aligned}$$

We compute \(\mathbb {P} (Z_t\leq x)=\mathbb {P} (N_t\leq {t/\mu }+x\sqrt {t\sigma ^2/\mu ^3}).\) If n t denotes the integer part of \(t/ \mu +x\sqrt {t\sigma ^2/\mu ^3},\) then

$$\displaystyle \begin{aligned}\mathbb{P} (Z_t\leq x)=\mathbb{P} (S_{n_t}\geq t)=\mathbb{P} \left(\frac{S_{n_t}-n_t\mu}{\sigma\sqrt{n_t}}\geq \frac{t-n_t\mu}{\sigma\sqrt{n_t}}\right). \end{aligned}$$

By the central limit theorem, \(({S_{n_t}-n_t\mu })/{\sigma \sqrt {n_t}}\) tends in distribution to a standard Gaussian variable. Moreover, \(n_t\approx x\sqrt {t\sigma ^2/\mu ^3}+{t/\mu }\) when t tends to infinity. Hence \(t-n_t\mu \approx -x\sqrt {t\sigma ^2/\mu }\) and \(\sigma \sqrt {n_t}\approx \sigma \sqrt {t/ \mu }\), so \(({t-n_t\mu })/{\sigma \sqrt {n_t}}\approx -x,\) and the conclusion follows. □

We state without proof the following two renewal theorems. A distribution function F is said to be arithmetic with period δ if the distribution is concentrated on \(\{x_0+n\delta \,:\,n\in \mathbb {N}\}\).

Theorem 4.62 (Blackwell’s Renewal)

If F is a non arithmetic distribution function on \(\mathbb {R}_+\) , then, for all h > 0,

$$\displaystyle \begin{aligned} m(t)-m(t-h) \longrightarrow \frac{h}{\mu},\quad t\rightarrow +\infty. \end{aligned} $$
(4.10)

If F is arithmetic with period δ, the above result remains valid provided that h is a multiple of δ.

▹ Example 4.63 (Poisson Process)

In this case, , whose Laplace transform is \(\widetilde {F}(s)= \lambda /(\lambda +s)\). Moreover, N 0 = 0, so the Laplace transform of m is

$$\displaystyle \begin{aligned}\widetilde{m}(t)=\sum_{n\ge 1}(\widetilde{F}(s))^n=\frac{1}{1-\widetilde{F}(s)}-1= \frac{\lambda}{s}. \end{aligned}$$

Inverting Laplace transform yields m(t) = λt. Thus, the relation m(t) − m(t − h) = λh holds for all t > h. \(\lhd \)

In order to state the key renewal theorem, let us introduce the direct Riemann integrable functions.

Definition 4.64

Let g be a function defined on \(\overline {\mathbb {R}}_+\). Let \(\overline {m}_n(a)\) denote the maximum and \( \underline {m}_n(a)\) the minimum of g on [(n − 1)a, na], for all a > 0. Then g is said to be direct Riemann integrable if \(\sum _{n\geq 1} \underline {m}_n(a)\) and \(\sum _{n\geq 1}\overline {m}_n(a)\) are finite for all a > 0 and if \(\lim _{a\rightarrow 0}\sum _{n=1}^{\infty } \underline {m}_n(a)=\lim _{a\rightarrow 0} \sum _{n=1}^{\infty } \overline {m}_n(a)\).

▹ Example 4.65

Any nonnegative, decreasing function integrable over \(\mathbb {R}_+\) is direct Riemann integrable. Indeed, we get

$$\displaystyle \begin{aligned}\sum_{n\geq 1}(\overline{m}_n(a)-\underline{m}_n(a))\le g(0). \end{aligned}$$

Any nonnegative function integrable over \(\mathbb {R}_+\) and with a compact support is also direct Riemann integrable. \(\lhd \)

Theorem 4.66 (Key Renewal)

If Fis a non arithmetic distribution function on \(\mathbb {R}_+\) and if \(g : {\mathbb {R}}_+\longrightarrow {\mathbb {R}}_+\) is direct Riemann integrable, then

$$\displaystyle \begin{aligned} \int_0^t g(t-x)dm(x) \longrightarrow\frac{1}{\mu} \int_0^{+\infty} g(x)dx,\quad t\rightarrow +\infty. \end{aligned} $$
(4.11)

If F is arithmetic with period δ and if ∑k≥0g(x + ) < +, then

$$\displaystyle \begin{aligned} m*g(x+n\delta) \longrightarrow\frac{\delta}{\mu} \sum_{k\geq 0}g(x+k\delta), \quad n \rightarrow +\infty. \end{aligned}$$

Note that Blackwell’s renewal theorem and the key renewal theorem are equivalent in the sense that (4.10) and (4.11) are equivalent.

For stopping renewal processes, the key renewal theorem takes the following form.

Proposition 4.67

Let F be a defective distribution function on \(\mathbb {R}_+\) . If \(g : {\mathbb {R}}_+\longrightarrow {\mathbb {R}}_+\) is direct Riemann integrable and such that g(+) =limt→+g(t) exists, then the solution of the renewal equation (4.8) p. 24 satisfies

$$\displaystyle \begin{aligned} h(t)=m*g(t)\longrightarrow\frac{g(+\infty)}{q},\quad t\to+\infty, \end{aligned}$$

where q = 1 − F(+).

Proof

According to Proposition 4.47, h(t) = m ∗ g(t). According to relation (4.7) p. 23, the limit of m(t) when t tends to infinity is

$$\displaystyle \begin{aligned} 1+F(+\infty)+F(+\infty)^2+\cdots=\frac{1}{1-F(+\infty)}=\frac{1}{q}, \end{aligned}$$

and the result follows. □

▹ Example 4.68 (Cramér-Lundberg Process (Continuation or Example 4.57))

Let us determine the probability of ruin of the Cramér-Lundberg process. Set \(\zeta (u) = \mathbb {P} (T= +\infty )= 1- \mathbb {P} (T< +\infty ),\) where T is the time of ruin of the process. We compute

$$\displaystyle \begin{aligned} \begin{array}{rcl} \zeta (u) &\displaystyle =&\displaystyle \int_0^{+\infty} \int_0^{u+cs}\mathbb{P} (S_1\in ds, Y_1\in dy, T\circ\theta_s =+\infty)\\ &\displaystyle =&\displaystyle \int_0^{+\infty} \int_0^{u+cs}\mathbb{P} (S_1\in ds)\mathbb{P} ( Y_1\in dy, T\circ\theta_s =+\infty\mid S_1=s)\\ &\displaystyle =&\displaystyle \int_0^{+\infty} \mathbb{P} (S_1\in ds) \int_0^{u+cs}\mathbb{P} (T \circ \theta_s=+\infty\mid S_1=s, Y_1=y)\mathbb{P} (Y_1\in dy) \\ &\displaystyle =&\displaystyle \int_0^{+\infty} \lambda e^{-\lambda s}ds\int_0^{u+cs}\zeta (u+cs-y)dG(y). \end{array} \end{aligned} $$

By the change of variable v = u + cs, we get

$$\displaystyle \begin{aligned}\zeta (u) = \lambda_0 \int_u^{+\infty} e^{-\lambda_0 (v-u)}dv\int_0^v \zeta (v-y)dG(y)=\lambda_0 e^{\lambda_0 u} g(u), \end{aligned}$$

where

$$\displaystyle \begin{aligned}g(u)= \int_u^{+\infty} \int_0^v \lambda_0e^{-\lambda_0 (v-u)}dv \zeta (v-y)dG(y), \end{aligned}$$

and λ 0 = λc. Differentiating this relation gives

$$\displaystyle \begin{aligned}\zeta' (u)=\lambda_0 e^{\lambda_0 u}[\lambda_0 g(u)+ g'(u)] =\lambda_0 \zeta(u)-\lambda_0 G*\zeta(u)= \lambda_0 [1-G]*\zeta (u).\end{aligned}$$

Integrating the above differential equation on [0, u] yields

$$\displaystyle \begin{aligned} \zeta (u) = \zeta (0) + \lambda_0\int_0^u \zeta (u-y)[1-G(y)]dy,\quad u\ge 0. \end{aligned} $$
(4.12)

This equation is a renewal equation with defective distribution function with density L(y) = λ 0[1 − G(y)], where L(+) = λ 0μ < 1. The case L(+) = 1 is excluded because then ζ(u) = 0 for all \(u\in \mathbb {R}_+\). Thanks to the key renewal theorem for stopped renewal processes,

$$\displaystyle \begin{aligned}\zeta (+\infty)= \frac{\zeta (0)}{1-L(+\infty)},\end{aligned}$$

or, finally, since ζ(+) = 1,

$$\displaystyle \begin{aligned}\zeta (0)= 1- \frac{\lambda \mu}{c}.\end{aligned}$$

This allows the computation of ζ(u) for all \(u\in \mathbb {R}_+\) through (4.12).

We have considered only nonnegative Y 1. The result remains valid for any variable Y 1: it is sufficient to take − and u + cs instead of 0 and u + cs as bounds of the second integral in the above computation of ζ(u). \(\lhd \)

4.5 Exercises

∇ Exercise 4.1 (The AR(1) Process on \(\mathbb {N}\))

Let \((\varepsilon _n)_{n\in \scriptstyle \mathbb {N}}\) be a white noise with variance 1. Let a ∈ ] − 1, 1[. An AR(1) process on \(\mathbb {N}\) is defined by setting X n = aX n−1 + ε n for n > 0 and X 0 = ε 0.

  1. 1.
    1. a.

      Write X n as a function of ε 0, …, ε n and determine its distribution.

    2. b.

      Determine the characteristic function of X n.

    3. c.

      Give the distribution of (X 0, …, X n).

  2. 2.

    Show that:

    1. a.

      (X n) converges in distribution and give the limit.

    2. b.

      (X 0 + ⋯ + X n)∕n converges in probability to 0;

    3. c.

      \( (X_0+\cdots +X_n)/\sqrt {n}\) converges in distribution; give the limit.

Solution

  1. 1.
    1. a.

      For n ≥ 1, we have \(X_n=\sum _{p=1}^na^{n-p}\varepsilon _p\), and hence \(\mathbb {E}\, X_n=0\) and

      $$\displaystyle \begin{aligned} \mathbb{V}\mathrm{ar}\, X_n=a^{2n}+\cdots+a^2+1= \frac{1-a^{2(n+1)}}{1-a^2}. \end{aligned}$$
    2. b.

      Since (ε 1, …, ε n) is a Gaussian vector, X n is a Gaussian variable, so the characteristic function of X n is

      $$\displaystyle \begin{aligned} \phi_{X_n}(t)= \exp\Big(-\frac{1-a^{2(n+1)}}{1-a^2}t^2\Big). \end{aligned}$$
    3. c.

      The vector (X 0, …, X n) is a linear transform of the standard Gaussian vector (ε 1, …, ε n), so is a Gaussian vector too, with mean (0, …, 0) and covariance matrix given by

      $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{C}\mathrm{ov}\,(X_j,X_{j+k})&\displaystyle =&\displaystyle \mathbb{E}\,( X_jX_{j+k})=\mathbb{E}\,\Big[\big(\sum_{p=0}^ja^{j-p} \varepsilon_p\big)\big(\sum_{q=0}^{j+k}a^{j+k-q}\varepsilon_q\big)\Big]\\ &\displaystyle =&\displaystyle a^{k}\sum_{l=0}^{j}a^{2l}= \frac{1-a^{2(j+1)}}{1-a^2}a^k. \end{array} \end{aligned} $$
  2. 2.
    1. a.

      Clearly, \(\phi _{X_n}(t)\) tends to \( \exp [-t^2/(1-a^2)]\), so X n converges in distribution to a random variable with distribution \({\mathcal N}(0,{1}/{(1-a^2)}).\)

    2. b.

      We compute

      $$\displaystyle \begin{aligned} \begin{array}{rcl}\frac 1n (X_0+\cdots+X_n)&\displaystyle =&\displaystyle \frac{X_0}{n}+\frac 1n \sum_{i=1}^n(aX_{i-1}+\varepsilon_i) =\frac an \sum_{i=0}^{n-1}X_i+\frac 1n\sum_{i=0}^n\varepsilon_i\\ &\displaystyle =&\displaystyle \frac an (X_0+\cdots+X_n)-\frac an X_n+\frac 1n \sum_{i=0}^n\varepsilon_i. \end{array} \end{aligned} $$

      We know that \(X_n\sim {\mathcal N}(0,{(1-a^{2(n+1)})}/{(1-a^2)})\), so X nn converges to 0 in probability. Moreover, by the strong law of large numbers, \(\sum _{i=0}^n\varepsilon _i/n\) converges a.s. to 0. Therefore, (1 − a)(X 0 + ⋯ + X n)∕n converges to 0 in probability, and hence (X 0 + ⋯ + X n)∕n too.

    3. c.

      In the same way, by the central limit theorem, \(\sum _{i=0}^n\varepsilon _i/\sqrt {n}\) converges in distribution to \({\mathcal N}(0,1)\), so \((1-a) (X_0+\cdots +X_n)/\sqrt {n}\) too, from which it follows that \((X_0+\cdots +X_n)/ \sqrt {n}\) converges in distribution to \({\mathcal N}(0,{1}/{(1-a)^2}).\)

∇ Exercise 4.2 (Generalization of AR and MA Processes)

Let (γ n) be a sequence of i.i.d. standard random variables. Let θ ∈ ] − 1, 1[.

  1. 1.

    Set V n = γ 1 + θγ 2 + ⋯ + θ n−1γ n, for \(n\in \mathbb {N}^*.\)

    1. a.

      Show that V n converges in square mean—use the Cauchy criterion.

    2. b.

      Set V =∑i≥1θ i−1γ i. Show that V n tends a.s. to V .

  2. 2.

    Let X 0 be a random variable independent of (γ n). Set X n = θX n−1 + γ n for n ≥ 1.

    1. a.

      Show that V n and X n − θ nX 0 have the same distribution for n ≥ 1.

    2. b.

      Let ρ denote the distribution of V. Compute the mean and the variance of ρ. Show that X n tends in distribution to ρ.

    3. c.

      Assume that X 0 ∼ ρ. Show that X n ∼ ρ for all n.

Solution

  1. 1.
    1. a.

      We have

      $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{E}\,(V_m-V_{n+m})^2&\displaystyle =&\displaystyle \mathbb{E}\,(\theta^m \gamma_{m+1}+\cdots + \theta^{n+m-1}\gamma_{n+m})^2\\ &\displaystyle =&\displaystyle \sum_{i=0}^{n-1}\theta^{2(m+i)}\mathbb{E}\,(\gamma^2_{m+i})+ 2\sum_{i<j}\theta^{m+i}\theta^{m+j}\mathbb{E}\,(\gamma_{m+i}\gamma_{m+j}), \end{array} \end{aligned} $$

      so that \(\mathbb {E}\,(V_m-V_{n+m})^2=\sum _{i=0}^{n-1}\theta ^{2(m+i)}\), which converges to 0.

    2. b.

      According to Proposition 1.80, it is sufficient to show that \(\mathbb {P} ({\overline {\lim }}|V_n-V|>\varepsilon )=0\), or, using Borel-Cantelli lemma, that \(\sum _{n\geq 0}\mathbb {P} (|V_n-V|>\varepsilon )\) is finite for all ε > 0.

      Chebyshev’s inequality gives \(\mathbb {P} (|V_n-V|>\varepsilon ) \leq {\mathbb {E}\,[(V_n-V)^2]/\varepsilon ^2}\). Moreover,

      $$\displaystyle \begin{aligned}\mathbb{E}\,[(V_n-V)^2]=\mathbb{E}\,[(\theta^{n+1}\gamma_{n+2}+\theta^{n+2}\gamma_{n+3}+\cdots)^2] =\frac{\theta^{2n}}{1-\theta^2}, \end{aligned}$$

      so \(\mathbb {P} (|V_n-V|>\varepsilon ) \leq {\theta ^{2n}/(1-\theta ^2)\varepsilon ^2},\) and the sum of the series is finite.

  2. 2.
    1. a.

      We can write X n − θ nX 0 = θ n−1γ 1 + ⋯ + θγ n−1 + γ n, from which the result follows, because all the γ i have the same distribution.

    2. b.

      The sequence (V n) converges to V in square mean—so also in mean, hence \(\mathbb {E}\, V=\lim _{n\rightarrow +\infty }\mathbb {E}\, V_n=0\) and

      $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{V}\mathrm{ar}\, V=\mathbb{E}\, (V^2)&\displaystyle =&\displaystyle \lim_{n\rightarrow +\infty}\mathbb{E}\,( V_n^2)=\lim_{n\rightarrow +\infty} \sum_{i=0}^n\theta^{2i}\mathbb{E}\,(\gamma_{i+1}^2) \\&\displaystyle =&\displaystyle \lim_{n\rightarrow +\infty}\frac{1-\theta^{2n}}{1-\theta^2}=\frac{1}{1-\theta^2}. \end{array} \end{aligned} $$

      Since X n − θ nX 0 ∼ V n, we can write X n = θ nX 0 + U n, where U n is a variable with the same distribution as V n. Therefore, (U n) converges in distribution to ρ too, and since (θ nX 0) converges a.s. to 0, (X n) converges in distribution to ρ.

    3. c.

      We have X 0 ∼ V so θX ∼∑n≥0θ n+1γ n+1 or θX ∼∑n≥1θ nγ n. Since γ 1 ∼ γ 0, it follows that X 1 ∼∑n≥0θ nγ n; since γ n ∼ γ n+1 for all n, we obtain X 1 ∼∑n≥0θ nγ n+1, that is X 1 ∼ V ∼ ρ. The result follows by induction.

      Note that if \(\rho ={\mathcal N}(0,1)\) and \(\gamma _n\sim {\mathcal N}(0,1)\), then X n is an AR(1) process and (V n) is an MA(n − 1) process on \(\mathbb {N}\). △

∇ Exercise 4.3 (Sinusoidal Signals and Stationarity)

Let X be the stochastic process defined in Example 4.1. The variables ν, A and ϕ are not supposed to be constant, unless specifically stated. The variable φ takes values in [0, 2π[.

  1. 1.

    Assume that A = 1 and \(\varphi \sim {\mathcal U}(0,2\pi )\).

    1. a.

      Show that if ν is a.s. constant, then X is strictly stationary to the order 1.

    2. b.

      Show that if ν is continuous with density f and is independent of φ, then X is weakly stationary to the order 2; determine its spectral density h.

  2. 2.

    Suppose that ν and φ are constant.

    1. a.

      Give a necessary and sufficient condition on A for X to be weakly stationary to the order 1.

    2. b.

      Can X be weakly stationary to the order 2?

    3. c.

      Let S = X + Y. Give a necessary and sufficient condition on A and B for S to be weakly stationary to the order 1, and then to the order 2.

  3. 3.

    Suppose that ν is a.s. constant, that A is nonnegative and that A and φ are independent.

    1. a.

      Give a necessary and sufficient condition on φ for X to be weakly stationary to the order 1 and then 2.

    2. b.

      Give a necessary and sufficient condition on φ for X to be strictly stationary to the order 1.

    3. c.

      Let Z be the stochastic process defined by

      $$\displaystyle \begin{aligned}Z_t=A\cos{}(\nu t+\varphi)+B\sin{}(\nu t+\varphi),\end{aligned}$$

      where \(\varphi \sim {\mathcal U}(0,2\pi )\) is independent of A and B. Show that Z is weakly stationary to the order 2.

Solution

  1. 1.
    1. a.

      Setting ψ = ντ + φ, we obtain \(X_{t+\tau }=\cos {}(\nu t+\psi )\). Since \(\varphi \sim {\mathcal U}[0, 2\pi ]\) and ντ is constant for a fixed τ, we have \(\psi \sim {\mathcal U}[\nu \tau ,\nu \tau + 2\pi ]\), and hence X t ∼ X t+τ.

    2. b.

      We have \(\mathbb {E}\, X_t=0\) and \(2\mathbb {E}\,(X_tX_{t+\tau })=\mathbb {E}\,(\cos [\nu (2t+\tau )+2\varphi ])+\mathbb {E}\,[\cos {}(\nu \tau )]\). We compute

      $$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle {\mathbb{E}\,(\cos[\nu(2t+\tau)+2\varphi])=}\\&\displaystyle =&\displaystyle \mathbb{E}\,(\cos[\nu(2t+\tau)])\mathbb{E}\,[\cos (2\varphi)]-\mathbb{E}\,(\sin[\nu(2t+\tau)])\mathbb{E}\,[\sin (2\varphi)]=0, \end{array} \end{aligned} $$

      and \(\mathbb {E}\,[\cos {}(2\varphi )]=\mathbb {E}\,[\sin (2\varphi )]=0\), so X is weakly stationary to the order 2, with \(r(\tau )=\int _{\scriptstyle \mathbb {R}}\cos {}(\lambda \tau )f(\lambda )d\lambda \). Since r is even and real,

      $$\displaystyle \begin{aligned}r(\tau)=\int_{\scriptstyle\mathbb{R}}e^{i \lambda\tau}h(\lambda)d\lambda= \int_{\scriptstyle\mathbb{R}_+}2\cos{}(\lambda\tau)h(\lambda)d\lambda,\end{aligned}$$

      and hence h(λ) = [f(−λ) + f(λ)]∕4.

  2. 2.
    1. a.

      We have \(\mathbb {E}\, X_t=[\cos {}(\nu t)]\mathbb {E}\, A\), so X is weakly stationary to the order 1 if and only if A is centered.

    2. b.

      We compute \(2\mathbb {E}\,(X_tX_{t+\tau })=[\cos {}(2\nu t+\nu \tau )+\cos {}(\nu t)]\mathbb {E}\, (A^2).\) This is a function of τ only if \(\mathbb {E}\,( A^2)=0\), that is to say if A is null. Then the signal itself is null.

    3. c.

      Therefore, \(\mathbb {E}\, S_t=[\cos {}(\nu t)]\mathbb {E}\, A+[\sin {}(\nu t)]\mathbb {E}\, B\), which is constant in t if and only if A and B are centered. Moreover,

      $$\displaystyle \begin{aligned} \begin{array}{rcl}\mathbb{E}\,(S_tS_{t+\tau})&\displaystyle =&\displaystyle (\mathbb{V}\mathrm{ar}\, A+\mathbb{V}\mathrm{ar}\, B)\cos{}(\nu t)+(\mathbb{V}\mathrm{ar}\, A-\mathbb{V}\mathrm{ar}\, B)\cos{}(2\nu t+\nu \tau)\\ &\displaystyle &\displaystyle +\mathbb{C}\mathrm{ov}\,(A,B)\sin{}(2\nu t+\nu \tau), \end{array} \end{aligned} $$

      so S is weakly stationary to the order 2 if A and B are uncorrelated and have the same variance.

  3. 3.
    1. a.

      We have \(\mathbb {E}\, X_t=[\cos {}(\nu t)\mathbb {E}\,(\cos \varphi )-\sin {}(\nu t)\mathbb {E}\,(\sin \varphi )]\mathbb {E}\, A\), which is constant in t if \(\mathbb {E}\,(\cos \varphi )=\mathbb {E}\,(\sin \varphi )=0\). Similarly,

      $$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle {2\mathbb{E}\,(X_tX_{t+\tau})=}\\ &\displaystyle =&\displaystyle [\cos{}(\nu\tau)+\cos{}(2\nu t+\nu \tau)\mathbb{E}\,(\cos 2\varphi) -\sin{}(2\nu t+\nu\tau)\mathbb{E}\,(\sin 2\varphi)]\mathbb{E}\, (A^2) \end{array} \end{aligned} $$

      so X is weakly stationary to the order 2 if \(\mathbb {E}\,(\cos \varphi )=\mathbb {E}\,(\sin \varphi )=0\) and \(\mathbb {E}\,(\cos 2\varphi )=\mathbb {E}\,(\sin 2\varphi )=0\).

    2. b.

      The variables \(X_{t+\tau }=A\cos {}(\nu t+\varphi +\nu \tau )\) and \(X_t=A\cos {}(\nu t+\varphi )\) have the same distribution if (A, φ + ντ) ∼ (A, φ) for all τ. This condition is fulfilled if and only if \(\varphi \sim {\mathcal U}(0,2\pi )\) and is independent of A.

    3. c.

      Since \(\mathbb {E}\, [\cos {}(\nu t+\varphi )]=\mathbb {E}\, [\sin {}(\nu t+\varphi )]=0\), the process is centered. Set \(X_t=A\cos {}(\nu t+\varphi )\) and \(Y_t=B\sin {}(\nu t+\varphi )\). We have

      $$\displaystyle \begin{aligned} \mathbb{E}\,(Z_tZ_{t+\tau})=\mathbb{E}\,(X_tX_{t+\tau})+\mathbb{E}\,(Y_tY_{t+\tau}) +\mathbb{E}\,(X_tY_{t+\tau})+\mathbb{E}\,(Y_tX_{t+\tau}). \end{aligned}$$

      We compute \( 2\mathbb {E}\,(X_tX_{t+\tau })=\cos {}(\nu \tau )\mathbb {E}\, (A^2)\) and \(2\mathbb {E}\,(Y_tY_{t+\tau })=\cos {}(\nu \tau )\mathbb {E}\, (B^2)\), and also \(2\mathbb {E}\,(X_tY_{t+\tau })=\sin {}(\nu \tau )\mathbb {E}\, (AB)=-2\mathbb {E}\,(Y_tX_{t+\tau }), \) so Z is indeed weakly stationary to the order 2. △

∇ Exercise 4.4 (Alternated Renewal Process and Availability)

Consider a component starting operating at time S 0 = 0. When it fails, it is renewed. When the second fails, it is renewed and so on. Suppose the sequence of time durations (X n) of the successive components is i.i.d. and that of the replacing times (Y n) is also i.i.d. and is independent of (X n).

  1. 1.

    Show that (S n), defined by S 0 = 0 and S n = S n−1 + X n + Y n for n ≥ 1, is a renewal process.

  2. 2.
    1. a.

      Write the event E t=“the system is in good shape at time t” as a function of (X n) and (S n).

    2. b.

      Infer from a. the instantaneous availability \(A(t)=\mathbb {P} (E_t)\).

    3. c.

      If \(\mathbb {E}\, X_1+\mathbb {E}\, Y_1 < +\infty \), compute the limit availability A =limt→+A(t).

Solution

  1. 1.

    The sequence (T n) = (X n + Y n) is i.i.d., and, according to Definition 4.44, (S n) is indeed a renewal process.

  2. 2.
    1. a.

      We can write

      $$\displaystyle \begin{aligned} E_t = (X_1>t)\bigcup \Big[\bigcup_{n\ge 1}(S_n \le t)\cap(X_{n+1}>t-S_n))\Big]. \end{aligned}$$
    2. b.

      Let F and G denote the respective distribution functions of X 1 and Y 1. The distribution function of T n is

      $$\displaystyle \begin{aligned} H(t)=F*G(t)=\int_0^tF(t-x)dG(x), \quad t\ge 0. \end{aligned}$$

      Therefore, setting R(t) = 1 − F(t),

      $$\displaystyle \begin{aligned} A(t)= {\mathbb{P} }(E_t)= {\mathbb{P} }(X_1>t)+\sum_{n\ge 1}{\mathbb{P} } (S_n\le t, X_{n+1}>t-S_n). \end{aligned}$$

      We compute

      Thus

      $$\displaystyle \begin{aligned} A(t)=\int_0^t R(t-y)dM(y)=R*m(t),\end{aligned}$$

      where M(t) =∑n≥1H ∗(n)(t) and . Finally, the key renewal theorem yields

      $$\displaystyle \begin{aligned} A=\lim_{t\rightarrow +\infty} A(t)= \frac{{\mathbb{E}\,} X}{{\mathbb{E}\,} X + {\mathbb{E}\,} Y}, \end{aligned}$$

    Note that the same problem will be modelled by a semi-Markov process in Exercise 5.5. △

∇ Exercise 4.5 (Poisson Process)

The notation is that of Sect. 4.4.3.

  1. 1.

    Set U n = S nn, for \(n\in \mathbb {N}^*\). Show that (U n) and (n 1∕2(λU n − 1)) converge in distribution and give the limits.

  2. 2.

    Set V n = N nn, for \(n\in \mathbb {N}^*\). Determine the characteristic function of V n. Show that (V n) and ((nλ)1∕2(V n − λ)) converge in distribution; give the limits.

Solution

  1. 1.

    Thanks to the law of large numbers, U n converges a.s. to 1∕λ, and, thanks to the central limit theorem, n 1∕2(λU n − 1) converges in distribution to the standard normal distribution.

  2. 2.

    We compute

    $$\displaystyle \begin{aligned}\mathbb{E}\,(e^{i uN_t})=\sum_{k\geq 0}e^{i uk}\mathbb{P} (N_t=k)= e^{-\lambda t}\sum_{k\geq 0}\frac{(\lambda te^{i u})^k}{k\,!}=e^{-\lambda t(1-e^{i u})}. \end{aligned}$$

    Hence \(\phi _{V_n}(u)=e^{-\lambda n(1-e^{i u})}\) converges to e iuλ, and V n converges in distribution to λ. Moreover, \(Z_n=\sqrt {n}(V_n-\lambda )/\sqrt {\lambda }={N_n/\sqrt {n\lambda }}- \sqrt {n\lambda },\) so \(\phi _{Z_n}(u)=\exp [-\lambda n(1-e^{i u/\sqrt {n\lambda }})-iu\sqrt {n\lambda }],\) which converges to \(e^{-u^2/2}\), and Z n converges in distribution to the standard normal distribution. △

∇ Exercise 4.6 (Superposition and Decomposition of Poisson Processes)

  1. 1.

    Let N and \(\mathbf {\widetilde {N}}\) be two independent Poisson processes with respective intensities λ > 0 and μ > 0. Show that the process K, defined by \(K_t=N_t + \widetilde N_t\), is a Poisson process, called superposition. Give its intensity.

  2. 2.

    Let N be a Poisson process with intensity λ > 0, whose arrivals are of two different types, A and B, with respective probabilities p and 1 − p independent of the arrivals times. Show that the counting process \({\mathbf M}=(M_t)_{t\in \scriptstyle \mathbb {R}_+}\) of the arrivals of type A is a Poisson process. Give its intensity.

Solution

  1. 1.

    We can write \(K_t-K_s=(N_t-N_s)+(\widetilde N_t-\widetilde N_s)\), so K is indeed a process with independent increments. Moreover,

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \mathbb{P} (K_t=n)&\displaystyle =&\displaystyle \sum_{m=0}^n\mathbb{P} (N_t=m)\mathbb{P} (\widetilde N_t=n-m)\\ &\displaystyle =&\displaystyle \sum_{m=0}^ne^{-\lambda t}\frac{(\lambda t)^m}{m!}e^{-\mu t} \frac{(\mu t)^{n-m}}{(n-m)!}= e^{-(\lambda+\mu)t}\frac{[(\lambda+\mu)t]^n}{n!}. \end{array} \end{aligned} $$

    One can show similarly that \(K_{t+s}-K_t\sim {\mathcal P}((\lambda + \mu )s)\). Proposition 4.54 yields that K is a Poisson process, with intensity λ + μ.

  2. 2.

    By definition, the process N is the counting process of the renewal process associated with the sequence of arrivals times of A and B. These times are i.i.d., hence in particular the arrivals times of A are i.i.d., and the associated counting process is M. Moreover,

    $$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathbb{P} } (M_t=k)&\displaystyle =&\displaystyle \sum_{n\ge k}{\mathbb{P} } (M_t=k\mid N_t=n){\mathbb{P} }(N_t=n)\\ &\displaystyle =&\displaystyle \sum _{n\ge k} \binom{n}{k}p^k(1-p)^{n-k}e^{-\lambda t}{(\lambda t)^n \over n!} =e^{-p\lambda t}{(p\lambda t)^k \over k!}. \end{array} \end{aligned} $$

    One can show similarly that \(M_{t+s}-M_t\sim {\mathcal P}(p\lambda s)\) so that M is a Poisson process with intensity . △