Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

15.1 Definitions, Simplest Properties, and Examples

In Chap. 13 we considered sequences of dependent random variables X 0,X 1,… forming Markov chains. Dependence was described there in terms of transition probabilities determining the distribution of X n+1 given X n . That enabled us to investigate rather completely the properties of Markov chains.

In this chapter we consider another type of sequence of dependent random variables. Now dependence will be characterised only by the mean value of X n+1 given the whole “history” X 0,…,X n . It turns out that one can also obtain rather general results for such sequences.

Let a probability space \(\langle\varOmega, \mathfrak{F}, \mathbf {P}\rangle\) be given together with a sequence of random variables X 0,X 1,… defined on it and an increasing family (or flow) of σ-algebras \(\{ \mathfrak{F}_{n} \}_{n \ge0}\): \(\mathfrak{F}_{0} \subseteq\mathfrak{F}_{1}\subseteq\cdots\subseteq \mathfrak{F}_{n} \subseteq\cdots\subseteq\mathfrak{F}\).

Definition 15.1.1

A sequence of pairs \(\{ X_{n} ,\mathfrak{F}_{n};\, n\ge0 \}\) is called a stochastic sequence if, for each n≥0, X n is \(\mathfrak{F}_{n}\)-measurable. A stochastic sequence is said to be a martingale (one also says that {X n } is a martingale with respect to the flow of σ-algebras \(\{\mathfrak{F}_{n}\}\)) if, for every n≥0,

  1. (1)
    $$\begin{aligned} \mathbf{E}|X_n|<\infty, \end{aligned}$$
    (15.1.1)
  2. (2)

    X n is measurable with respect to \(\mathfrak{F}_{n}\),

  3. (3)
    $$\begin{aligned} \mathbf{E} (X_{n+1}\mid\mathfrak{F}_n )=X_n. \end{aligned}$$
    (15.1.2)

A stochastic sequence \(\{ X_{n}, \mathfrak{F}_{n} ; \, n \ge0 \}\) is called a submartingale (supermartingale) if conditions (1)–(3) hold with the sign “=” replaced in (15.1.2) with “≥” (“≤”, respectively).

We will say that a sequence {X n } forms a martingale (submartingale, supermartingale) if, for \(\mathfrak{F}_{n} = \sigma(X_{0}, \ldots, X_{n})\), the pairs \(\{ X_{n}, \mathfrak{F}_{n} \}\) form a sequence with the same name. Submartingales and supermartingales are often called semimartingales.

It is evident that relation (15.1.2) persists if we replace X n+1 on its left-hand side with X m for any m>n. Indeed, by virtue of the properties of conditional expectations,

$$\mathbf{E}(X_m | \mathfrak{F}_n) = \mathbf{E} \bigl[ \mathbf{E}(X_{m} | \mathfrak{F}_{m-1}) \big| \mathfrak{F}_n \bigr] = \mathbf{E}(X_{m-1} | \mathfrak{F}_n) = \cdots= X_n . $$

A similar assertion holds for semimartingales.

If {X n } is a martingale, then E(X n+1|σ(X 0,…,X n ))=X n , and, by a property of conditional expectations,

$$\mathbf{E}\bigl(X_{n+1} \big| \sigma(X_n )\bigr) = \mathbf{E} \bigl[\mathbf{E}\bigl(X_{n+1} \big| \sigma(X_0, \ldots, X_n )\bigr) \big| \sigma (X_n) \bigr] = \mathbf{E}\bigl(X_{n} \big| \sigma(X_n ) \bigr) = X_n . $$

So, for martingales, as for Markov chains, we have

$$\mathbf{E}\bigl(X_{n+1} \big| \sigma(X_0, \ldots, X_n )\bigr) = \mathbf {E}\bigl(X_{n+1} \big| \sigma(X_n)\bigr) . $$

The similarity, however, is limited to this relation, because for a martingale, the equality does not hold for distributions, but the additional condition

$$\mathbf{E}\bigl(X_{n+1} \big| \sigma(X_n )\bigr) = X_n $$

is imposed.

Example 15.1.1

Let ξ n , n≥0 be independent. Then X n =ξ 1+⋯+ξ n form a martingale (submartingale, supermartingale) if E ξ n =0 (E ξ n ≥0, E ξ n ≤0). It is obvious that X n also form a Markov chain. The same is true of \(X_{n} = \prod_{k=0}^{n} \xi_{k}\) if E ξ n =1.

Example 15.1.2

Let ξ n , n≥0, be independent. Then

$$X_n = \sum_{k=1}^n \xi_{k-1} \xi_k,\quad n \ge1,\qquad X_0 = \xi_0, $$

form a martingale if E ξ n =0, because

$$\mathbf{E}\bigl(X_{n+1} \big| \sigma(X_0, \ldots, X_n)\bigr) = X_n + \mathbf{E}\bigl(\xi_n \xi_{n+1} \big| \sigma(\xi _n)\bigr) = X_n. $$

Clearly, {X n } is not a Markov chain here. An example of a sequence which is a Markov chain but not a martingale can be obtained, say, if we consider a random walk on a segment with reflection at the endpoints (see Example 13.1.1).

As well as {0,1,…} we will use other sets of indices for X n , for example, {−∞<n<∞} or {n≤−1}, and also sets of integers including infinite values ±∞, say, {0≤n≤∞}. We will denote these sets by a common symbol \(\mathcal{N}\) and write martingales (semimartingales) as \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\). By \(\mathfrak {F}_{-\infty}\) we will understand the σ-algebra \(\bigcap_{n \in\mathcal{N}} \mathfrak{F}_{n}\), and by \(\mathfrak{F}_{\infty}\) the σ-algebra \(\sigma (\bigcup_{n \in\mathcal{N}} \mathfrak{F}_{n} )\) generated by \(\bigcup_{n \in\mathcal{N}} \mathfrak{F}_{n}\), so that \(\mathfrak{F}_{-\infty} \subseteq\mathfrak{F}_{n}\subseteq\mathfrak {F}_{\infty}\subseteq\mathfrak{F}\) for any \(n \in\mathcal{N}\).

Definition 15.1.2

A stochastic sequence \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\) is called a martingale (submartingale, supermartingale), if the conditions of Definition 15.1.1 hold for any \(n \in\mathcal{N}\).

If \(\{ X_{n} ,\mathfrak{F};\, n \in\mathcal{N}\}\) is a martingale and the left boundary n 0 of \(\mathcal{N}\) is finite (for example, \(\mathcal{N}= \{ 0, 1, \ldots\}\)), then the martingale \(\{ X_{n} ,\mathfrak{F}_{n} \}\) can be always extended “to the whole axis” by setting \(\mathfrak{F}_{n} := \mathfrak {F}_{n_{0}}\) and \(X_{n} := X_{n_{0}}\) for n<n 0. The same holds for the right boundary as well. Therefore if a martingale (semimartingale) \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\) is given, then without loss of generality we can always assume that one is actually given a martingale (semimartingale) \(\{ X_{n}, \mathfrak{F}_{n} ;\, {-}\infty\le n \le \infty\}\).

Example 15.1.3

Let \(\{ \mathfrak{F}_{n},\, {-}\infty\le n \le \infty\}\) be a given sequence of increasing σ-algebras, and ξ a random variable on \(\langle\varOmega, \mathfrak{F}, \mathbf{P}\rangle\), E|ξ|<∞. Then \(\{ X_{n}, \mathfrak{F}_{n} ; {-}\infty\leq n \leq\infty\}\) with \(X_{n} = \mathbf{E}(\xi| \mathfrak{F}_{n} )\) forms a martingale.

Indeed, by the property of conditional expectations, for any m≤∞ and m>n,

$$\mathbf{E}(X_m | \mathfrak{F}_n) = \mathbf{E}\bigl[ \mathbf{E}(\xi| \mathfrak{F}_m ) \big| \mathfrak{F}_n \bigr] = \mathbf{E}(\xi| \mathfrak{F}_n) = X_n . $$

Definition 15.1.3

The martingale of Example 15.1.3 is called a martingale generated by the random variable ξ (and the family \(\{ \mathfrak{F}_{n} \}\)).

Definition 15.1.4

A set \(\mathcal{N}_{+}\) is called the right closure of \(\mathcal{N}\) if:

  1. (1)

    \(\mathcal{N}_{+} = \mathcal{N}\) when the maximal element of \(\mathcal{N}\) is finite;

  2. (2)

    \(\mathcal{N}_{+} = \mathcal{N}\cup\{ \infty\}\) if \(\mathcal{N}\) is not bounded from the right.

If \(\mathcal{N}= \mathcal{N}_{+}\) then we say that \(\mathcal{N}\) is right closed. A martingale (semimartingale) \(\{ X_{n}, \mathfrak{F};\, n \in\mathcal{N}\}\) is said to be right closed if \(\mathcal{N}\) is right closed.

Lemma 15.1.1

A martingale \(\{ X_{n}, \mathfrak{F};\, n \in\mathcal{N}\}\) is generated by a random variable if and only if it is right closed.

The Proof

of the lemma is trivial. In one direction it follows from Example 15.1.3, and in the other from the equality

$$\mathbf{E}(X_N | \mathfrak{F}_n) = X_n , \quad N = \sup \{ k;\, k\in\mathcal{N}\}, $$

which implies that \(\{ X_{n}, \mathfrak{F}\}\) is generated by X N . The lemma is proved. □

Now we consider an interesting and more concrete example of a martingale generated by a random variable.

Example 15.1.4

Let ξ 1,ξ 2,… be independent and identically distributed and assume E|ξ 1|<∞. Set

$$S_n = \xi_1 + \cdots+ \xi_n , \quad X_{-n} = {S_n }/n ,\quad \mathfrak{F}_{-n} = \sigma(S_n, S_{n+1}, \ldots) = \sigma(S_n, \xi _{n+1}, \ldots ) . $$

Then \(\mathfrak{F}_{-n} \subset\mathfrak{F}_{-n+1}\) and, for any 1≤kn, by symmetry

$$\mathbf{E}(\xi_k | \mathfrak{F}_{-n}) = \mathbf{E}( \xi_1 | \mathfrak{F}_{-n}) . $$

From this it follows that

$$S_n = \mathbf{E}(S_n | \mathfrak{F}_{-n}) = \sum_{k=1}^n \mathbf{E}(\xi_k | \mathfrak{F}_{-n}) = n \mathbf{E}(\xi_1 | \mathfrak{F}_{-n}) ,\quad \frac{S_n}{n} = \mathbf{E}(\xi_1 | \mathfrak{F}_{-n}) . $$

This means that \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \le1 \}\) forms a martingale generated by ξ 1.

We will now obtain a series of auxiliary assertions giving the simplest properties of martingales and semimartingales. When considering semimartingales, we will confine ourselves to submartingales only, since the corresponding properties of supermartingales will follow immediately if one considers the sequence Y n =−X n , where {X n } is a submartingale.

Lemma 15.1.2

  1. (1)

    The property that \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in\mathcal {N}\}\) is a martingale is equivalent to invariability in mn of the set functions (integrals)

    $$ \mathbf{E}(X_m ;\, A) = \mathbf{E}(X_n ; A) $$
    (15.1.3)

    for any \(A \in\mathfrak{F}_{n}\). In particular, E X m =const.

  2. (2)

    The property that \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in\mathcal {N}\}\) is a submartingale is equivalent to the monotone increase in mn of the set functions

    $$ \mathbf{E}(X_m ; A) \ge\mathbf{E}(X_n ; A) $$
    (15.1.4)

    for every \(A \in\mathfrak{F}_{n}\). In particular, E X m ↑.

The Proof

follows immediately from the definitions. If (15.1.3) holds then, by the definition of conditional expectation, \(X_{n} = \mathbf{E}(X_{m} | \mathfrak{F}_{n })\), and vice versa. Now let (15.1.4) hold. Put \(Y_{n} = \mathbf{E}(X_{m} | \mathfrak {F}_{n})\). Then (15.1.4) implies that E(Y n ;A)≥E(X n ;A) and E(Y n X n ;A)≥0 for any \(A \in\mathfrak{F}_{n}\). From this it follows that \(Y_{n} = \mathbf{E} (X_{m} | \mathfrak{F}_{n }) \ge X_{n}\) with probability 1. The converse assertion can be obtained as easily as the direct one. The lemma is proved. □

Lemma 15.1.3

Let \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\) be a martingale, g(x) be a convex function, and E|g(X n )|<∞. Then \(\{ g(X_{n}), \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\) is a submartingale.

If, in addition, g(x) is nondecreasing, then the assertion of the theorem remains true when \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in \mathcal{N}\}\) is a submartingale.

The Proof

of both assertions follows immediately from Jensen’s inequality

$$\begin{aligned} \mathbf{E} \bigl(g (X_{n+1}) \big|\mathfrak{F}_n \bigr) \ge g \bigl(\mathbf{E}(X_{n+1} |\mathfrak{F}_n) \bigr) \ge g \bigl(\mathbf{E}(X_n |\mathfrak {F}_n) \bigr). \end{aligned}$$

 □

Clearly, the function g(x)=|x|p for p≥1 satisfies the conditions of the first part of the lemma, and the function g(x)=e λx for λ>0 meets the conditions of the second part of the lemma.

Lemma 15.1.4

Let \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\) be a right closed submartingale. Then, for X n (a)=max{X n ,a} and any a, \(\{ X_{n}(a), \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\) is a uniformly integrable submartingale.

If \(\{ X_{n}, \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\) is a right closed martingale, then it is uniformly integrable.

Proof

Let \(N := \sup\{ k: k\in\mathcal{N}\}\). Then, by Lemma 15.1.3, \(\{ X_{n}(a), \mathfrak{F}_{n} ;\; n \in\mathcal {N}\}\) is a submartingale. Hence, for any c>0,

$$c \mathbf{P}\bigl(X_n(a) > c\bigr) \le\mathbf{E} \bigl(X_n(a) ;\, X_n(a) > c\bigr) \le \mathbf{E} \bigl(X_N(a) ;\, X_n(a) > c\bigr) \le \mathbf{E}X_N^+(a) $$

(here X +=max(0,X)) and so

$$\mathbf{P}\bigl(X_n(a) > c\bigr) \le\frac{1}{c} \mathbf{E} \bigl(X_N^+(a)\bigr) \to0 , $$

uniformly in n as c→∞. Therefore we get the required uniform integrability:

$$\sup_n \mathbf{E}\bigl(X_n(a) ;\, X_n(a) > c\bigr) \le\sup_n \mathbf {E} \bigl(X_N(a) ;\, X_n(a) > c\bigr) \to0 , $$

since sup n P(X n (a)>c)→0 as c→∞ (see Lemma A3.2.3 in Appendix 3; by truncating at the level a we avoided estimating the “negative tails”).

If \(\{ X_{n} , \mathfrak{F}_{n} ;\, n \in\mathcal{N}\}\) is a martingale, then its uniform integrability will follow from the first assertion of the lemma applied to the submartingale \(\{ |X_{n} | , \mathfrak{F}_{n} ;\, n \in\mathcal{N} \}\). The lemma is proved. □

The nature of martingales can be clarified to some extent by the following example.

Example 15.1.5

Let ξ 1,ξ 2,… be an arbitrary sequence of random variables, E|ξ k |<∞, \(\mathfrak{F}_{n} = \sigma (\xi_{1}, \ldots, \xi_{n})\) for n≥1, \(\mathfrak{F}_{0} = (\varnothing, \varOmega)\) (the trivial σ-algebra),

$$S_n = \sum _{k=1}^n \xi_k , \qquad Z_n = \sum _{k=1}^n \mathbf{E}(\xi_k | \mathfrak{F}_{k-1}) , \qquad X_n =S_n -Z_n . $$

Then \(\{ X_{n} , \mathfrak{F}_{n} ;\, n \ge1 \}\) is a martingale. This is a consequence of the fact that

$$\mathbf{E}(S_{n+1} - Z_{n+1} | \mathfrak{F}_n ) = \mathbf{E}\bigl(X_n + \xi_{n+1} - \mathbf{E}( \xi_{n+1} | \mathfrak{F}_n) \big| \mathfrak{F}_n \bigr) = X_n . $$

In other words, for an arbitrary sequence {ξ n }, the sequence S n can be “compensated” by a so-called “predictable” (in the sense that its value is determined by S 1,…,S n−1) sequence Z n so that S n Z n will be a martingale.

15.2 The Martingale Property and Random Change of Time. Wald’s Identity

Throughout this section we assume that \(\mathcal{N}= \{ n \ge0 \}\). Recall the definition of a stopping time.

Definition 15.2.1

A random variable ν will be called a stopping time or a Markov time (with respect to an increasing family of σ-algebras \(\{ \mathfrak{F}_{n} ;\, n \ge0 \}\)) if, for any n≥0, \(\{ \nu\le n \} \in\mathfrak{F}_{n} \).

It is obvious that a constant νm is a stopping time. If ν is a stopping time, then, for any fixed m, ν(m)=min(ν,m), is also a stopping time, since for nm we have

$$\nu(m) \le m \le n,\quad \bigl\{ \nu(m) \le n \bigr\} = \varOmega\in \mathfrak{F}_n, $$

and if n<m then

$$\bigl\{ \nu(m) \le n \bigr\} = \{ \nu\le n \} \in\mathfrak{F}_n. $$

If ν is a stopping time, then

$$\{ \nu= n \} = \{ \nu\le n \} - \{ \nu\le n-1 \} \in\mathfrak {F}_n, \quad \{ \nu\ge n \} = \varOmega- \{ \nu\le n - 1 \} \in\mathfrak{F}_{n-1}. $$

Conversely, if \(\{ \nu= n \} \in\mathfrak{F}_{n}\), then \(\{ \nu\le n \} \in \mathfrak{F}_{n}\) and therefore ν is a stopping time.

Let a martingale \(\{ X_{n} , \mathfrak{F}_{n} ;\, n \ge0 \}\) be given. A typical example of a stopping time is the time ν at which X n first hits a given measurable set B:

$$\nu= \inf\{ n \ge0 : X_n \in B \} $$

(ν=∞ if all X n B). Indeed,

$$\{ \nu= n \} = \{ X_0 \notin B , \ldots, X_{n-1} \notin B, \, X_n \in B \}\in\mathfrak{F}_n . $$

If ν is a proper stopping time (P(ν<∞)=1), then X ν is a random variable, since

$$X_{\nu} = \sum _{n=0}^{\infty} X_n {\rm I}_{\{\nu=n \}} . $$

By \(\mathfrak{F}_{\nu}\) we will denote the σ-algebra of sets \(A \in\mathfrak{F}\) such that \(A \cap\{\nu=n \} \in\mathfrak{F}_{n}\), n=0,1,… This σ-algebra can be thought of as being generated by the events {νn}∩B n , n=0,1,…, where \(B_{n} \in \mathfrak{F}_{n}\). Clearly, ν and X ν are \(\mathfrak {F}_{\nu}\)-measurable. If ν 1 and ν 2 are two stopping times, then \(\{\nu_{2} \ge\nu_{1} \} \in\mathfrak{F}_{\nu_{1}}\) and \(\{\nu_{2} \ge\nu_{1} \} \in \mathfrak{F}_{\nu_{2}}\), since {ν 2ν 1}=⋃ n [{ν 2=n}∩{ν 1n}].

We already know that if \(\{ X_{n} , \mathfrak{F}_{n} \}\) is a martingale then E X n is constant for all n. Will this property remain valid for E X ν if ν is a stopping time? From Wald’s identity we know that this is the case for the martingale from Example 15.1.1. In the general case one has the following.

Theorem 15.2.1

(Doob)

Let \(\{ X_{n} , \mathfrak{F}_{n} ;\, n \ge0 \}\) be a martingale (submartingale) and ν 1,ν 2 be stopping times such that

(15.2.1)
$$\begin{aligned}[cc] \liminf_{n \to\infty} \mathbf{E}\bigl(|X_n| ;\, \nu_2 \ge n \bigr) = 0 . \end{aligned}$$
(15.2.2)

Then, on the set {ν 2ν 1},

$$ \mathbf{E}(X_{\nu_2 } | \mathfrak{F}_{\nu_1}) = X_{\nu_1} \quad ( \ge X_{\nu_1 }) . $$
(15.2.3)

This theorem extends the martingale (submartingale) property to random time.

Corollary 15.2.1

If ν 2=ν≥0 is an arbitrary stopping time, then putting ν 1=n (also a stopping time) we have that, on the set νn,

$$\mathbf{E}(X_{\nu} | \mathfrak{F}_n) = X_n , \qquad\mathbf {E}X_{\nu} = \mathbf{E}X_0 , $$

or, which is the same, for any \(A \in\mathfrak{F}_{n} \cap\{ \nu\ge n \}\),

$$\mathbf{E}(X_{\nu} ;\, A) = \mathbf{E}(X_n ;\, A) . $$

For submartingales substitute “=” by “≥”.

Proof of Theorem 15.2.1

To prove (15.2.3) it suffices to show that, for any \(A \in \mathfrak{F}_{\nu_{1}}\),

$$ \mathbf{E}\bigl(X_{\nu_2} ;\, A \cap\{ \nu_2 \ge\nu_1 \}\bigr) = \mathbf {E} \bigl(X_{\nu_1} ;\, A \cap\{ \nu_2 \ge\nu_1 \} \bigr) . $$
(15.2.4)

Since the random variables ν i are discrete, we just have to establish (15.2.4) for sets \(A_{n} = A \cap\{ \nu_{1} = n \} \in\mathfrak{F}_{n}\), n=0,1,… , i.e. to establish the equality

$$ \mathbf{E}\bigl(X_{\nu_2} ;\, A_n \cap\{ \nu_2 \ge n \}\bigr) = \mathbf {E}\bigl(X_n ;\, A_n \cap\{ \nu_2 \ge n \}\bigr) . $$
(15.2.5)

Thus the proof is reduced to the case ν 1=n. We have

$$\begin{aligned} \mathbf{E}\bigl(X_n ;\, A_n \cap\{ \nu_2 \ge n \}\bigr) = &\mathbf{E}\bigl(X_n ;\, A_n \cap \{ \nu_2 = n \}\bigr) + \mathbf{E}\bigl(X_n ;\, A_n \cap\{ \nu_2 \ge n + 1 \}\bigr) \\=& \mathbf{E}\bigl(X_{\nu_2} ;\, A_n \cap\{ \nu_2 = n \}\bigr) + \mathbf{E}\bigl(X_{n+1} ;\, A_n \cap\{ \nu_2 \ge n + 1 \}\bigr) . \end{aligned}$$

Here we used the fact that \(\{ \nu_{2} \ge n_{1} \} \in\mathfrak{F}_{n}\) and the martingale property (15.1.3).

Applying this equality mn times we obtain that

(15.2.6)

By (15.2.2) the last expression converges to zero for some sequence m→∞.

Since

$$A_{n,m} := A_n \cap\{ n \le\nu_2 < m \} \uparrow B_n = A_n \cap\{ n \le\nu_2 \} , $$

by the property of integrals and by virtue of (15.2.6),

$$\mathbf{E}\bigl(X_{\nu_2} ;\, A_n \cap\{ n \le \nu_2 \}\bigr) = \lim_{m \to\infty} \mathbf{E}(X_{\nu_2} ;\, A_{n,m}) = \mathbf{E}\bigl(X_n ;\, A_n \cap \{ \nu_2 \ge n \}\bigr) . $$

Thus we proved equality (15.2.5) and hence Theorem 15.2.1 for martingales. The proof for submartingales can be obtained by simply changing the equality signs in certain places to inequalities. The theorem is proved. □

The conditions of Theorem 15.2.1 are far from always being met, even in rather simple cases. Consider, for instance, a fair game (see Examples 4.2.3 and 4.4.5) versus an infinitely rich adversary, in which z+S n is the fortune of the first gambler after n plays (given he has not been ruined yet). Here z>0, \(S_{n} = \sum_{k=1}^{n} \xi_{k}\), P(ξ k =±1)=1/2, η(z)=min{k:S k =−z} is obviously a Markov (stopping) time, and the sequence {S n ; n≥0}, S 0=0, is a martingale, but S η(z)=−z. Hence E S η(z)=−zE S n =0, and equality (15.2.5) does not hold for ν 1=0, ν 2=η(z), z>0, n>0. In this example, this means that condition (15.2.2) is not satisfied (this is related to the fact that E η(z)=∞).

Conditions (15.2.1) and (15.2.2) of Theorem 15.2.1 can, generally speaking, be rather hard to verify. Therefore the following statements are useful in applications.

Put for brevity

$$\xi_n := X_n - X_{n-1} ,\qquad\xi_0 := X_0 ,\qquad Y_n := \sum_{k=0}^n |\xi_k| ,\quad n = 0, 1, \ldots $$

Lemma 15.2.1

The condition

$$ \mathbf{E}Y_{\nu} < \infty $$
(15.2.7)

is sufficient for (15.2.1) and (15.2.2) (with ν i =ν).

The Proof

is almost evident since |X ν |≤Y ν and

$$\mathbf{E}(|X_n| ;\, \nu> n ) \le\mathbf{E}(Y_{\nu} ;\, \nu> n). $$

Because P(ν>n)→0 and E Y ν <∞, it remains to use the property of integrals by which E(η; A n )→0 if E|η|<∞ and P(A n )→0. □

We introduce the following notation:

$$a_n := \mathbf{E}\bigl(|\xi_n| \; \big| \mathfrak{F}_{n-1}\bigr) , \quad\sigma _n^2 :=\mathbf{E}\bigl( \xi_n^2 \big| \mathfrak{F}_{n-1}\bigr) , \quad n=0, 1, 2, \ldots, $$

where \(\mathfrak{F}_{-1}\) can be taken to be the trivial σ-algebra.

Theorem 15.2.2

Let {X n ; n≥0} be a martingale (submartingale) and ν be a stopping time (with respect to \(\{ \mathfrak{F}_{n} = \sigma(X_{0}, \ldots, X_{n}) \}\)).

  1. (1)

    If

    $$ \mathbf{E}\nu< \infty $$
    (15.2.8)

    and, for all n≥0, on the set \(\{ \nu\ge n \} \in\mathfrak{F}_{n-1}\) one has

    $$ a_n \le c = {\rm const} , $$
    (15.2.9)

    then

    $$ \mathbf{E}|X_{\nu}| < \infty,\quad \mathbf{E}X_{\nu} = \mathbf {E}X_0 \quad(\ge\mathbf{E} X_0) . $$
    (15.2.10)
  2. (2)

    If, in addition, \(\mathbf{E}\sigma_{n}^{2} = \mathbf{E}\xi_{n}^{2} < \infty\) then

    $$ \mathbf{E}X_{\nu}^2 = \mathbf{E}\sum _{k=1}^{\nu} \sigma_k^2. $$
    (15.2.11)

Proof

By virtue of Theorem 15.2.1, Corollary 15.2.1 and Lemma 15.2.1, to prove (15.2.10) it suffices to verify that conditions (15.2.8) and (15.2.9) imply (15.2.7). Quite similarly to the proof of Theorem 4.4.1, we have

$$\mathbf{E}|Y_{\nu}| = \sum _{n=0}^{\infty} \Biggl(\,\sum _{k=0}^n \mathbf{E}\bigl(|\xi_k| ; \, \nu= n\bigr) \Biggr) = \sum _{k=0}^{\infty}\, \sum _{n=k}^{\infty} \mathbf{E}\bigl(|\xi_k| ;\, \nu= n\bigr) = \sum _{k=0}^{\infty} \mathbf{E}\bigl(|\xi_k| ; \, \nu\ge k\bigr) . $$

Here \(\{ \nu\ge k \} = \varOmega\setminus\{ \nu\le k-1 \} \in \mathfrak{F}_{k-1}\). Therefore, by condition (15.2.9),

$$\mathbf{E}(|\xi_k| ;\, \nu\ge k) = \mathbf{E}\bigl(\mathbf{E}\bigl(| \xi_k| \; \big| \mathfrak{F}_{k-1}\bigr) ;\, \nu \ge k\bigr) \le c \, \mathbf{P}(\nu\ge k) . $$

This means that

$$\mathbf{E}Y_{\nu} \le c \sum _{k=0}^{\infty} \mathbf{P}(\nu \ge k) = c \, \mathbf{E}\nu< \infty. $$

Now we will prove (15.2.11). Set \(Z_{n} := X_{n}^{2} - \sum_{0}^{n} \sigma_{k}^{2}\). One can easily see that Z n is a martingale, since

$$\mathbf{E} \bigl(X_{n+1}^2 - X_n^2 - \sigma_{n+1}^2 \big| \mathfrak{F}_n\bigr) = \mathbf {E}\bigl(2 X_n \xi_{n+1} + \xi_{n+1}^2 - \sigma_{n+1}^2 \big| \mathfrak{F}_n \bigr) = 0. $$

It is also clear that E|Z n |<∞ and ν(n)=min(ν,n) is a stopping time. By virtue of Lemma 15.2.1, conditions (15.2.1) and (15.2.2) always hold for the pair {Z k }, ν(n). Therefore, by the first part of the theorem,

$$ \mathbf{E}Z_{\nu(n)} = 0 ,\qquad\mathbf{E}X_{\nu(n)}^2 = \mathbf{E} \sum _{k=1}^{\nu(n)} \sigma_k^2 . $$
(15.2.12)

It remains to verify that

$$ \lim_{n \to\infty} \mathbf{E}X_{\nu(n)}^2 = \mathbf{E}X_{\nu}^2 , \qquad \lim_{n \to\infty} \mathbf{E}\sum _{k=1}^{\nu(n)} \sigma _k^2 = \mathbf{E} \sum _{k=1}^{\nu} \sigma_k^2 . $$
(15.2.13)

The second equality follows from the monotone convergence theorem (ν(n)↑ν, \(\sigma_{k}^{2} \ge0\)). That theorem implies the former equality as well, for \(X_{\nu(n)}^{2} \stackrel{\mathit {a}.\mathit{s}.}{\longrightarrow}X_{\nu}^{2}\) and \(X_{\nu(n)}^{2} {\uparrow}\). To verify the latter claim, note that \(\{ X_{n}^{2} , \mathfrak{F}_{n} ;\, n \ge0 \}\) is a martingale, and therefore, for any \(A \in\mathfrak{F}_{n}\),

$$\begin{aligned} \mathbf{E}\bigl(X_{\nu(n)}^2 ;\, A\bigr) =& \mathbf{E}\bigl(X_{\nu}^2 ;\, A \cap\{\nu\le n\}\bigr) + \mathbf{E}\bigl(X_n^2 ;\, A \cap\{ \nu> n \}\bigr) \\\le& \mathbf{E}\bigl(X_{\nu}^2 ;\, A \cap\{ \nu\le n \}\bigr) + \mathbf{E}\bigl(\mathbf{E}\bigl(X_{n+1}^2 \big| \mathfrak{F}_n\bigr) ;\, A \cap\{ \nu> n \}\bigr) \\=& \mathbf{E}\bigl(X_{\nu}^2 ;\, A \cap\{ \nu< n+1 \} \bigr) + \mathbf{E}\bigl(X_{n+1}^2 ;\, A \cap\{ \nu\ge n+1 \}\bigr) \\=& \mathbf{E}\bigl(X_{\nu(n+1)}^2 ;\, A\bigr). \end{aligned}$$

Thus (15.2.12) and (15.2.13) imply (15.2.11), and the theorem is completely proved. □

The main assertion of Theorem 15.2.2 for martingales (submartingales):

$$ \mathbf{E}X_{\nu} = \mathbf{E}X_0 \quad( \geq\mathbf{E}X_0) $$
(15.2.14)

was obtained as a consequence of Theorem 15.2.1. However, we could get it directly from some rather transparent relations which, moreover, enable one to extend it to improper stopping times ν.

A stopping time ν is called improper if 0<P(ν<∞)=1−P(ν=∞)<1. To give an example of an improper stopping time, consider independent identically distributed random variables ξ k , a=E ξ k <0, \(X_{n} = \sum^{n}_{k=1} \xi_{k}\), and put

$$\nu= \eta(x) := \min\{ k \geq1 : X_k > x \}, \quad x \geq0 . $$

Here ν is finite only for such trajectories {X k } that sup k X k >x. If the last inequality does not hold, we put ν=∞. Clearly,

$$\mathbf{P}(\nu= \infty) = \mathbf{P}\Bigl(\sup _k X_k \leq x\Bigr) > 0. $$

Thus, for an arbitrary (possibly improper) stopping time, we have

$$ \displaystyle\mathbf{E}(X_{\nu} ; \, \nu< \infty) = \sum ^{\infty}_{k=0} \mathbf{E}(X_k ; \, \nu= k) =\sum ^{\infty}_{k=0} \bigl[ \mathbf{E}(X_k ; \, \nu\geq k) - \mathbf{E}(X_k ; \, \nu \geq k+1) \bigr] . $$
(15.2.15)

Assume now that changing the order of summation is justified here. Then, by virtue of the relation \(\{ \nu\geq k+1\}\in\mathfrak{F}_{k}\), we get

$$ \begin{aligned}[b] \mathbf{E}(X_{\nu} ; \, \nu< \infty) &= \mathbf{E}X_0+\sum^{\infty}_{k=0}\mathbf{E}(X_{k+1}-X_k;\, \nu\geq k+1)\\&= \mathbf{E}X_0 + \sum^{\infty}_{k=0} \mathbf {E}\mathrm{I}(\nu \geq k + 1) \mathbf{E}(X_{k+1} - X_k | \mathfrak{F}_k). \end{aligned}$$
(15.2.16)

Since for martingales (submartingales) the factors \(\mathbf{E}(X_{k+1} - X_{k} | \mathfrak{F}_{k})= 0\) (≥0), we obtain the following.

Theorem 15.2.3

If the change of the order of summation in (15.2.15) and (15.2.16) is legitimate then, for martingales (submartingales),

$$ \mathbf{E}(X_{\nu} ;\, \nu< \infty) = \mathbf{E}X_0 \quad( \geq\mathbf{E}X_0) . $$
(15.2.17)

Assumptions (15.2.8) and (15.2.9) of Theorem 15.2.2 are nothing else but conditions ensuring the absolute convergence of the series in (15.2.15) (see the proof of Theorem 15.2.2) and (15.2.16), because the sum of the absolute values of the terms in (15.2.16) is dominated by

$$\sum ^{\infty}_{k=1} a_k \mathbf{P}( \nu\geq k + 1) \leq a \mathbf{E}\nu< \infty, $$

where, as before, \(a_{k} = \mathbf{E} (|\xi_{k}| \mid \mathfrak{F}_{k-1} )\) with ξ k =X k X k−1. This justifies the change of the order of summation.

There is still another way of proving (15.2.17) based on (15.2.15) specifying a simple condition ensuring the required justification. First note that identity (15.2.17) assumes that the expectation E(X ν ; ν<∞) exists, i.e. both values \(\mathbf{E}(X^{\pm}_{\nu};\,\nu<\infty)\) are finite, where x ±=max(±x,0).

Theorem 15.2.4

1. Let \(\{X_{n},\mathfrak{F}_{n}\}\) be a martingale. Then the condition

$$ \lim_{n\to\infty}\mathbf{E}(X_n;\, \nu>n)=0 $$
(15.2.18)

is necessary and sufficient for the relation

$$ \lim_{n\to\infty}\mathbf{E}(X_n;\,\nu \leq n)=\mathbf{E}X_0. $$
(15.2.19)

A necessary and sufficient condition for (15.2.17) is that (15.2.18) holds and at least one of the values \(\mathbf{E}(X_{\nu}^{\pm};\,\nu<\infty)\) is finite.

2. If \(\{ X_{n} , \mathfrak{F}_{n} \}\) is a supermartingale and

$$ \liminf_{n \to\infty} \mathbf{E}(X_n ; \, \nu> n)\geq0, $$
(15.2.20)

then

$$\limsup_{n \to\infty} \mathbf{E}(X_n ;\,\nu\leq n) \leq \mathbf{E}X_0. $$

If, in addition, at least one of the values \(\mathbf{E}(X^{\pm}_{\nu};\,\nu<\infty)\) is finite then

$$\mathbf{E}(X_{\nu} ; \, \nu< \infty)\leq\mathbf{E}X_0. $$

3. If, in conditions (15.2.18) and (15.2.20), we replace the quantity E(X n ; ν>n) with E(X n ; νn), the first two assertions of the theorem will remain true.

The corresponding symmetric assertions hold for submartingales.

Proof

As we have already mentioned, for martingales, E(ξ k ; νk)=0. Therefore, by virtue of (15.2.18)

$$\begin{aligned} \mathbf{E}X_0 =& \lim_{n \to\infty} \Biggl[ \mathbf{E}X_0 + \sum^n_{k=1} \mathbf{E}(\xi_k ; \, \nu\geq k) - \mathbf{E}(X_n , \, \nu \geq n+1) \Biggr]. \end{aligned}$$

Here

$$\begin{aligned} \sum_{k=1}^n\mathbf{E}( \xi_k;\,\nu\geq k) =&\sum_{k=1}^n \mathbf {E}(X_k;\,\nu \geq k)-\sum_{k=1}^n \mathbf{E}(X_{k-1};\,\nu\geq k) \\=&\sum_{k=1}^n\mathbf{E}(X_k; \,\nu\geq k)-\sum_{k=1}^{n-1} \mathbf{E}(X_k;\,\nu\geq k+1). \end{aligned}$$

Hence

$$\begin{aligned} \mathbf{E}X_0 =& \lim_{n \to\infty} \sum ^n_{k=0} \bigl[ \mathbf {E}(X_k ; \,\nu \geq k) - \mathbf{E}(X_k ; \, \nu\geq k+1) \bigr] \\=& \lim_{n \to\infty} \sum^n_{k=0} \mathbf{E}(X_k ; \, \nu=k) = \lim_{n\to\infty} \mathbf{E}(X_{\nu} ; \, \nu\leq n) . \end{aligned}$$

These equalities also imply the necessity of condition (15.2.18).

If at least one of the values \(\mathbf{E}(X^{\pm}_{\nu};\,\nu<\infty)\) is finite, then by the monotone convergence theorem

$$\begin{aligned} \lim_{n\to\infty}\mathbf{E}(X_n;\,\nu\leq n) =&\lim _{n\to\infty}\mathbf{E}\bigl(X_n^+;\,\nu\leq n\bigr)- \lim _{n\to\infty}\mathbf{E}\bigl(X_n^-;\,\nu\leq n\bigr) \\=&\mathbf{E}\bigl(X_\nu^+;\,\nu<\infty\bigr)-\mathbf{E} \bigl(X_\nu^-;\,\nu <\infty\bigr)=\mathbf{E} (X_\nu;\,\nu <\infty). \end{aligned}$$

The third assertion of the theorem follows from the fact that the stopping time ν(n)=min(ν,n) satisfies the conditions of the first part of the theorem (or those of Theorems 15.2.1 and 15.2.3), and therefore, for the martingale {X n },

$$\mathbf{E}X_0=\mathbf{E}X_{\nu(n)}=\mathbf{E}(X_\nu; \,\nu <n)+\mathbf{E}(X_\nu;\,\nu\geq n), $$

so that (15.2.19) implies the convergence E(X n ; νn)→0 and vice versa.

The proof for semimartingales is similar. The theorem is proved. □

That assertions (15.2.17) and (15.2.19) are, generally speaking, not equivalent even when (15.2.18) holds (i.e., lim n→∞ E(X ν ;νn)=E(X ν ;ν<∞) is not always the case), can be illustrated by the following example. Let ξ k be independent random variables with

$$\mathbf{P}\bigl(\xi_k=3^k\bigr)=\mathbf{P}\bigl( \xi_k=-3^k\bigr)=1/2, $$

ν be independent of {ξ k }, and P(ν=k)=2k, k=1,2,… . Then X 0=0, X k =X k−1+ξ k for k≥1 is a martingale,

$$\mathbf{E}X_n=0,\quad\mathbf{P}(\nu<\infty)=1,\quad\mathbf {E}(X_n;\,\nu>n)=\mathbf{E} X_n\mathbf{P}(\nu>n)=0 $$

by independence, and condition (15.2.18) is satisfied. By virtue of (15.2.19), this means that lim n→∞ P(X ν ; νn)=0 (one can also verify this directly). On the other hand, the expectation E(X ν ; ν<∞)=E X ν is not defined, since \(\mathbf{E} X_{\nu}^{+}=\mathbf{E}X_{\nu}^{-}=\infty\). Indeed, clearly

By symmetry, we also have \(\mathbf{E}X_{\nu}^{-}=\infty\).

Corollary 15.2.2

1. If \(\{ X_{n}, \mathfrak{F}_{n} \}\) is a nonnegative martingale, then condition (15.2.18) is necessary and sufficient for (15.2.17).

2. If \(\{ X_{n}, \mathfrak{F}_{n} \}\) is a nonnegative supermartingale and ν is an arbitrary stopping time, then

$$ \mathbf{E}(X_{\nu} ; \, \nu< \infty) \leq \mathbf{E}X_0 . $$
(15.2.21)

Proof

The assertion follows in an obvious way from Theorem 15.2.4 since one has \(\mathbf{E}(X^{-}_{\nu};\,\nu<\infty)=0\). □

Theorem 15.2.2 implies the already known Wald’s identity (see Theorem 4.4.3) supplemented with another useful statement.

Theorem 15.2.5

(Wald’s identity)

Let ζ 1,ζ 2,… be independent identically distributed random variables, S n =ζ 1+⋯+ζ n , S 0=0, and assume E ζ 1=a. Let, further, ν be a stopping time with E ν<∞. Then

$$ \mathbf{E}S_{\nu} = a \mathbf{E}\nu. $$
(15.2.22)

If, moreover, \(\sigma^{2} = \operatorname{Var}\zeta_{k} < \infty\), then

$$ \mathbf{E} [S_{\nu} - \nu a ]^2 = \sigma^2 \mathbf{E}\nu. $$
(15.2.23)

Proof

It is clear that X n =S n na forms a martingale and conditions (15.2.8) and (15.2.9) are met. Therefore E X ν =E X 0=0, which is equivalent to (15.2.22), and \(\mathbf{E}X_{\nu}^{2} = \mathbf{E}\nu\sigma ^{2}\), which is equivalent to (15.2.23). □

Example 15.2.1

Consider a generalised renewal process (see Sect. 10.6) S(t)=S η(t), where \(S_{n}=\sum_{j=1}^{n}\xi_{j}\) (in this example we follow the notation of Chap. 10 and change the meaning of the notation S n from the above), η(t)=min{k:T k >t}, \(T_{n}=\sum_{j=1}^{n}\tau_{j}\) and (τ j ,ξ j ) are independent vectors distributed as (τ,ξ), τ>0. Set a ξ =E ξ, a=E τ, \(\sigma^{2}_{\xi}=\operatorname{Var}\xi\) and \(\sigma^{2}=\operatorname {Var}\tau\). As we know from Wald’s identity in Sect. 4.4,

$$\mathbf{E}\eta(t)=\frac{t+\mathbf{E}\chi(t)}{a},\qquad\mathbf{E} S(t)=a_\xi \mathbf{E}\eta(t), $$

where E χ(t)=o(t) as t→∞ (see Theorem 10.1.1) and, in the non-lattice case, \(\mathbf{E}\chi(t)\to\frac{\sigma ^{2}+a^{2}}{2a^{2}}\) if σ 2<∞ (see Theorem 10.4.3).

We now find \(\operatorname{Var}\eta(t)\) and \(\operatorname {Var}S(t)\). Omitting for brevity’s sake the argument t, we can write

$$\begin{aligned} a^2\operatorname{Var}\eta(t) =&a^2\operatorname{Var} \eta=\mathbf {E}(a\eta-a\mathbf{E}\eta)^2= \mathbf{E}(a \eta-T_\eta+T_\eta-a\mathbf{E}\eta)^2 \\=&\mathbf{E}(T_\eta-a\eta)^2+\mathbf{E}(T_\eta-a \mathbf{E}\eta)^2- 2\mathbf{E}(T_\eta-a\eta) (T_\eta-a\mathbf{E}\eta). \end{aligned}$$

The first summand on the right-hand side is equal to

$$\sigma^2\mathbf{E}\eta=\frac{\sigma^2 t}{a}+O(1) $$

by Theorem 15.2.3. The second summand equals, by (10.4.8) (χ(t)=T η(t)t),

$$\mathbf{E} \bigl(t+\chi(t)-a\mathbf{E}\eta \bigr)^2= \mathbf{E} \bigl(\chi(t)-\mathbf{E}\chi(t) \bigr)^2\leq\mathbf {E} \chi^2(t)=o(t). $$

The last summand, by the Cauchy–Bunjakovsky inequality, is also o(t). Finally, we get

$$\operatorname{Var}\eta(t)=\frac{\sigma^2 t}{a^3}+o(t). $$

Consider now (with r=a ξ /a; ζ j =ξ j j , E ζ j =0)

$$\begin{aligned} \operatorname{Var}S(t) =&\mathbf{E}(S_\eta-a_\xi \mathbf{E}\eta)^2= \mathbf{E} \bigl[S_\eta-rT_\eta+r(T_\eta-a \mathbf{E}\eta) \bigr]^2 \\=&\mathbf{E} \Biggl(\sum_{j=1}^\eta \zeta_j\Biggr)^2+r^2\mathbf{E}(T_\eta -a\mathbf{E}\eta)^2+ 2r\mathbf{E} \Biggl(\sum _{j=1}^\eta\zeta_j \Biggr) (T_\eta-a\mathbf {E}\eta). \end{aligned}$$

The first term on the right-hand side is equal to

$$\mathbf{E}\eta\operatorname{Var}\zeta=\frac{t\operatorname {Var}\zeta}{a}+O(1) $$

by Theorem 15.2.3. The second term has already been estimated above. Therefore, as before, the sum of the last two terms is o(t). Thus

$$\operatorname{Var}S(t)=\frac{t}{a}\,\mathbf{E}(\xi-r \tau)^2+o(t). $$

This corresponds to the scaling used in Theorem 10.6.2.

Example 15.2.2

Examples 4.4.4 and 4.5.5 referring to the fair game situation with P(ζ k =±1)=1/2 and ν=min{k:S k =z 2 or S k =−z 1} (z 1 and z 2 being the capitals of the gamblers) can also illustrate the use of Theorem 15.2.5.

Now consider the case p=P(ζ k =1)≠1/2. The sequence \(X_{n} = (q/p)^{S_{n}}\), n≥0, q=1−p is a martingale, since

$$\mathbf{E}({q}/{p})^{\zeta_k} = p ({q}/{p}) + q ({p}/{q}) = 1. $$

By Theorem 15.2.5 (the probabilities P 1 and P 2 were defined in Example 4.4.5),

$$\mathbf{E}X_{\nu} = \mathbf{E}X_0 = 1 , \qquad P_1 ({q}/{p})^{z_2} + P_2 ({q}/{p})^{z_1} = 1 . $$

From this relation and equality P 1+P 2=1 we have

$$P_1 = \frac{({q}/{p})^{z_1} - 1}{({q}/{p})^{z_1} - ({q}/{p})^{z_2}} , \qquad P_2 = 1 - P_1 . $$

Using Wald’s identity again, we also obtain that

$$\mathbf{E}\nu= \frac{\mathbf{E}S_{\nu}}{\mathbf{E}\zeta_1} = \frac{P_1 z_2 - P_2 z_1}{p - q} . $$

Note that these equalities could have been obtained by elementary methodsFootnote 1 but this would require lengthy calculations.

In the cases when the nature of S ν is simple enough, the assertions of the type of Theorems 15.2.1–15.2.2 enable one to obtain (or estimate) the distribution of the random variable ν itself. In such situations, the following assertion is rather helpful.

Suppose that the conditions of Theorem 15.2.5 are met, but, instead of conditions on the moments of ζ n , the Cramér condition (cf. Chap. 9) is assumed to be satisfied:

$$\psi(\lambda) := \mathbf{E}e^{\lambda\zeta} < \infty $$

for some λ≠0.

In other words, if

$$\lambda_+ := \sup\bigl(\lambda: \psi(\lambda) < \infty\bigr) \geq0 , \qquad \lambda_- := \inf \bigl(\lambda: \psi(\lambda) < \infty\bigr) \leq0 , $$

then λ +λ >0. Everywhere in what follows we will only consider the values

$$\lambda\in B:=\bigl\{\psi(\lambda)<\infty\bigr\}\subseteq[\lambda_- , \lambda_+] $$

for which ψ′(λ)<∞. For such λ, the positive martingale

$$X_n = \frac{e^{\lambda S_n}}{\psi^n(\lambda)} , \qquad X_0 = 1 , $$

is well-defined so that E X n =1.

Theorem 15.2.6

Let ν be an arbitrary stopping time and λB. Then

$$ \mathbf{E} \biggl( \frac{e^{\lambda S_{\nu}}}{\psi(\lambda)^{\nu}} ; \, \nu< \infty \biggr) \leq1 $$
(15.2.24)

and, for any s>1 and r>1 such that 1/r+1/s=1,

$$ \mathbf{E}\bigl(e^{\lambda S_{\nu}} ; \, \nu< \infty\bigr) \leq \bigl\{ \mathbf{E}\bigl[ \psi^{{r \nu}/{s}} (\lambda s); \, \nu< \infty\bigr] \bigr\}^{1/r} . $$
(15.2.25)

A necessary and sufficient condition for

$$ \mathbf{E} \biggl( \frac{e^{\lambda S_{\nu}}}{\psi(\lambda)^{\nu}} ; \, \nu< \infty \biggr) = 1 $$
(15.2.26)

is that

$$ \lim_{n \to\infty} \mathbf{E} \biggl( \frac{e^{\lambda S_n}}{\psi (\lambda)^n} ; \, \nu> n \biggr) = 0 . $$
(15.2.27)

Remark 15.2.1

Relation (15.2.26) is known as the fundamental Wald identity. In the literature it is usually considered for a.s. finite ν (when P(ν<∞)=1) being in that case an extension of the obvious equality \(\mathbf{E}e^{\lambda S_{n}} = \psi^{n} (\lambda)\) to the case of random ν. Originally, identity (15.2.26) was established by A. Wald in the special case where ν is the exit time of the sequence {S n } from a finite interval (see Corollary 15.2.3), and was accompanied by rather restrictive conditions. Later, these conditions were removed (see e.g. [13]). Below we will obtain a more general assertion for the problem on the first exit of the trajectory {S n } from a strip with curvilinear boundaries.

Remark 15.2.2

The fundamental Wald identity shows that, although the nature of a stopping time could be quite general, there exists a stiff functional constraint (15.2.26) on the joint distribution of ν and S ν (the distribution of ζ k is assumed to be known). In the cases where one of these variables can somehow be “computed” or “eliminated” (see Examples 15.2.2–15.2.4) Wald’s identity turns into an explicit formula for the Laplace transform of the distribution of the other variable. If ν and S ν prove to be independent (which rarely happens), then (15.2.26) gives the relationship

$$\mathbf{E}e^{\lambda S_{\nu}} = \bigl[\mathbf{E}\psi(\lambda)^{-\nu} \bigr]^{-1} $$

between the Laplace transforms of the distributions of ν and S ν .

Proof of Theorem 15.2.6

As we have already noted, for

$$X_n = e^{\lambda S_n} \psi^{-n} (\lambda),\qquad \mathfrak{F}_n = \sigma(\zeta_1, \ldots, \zeta_n), $$

\(\{ X_{n} , \mathfrak{F}_{n} ; \, n \geq0 \}\) is a positive martingale with X 0=1 and E X n =1. Corollary 15.2.2 immediately implies (15.2.24).

Inequality (15.2.25) is a consequence of Hölder’s inequality and (15.2.24):

$$\begin{aligned} \mathbf{E}\bigl(e^{(\lambda/s) S_{\nu}} ; \nu< \infty\bigr) =& \mathbf {E} \biggl[ \biggl( \frac{e^{\lambda S_{\nu}}}{\psi^{\nu}(\lambda)} \biggr)^{1/s} \psi^{\nu/s}( \lambda) ; \nu< \infty \biggr]\\\leq&\bigl[ \mathbf{E}\bigl( \psi^{{\nu r}/{s}} ( \lambda) ; \, \nu< \infty\bigr) \bigr]^{1/r} . \end{aligned}$$

The last assertion of the theorem (concerning the identity (15.2.26)) follows from Theorem 15.2.4. □

We now consider several important special cases. Note that ψ(λ) is a convex function (ψ″(λ)>0), ψ(0)=1, and therefore there exists a unique point λ 0 at which ψ(λ) attains its minimum value ψ(λ 0)≤1 (see also Sect. 9.1).

Corollary 15.2.3

Assume that we are given a sequence g(n) such that

$$g^+ (n) := \max \bigl(0,g(n) \bigr)=o(n) \quad\mbox{\textit{as} } n \to\infty. $$

If S n g(n) holds on the set {ν>n}, then (15.2.26) holds for λ∈(λ 0,λ +]∩B, B={λ:ψ(λ)<∞}.

The random variable ν=ν g =inf{k≥1:S k >g(k)} for g(k)=o(k) obviously satisfies the conditions of Corollary 15.2.3. For stopping times ν g one could also consider the case g(n)/nc≥0 as n→∞, which can be reduced to the case g(n)=o(n) by introducing the random variables

$$\zeta^*_k := \zeta_k - c,\qquad S_k^* := \sum^k_{j=1} \zeta^*_j, $$

for which \(\nu_{g} = \inf\{ k \geq1 : S^{*}_{k} > g(k) - c k \}\).

Proof of Corollary 15.2.3

For λ>λ 0, λB, we have

$$\begin{aligned} \mathbf{E} \biggl(\frac{e^{\lambda S_n}}{\psi^n (\lambda)}; \nu> n \biggr) \le& \psi^{-n} ( \lambda) \mathbf{E} \bigl(e^{\lambda S_n}; S_n \le g (n) \bigr) \\=& \psi^{-n} (\lambda) \mathbf{E} \bigl(e^{(\lambda- \lambda_0) S_n} \cdot e^{\lambda_0 S_n}; S_n \le g (n) \bigr) \\\leq& \psi^{-n} (\lambda) e^{(\lambda- \lambda_0) g (n)} \mathbf{E} \bigl(e^{\lambda_0 S_n}; S_n \le g (n) \bigr) \\\le& \psi^{-n} (\lambda) e^{(\lambda- \lambda_0) g^+(n)} \mathbf{E} e^{\lambda_0 S_n} = \biggl(\frac{\psi(\lambda_0)}{\psi(\lambda)} \biggr)^n e^{(\lambda- \lambda_0) g^+ (n)} \to0 \end{aligned}$$

as n→∞, because (λλ 0)g +(n)=o(n). It remains to use Theorem 15.2.6. The corollary is proved. □

We now return to Theorem 15.2.6 for arbitrary stopping times. It turns out that, based on the Cramér transform introduced in Sect. 9.1, one can complement its assertions without using any martingale techniques.

Together with the original distribution P of the sequence \(\{\zeta_{k}\}_{k=1}^{\infty}\) we introduce the family of distributions P λ of this sequence in \(\langle\mathbb{R}^{\infty}, \mathfrak{B}^{\infty}\rangle\) (see Sect. 5.5) generated by the finite-dimensional distributions

This is the Cramér transform of the distribution P.

Theorem 15.2.7

Let ν be an arbitrary stopping time. Then, for any λB,

$$ \mathbf{E} \biggl(\frac{e^{\lambda S_{\nu}}}{\psi^{\nu} (\lambda )}; \nu< \infty \biggr) = \mathbf{P}_{\lambda} (\nu< \infty). $$
(15.2.28)

Proof

Since {ν=n}∈σ(ζ 1,…,ζ n ), there exists a Borel set \(D_{n}\subset\mathbb{R}^{n}\), such that

$$\{\nu=n\}= \bigl\{(\zeta_1,\ldots,\zeta_n)\in D_n \bigr\}. $$

Further,

$$\mathbf{E} \biggl(\frac{e^{\lambda S_{\nu}}}{\psi^{\nu} (\lambda )}; \nu< \infty \biggr) = \sum ^{\infty}_{n=0} \mathbf{E} \biggl(\frac {e^{\lambda S_n}}{ \psi^n (\lambda)}; \nu= n \biggr), $$

where

$$\begin{aligned} \mathbf{E} \biggl(\frac{e^{\lambda S_n}}{\psi^n (\lambda)}; \nu= n \biggr) =& \int _{(x_1, \ldots, x_n) \in D_n} \frac{e^{\lambda(x_1 + \cdots+ x_n)}}{\psi^n(\lambda)} \mathbf{P}(\zeta_1 \in d x_1, \ldots, \zeta_n \in dx) \\=&\int _{(x_1, \ldots, x_n) \in D_n} \mathbf{P}_{\lambda} (\zeta _1 \in d x_1, \ldots, \zeta_n \in d x_n) = \mathbf{P}_{\lambda} (\nu= n). \end{aligned}$$

This proves the theorem. □

For a given function g(n), consider now the stopping time

$$\nu=\nu_g=\inf \bigl\{k:S_k\ge g(k) \bigr\} $$

(cf. Corollary 15.2.3). The assertion of Theorem 15.2.7 can be obtained in that case in the following way. Denote by E λ the expectation with respect to the distribution P λ .

Corollary 15.2.4

1. If g +(n)=max(0,g(n))=o(n) as n→∞ and λ∈(λ 0,λ +]∩B, then one has P λ (ν g <∞)=1 in relation (15.2.28).

2. If g(n)≥0 and λ<λ 0, then P λ (ν g <∞)<1.

3. For λ=λ 0, the distribution \(\mathbf{P}_{\lambda _{0}}\) of the variable ν can either be proper (when one has \(\mathbf{P}_{\lambda_{0}}(\nu_{g}<\infty)=1\)) or improper \((\mathbf{P}_{\lambda_{0}}(\nu_{g}<\infty)<1)\). If λ 0∈(λ ,λ +), g(n)<(1−ε)σ(2loglogn)1/2 for all nn 0, starting from some n 0, and \(\sigma^{2}=\mathbf{E}_{\lambda_{0}}\zeta^{2}_{1}\), then P λ (ν g <∞)=1.

But if λ∈(λ ,λ +), g(n)≥0, and g(n)≥(1+ε)σ(2loglogn)1/2 for nn 0, then P λ (ν g <∞)<1 (we exclude the trivial case ζ k ≡0).

Proof

Since \(\mathbf{E}_{\lambda}\zeta_{k}=\frac{\psi'(\lambda)}{\psi(\lambda )}\), the expectation E λ ζ k is of the same sign as the difference λλ 0, and \(\mathbf{E}_{\lambda_{0}}\zeta_{k}=0\) (ψ′(λ 0)=0 if λ 0∈(λ ,λ +)). Hence the first assertion follows from the relations

$$\mathbf{P}_\lambda(\nu=\infty)=\mathbf{P}_\lambda \bigl(X_n<g(n) \mbox { for all } n\bigr)<\mathbf{P} \bigl(X_n<g^+(n)\bigr)\to0 $$

as n→∞ by the law of large numbers for the sums \(X_{n}=\sum_{k=1}^{n} \zeta_{k}\), since E λ ζ k >0.

The second assertion is a consequence of the strong law of large numbers since E λ ζ k <0 and hence P λ (ν=∞)=P(sup n X n ≤0)>0.

The last assertion of the corollary follows from the law of the iterated logarithm which we prove in Sect. 20.2. The corollary is proved. □

The condition g(n)≥0 of part 2 of the corollary can clearly be weakened to the condition g(n)=o(n), P(ν>n)>0 for any n>0. The same is true for part 3.

An assertion similar to Corollary 15.2.4 is also true for the (stopping) time \(\nu_{g_{-}, g_{+}}\) of the first passage of one of the two boundaries g ±(n)=o(n):

$$\nu_{g_-, g_+}:=\inf \bigl\{k\geq1: S_k\ge g_+(k)\mbox{ or } S_k\le g_-(k) \bigr\}. $$

Corollary 15.2.5

For λB∖{λ 0}, we have \(\mathbf{P}_{\lambda}(\nu_{g_{-}, g_{+}}<\infty)=1\).

If λ=λ 0∈(λ ,λ +), then the P λ -distribution of ν may be either proper or improper.

If, for some n 0>2,

$$g_\pm(n)\lessgtr\pm(1-\varepsilon)\sigma\sqrt{2\ln\ln n} $$

for nn 0 then \(\mathbf{P}_{\lambda_{0}}(\nu_{g_{-},g_{+}}<\infty)=1\).

If g ±(n)≷0 and, additionally,

$$g_\pm(n)\gtrless\pm(1+\varepsilon)\sigma\sqrt{2\ln\ln n} $$

for nn 0 then \(\mathbf{P}_{\lambda_{0}}(\nu_{g_{-},g_{+}}<\infty)<1\).

Proof

The first assertion follows from Corollary 15.2.4 applied to the sequences {±X n }. The second is a consequence of the law of the iterated logarithm from Sect. 20.2. □

We now consider several relations following from Corollaries 15.2.3, 15.2.4 and 15.2.5 (from identity (15.2.26)) for the random variables ν=ν g and \(\nu=\nu_{g_{-},g_{+}}\).

Let a<0 and ψ(λ +)≥1. Since ψ′(0)=a<0 and the function ψ(λ) is convex, the equation ψ(λ)=1 will have a unique root μ>0 in the domain λ>0. Setting λ=μ in (15.2.26) we obtain the following.

Corollary 15.2.6

If a<0 and ψ(λ +)≥1 then, for the stopping times ν=ν g and \(\nu= \nu_{g_{-}, g_{+}}\), we have the equality

$$\mathbf{E}\bigl(e^{\mu S_{\nu}} ; \, \nu< \infty\bigr) = 1. $$

Remark 15.2.3

For an x>0, put (as in Chap. 10) η(x):=inf{k:S k >0}. Since S η(x)=x+χ(x), where χ(x):=S η(x)x is the value of overshoot over the level x, Corollary 15.2.6 implies

$$ \mathbf{E} \bigl(e^{\mu(x+\chi(x))};\,\eta(x)<\infty \bigr)=1. $$
(15.2.29)

Note that P(η(x)<∞)=P(S>x), where S=sup k≥0 S k . Therefore, Theorem 12.7.4 and (15.2.29) imply that, as x→∞,

$$ e^{\mu x}\mathbf{P} \bigl(\eta(x)<\infty \bigr)= \bigl[\mathbf{E} \bigl(e^{\mu\chi(x)}\,\big|\,\eta(x)<\infty \bigr) \bigr]^{-1}\to c. $$
(15.2.30)

The last convergence relation corresponds to the fact that the limiting conditional distribution (as x→∞) G of χ(x) exists given η(x)<∞. If we denote by χ a random variable with the distribution G then (15.2.30) will mean that c=[Ee μχ]−1<1. This provides an interpretation of the constant c that is different from the one in Theorem 12.7.4.

In Corollary 15.2.6 we “eliminated” the “component” ψ ν(λ) in identity (15.2.26). “Elimination” of the other component \(e^{\lambda S_{\nu}}\) is possible only in some special cases of random walks, such as the so-called skip-free walks (see Sect. 12.8) or walks with exponentially (or geometrically) distributed \(\zeta^{+}_{k} = \max(0, \zeta_{k})\) or \(\zeta^{-}_{k} = - \min(0, \zeta_{k})\). We will illustrate this with two examples.

Example 15.2.3

We return to the ruin problem discussed in Example 15.2.2. In that case, Corollary 15.2.4 gives, for g (n):=−z 1 and g +(n)=z 2, that

$$e^{\lambda z_2} \mathbf{E}\bigl(\psi(\lambda)^{-\nu} ;\, S_{\nu} = z_2\bigr) + e^{-\lambda z_1} \mathbf{E}\bigl(\psi( \lambda)^{-\nu} ;\, S_{\nu} = - z_1\bigr) = 1 . $$

In particular, for z 1=z 2=z and p=1/2, we have by symmetry that

$$ \mathbf{E}\bigl(\psi(\lambda)^{-\nu} ;\, S_{\nu} = z\bigr) = \frac{1}{e^{\lambda z} + e^{-\lambda z}} ,\qquad \mathbf{E}\bigl(\psi( \lambda)^{-\nu}\bigr) = \frac{2}{e^{\lambda z} + e^{-\lambda z}} . $$
(15.2.31)

Let λ(s) be the unique positive solution of the equation (λ)=1, s∈(0,1). Since here \(\psi(\lambda) = \frac{1}{2} (e^{\lambda} + e^{-\lambda})\), solving the quadratic equation yields

$$e^{\lambda(s)} = \frac{1 + \sqrt{1 - s^2}}{s}. $$

Identity (15.2.31) now gives

$$\mathbf{E}s^{\nu} = 2 \bigl(e^{\lambda(s) z} + e^{-\lambda(s) z}\bigr) . $$

We obtain an explicit form of the generating function of the random variable ν, which enables us to find the probabilities P(ν=n), n=1,2,… by expanding elementary functions into series.

Example 15.2.4

Simple explicit formulas can also be obtained from Wald’s identity in the problem with one boundary, where ν=ν g , g(n)=z. In that case, the class of distributions of ζ k could be wider than in Example 15.2.3. Suppose that one of the two following conditions holds (cf. Sect. 12.8).

  1. 1.

    The transform walk is arithmetic and skip-free, i.e. ζ k are integers, P(ξ k =1)>0 and P(ζ k ≥2)=0.

  2. 2.

    The walk is right exponential, i.e.

    $$ \mathbf{P}(\zeta_k > t) = c e^{- \alpha t} $$
    (15.2.32)

    either for all t>0 or for t=0,1,2,… if the walk is integer-valued (the geometric distribution).

The random variable ν g will be proper if and only if E ξ k =ψ′(0)≥0 (see Chaps. 9 and 12). For skip-free random walks, Wald’s identity (15.2.26) yields (g(n)=z>0, S ν =z)

$$ e^{\lambda z} \mathbf{E}\psi^{-\nu} (\lambda) = 1 , \quad\lambda> \lambda_0 . $$
(15.2.33)

For s≤1, the equation ψ(λ)=s −1 (cf. Example 15.2.3) has in the domain λ>λ 0 a unique solution λ(s). Therefore identity (15.2.33) can be written as

$$ \mathbf{E}s^{\nu} = e^{- z \lambda(s)}. $$
(15.2.34)

This statement implies a series of results from Chaps. 9 and 12. Many properties of the distribution of ν:=ν z can be derived from this identity, in particular, the asymptotics of P(ν z =n) as z→∞, n→∞. We already know one of the ways to find this asymptotics. It consists of using Theorem 12.8.4, which implies

$$ \mathbf{P}(\nu_z=n)=\frac{x}{n} \mathbf{P}(S_n=z), $$
(15.2.35)

and the local Theorem 9.3.4 providing the asymptotics of P(S n =z). Using relation (15.2.34) and the inversion formula is an alternative approach to studying the asymptotics of P(ν z =n). If we use the inversion formula, there will arise an integral of the form

$$ \int_{|s|=1}s^{-n}e^{-z\mu(s)}\,ds, $$
(15.2.36)

where the integrand s n e (s), after the change of variable μ(s)=λ (or s=ψ(λ)−1), takes the form

$$\exp- \bigl\{z\lambda-n\ln\psi(\lambda) \bigr\}. $$

The integrand in the inversion formula for the probability P(S n =z) has the same form. This probability has already been studied quite well (see Theorem 9.3.4); its exponential part has the form e (α), where α=z/n, Λ(α)=sup λ (αλ−lnψ(λ)) is the large deviation rate function (see Sect. 9.1 and the footnote for Definition 9.1.1). A more detailed study of the inversion formula (15.2.36) allows us to obtain (15.2.35).

Similar relations can be obtained for random walks with exponential right distribution tails. Let, for example, (15.2.32) hold for all t>0. Then the conditional distribution P(S ν >t|ν=n,S n−1=x) coincides with the distribution

$$\mathbf{P}(\zeta_n > z - x + t | \zeta_n > z - x) = e^{- \alpha t} $$

and clearly depends neither on n nor on x. This means that ν and S ν are independent, S ν =z+γ, \(\gamma \mathbin {{\subset }\hspace {-.7em}{=}}{\boldsymbol{\Gamma}}_{\alpha}\),

$$\mathbf{E}\psi(\lambda)^{-\nu} = \frac{1}{\mathbf{E}e ^{(z + \gamma) \lambda}}= e^{- \lambda z} \frac{\alpha- \lambda}{\alpha} , \quad\lambda_0 < \lambda< \alpha;\qquad \mathbf{E}s^{\nu} = e^{-z \lambda(s)} \frac{\alpha- \lambda(s)}{\alpha} , $$

where λ(s) is, as before, the only solution to the equation ψ(λ)=s −1 in the domain λ>λ 0. This implies the same results as (15.2.34).

If P(ζ k >t)=c 1 e αt and P(ζ k <−t)=c 2 e βt, t>0, then, in the problem with two boundaries, we obtain for \(\nu= \nu_{g_{-}, g_{+}}\), g +(n)=z 2 and g (n)=−z 1 in exactly the same way from (15.2.26) that

$$\frac{\alpha e^{\lambda z_2}}{\alpha- \lambda} \mathbf{E}\bigl(\psi^{-\nu}(\lambda) ; \, S_{\nu} \geq z_2\bigr) + \frac{\beta e^{-\lambda z_1}}{\beta+ \lambda} \, \mathbf{E} \bigl(\psi^{-\nu}(\lambda) ; \, S_{\nu} \leq- z_1 \bigr) = 1 , \quad\lambda\in(- \beta, \alpha). $$

15.3 Inequalities

15.3.1 Inequalities for Martingales

First of all we note that the property E X n ≤1 of the sequence \(X_{n} = {e^{\lambda S_{n}}}{\psi_{0}(\lambda)^{-n}}\) forming a supermartingale for an appropriate function ψ 0(λ) remains true when we replace n with a stopping time ν (an analogue of inequality (15.2.24)) in a much more general case than that of Theorem 15.2.6. Namely, ζ k may be dependent.

Let, as before, \(\{\mathfrak{F}_{n}\}\) be an increasing sequence of σ-algebras, and ζ n be \(\mathfrak{F}_{n}\)-measurable random variables. Suppose that a.s.

$$ \mathbf{E}\bigl(e^{\lambda\zeta_n} \big| \mathfrak{F}_{n-1} \bigr) \leq\psi _0(\lambda) . $$
(15.3.1)

This condition is always met if a.s.

$$\mathbf{P}(\zeta_n \geq x | \mathfrak{F}_{n-1}) \leq G(x) , \qquad\psi_0(\lambda) = - \int e^{\lambda x} \, d G(x) < \infty. $$

In that case the sequence \(X_{n} = e^{\lambda S_{n}} \psi^{-n}_{0}(\lambda)\) forms a supermartingale:

$$\mathbf{E}(X_n | \mathfrak{F}_{n-1}) \leq X_{n-1} , \quad\mathbf {E}X_n \leq1 . $$

Theorem 15.3.1

Let (15.3.1) hold and ν be a stopping time. Then inequalities (15.2.24) and (15.2.25) will hold true with ψ replaced by ψ 0.

The Proof

of the theorem repeats almost verbatim that of Theorem 15.2.6. □

Now we will obtain inequalities for the distribution of

$$\overline{X}_n = \max_{k \leq n} X_k \quad{\mbox{and}} \quad X_n^* = \max_{k \leq n} |X_k| , $$

X n being an arbitrary submartingale.

Theorem 15.3.2

(Doob)

Let \(\{ X_{n}, \mathfrak{F}_{n} ; \, n \geq0 \}\) be a nonnegative submartingale. Then, for all x≥0 and n≥0,

$$\mathbf{P}(\overline{X}_n > x) \leq\frac{1}{x} \mathbf{E}X_n . $$

Proof

Let

$$\nu= \eta(x): = \inf\{ k \geq0 : \ X_k > x \} , \qquad\nu(n) := \min( \nu, n) . $$

It is obvious that n and ν(n) are stopping times, ν(n)≤n, and therefore, by Theorem 15.2.1 (see (15.2.3) for ν 2=n, ν 1=ν(n)),

$$\mathbf{E}X_n \geq\mathbf{E}X_{\nu(n)} . $$

Observing that \(\{ \overline{X}_{n} > x \} = \{ X_{\nu(n)} > x \}\), we have from Chebyshev’s inequality that

$$\mathbf{P}(\overline{X}_n > x) = \mathbf{P}(X_{\nu(n)} > x ) \leq \frac{1}{x} \mathbf{E}X_{\nu(n)} \leq\frac{1}{x} \mathbf{E}X_n. $$

The theorem is proved. □

Theorem 15.3.2 implies the following.

Theorem 15.3.3

(The second Kolmogorov inequality)

Let \(\{ X_{n}, \mathfrak{F}_{n} ; \, n \geq0 \}\) be a martingale with a finite second moment \(\mathbf{E}X^{2}_{n}\). Then \(\{ X^{2}_{n}, \mathfrak{F}_{n} ; \, n \geq0 \}\) is a submartingale and by Theorem 15.3.2

$$\mathbf{P}\bigl(X^*_n > x\bigr) \leq\frac{1}{x^2} \mathbf{E}X^2_n. $$

Originally A.N. Kolmogorov established this inequality for sums X n =ξ 1+⋯+ξ n of independent random variables ξ n . Theorem 15.3.3 extends Kolmogorov’s proof to the case of submartingales and refines Chebyshev’s inequality.

The following generalisation of Theorem 15.3.3 is also valid.

Theorem 15.3.4

If \(\{ X_{n}, \mathfrak{F}_{n} ; \,n \geq0 \}\) is a martingale and E|X n |p<∞, p≥1, then \(\{ |X_{n}|^{p}, \mathfrak{F}_{n} ; \, n \geq0 \}\) forms a nonnegative submartingale and, for all x>0,

$$\mathbf{P}\bigl(X^*_n \geq x\bigr) \leq\frac{1}{x^p} \mathbf{E}|X_n|^p. $$

If \(\{ X_{n}, \mathfrak{F}_{n} ; \, n \geq0\}\) is a submartingale, \(\mathbf{E}e^{\lambda X_{n}} < \infty\), λ>0, then \(\{ e^{\lambda X_{n}} , \mathfrak {F}_{n} ; \, n\ge0 \}\) also forms a nonnegative submartingale,

$$\mathbf{P}(\overline{X}_n \geq x) \leq e^{-\lambda x} \mathbf {E}e^{\lambda X_n} . $$

Both Theorem 15.3.4 and Theorem 15.3.3 immediately follow from Lemma 15.1.3 and Theorem 15.3.2.

If \(X_{n} = S_{n} = \sum^{n}_{k=1} \zeta_{k}\), where ζ k are independent, identically distributed and satisfy the Cramér condition: λ +=sup{λ:ψ(λ)<∞}>0, then, with the help of the fundamental Wald identity, one can obtain sharper inequalities for \(\mathbf{P}(\overline{X}_{n} > x)\) in the case a=E ξ k <0.

Recall that, in the case a=ψ′(0)<0, the function \(\psi (\lambda) = \mathbf{E}e^{\lambda\zeta_{k}}\) decreases in a neighbourhood of λ=0, and, provided that ψ(λ +)≥1, the equation ψ(λ)=1 has a unique solution μ in the domain λ>0.

Let ζ be a random variable having the same distribution as ζ k . Put

$$\psi_+ := \sup_{t > 0} \mathbf{E}\bigl(e^{\mu(\zeta- t)} \big| \zeta> t\bigr) , \qquad \psi_- := \inf_{t>0} \mathbf{E} \bigl(e^{\mu(\zeta- t)} \big| \zeta> t\bigr) . $$

If, for instance, P(ζ>t)=ce αt for t>0 (in this case necessarily α>μ in (15.2.32)), then

$$\mathbf{P}(\zeta- t > v | \zeta> t) = \frac{\mathbf{P}(\zeta> t + v)}{\mathbf{P}(\zeta> t)} = e^{- \alpha v} , \qquad \psi_+ = \psi_- = \frac{\alpha}{\alpha- \mu} . $$

A similar equality holds for integer-valued ξ with a geometric distribution.

For other distributions, one has ψ +>ψ .

Under the above conditions, one has the following assertion which supplements Theorem 12.7.4 for the distribution of the random variable S=sup k S k .

Theorem 15.3.5

If a=E ζ<0 then

$$ \psi^{-1}_+ e^{-\mu x} \leq\mathbf{P}(S > x) \leq\psi^{-1}_- e^{-\mu x},\quad x>0. $$
(15.3.2)

This theorem implies that, in the case of exponential right tails of the distribution of ζ (see (15.2.32)), inequalities (15.3.2) become the exact equality

$$\mathbf{P}(S > x) = \frac{\alpha- \mu}{\alpha} e^{- \mu x} . $$

(The same result was obtained in Example 12.5.1.) This means that inequalities (15.3.2) are unimprovable. Since \(\overline{S}_{n} = \max_{k \leq n} S_{k} \leq S\), relation (15.3.2) implies that, for any n,

$$\mathbf{P}(\overline{S}_n > x) \leq\psi^{-1}_- e^{- \mu x} . $$

Proof of Theorem 15.3.5

Set ν:=∞ if S=sup k≥0 S k x, and put ν:=η(x)=min{k: S k >x} otherwise. Further, let χ(x):=S η(x)x be the excess of the level x. We have

$$\begin{aligned} \mathbf{P}\bigl(\chi(x) > v ; \, \nu< \infty\bigr) =& \sum ^{\infty}_{k=1} \int ^x_{-\infty} \mathbf{P}( \overline{S}_{k-1} \leq x, \, S_{k-1} \in du, \, \zeta_k > x - u + v) \\ =& \sum ^{\infty}_{k=1} \int ^x_{-\infty} \mathbf{P}(\overline{S}_{k-1} \leq x, \, S_{k-1} \in du, \, \zeta_k > x - u)\\&{}\times \mathbf{P}(\zeta_k > x - u + v | \zeta_k > x - u) , \\ \mathbf{E}\bigl(e^{\mu\chi(x)} ; \, \nu< \infty\bigr) \leq& \sum ^{\infty}_{k=1} \int ^x_{- \infty} \mathbf{P}( \overline{S}_{k-1} \leq x,\, S_{k-1} \in d u,\, \zeta_k > x - u) \, \psi_+ \\=& \psi_+ \sum ^{\infty}_{k=1} \ \mathbf{P}( \nu= k) = \psi_+ \mathbf{P}(\nu< \infty) . \end{aligned}$$

Similarly,

$$\mathbf{E}\bigl(e^{\mu\chi(x)} ; \, \nu< \infty\bigr) \ge\psi_- \mathbf {P}(\nu< \infty ) . $$

Next, by Corollary 15.2.6,

$$1 = \mathbf{E}\bigl(e^{\mu S_{\nu}} ; \, \nu< \infty\bigr) = e^{\mu x} \mathbf{E}\bigl(e^{\mu\chi(x)} ; \, \nu< \infty\bigr) \leq e^{\mu x} \psi_+ \, \mathbf{P}(\nu< \infty) . $$

Because P(ν<∞)=P(S>x), we get from this the right inequality of Theorem 15.3.5. The left inequality is obtained in the same way. The theorem is proved. □

Remark 15.3.1

We proved Theorem 15.3.5 with the help of the fundamental Wald identity. But there is a direct proof based on the following relations:

$$ \begin{aligned}[b] \psi^n(\lambda) = \mathbf{E}e^{\lambda S_n} &\geq \sum^n_{k=1} \mathbf{E}\bigl(e^{(S_k + S_n - S_k) \lambda} ; \,\nu= k\bigr) \\&= \sum^n_{k=1} \mathbf{E}\bigl(e^{(x + \chi(x)) \lambda} e^{(S_n - S_k) \lambda} ; \, \nu= k\bigr) . \end{aligned} $$
(15.3.3)

Here the random variables e λχ(x)I(ν=k) and S n S k are independent and, as before,

$$\mathbf{E}\bigl(e^{\lambda\chi(x)} ;\, \nu= k\bigr) \geq\psi_- \,\mathbf {P}(\nu= k) . $$

Therefore, for all λ such that ψ(λ)≤1,

$$\psi^n(\lambda) \geq e^{\lambda x} \psi_- \sum ^n_{k=1} \psi^{n-k}(\lambda) \, \mathbf{P}(\nu= k) \geq\psi_- e^{\lambda x} \psi^n (\lambda) \, \mathbf{P}(\nu\leq n) . $$

Hence we obtain

$$\mathbf{P}(\overline{S}_n > x) = \mathbf{P}(\nu\leq n) \leq \psi^{-1}_- e^{- \lambda x} . $$

Since the right-hand side does not depend on n, the same inequality also holds for P(S>x). The lower bound is obtained in a similar way. One just has to show that, in the original equality (cf. (15.3.3))

$$\psi^n(\lambda) = \sum ^n_{k=1} \mathbf{E}\bigl(e^{\lambda S_n} ; \, \nu= k\bigr) + \mathbf{E} \bigl(e^{\lambda S_n} ; \, \nu> n\bigr), $$

one has \(\mathbf{E}(e^{\lambda S_{n}} ; \, \nu> n) = o (1)\) as n→∞ for λ=μ, which we did in Sect. 15.2.

15.3.2 Inequalities for the Number of Crossings of a Strip

We now return to arbitrary submartingales X n and prove an inequality that will be necessary for the convergence theorems of the next section. It concerns the number of crossings of a strip by the sequence X n . Let a<b be given numbers. Set ν 0=0,

$$\begin{array}{r@{\,}c@{\,}l@{\quad}r@{\,}c@{\,}l} \nu_1&:=&\min\{n>0:X_n\le a\},&\nu_2&:=&\min\{n>\nu_1:X_n\ge b\},\cr&&\dotfill&&&\dotfill \cr \nu_{2k-1}&:=&\min\{n>\nu_{2k-2}:X_n\le a\},&\nu_{2k}&:=&\min\{n>\nu_{2k-1}:X_n\ge b\}. \end{array} $$

We put ν m :=∞ if the path {X n } for nν m−1 never crosses the corresponding level. Using this notation, one can define the number of upcrossings of the strip (interval) [a,b] by the trajectory X 0,…,X n as the random variable

Set (a)+=max(0,a).

Theorem 15.3.6

(Doob)

Let \(\{X_{n}, \mathfrak{F}_{n} ; \, n \ge0 \}\) be a submartingale. Then, for all n,

$$ \mathbf{E}\nu(a, b ; n) \le\frac{\mathbf{E}(X_n - a)^+}{b - a} . $$
(15.3.4)

It is clear that inequality (15.3.4) assumes by itself that only the submartingale \(\{X_{n} ,\mathfrak{F}_{n} ; 0 \le k \le n \}\) is given.

Proof

The random variable ν(a,b; n) coincides with the number of upcrossings of the interval [0,ba] by the sequence (X n a)+. Now \(\{(X_{n} - a)^{+} , \mathfrak{F}_{n} ; \, n \ge0 \}\) is a nonnegative submartingale (see Example 15.1.4) and therefore, without loss of generality, one can assume that a=0 and X n ≥0, and aim to prove that

$$\mathbf{E}\nu(0, b ;\, n) \le\frac{\mathbf{E}X_n}{b} . $$

Let

In Fig. 15.1, ν 1=2, ν 2=5, ν 3=8; η j =0 for j≤2, η j =1 for 3≤j≤5 etc. It is not hard to see (using the Abel transform) that (with X 0=0, η 0=1)

$$\eta_0 X_0 + \sum _1^n \eta_j (X_j - X_{j-1}) = \sum _0^{n-1} X_j (\eta_j - \eta_{j+1}) + \eta_n X_n \ge b \nu(0, b ; n) . $$

Moreover (here \(\mathcal{N}_{1}\) denotes the set of odd numbers),

$$\{ \eta_j = 1 \} =\bigcup _{k \in\mathcal{N}_1} \{ \nu_k < j \le\nu_{k+1} \} =\bigcup _{k \in\mathcal{N}_1 } \bigl[\{ \nu_k \le j-1 \} - \{ \nu_{k+1} \le j-1 \}\bigr] \in \mathfrak{F}_{j-1} . $$

Therefore, by virtue of the relation \(\mathbf{E}(X_{j} | \mathfrak {F}_{j-1}) - X_{j-1} \ge0\), we obtain

The theorem is proved.

Fig. 15.1
figure 1

Illustration to the proof of Theorem 15.3.6 showing the locations of the random times ν 1, ν 2, and ν 3 (here a=0)

 □

15.4 Convergence Theorems

Theorem 15.4.1

(Doob’s martingale convergence theorem)

Let

$$\{X_n , \mathfrak{F}_n ; \, {-}\infty< n < \infty\} $$

be a submartingale. Then

  1. (1)

    The limit X −∞:=lim n→−∞ X n exists a.s., \(\mathbf{E}X_{-\infty}^{+} < \infty\), and the process \(\{X_{n} , \mathfrak{F}_{n} ; \, {-}\infty\le n < \infty\}\) is a submartingale.

  2. (2)

    If \(\sup_{n} \mathbf{E}X_{n}^{+} < \infty\) then X :=lim n→∞ X n exists a.s. and \(\mathbf{E}X_{\infty}^{+} < \infty\). If, moreover, sup n E|X n |<∞ then E|X |<∞.

  3. (3)

    The random sequence \(\{X_{n} , \mathfrak{F}_{n} ; \, -\infty\le n \le\infty\}\) forms a submartingale if and only if the sequence \(\{ X_{n}^{+} \}\) is uniformly integrable.

Proof

(1) Since

(here the limits are taken as n→−∞), the assumption on divergence with positive probability

$$\mathbf{P}(\limsup X_n > \liminf X_n ) > 0 $$

means that there exist rational numbers a<b such that

$$ \mathbf{P}(\limsup X_n > b > a > \liminf X_n ) > 0 . $$
(15.4.1)

Let ν(a,b; m) be the number of upcrossings of the interval [a,b] by the sequence Y 1=X m ,…,Y m =X −1 and ν(a,b)=lim m→∞ ν(a,b;m). Then (15.4.1) means that

$$ \mathbf{P}\bigl(\nu(a,b) = \infty\bigr) > 0 . $$
(15.4.2)

By Theorem 15.3.6 (applied to the sequence Y 1,…,Y m ),

(15.4.3)
$$\begin{aligned}[cc] \mathbf{E}\nu(a,b) \le\frac{\mathbf{E}X_{-1}^+ + |a|}{b - a} . \end{aligned}$$
(15.4.4)

Inequality (15.4.4) contradicts (15.4.2) and hence proves that

$$\mathbf{P}(\limsup X_n = \liminf X_n) = 1. $$

Moreover, by the Fatou–Lebesgue theorem (\(X_{-\infty}^{+} := \liminf X_{n}^{+}\)),

$$ \mathbf{E}X_{-\infty}^+ \le\liminf X_n^+ \le \mathbf{E}X_{-1}^+ < \infty. $$
(15.4.5)

Here the second inequality follows from the fact that \(\{ X_{n}^{+}, \mathfrak{F}_{n} \}\) is also a submartingale (see Lemma 15.1.3) and therefore \(\mathbf{E}X_{n}^{+} \uparrow\).

By Lemma 15.1.2, to prove that \(\{X_{n}, \mathfrak{F}_{n} ;\, -\infty \le n < \infty\}\) is a submartingale, it suffices to verify that, for any \(A \in\mathfrak{F}_{-\infty} \subset\mathfrak{F}\),

$$ \mathbf{E}(X_{-\infty} ;\, A) \le\mathbf{E}(X_n ;\, A) . $$
(15.4.6)

Set X n (a):=max(X n ,a). By Lemma 15.1.4, \(\{ X_{n}(a), \mathfrak{F}_{n} ; \, n \le0 \}\) is a uniformly integrable submartingale. Therefore, for any −∞<k<n,

$$ \begin{aligned}[c] \mathbf{E}\bigl(X_{k}(a) ;\, A\bigr) &\le\mathbf{E} \bigl(X_n(a) ;\, A\bigr) ,\\ \mathbf{E}\bigl(X_{-\infty}(a) ; \, A\bigr) &= \lim _{k \to-\infty} \mathbf {E}\bigl(X_k(a) ; \, A\bigr) \le \mathbf{E}\bigl(X_n(a) ; \, A\bigr) . \end{aligned} $$
(15.4.7)

Letting a→−∞ we obtain (15.4.6) from the monotone convergence theorem.

(2) The second assertion of the theorem is proved in the same way. One just has to replace the right-hand sides of (15.4.3) and (15.4.4) with \(\mathbf{E}X_{n}^{+}\) and \(\sup_{n} \mathbf{E}X_{n}^{+}\), respectively. Instead of (15.4.5) we get (the limits here are as n→∞)

$$\mathbf{E}X_{\infty}^+ \le\liminf\mathbf{E}X_n^+ < \infty, $$

and if sup n E|X n |<∞ then

$$\mathbf{E}|X_{\infty}| \le\liminf\mathbf{E}|X_n| < \infty. $$

(3) The last assertion of the theorem is proved in exactly the same way as the first one—the uniform integrability enables us to deduce along with (15.4.7) that, for any \(A \in\mathfrak{F}_{n}\),

$$\mathbf{E}\bigl(X_{\infty}(a) ; \, A\bigr) = \lim_{k \to\infty} \mathbf {E}\bigl(X_k(a) ; \, A\bigr) \ge\mathbf{E} \bigl(X_n(a) ; \, A\bigr). $$

The converse part of the third assertion of the theorem follows from Lemma 15.1.4. The theorem is proved. □

Now we will obtain some consequences of Theorem 15.4.1.

So far (see Sect. 4.8), while studying convergence of conditional expectations, we dealt with expectations of the form \(\mathbf{E}(X_{n} | \mathfrak{F})\). Now we can obtain from Theorem 15.4.1 a useful theorem on convergence of conditional expectations of another type.

Theorem 15.4.2

(Lévy)

Let a nondecreasing family \(\mathfrak{F}_{1} \subseteq\mathfrak{F}_{2} \subseteq\cdots \subseteq\mathfrak{F}\) of σ-algebras and a random variable ξ, with E|ξ|<∞, be given on a probability space \(\langle\varOmega ,\mathfrak{F} ,\mathbf{P}\rangle\). Let, as before, \(\mathfrak{F}_{\infty} := \sigma(\bigcup_{n} \mathfrak{F}_{n})\) be the σ-algebra generated by events from \(\mathfrak{F}_{1}, \mathfrak{F}_{2}, \ldots\) . Then, as n→∞,

$$ \mathbf{E}(\xi| \mathfrak{F}_n) \stackrel{ \mathit{a}.\mathit {s}.}{\longrightarrow}\mathbf{E}(\xi| \mathfrak{F}_{\infty}) . $$
(15.4.8)

Proof

Set \(X_{n} := \mathbf{E}(\xi| \mathfrak{F}_{n})\). We already know (see Example 15.1.3) that the sequence \(\{ X_{n}, \mathfrak{F}_{n} ; \, 1 < n \le\infty\}\) is a martingale and therefore, by Theorem 15.4.1, the limit lim n→∞ X n =X (∞) exists a.s. It remains to prove that \(X_{(\infty)} = \mathbf{E}(\xi| \mathfrak{F}_{\infty})\) (i.e., that X (∞)=X ). Since \(\{ X_{n}, \mathfrak{F}_{n} ; \, 1 \le n \le\infty\} \) is by Lemma 15.1.4 a uniformly integrable martingale,

$$\mathbf{E}(X_{(\infty)} ; \, A) = \lim_{n \to\infty} \mathbf {E}(X_n ; \, A) =\lim_{n\to\infty}\mathbf{E}\bigl( \mathbf{E}(\xi|\mathfrak{F}_n);\, A\bigr)=\mathbf{E}(\xi;\,A) $$

for \(A \in\mathfrak{F}_{k}\) and any k=1,2,… This means that the left- and right-hand sides of the last relation, being finite measures, coincide on the algebra \(\bigcup_{n=1}^{\infty} \mathfrak {F}_{n}\). By the theorem on extension of a measure (see Appendix 1), they will coincide for all \(A \in\sigma(\bigcup_{n=1}^{\infty} \mathfrak {F}_{n}) = \mathfrak{F}_{\infty}\). Therefore, by the definition of conditional expectation,

$$X_{(\infty)} = \mathbf{E}(\xi| \mathfrak{F}_{\infty}) = X_{\infty} . $$

The theorem is proved. □

We could also note that the uniform integrability of \(\{ X_{n}, \mathfrak{F}_{n} ;\, 1 \le n \le\infty\}\) implies that \(\stackrel{\mathit{a}.\mathit{s}.}{\longrightarrow}\) in (47) can be replaced by \(\stackrel{(1)}{\longrightarrow}\).

Theorem 15.4.1 implies the strong law of large numbers. Indeed, turn to our Example 15.1.4. By Theorem 15.4.1, the limit X −∞=lim n→−∞ X n =lim n→∞ n −1 S n exists a.s. and is measurable with respect to the tail (trivial) σ-algebra, and therefore it is constant with probability 1. Since E X −∞=E ξ 1, we have \(n^{-1}{S_{n}}\stackrel{\mathit {a}.\mathit{s}.}{\longrightarrow}\mathbf{E}\xi_{1}\).

One can also obtain some extensions of the theorems on series convergence of Chap. 11 to the case of dependent variables. Let

$$X_n = S_n = \sum_{k=1}^n \xi_k $$

and X n form a submartingale (\(\mathbf{E}(\xi_{n+1} | \mathfrak {F}_{n}) \ge0\)). Let, moreover, E|X n |<c for all n and for some c<∞. Then the limit S =lim n→∞ S n exists a.s. (As well as Theorem 15.4.1, this assertion is a generalisation of the monotone convergence theorem. The crucial role is played here by the condition that E|X n | is bounded.) In particular, if ξ k are independent, E ξ k =0, and the variances \(\sigma_{k}^{2}\) of ξ k are such that \(\sum_{k=1}^{\infty} \sigma_{k}^{2} < \sigma^{2} <\infty\), then

$$\mathbf{E}|X_n | \le\bigl(\mathbf{E}X_n^2 \bigr)^{1/2} \le \Biggl(\, \sum _{k=1}^n \sigma_k^2 \Biggr)^{1/2} \le \sigma< \infty, $$

and therefore \(S_{n}\stackrel{\mathit{a}.\mathit{s}.}{\longrightarrow }S_{\infty}\). Thus we obtain, as a consequence, the Kolmogorov theorem on series convergence.

Example 15.4.1

Consider a branching process {Z n } (see Sect. 7.7). We know that Z n admits a representation

$$Z_n = \zeta_1 + \cdots+\zeta_{Z_{n-1}} , $$

where the ζ k are identically distributed integer-valued random variables independent of each other and of Z n−1, ζ k being the number of descendants of the k-th particle from the (n−1)-th generation. Assuming that Z 0=1 and setting μ:=E ζ k , we obtain

$$\mathbf{E}(Z_{n} | Z_{n-1}) = \mu Z_{n-1} , \qquad\mathbf{E}Z_n = \mu\mathbf{E}Z_{n-1} = \mu^n . $$

This implies that X n =Z n /μ n is a martingale, because

$$\mathbf{E}(X_n | X_{n-1}) = \mu^{1-n} Z_{n-1} = X_{n-1} . $$

For branching processes we have the following.

Theorem 15.4.3

The sequence X n =μ n Z n converges almost surely to a proper random variable X with E X<∞. The ch.f. φ(λ) of the random variable X satisfies the equation

$$\varphi(\mu\lambda)=p\bigl(\varphi(\lambda)\bigr), $$

where \(p(v)=\mathbf{E}v^{\zeta_{k}}\).

Theorem 15.4.3 means that μ n Z n has a proper limiting distribution as n→∞.

Proof

Since X n ≥0 and E X n =1, the first assertion follows immediately from Theorem 15.4.1.

Since \(\mathbf{E}z^{Z_{n}}\) is equal to the n-th iteration of the function f(z), for the ch.f. of Z n we have (φ η (λ):=E e iλη)

Because X n X and the function p is continuous, from this we obtain the equation for the ch.f. of the limiting distribution X:

$$\varphi(\lambda)=p \biggl(\varphi \biggl(\frac{\lambda}{\mu} \biggr) \biggr). $$

The theorem is proved. □

In Sect. 7.7 we established that in the case μ≤1 the process Z n becomes extinct with probability 1 and therefore P(X=0)=1. We verify now that, for μ>1, the distribution of X is nondegenerate (not concentrated at zero). It suffices to prove that {X n ,0≤n≤∞} forms a martingale and consequently

$$\mathbf{E}X = \mathbf{E}X_n \ne0. $$

By Theorem 15.4.1, it suffices to verify that the sequence X n is uniformly integrable. To simplify the reasoning, we suppose that \(\operatorname{Var}(\zeta _{k} )= \sigma^{2} < \infty\) and show that then \(\mathbf{E}X_{n}^{2} < c <\infty\) (this certainly implies the required uniform integrability of X n , see Sect. 6.1). One can directly verify the identity

$$Z_n^2 - \mu^{2n} = \sum _{k=1}^n \ \bigl[Z_k^2 - ( \mu Z_{k-1})^2 \bigr] \mu^{2n-2k} . $$

Since \(\mathbf{E} [Z_{k}^{2} - (\mu Z_{k-1})^{2} | Z_{k-1} ] = \sigma^{2}Z_{k-1}\) (recall that \(\operatorname{Var}(\eta)= \mathbf{E}(\eta^{2} - (\mathbf{E}\eta)^{2})\)), we have

Thus we have proved that X is a nondegenerate random variable,

$$\mathbf{E}X = 1,\quad\operatorname{Var}(X_n) \to\frac{\sigma ^2}{\mu(\mu- 1)}. $$

From the last relation one can easily obtain that \(\operatorname{Var}( X) = \frac{\sigma^{2}}{\mu(\mu- 1)}\). To this end one can, say, prove that X n is a Cauchy sequence in mean quadratic and hence (see Theorem 6.1.3) \(X_{n} \stackrel{(2)}{\longrightarrow} X\).

15.5 Boundedness of the Moments of Stochastic Sequences

When one uses convergence theorems for martingales, conditions ensuring boundedness of the moments of stochastic sequences \(\{ X_{n}, \mathfrak{F}_{n}\}\) are of significant interest (recall that the boundedness of E X n is one of the crucial conditions for convergence of submartingales). The boundedness of the moments, in turn, ensures that X n is stochastically bounded, i.e., that sup n P(X n >N)→0 as N→∞. The last boundedness is also of independent interest in the cases where one is not able to prove, for the sequence {X n }, convergence or any other ergodic properties.

For simplicity’s sake, we confine ourselves to considering nonnegative sequences X n ≥0. Of course, if we could prove convergence of the distributions of X n to a limiting distribution, as was the case for Markov chains or submartingales in Theorem 15.4.1, then we would have a more detailed description of the asymptotic behaviour of X n as n→∞. This convergence, however, requires that the sequence X n satisfies stronger constraints than will be used below.

The basic and rather natural elements of the boundedness conditions to be considered below are: the boundedness of the moments of ξ n =X n X n−1 of the respective orders and the presence of a negative “drift” \(\mathbf{E}(\xi_{n} | {\mathfrak{F}}_{n-1})\) in the domain X n−1>N for sufficiently large N. Such a property has already been utilised for Markov chains; see Corollary 13.7.1 (otherwise the trajectory of X n may go to ∞).

Let us begin with exponential moments. The simplest conditions ensuring the boundedness of \(\sup_{n} \mathbf{E}e^{\lambda X_{n}}\) for some λ>0 are as follows: for all n≥1 and some λ>0 and N<∞,

(15.5.1)
$$\begin{aligned}[cc] \mathbf{E}\bigl(e^{\lambda\xi_n} \big| \mathfrak{F}_{n-1} \bigr)\, \mathrm {I}(X_{n-1} \geq N) \leq\psi(\lambda) < \infty. \end{aligned}$$
(15.5.2)

Theorem 15.5.1

If conditions (15.5.1) and (15.5.2) hold then

$$ \mathbf{E}\bigl(e^{\lambda X_n} \big| \mathfrak{F}_0 \bigr) \leq\beta(\lambda) e^{\lambda X_0} + \frac{\psi (\lambda) \, e^{\lambda N}}{1 - \beta(\lambda)} . $$
(15.5.3)

Proof

Denote by A n the left-hand side of (15.5.3). Then, by virtue of (15.5.1) and (15.5.2), we obtain

$$\begin{aligned} A_n =& \mathbf{E}\bigl\{ \mathbf{E}\bigl[e^{\lambda X_n} \bigl( \mathrm{I}(X_{n-1} > N) + \mathrm{I}(X_{n-1} \leq N)\bigr) \big| \mathfrak{F}_{n-1}\bigr] \big| \mathfrak {F}_0 \bigr\} \\\leq& \mathbf{E}\bigl[e^{\lambda X_{n-1}} \bigl(\beta(\lambda) \, \mathrm {I}(X_{n-1} > N) + \psi(\lambda) \, \mathrm{I}(X_{n-1} \leq N) \bigr) \big| \mathfrak {F}_0\bigr] \\\leq& \beta(\lambda) A_{n-1} + e^{\lambda N} \psi(\lambda) . \end{aligned}$$

This immediately implies that

$$A_n \leq A_0 \beta^n(\lambda) + e^{\lambda N} \psi(\lambda) \sum^{n-1}_{k=0} \beta^k(\lambda) \leq A_0 \beta^n(\lambda) + \frac{e^{\lambda N} \psi(\lambda)}{1 - \beta(\lambda)} . $$

The theorem is proved. □

The conditions

(15.5.4)
$$\begin{aligned}[cc] \mathbf{E}\bigl(e^{\lambda|\xi_n|} \big| \mathfrak{F}_{n-1} \bigr) \leq\psi _1(\lambda) < \infty \quad\textit{for some}\ \lambda>0 \end{aligned}$$
(15.5.5)

are sufficient for (15.5.1) and (15.5.2).

The first condition means that Y n :=(X n +εn) I(X n−1>N) is a supermartingale.

We now prove sufficiency of (15.5.4) and (15.5.5). That (15.5.2) holds is clear. Further, make use of the inequality

$$e^x \leq1 + x + \frac{x^2}{2} e^{|x|} , $$

which follows from the Taylor formula for e x with the remainder in the Cauchy form:

$$e^x=1+x+\frac{x^2}{2}\,e^{\theta x},\quad\theta\in[0,1]. $$

Then, on the set {X n−1>N}, one has

$$\mathbf{E}\bigl(e^{\lambda\xi_n} \big| {\mathfrak{F}}_{n-1}\bigr) \leq1 - \lambda\varepsilon+ \frac{\lambda^2}{2} \mathbf{E}\bigl(\xi^2_n e^{\lambda|\xi_n|} \big| {\mathfrak{F}}_{n-1}\bigr). $$

Since x 2<e λx/2 for sufficiently large x, by the Hölder inequality it follows that, together with (15.5.5), we will have

$$\mathbf{E}\bigl(\xi^2_n e^{{\lambda|\xi_n|}/{2}} \big| {\mathfrak {F}}_{n-1}\bigr) \leq\psi_2 (\lambda) < \infty. $$

This implies that, for sufficiently small λ, one has on the set {X n−1>N} the inequality

$$\mathbf{E}\bigl(e^{\lambda\xi_n} \big| {\mathfrak{F}}_{n-1}\bigr) \leq1 - \lambda\varepsilon+ \frac{\lambda^2}{2} \psi_2(\lambda) =: \beta( \lambda) \leq 1 - \frac{\lambda\varepsilon}{2} < 1 . $$

This proves (15.5.1). □

Corollary 15.5.1

If, in addition to the conditions of Theorem 15.5.1, the distribution of X n converges to a limiting distribution: P(X n <t)⇒P(X<t), then

$$\mathbf{E}e^{\lambda X} \leq\frac{e^{\lambda N} \psi(\lambda)}{1 - \beta(\lambda)} . $$

The corollary follows

from the Fatou–Lebesgue theorem (see also Lemma 6.1.1):

$$\begin{aligned} \mathbf{E}e^{\lambda X} \leq\liminf_{n \to\infty} \mathbf {E}e^{\lambda X_n} . \end{aligned}$$

 □

We now obtain bounds for “conventional” moments. Set

Theorem 15.5.2

Assume that \(\mathbf{E}X^{s}_{0} < \infty\) for some s>1 and there exist N≥0 and ε>0 such that

(15.5.6)
$$\begin{aligned}[cc] m(s) < c < \infty. \end{aligned}$$
(15.5.7)

Then

$$ \liminf_{n \to\infty} M^{s-1}(n) < \infty. $$
(15.5.8)

If, moreover,

$$ M^s(n + 1) > M^s(n) - c_1 $$
(15.5.9)

for some c 1>0, then

$$ \sup_n M^{s-1}(n) < \infty. $$
(15.5.10)

Corollary 15.5.2

If conditions (15.5.6) and (15.5.7) are met and the distribution of X n converges weakly to a limiting distribution: P(X n <t)⇒P(X<t), then E X s−1<∞.

This assertion follows from the Fatou–Lebesgue theorem

(see also Lemma 6.1.1), which implies

$$\begin{aligned} \mathbf{E}X^{s-1} \leq\liminf_{n \to\infty} \mathbf{E}X^{s-1}_n. \end{aligned}$$

 □

The assertion of Corollary 15.5.2 is unimprovable. One can see this from the example of the sequence X n =(X n−1+ζ n )+, where \(\zeta_{k}\stackrel{d}{=}\zeta\) are independent and identically distributed. If E ζ k <0 then the limiting distribution of X n coincides with the distribution of S=sup k S k (see Sect. 12.4). From factorisation identities one can derive that E S s−1 is finite if and only if E(ζ +)s<∞. An outline of the proof is as follows. Theorem 12.3.2 implies that \(\mathbf{E} S^{k} = c \, \mathbf{E}(\chi^{k}_{+} ; \, \eta_{+} < \infty)\), c=const<∞. It follows from Corollary 12.2.2 that

$$1 - \mathbf{E}\bigl(e^{i \lambda\chi_+} ; \, \eta_+ < \infty\bigr) = \bigl(1 - \mathbf{E}e^{i \lambda\zeta}\bigr) \int^{\infty}_0 e^{- i \lambda x} \, d H (x) , $$

where H(x) is the renewal function for the random variable \(-\chi^{0}_{-} \geq0\). Since

$$a_1 + b_1 x \leq H(x) \leq a_2 + b_2 x $$

(see Theorem 10.1.1 and Lemma 10.1.1; a i , b i are constants), integrating the convolution

$$\mathbf{P}(\chi_+ > x , \, \eta_+ < \infty) = \int^{\infty}_0 \mathbf{P}(\zeta> v + x) \, d H(v) $$

by parts we verify that, as x→∞, the left-hand side has the same order of magnitude as \(\int^{\infty}_{0} \mathbf{P}(\zeta> v + x) \, dv\). Hence the required statement follows.

We now return to Theorem 15.5.2. Note that in all of the most popular problems the sequence M s−1(n) behaves “regularly”: either it is bounded or M s−1(n)→∞. Assertion (15.5.8) means that, under the conditions of Theorem 15.5.2, the second possibility is excluded. Condition (15.5.9) ensuring (15.5.10) is also rather broad.

Proof of Theorem 15.5.2

Let for simplicity’s sake s>1 be an integer. We have

$$\begin{aligned} \mathbf{E}\bigl(X^s_n ;\, X_{n-1} > N\bigr) =& \int ^{\infty}_N \mathbf{E}\bigl((x + \xi_n)^s ; \, X_{n-1} \in dx\bigr) \\=& \sum ^s_{l=0} \binom{s }{ l} \int ^{\infty}_N x^l \mathbf{E}\bigl(\xi_n^{s-l} ; \, X_{n-1} \in dx\bigr) . \end{aligned}$$

If we replace \(\xi^{s-l}_{n}\) for sl≥2 with |ξ n |sl then the right-hand side can only increase. Therefore,

$$\mathbf{E}\bigl(X^s_n ; \, X_{n-1} > N\bigr) \leq\sum^s_{l=0} \binom{s}{ l} m( s- l) M^l_N (n - 1) , $$

where

$$M^l_N(n) = \mathbf{E}\bigl(X^l_n ; \, X_n > N\bigr) . $$

The moments \(M^{s}(n) = \mathbf{E}X^{s}_{n}\) satisfy the inequalities

$$ \begin{aligned}[b] M^s(n) \leq& \mathbf{E}\bigl[\bigl( N + | \xi_n |\bigr)^s ; \, X_{n-1} \leq N\bigr] + \sum^s_{l=0} \binom{s }{ l} m(s-l) M^l_N( n - 1) \\\leq& 2^s \bigl[N^s + c\bigr] + \sum^s_{l=0} \binom{s }{ l} m(s - l) M^l_N(n - 1) . \end{aligned} $$
(15.5.11)

Suppose now that (15.5.8) does not hold: M s−1(n)→∞. Then all the more M s(n)→∞ and there exists a subsequence n′ such that M s(n′)>M s(n′−1). Since M l(n)≤[M l+1(n)]l/l+1, we obtain from (15.5.6) and (15.5.11) that

$$\begin{aligned} M^s\bigl(n'\bigr) \leq& \mbox{const} + M^s\bigl(n' - 1\bigr) + s M^{s-1} \bigl(n' - 1\bigr) m(1) + o \bigl(M^{s-1} \bigl(n' - 1\bigr)\bigr) \\\leq& M^s\bigl(n' - 1\bigr) - \frac{1}{2} s \varepsilon M^{s-1}\bigl(n' - 1\bigr) \end{aligned}$$

for sufficiently large n′. This contradicts the assumption that M s(n)→∞ and hence proves (15.5.8).

We now prove (15.5.10). If this relation is not true then there exists a sequence n′ such that M s−1(n′)→∞ and M s(n′)>M s(n′−1)−c 1. It remains to make use of the above argument.

We leave the proof for a non-integer s>1 to the reader (the changes are elementary). The theorem is proved. □

Remark 15.5.1

(1) The assertions of Theorems 15.5.1 and 15.5.2 will remain valid if one requires inequalities (15.5.4) or \(\mathbf{E}(\xi_{n} + \varepsilon| {\mathfrak{F}}_{n-1})\,\mathrm{I}(X_{n-1} > N) \leq0\) to hold not for all n, but only for nn 0 for some n 0>1.

(2) As in Theorem 15.5.1, condition (15.5.6) means that the sequence of random variables (X n +εn) I(X n−1>N) forms a supermartingale.

(3) The conditions of Theorems 15.5.1 and 15.5.2 may be weakened by replacing them with “averaged” conditions. Consider, for instance, condition (15.5.1). By integrating it over the set {X n−1>x>N} we obtain

$$\mathbf{E}\bigl(e^{\lambda\xi_n} ;\, X_{n-1} > x\bigr) \leq\beta( \lambda) \mathbf{P}(X_{n-1} > x) $$

or, which is the same,

$$ \mathbf{E}\bigl(e^{\lambda\xi_n} \big| X_{n-1} > x \bigr) \leq\beta(\lambda) . $$
(15.5.12)

The converse assertion that (15.5.12) for all x>N implies relation (15.5.1) is obviously false, so that condition (15.5.12) is weaker than (15.5.1). A similar remark is true for condition (15.5.4).

One has the following generalisations of Theorems 15.5.1 and 15.5.2 to the case of “averaged conditions”.

Theorem 15.5.1A

Let, for some λ>0, N>0 and all xN,

$$\mathbf{E}\bigl(e^{\lambda\xi_n} \big| X_{n-1} > x\bigr) \leq\beta( \lambda) < 1 , \qquad \mathbf{E}\bigl(e^{\lambda\xi_n} ; \, X_{n-1} \leq N \bigr) \leq\psi(\lambda ) < \infty. $$

Then

$$\mathbf{E}e^{\lambda X_n} \leq\beta^n(\lambda) \, \mathbf{E}e^{\lambda X(0)} + \frac{e^{\lambda N} \psi(\lambda)}{1 - \beta(\lambda)} . $$

Put

Theorem 15.5.2A

Let \(\mathbf{E}X^{s}_{0} < \infty\) and there exist N≥0 and ε>0 such that

$$\overline{m}(1) \leq- \varepsilon, \qquad\overline{m}(s) < \infty, \qquad \mathbf{E}\bigl(|\xi_n|^s ; \, X_{n-1} \leq N \bigr) < c < \infty. $$

Then (15.5.8) holds true. If, in addition, (15.5.9) is valid, then (15.5.10) is true.

The proofs of Theorems 15.5.1A and 15.5.2A

are quite similar to those of Theorems 15.5.1 and 15.5.2. The only additional element in both cases is integration by parts. We will illustrate this with the proof of Theorem 15.5.1A. Consider

From this we find that

$$\begin{aligned} \beta_n(\lambda) :=& \mathbf{E}e^{\lambda X_n} \leq \mathbf{E} \bigl(e^{\lambda(X_{n-1} + \xi_n)} ; \, X_{n-1} \leq N\bigr) + \mathbf{E} \bigl(e^{\lambda X_n} ; X_{n - 1} > N\bigr) \\\le& e^{\lambda N} \psi(\lambda) + \beta(\lambda) \, \mathbf{E} \bigl(e^{\lambda X_{n-1}} ; \, X_{n-1} > N\bigr) \\\le& e^{\lambda N} \psi(\lambda) - \mathbf{P}(X_{n-1} \leq N) \beta(\lambda) + \beta(\lambda) \beta_n(\lambda) ; \\\beta_n(\lambda) \leq& \beta^n(\lambda) \beta_0(\lambda) + \frac{e^{\lambda N} \psi(\lambda)}{1 - \beta(\lambda)} . \end{aligned}$$

 □

Note that Theorem 13.7.2 and Corollary 13.7.1 on “positive recurrence” can also be referred to as theorems on boundedness of stochastic sequences.