Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

21.1 Definitions and General Properties

Markov processes in discrete time (Markov chains) were considered in Chap. 13. Recall that their main property was independence of the “future” of the process of its “past” given its “present” is fixed. The same principle underlies the definition of Markov processes in the general case.

21.1.1 Definition and Basic Properties

Let \(\langle\varOmega,\mathfrak{F},\mathbf{P}\rangle\) be a probability space and {ξ(t)=ξ(t,ω), t≥0} a random process given on it. Set

$$\mathfrak{F}_1 :=\sigma\bigl(\xi(u);\,u \le t\bigr),\qquad\mathfrak {F}_{[t,\infty)} :=\sigma\bigl(\xi (u);\, u \ge t\bigr), $$

so that the variable ξ(u) is \(\mathfrak{F}_{t}\)-measurable for ut and \(\mathfrak{F}_{[t,\infty)}\)-measurable for ut. The σ-algebra \(\sigma(\mathfrak{F}_{t} ,\mathfrak{F}_{[t,\infty)})\) is generated by the variables ξ(u) for all u and may coincide with \(\mathfrak{F}\) in the case of the sample probability space.

Definition 21.1.1

We say that ξ(t) is a Markov process if, for any t, \(A \in\mathfrak{F}_{t}\), and \(B \in\mathfrak{F}_{[t,\infty)}\), we have

$$ \mathbf{P}\bigl(AB \big| \xi(t)\bigr) =\mathbf{P}\bigl(A \big| \xi(t)\bigr)\mathbf {P}\bigl(B \big| \xi (t)\bigr). $$
(21.1.1)

This expresses precisely the fact that the future is independent of the past when the present is fixed (conditional independence of \(\mathfrak{F}_{t}\) and \(\mathfrak{F}_{[t,\infty)}\) given ξ(t)).

We will now show that the above definition is equivalent to the following.

Definition 21.1.2

We say that ξ(t) is a Markov process if, for any bounded \(\mathfrak{F}_{[t,\infty)}\)-measurable random variable η,

$$ \mathbf{E}(\eta| \mathfrak{F}_t)=\mathbf{E}\bigl( \eta\big| \xi(t)\bigr). $$
(21.1.2)

It suffices to take η to be functions of the form η=f(ξ(s)) for st.

Proof of the equivalence

Let (21.1.1) hold. By the monotone convergence theorem it suffices to prove (21.1.2) for simple functions η. To this end it suffices, in turn, to prove (21.1.2) for η=I B , the indicator of the set \(B \in\mathfrak{F}_{[t ,\infty)}\). Let \(A \in\mathfrak{F}_{t}\). Then, by (21.1.1),

$$ \begin{aligned}[b] \mathbf{P}(AB)& = \mathbf{E}\mathbf{P}\bigl(AB \big| \xi(t)\bigr) =\mathbf{E}\bigl[\mathbf{P}\bigl(A \big| \xi(t)\bigr)\mathbf{P}\bigl(B \big| \xi(t)\bigr)\bigr] \\& = \mathbf{E}\mathbf{E}\bigl[\mathrm{I}_A \mathbf{P}\bigl(B \big| \xi(t)\bigr) \big| \xi(t)\bigr] = \mathbf{E}\bigl[ \mathrm{I}_A \mathbf{P}\bigl(B \big| \xi(t)\bigr)\bigr]. \end{aligned} $$
(21.1.3)

On the other hand,

$$ \mathbf{P}(AB) =\mathbf{E}[\mathrm{I}_A \mathrm{I}_B] =\mathbf{E}\bigl[ \mathrm{I}_A \mathbf{P}(B | \mathfrak{F}_t)\bigr]. $$
(21.1.4)

Because (21.1.3) and (21.1.4) hold for any \(A \in \mathfrak{F}_{t}\), this means that \(\mathbf{P}(B | \mathfrak{F}_{t} ) =\mathbf{P}(B | \xi(t))\).

Conversely, let (21.1.2) hold. Then, for \(A \in\mathfrak{F}_{t}\) and \(B \in\mathfrak{F}_{[t ,\infty)}\), we have

$$\begin{aligned} \mathbf{P}\bigl(AB \big| \xi(t)\bigr) =& \mathbf{E}\bigl[ \mathbf{E}( \mathrm{I}_A \mathrm{I}_B | \mathfrak {F}_t )\big| \xi(t)\bigr] = \mathbf{E}\bigl[\mathrm{I}_A \mathbf{E}( \mathrm{I}_B | \mathfrak {F}_t )\big| \xi(t)\bigr] \\=& \mathbf{E}\bigl[\mathrm{I}_A \mathbf{E}\bigl( \mathrm{I}_B \big| \xi(t) \bigr)\big| \xi (t)\bigr]= \mathbf{P}\bigl(B \big| \xi(t)\bigr) \mathbf{P}\bigl(A \big| \xi(t)\bigr) . \end{aligned}$$

 □

It remains to verify that it suffices to take η=f(ξ(s)), st, in (21.1.2). In order to do this, we need one more equivalent definition of a Markov process.

Definition 21.1.3

We say that ξ(t) is a Markov process if, for any bounded function f and any t 1<t 2<⋯<t n t,

$$ \mathbf{E}\bigl(f\bigl(\xi(t)\bigr) \big| \xi(t_1 ), \ldots,\xi(t_n)\bigr) = \mathbf {E}\bigl(f\bigl(\xi(t) \big| \xi(t_n)\bigr)\bigr). $$
(21.1.5)

Proof of the equivalence

Relation (21.1.5) follows in an obvious way from (21.1.2). Now assume that (21.1.5) holds. Then, for any Aσ(ξ(t 1),…,ξ(t n )),

$$ \mathbf{E}\bigl(f\bigl(\xi(t)\bigr) ;\, A\bigr)=\mathbf{E}\bigl[ \mathbf{E}\bigl(f\bigl(\xi(t)\bigr) \big| \xi (t_n)\bigr);\,A\bigr]. $$
(21.1.6)

Both parts of (21.1.6) are measures coinciding on the algebra of cylinder sets. Therefore, by the theorem on uniqueness of extension of a measure, they coincide on the σ-algebra generated by these sets, i.e. on \(\mathfrak{F}_{t_{n} }\). In other words, (21.1.6) holds for any \(A \in\mathfrak{F}_{t_{n} }\), which is equivalent to the equality

$$\mathbf{E}\bigl[f \bigl(\xi(t)\bigr) \big| \mathfrak{F}_{t_n }\bigr] = \mathbf{E}\bigl[f\bigl(\xi (t)\bigr) \big| \xi(t_n )\bigr] $$

for any t n t. Relation (21.1.2) for η=f(ξ(t)) is proved. □

We now prove that in (21.1.2) it suffices to take η=f(ξ(s)), st. Let tu 1<⋯<u n . We prove that then (21.1.2) is true for

$$ \eta= \prod_{i=1}^n f_i \bigl(\xi(u_i)\bigr). $$
(21.1.7)

We will make use of induction and assume that equality (21.1.2) holds for the functions

$$\gamma=\prod_{i=1}^{n-1} f_i \bigl(\xi(u_i)\bigr) $$

(for n=1 relation (21.1.2) is true). Then, putting g(u n−1):=E[f n (ξ(u n ))|ξ(u n−1)], we obtain

$$\begin{aligned} \mathbf{E}(\eta| \mathfrak{F}_t ) =& \mathbf{E}\bigl[ \mathbf{E}(\eta| \mathfrak{F}_{u_{n-1}} ) \big| \mathfrak{F}_t \bigr] = \mathbf{E}\bigl[\gamma \mathbf{E}\bigl(f_n \bigl( \xi(u_n )\bigr) \big| \mathfrak {F}_{u_{n-1}} \bigr)\big| \mathfrak{F}_t\bigr] \\=& \mathbf{E}\bigl[\gamma \mathbf{E}\bigl(f_n \bigl(\xi(u_n )\bigr) \big| \xi(u_{n-1})\bigr) \big| \mathfrak{F}_t\bigr] =\mathbf{E}\bigl[\gamma g \bigl( \xi(u_{n-1} )\bigr) \big| \mathfrak{F}_t\bigr] . \end{aligned}$$

By the induction hypothesis this implies that

$$\mathbf{E}(\eta| \mathfrak{F}_t )=\mathbf{E}\bigl[\gamma g \bigl(\xi (u_{n-1} )\bigr) \big| \xi(t)\bigr] $$

and, therefore, that \(\mathbf{E}(\eta| \mathfrak{F}_{t} )\) is σ(ξ(t))-measurable and

$$\mathbf{E}\bigl(\eta\big| \xi(t) \bigr)=\mathbf{E}\bigl( \mathbf{E}(\eta| \mathfrak{F}_t )\big| \xi(t)\bigr) =\mathbf{E}(\eta| \mathfrak{F}_t ). $$

We proved that (21.1.2) holds for σ(ξ(u 1),…,ξ(u n ))-measurable functions of the form (21.1.7). By passing to the limit we establish first that (21.1.2) holds for simple functions, and then that it holds for any \(\mathfrak{F}_{[t ,\infty)}\)-measurable functions. □

21.1.2 Transition Probability

We saw that, for a Markov process ξ(t), the conditional probability

$$\mathbf{P}\bigl(\xi(t) \in B \big| \mathfrak{F}_s \bigr) = \mathbf{P}\bigl(\xi(t) \in B \big| \xi(s)\bigr) \quad\mbox{for } t>s $$

is a Borel function of ξ(s) which we will denote by

$$P \bigl(s,\xi(s) ;t,B\bigr) := \mathbf{P}\bigl(\xi(t) \in B \big| \xi(s)\bigr) . $$

One can say that P(s,x;t,B) as a function of B and x is the conditional distribution (see Sect. 4.9) of ξ(t) given that ξ(s)=x. By the Markov property, it satisfies the relation (s<u<t)

$$ P (s,x; t,B) =\int P (s,x; u,dy) P (u,y; t,B), $$
(21.1.8)

which follows from the equality

Equation (21.1.8) is called the Chapman–Kolmogorov equation.

The function P(s,x;t,B) can be used in an analytic definition of a Markov process. First we need to clarify what properties a function P x,B (s,t) should possess in order that there exists a Markov process ξ(t) for which

$$P_{x,B} (s,t) =P (s,x; t,B). $$

Let \(\langle\mathcal{X},\mathfrak{B}_{\mathcal{X}}\rangle\) be a measurable space.

Definition 21.1.4

A function P x,B (s,t) is said to be a transition function on \(\langle\mathcal{X},\mathfrak{ B}_{\mathcal{X}}\rangle\) if it satisfies the following conditions:

  1. (1)

    As a function of B, P x,B (s,t) is a probability distribution for each st, \(x\in\mathcal{X}\).

  2. (2)

    P x,B (s,t) is measurable in x for each st and \(B \in\mathfrak{ B}_{\mathcal{X}}\).

  3. (3)

    For 0≤s<u<t and all x and B,

    $$P_{x,B} (s,t) =\int P_{x,dy} (s,u) P_{y,B} (u,t) $$

    (the Chapman–Kolmogorov equation).

  4. (4)

    P x,B (s,t)=I B (x) for s=t.

Here properties (1) and (2) ensure that P x,B (s,t) can be a conditional distribution (cf. Sect. 4.9).

Now define, with the help of P x,B (s,t), the finite-dimensional distributions of a process ξ(t) with the initial condition ξ(0)=a by the formula

(21.1.9)

By virtue of properties (3) and (4), these distributions are consistent and therefore by the Kolmogorov theorem define a process ξ(t) in \(\langle\mathbb{R}^{ T } ,\mathfrak{ B}_{R}^{T} \rangle\), where T=[0,∞).

By formula (21.1.9) and rule (21.1.5),

We could also verify this equality in a more formal way using the fact that the integrals of both sides over the set {ξ(t 1)∈B 1,…,ξ(t n−1)∈B n−1} coincide.

Thus, by virtue of Definition 21.1.3, we have constructed a Markov process ξ(t) for which

$$P (s ,x; t,B) =P_{x,B} (s,t). $$

This function will also be called the transition function (or transition probability) of the process ξ(t).

Definition 21.1.5

A Markov process ξ(t) is said to be homogeneous if P(s,x;t,B), as a function of s and t, depends on the difference ts only:

$$P (s ,x; t,B)=P (t-s; x,B). $$

This is the probability of transition during a time interval of length ts from x to B. If

$$P (u; t,B) =\int_B p(u; t,y) \,dy $$

then the function p(u;x,y) is said to be a transition density.

It is not hard to see that the Wiener and Poisson processes are both homogeneous Markov processes. For example, for the Wiener process,

$$P(u;x,y) =\frac{1}{\sqrt{2\pi u}} e^{-{(x-y)^2}/{2u}} . $$

21.2 Markov Processes with Countable State Spaces. Examples

21.2.1 Basic Properties of the Process

Assume without loss of generality that the “discrete state space” \(\mathcal{X}\) coincides with the set of integers {0,1,2,…}. For simplicity’s sake we will only consider homogeneous Markov processes.

The transition function of such a process is determined by the collection of functions P(t;i,j)=p ij (t) which form a stochastic matrix P(t)=∥p ij (t)∥ (with p ij (t)≥0, ∑ j p ij (t)=1). Chapman–Kolmogorov’s equation now takes the form

$$p_{ij} (t+s) =\sum_k p_{ik} (t) p_{kj} (s), $$

or, which is the same, in the matrix form,

$$ P(t+s) =P (t) P(s) =P (s) P (t) . $$
(21.2.1)

In what follows, we consider only stochastically continuous processes for which \(\xi(t+s)\stackrel{p}{\to}\xi(t)\) as s→0, which is equivalent in the case under consideration to each of the following three relations:

$$ \mathbf{P}\bigl(\xi(t+s) \ne\xi(t)\bigr) \to0 ,\qquad P(t+s) \to P(t),\qquad P (s) \to P (0) \equiv E $$
(21.2.2)

as s→0 (component-wise; E is the unit matrix).

We will also assume that convergence in (21.2.2) is uniform (for a finite \(\mathcal{X}\) this is always the case).

According to the separability requirement, we will assume that ξ(t) cannot change its state in “zero time” more than once (thus excluding the effects illustrated in Example 18.1.1, i.e. assuming that if ξ(t)=j then, with probability 1, ξ(t+s)=j for s∈[0,τ), τ=τ(ω)>0). In that case, the trajectories of the processes will be piece-wise constant (right-continuous for definiteness), i.e. the time axis is divided into half-intervals [0,τ 1), [τ 1,τ 1+τ 2),… , on which ξ(t) is constant. Put

$$q_j (t) :=\mathbf{P}\bigl(\xi(u) =j ,\,0 \le u<t \big| \xi(0) =j\bigr) = \mathbf {P}(\tau_1 \ge t). $$

Theorem 21.2.1

Under the above assumptions (stochastic continuity and separability),

$$q_i (t) =e^{-q_i t} , $$

where q i <∞; moreover, q i >0 if \(p_{ii} (t) \not\equiv1\). There exist the limits

$$ \lim_{t \to0} \frac{1-p_{ii} (t)}{t} =q_i ,\qquad\lim_{t \to0} \frac{p_{ij} (t)}{t} =q_{ij},\quad i \ne j, $$
(21.2.3)

where j:ji q ij =q i .

Proof

By the Markov property,

$$q_i (t+s) =q_i (t) q_i (s), $$

and q i (t)↓. Therefore there exists a unique solution \(q_{i} (t) =e^{-q_{i} t} \) of this equation, where q i <∞, since P(τ 1>0)=1 and q i >0, because q i (t)<1 when \(p_{ii} (t) \not\equiv1\).

Let further 0<t 0<t 1⋯<t n <t. Since the events

$$\bigl\{ \xi(u) =i\ \mbox{for } u\le t_r,\xi(t_{r+1}) =j \bigr\}, \quad r=0,\ldots,n-1;\ j\ne i , $$

are disjoint,

$$ p_{ii} (t) =q_i (t) +\sum _{r=0}^{n-1} \sum_{j: j\ne i} q_i (t_r ) p_{ij} (t_{r+1} -t_r) p_{ji} (t-t_{r+1}). $$
(21.2.4)

Here, by condition (21.2.2), p ji (tt r+1)<ε t for all ji, and ε t →0 as t→0, so that the sum in (21.2.4) does not exceed

Together with the obvious inequality p ii (t)≥q i (t) this gives

$$1-q_i (t) \ge1-p_{ii} (t) \ge\bigl(1-q_i (t) \bigr) (1+\varepsilon_t) $$

(i.e. the asymptotic behaviour of 1−q i (t) and 1−p ii (t) as t→∞ is identical). This implies the second assertion of the theorem (i.e., the first relation in (21.2.3)).

Now let t r :=rt/n. Consider the transition probabilities

$$\begin{aligned} p_{ij} (t) \ge& \sum_{r=0}^{n-1} q_i (t_r) p_{ij}({t}/{n}) q_j (t-t_{r+1}) \\\ge& (1-\varepsilon_t) p_{ij} ( {t}/{n}) \sum _{r=0}^{n-1} e^{-q_i rt/{n}} \ge(1- \varepsilon_t) p_{ij} ( {t}/{n}) \frac{(1-e^{-q_i t})n}{{q_i }{t}} . \end{aligned}$$

This implies that

$$p_{ij} (t) \ge(1-\varepsilon_t) \biggl( \frac{1-e^{-q_i t}}{q_i } \biggr) \limsup_{\delta\to0} \frac{p_{ij} (\delta)}{\delta}, $$

and that the upper limit on the right-hand side is bounded. Passing to the limit as t→0, we obtain

$$\liminf_{t\to0} \frac{p_{ij} (t )}{t} \ge\limsup _{\delta\to0} \frac{p_{ij} (\delta)}{\delta}. $$

Since ∑ j:ji p ij (t)=1−p ii (t), we have ∑ j:ji q ij =q i . The theorem is proved. □

The theorem shows that the quantities

$$p_{ij} =\frac{q_{ij}}{q_i} ,\quad j \ne i , \qquad p_{ii} =0 $$

form a stochastic matrix and give the probabilities of transition from i to j during an infinitesimal time interval Δ given the process ξ(t) left the state i during that time interval:

$$\mathbf{P} \bigl(\xi(t+\varDelta) =j \big| \xi(t) =i ,\, \xi (t+\varDelta) \ne i \bigr) = \frac{p_{ij} (\varDelta)}{1-p_{ii} (\varDelta)} \to\frac{q_{ij}}{q_i} $$

as Δ→0.

Thus the evolution of ξ(t) can be thought of as follows. If ξ(0)=X 0, then ξ(t) stays at X 0 for a random time \(\tau_{1} \mathbin {{\subset }\hspace {-.7em}{=}}\boldsymbol {\Gamma }_{q_{X_{0} }}\). Then ξ(t) passes to a state X 1 with probability \(p_{X_{0} X_{1}}\). Further, ξ(t)=X 1 over the time interval [τ 1,τ 1+τ 2), \(\tau_{2}\mathbin {{\subset }\hspace {-.7em}{=}}\boldsymbol {\Gamma }_{q_{X_{1} }}\), after which the system changes its state to X 2 and so on. It is clear that X 0,X 1,… is a homogeneous Markov chain with the transition matrix ∥p ij ∥. Therefore the further study of ξ(t) can be reduced in many aspects to that of the Markov chain {X n ; n≥0}, which was carried out in detail in Chap. 13.

We see that the evolution of ξ(t) is completely specified by the quantities q ij and q i forming the matrix

$$ Q=\| q_{ij} \| =\lim_{t \to0} \frac{P(t) -P(0)}{t} , $$
(21.2.5)

where we put q ii :=−q i , so that ∑ j q ij =0. We can also justify this claim using an analytical approach. To simplify the technical side of the exposition, we will assume, where it is needed, that the entries of the matrix Q are bounded and convergence in (21.2.3) is uniform in i.

Denote by e A the matrix-valued function

$$e^A =E +\sum_{k=1}^{\infty} \frac{1}{k!} {A^k} . $$

Theorem 21.2.2

The transition probabilities p ij (t) satisfy the systems of differential equations

$$\begin{aligned} P' (t) =&P(t) Q, \end{aligned}$$
(21.2.6)
$$\begin{aligned} P' (t) =&Q P(t). \end{aligned}$$
(21.2.7)

Each of the systems (21.2.6) and (21.2.7) has a unique solution

$$P (t) =e^{Qt}. $$

It is clear that the solution can be obtained immediately by formally integrating equation (21.2.6).

Proof

By virtue of (21.2.1), (21.2.2) and (21.2.5),

$$ P' (t) =\lim_{s\to0} \frac{P(t+s ) -P(t)}{s} =\lim_{s \to0} P(t) \frac{P(s) -E}{s} =P(t) Q. $$
(21.2.8)

In the same way we obtain, from the equality

$$P(t+s) -P(t) =\bigl(P(s) -E\bigr)P(t), $$

the second equation in (21.2.7). The passage to the limit is justified by the assumptions we made.

Further, it follows from (21.2.6) that the function P(t) is infinitely differentiable, and

The theorem is proved. □

Because of the derivation method, (21.2.6) is called the backward Kolmogorov equation, and (21.2.7) is known as the forward Kolmogorov equation (the time increment is taken after or before the basic time interval).

The difference between these equations becomes even more graphical in the case of inhomogeneous Markov processes, when the transition probabilities

$$\mathbf{P}\bigl(\xi(t)=j\big| \xi(s)=i\bigr)=p_{ij} (s,t), \quad s \le t, $$

depend on two time arguments: s and t. In that case, (21.2.1) becomes the equality P(s,t+u)=P(s,t)P(t,t+u), and the backward and forward equations have the form

$$\frac{\partial P(s,t)}{\partial s} = P(s,t) Q(s), \qquad \frac{\partial P(s,t)}{\partial t}= Q(t) P(s,t), $$

respectively, where

$$Q(t)=\lim _{u\to0}\frac{P(t,t+u)-E}{u}. $$

The reader can derive these relations independently.

What are the general conditions for existence of a stationary limiting distribution? We can use here an approach similar to that employed in Chap. 13.

Let ξ (i)(t) be a process with the initial value ξ (i)(0)=i and right-continuous trajectories. For a given i 0, put

Here in the second formula we consider the values tν k−1+1, since for tν k−1 we would have ν k ν k−1. Clearly, P(ν k ν k−1=1)>0, and P(ν k ν k−1∈(t,t+h))>0 for any t≥1 and h>0 provided that \(p_{i_{0} i_{0}} (t) \not\equiv1\).

Note also that the variables ν k , k=0,1,… , are not defined for all elementary outcomes. We put ν 0=∞ if ξ (i)(t)≠i 0 for all t≥0. A similar convention is used for ν k , k≥1. The following ergodic theorem holds.

Theorem 21.2.3

Let there exist a state i 0 such that E ν 1<∞ and P(ν (i)<∞)=1 for all \(i \in\mathcal{X}_{0} \subset\mathcal{X}\). Then there exist the limits

$$ \lim_{t \to\infty} p_{ij} (t) = p_j $$
(21.2.9)

which are independent of \(i \in\mathcal{X}_{0}\).

Proof

As was the case for Markov chains, the epochs ν 1,ν 2,… divide the time axis into independent cycles of the same nature, each of them being completed when the system returns for the first time (after one time unit) to the state i 0. Consider the renewal process generated by the sums ν k , k=0,1,… , of independent random variables ν 0, ν k ν k−1, k=1,2,… . Let

$$\eta(t) :=\min\{ k: \nu_k >t \} ,\qquad\gamma(t) :=t- \nu_{\eta(t) -1} , \qquad H (t) :=\sum_{k=0}^{\infty} \mathbf{P}(\nu_k <t) . $$

The event A dv :={γ(t)∈[v,v+dv)} can be represented as the intersection of the events

$$B_{dv} := \bigcup_{k \ge0} \bigl\{ \nu_k \in(t-v-dv,t-v] \bigr\} \in \mathfrak{F}_{1-v} $$

and \(C_{v}:=\{\xi(u)\ne i_{0}\mbox{ for } u\in[t-v+1,t]\}\in\mathfrak{F}_{[t-v,\infty)}\). We have

$$\begin{aligned} p_{ij} (t) =& \int_0^t \mathbf{P}\bigl(\xi^{(i)} (t) =j,\,\gamma(t) \in[v ,v+dv)\bigr) = \int _0^t \mathbf{P}\bigl(\xi^{(i)} =j,\, B_{dv} C_v \bigr) \\=& \int_0^t \mathbf{E}\bigl[ \mathrm{I}_{B_{dv}} \mathbf{P}\bigl(\xi^{(i)} (t)=j,\, C_v\big| \mathfrak{F}_{t-v} \bigr)\bigr]\\=&\int _0^t \mathbf{E} \bigl[\mathrm{I}_{B_{dv}} \mathbf{P}\bigl(\xi^{(i)} (t)=j,C_v\big| \xi(t-v)\bigr) \bigr]. \end{aligned}$$

On the set B dv , one has ξ(tv)=i 0, and hence the probability inside the last integral is equal to

$$\mathbf{P} \bigl(\xi^{(i_0)} (v)=j,\, \xi(u) \ne i_0 \mbox{ for } u\in[1,v] \bigr) =: g(v) $$

and is independent of t and i. Since P(B dv )=dH(tv), one has

$$p_{ij} (t) =\int_0^t g(v) \mathbf{P}(B_{dv}) =\int_0^t g(v) \,dH(t-v). $$

By the key renewal theorem, as t→∞, this integral converges to

$$\frac{1}{\mathbf{E}\nu_1} \int_0^{\infty} g(v)\,dv. $$

The existence of the last integral follows from the inequality g(v)≤P(ν 1>v). The theorem is proved. □

Theorem 21.2.4

If the stationary distribution

$$P =\lim_{t \to\infty} P(t) $$

exists with all the rows of the matrix P being identical, then it is a unique solution of the equation

$$ PQ=0. $$
(21.2.10)

It is evident that Eq. (21.2.10) is obtained by setting P′(t)=0 in (21.2.6). Equation (21.2.7) gives the trivial equality QP=0.

Proof

Equation (21.2.10) is obtained by passing to the limit in (21.2.8) first as t→∞ and then as s→0. Now assume that P 1 is a solution of (21.2.10), i.e. P 1 Q=0. Then P 1 P(t)=P 1 for t<1, since

$$P_1\bigl(P(t)-P(0)\bigr)=P_1\sum _{k=1}^{\infty}\frac{Q^k t^k }{k!} =0. $$

Further, P 1=P 1 P k(t)=P 1 P(kt), P(kt)→P as k→∞, and hence P 1=P 1 P=P. The theorem is proved. □

Now consider a Markov chain {X n } in discrete time with transition probabilities p ij =q ij /q i , ij, p ii =0. Suppose that this chain is ergodic (see Theorem 13.4.1). Then its stationary probabilities {π j } satisfy Eqs. (13.4.2). Now note that Eq. (21.2.10) can be written in the form

$$p_j \ q_j = \sum_k p_k q_k p_{kj} $$

which has an obvious solution p j = j /q j , c=const. Therefore, if

$$ \sum\frac{\pi_j}{q_j} < \infty $$
(21.2.11)

then there exists a solution to (21.2.10) given by

$$ p_j = \frac{\pi_j}{q_j} \biggl( \sum \frac{\pi_j}{q_j} \biggr)^{- 1} . $$
(21.2.12)

In Sects. 21.4 and 21.5 we will derive the ergodic theorem for processes of a more general form than the one in the present section. That theorem will imply, in particular, that ergodicity of {X n } and convergence (21.2.11) imply (21.2.9). Recall that, for ergodicity of {X n }, it suffices, in turn, that Eqs. (13.4.2) have a solution {π j }. Thus the existence of solution (21.2.12) implies the ergodicity of ξ(t).

21.2.2 Examples

Example 21.2.1

The Poisson process ξ(t) with parameter λ is a Markov process for which q i =λ, q i,i+1=λ, and p i,i+1=1, i=1,0,… . For this process, the stationary distribution p=(p 0,p 1,…) does not exist (each trajectory goes to infinity).

Example 21.2.2

Birth-and-death processes. These are processes for which, for i≥1,

so that

are probabilities of birth and death, respectively, of a particle in a certain population given that the population consisted of i particles and changed its composition. For i=0 one should put μ 0:=0. Establishing conditions for the existence of a stationary regime is a rather difficult problem (related mainly to finding conditions under which the trajectory escapes to infinity). If the stationary regime exists, then according to Theorem 21.2.4 the stationary probabilities p j can be uniquely determined from the recursive relations (see Eq. (21.2.10), in our case q ii =−q i =−(λ i +μ i ))

(21.2.13)

and condition ∑p j =1.

Example 21.2.3

The telephone lines problem from queueing theory. Suppose we are given a system consisting of infinitely many communication channels which are used for telephone conversations. The probability that, for a busy channel, the transmitted conversation terminates during a small time interval (t,t+Δ) is equal to λΔ+o(Δ). The probability that a request for a new conversation (a new call) arrives during the same time interval is μΔ+o(Δ). Thus the “arrival flow” of calls is nothing else but the Poisson process with parameter λ, and the number ξ(t) of busy channels at time t is the value of the birth-and-death process for which λ i =λ and μ i =.

In that case, it is not hard to verify with the help of Theorem 21.2.3 that there always exists a stationary limiting distribution, for which Eqs. (21.2.13) have the form

(21.2.14)

From this we get that

$$ p_1 =p_0 \frac{\lambda}{\mu} ,\quad p_2 =\frac{p_0}{2} \biggl(\frac{\lambda }{\mu} \biggr)^2 ,\quad\ldots,\quad p_k = \biggl( \frac{\lambda}{\mu } \biggr)^k \frac{p_0 }{k!}, $$
(21.2.15)

so that p 0=e λ/μ, and the limiting distribution will be the Poisson law with parameter λ/μ.

If the number of channels n is finite, the calls which find all the lines busy will be rejected, and in (21.2.13) one has to put λ n =0, p n+1=p n+2=⋯=0. In that case, the last equation in (21.2.14) will have the form μnp n =λp n−1. Since the formulas (21.2.15) will remain true for kn, we obtain the so-called Erlang formulas for the stationary distribution:

$$p_k = \biggl( \frac{\lambda}{\mu} \biggr)^k \frac{1}{k!} \Biggl[ \,\sum_{j=0}^n \frac{1}{j!} \biggl( \frac{\lambda}{\mu} \biggr)^j \Biggr]^{-1} $$

(the truncated Poisson distribution).

The next example will be considered in a separate section.

21.3 Branching Processes

The essence of the mathematical model describing a branching process remains roughly the same as in Sect. 7.7.2. A continuous time branching process can be defined as follows. Let ξ (i)(t) denote the number of particles at time t with the initial condition ξ (i)(0)=i. Each particle, independently of all others, splits during the time interval (t,t+Δ) with probability μΔ+o(Δ) into a random number η≠1 of particles (if η=0, we say that the particle dies). Thus,

$$ \xi^{(i)} (t) =\xi_1^{(1)} (t) + \cdots+ \xi_i^{(1)} (t), $$
(21.3.1)

where \(\xi_{k}^{(1)} (t)\) are independent and distributed as ξ (1)(t). Moreover,

(21.3.2)

so that here q ij =iμh ji+1, q ii =−.

By formula (21.3.2), iμΔ is the principal part of the probability that at least one particle will split. Clearly, the state 0 is absorbing. It will not be absorbing any more if one considers processes with immigration when a Poisson process (with intensity λ) of “outside” particles is added to the process ξ (i)(t). Then

$$\begin{aligned} p_{ij}(\varDelta) =&i\mu\varDelta h_{j-i+1}+o(\varDelta)\quad\mbox{for } j-i\ne0,1, \\p_{i,i+1}(\varDelta) =&\varDelta(i\mu h_2+\lambda)+o(\varDelta). \end{aligned}$$

We return to the branching process (21.3.1), (21.3.2). By (21.3.1) we have

$$r^{(i)}(t,z):=\mathbf{E}z^{\xi^{(i)}(t)}= \bigl[\mathbf{E}z^{\xi^{(1)}(t)} \bigr]^i= r^{i}(t,z)=\sum_{k=0}^\infty z^k p_{ik}(t), $$

where

$$ r(t,z):=\mathbf{E}z^{\xi^{(1)}(t)}=\sum _{k=0}^{\infty}z^k p_{1k}(t). $$
(21.3.3)

Equation (21.2.7) implies

$${p'}_{1k} (t) =\sum_{l=0}^{\infty} q_{1l} p_{lk} (t) . $$

Therefore, differentiating (21.3.3) with respect to t, we find that

$$ \begin{aligned}[b] r_t'(t,z) &=\sum_{k=0}^{\infty} z^k p'_{1k} (t) = \sum_{k=0}^{\infty} \sum_{l=0}^{\infty} q_{1l} p_{lk} (t) z^k \\&= \sum_{l=0}^{\infty} q_{1l} \sum_{k=0}^{\infty} z^k p_{lk} (t) = \sum_{l=0}^{\infty} q_{1l} r^l (t,z). \end{aligned} $$
(21.3.4)

But q 1l =μp l for l≠1, q 11=−μ, and putting

$$f(s) :=\sum_{l=0}^{\infty} q_{1l} s^l = \mu\bigl(\mathbf{E}s^{\eta} -s\bigr) =\mu \Biggl( \,\sum _{l=0}^{\infty} h_l s^l-s \Biggr) , $$

we can write (21.3.4) in the form

$$r_t'(t,z) =f\bigl(r (t,z)\bigr). $$

We have obtained a differential equation for r=r(t,z) (equivalent to (21.3.2)) which is more convenient to write in the form

$$\frac{d r}{f(r)}=dt,\qquad t=\int _{r(0,z)}^{r(t,z)}\frac{dy}{f(y)}= \int _z^{r(t,z)}\frac{dy}{f(y)}. $$

Consider the behaviour of the function f 1(y)=E y ηy on [0,1]. Clearly, f 1(0)=P(η=0), f 1(1)=0, and

$$f'_1 (1) =\mathbf{E}\eta-1,\qquad f''_1 (y) =\mathbf{E}\eta(\eta-1) y^{\eta-2} >0. $$

Consequently, the function f 1(y) is convex and has no zeros in (0,1) if E η≤1. When E η>1, there exists a point q∈(0,1) such that f 1(q)=0, \(f'_{1} (q) <0\) (see Fig. 21.1), and \(f_{1} (y) =(y-q) f'_{1} (q) +O((y-q)^{2})\) in the vicinity of this point.

Fig. 21.1
figure 1

The form of the plot of the function f 1. The smaller root of the equation f 1(q)=q gives the probability of the eventual extinction of the branching process

Thus if E η>1, z<q and rq, then, by virtue of the representation

$$\frac{1}{f_1 (y)} =\frac{1}{(y-q) f'_1 (q)} +O(1), $$

we obtain

$$t =\int_z^{r} \frac{dy}{f(y)} = \frac{1}{\mu f_1' (q)} \ln \biggl( \frac{r -q}{z-q} \biggr) +O(1). $$

This implies that, as t→∞,

$$ \begin{aligned}[c] &r (t,z) -q =(z-q) e^{\mu t f'_1 (q) +O(1)} \sim(z-q) e^{\mu t f'_1 (q)} ,\\&r (t,z) =q +O \bigl(e^{-\alpha t} \bigr),\quad \alpha=-\mu f'_1 (q) >0. \end{aligned} $$
(21.3.5)

In particular, the extinction probability

$$p_{10} (t) =r (t,0) =q +O\bigl(e^{-\alpha t}\bigr) $$

converges exponentially fast to q, p 10(∞)=q. Comparing our results with those from Sect. 7.7, the reader can see that the extinction probability for a discrete time branching process had the same value (we could also come to this conclusion directly). Since p k0(t)=[p 10(t)]k, one has p k0(∞)=q k.

It follows from (21.3.5) that the remaining “probability mass” of the distribution of ξ(t) quickly moves to infinity as t→∞.

If E η<1, the above argument remains valid with q replaced with 1, so that the extinction probability is p 10(∞)=p k0(∞)=1.

If E η=1, then

Thus the extinction probability r(t,0)=p 10(t) also tends to 1 in this case.

21.4 Semi-Markov Processes

21.4.1 Semi-Markov Processes on the States of a Chain

Semi-Markov processes can be described as follows. Let an aperiodic discrete time irreducible Markov chain {X n } with the state space \(\mathcal{X}= \{ 0, 1, 2,\dots\}\) be given. To each state i we put into correspondence the distribution F i (t) of a positive random variable ζ (i):

$$F_i (t) = \mathbf{P}\bigl(\zeta^{(i)} < t \bigr). $$

Consider independent of the chain {X n } and of each other the sequences \(\zeta^{(i)}_{1}, \zeta^{(i)}_{2}, \ldots\) ; \(\zeta^{(i)}_{j}\stackrel{d}{=}\zeta^{(i)}\), of independent random variables with the distribution F i . Let, moreover, the distribution of the initial random vector (X 0,ζ 0), \(X_{0} \in \mathcal{X}\), ζ 0≥0, be given. The evolution of the semi-Markov process ξ(u) is described as follows:

$$ \begin{aligned}[c] \xi(u) &= X_0 \quad {\mbox{for }} 0 \leq u < \zeta_0, \\\xi(u) &= X_1 \quad {\mbox{for }} \zeta_0 \leq u < \zeta_0+\zeta _1^{(X_1)}, \\\xi(u) &= X_2 \quad {\mbox{for }} \zeta_0 + \zeta_1^{(X_1)} \leq u < \zeta_0 + \zeta_1^{(X_1)} + \zeta_2^{(X_2)}, \\&\cdots , \\\xi(u) &= X_n \quad {\mbox{for }} Z_{n-1} \leq u < Z_n, \ Z_n = \zeta_0 + \zeta_1^{(X_1)} + \cdots+ \zeta_n^{(X_n)}, \end{aligned} $$
(21.4.1)

and so on. Thus, upon entering state X n =j, the trajectory of ξ(u) remains in that state for a random time \(\zeta_{n}^{(X_{n})} = \zeta_{n}^{(j)}\), then switches to state X n+1 and so on. It is evident that such a process is, generally speaking, not Markovian. It will be a Markov process only if

$$1 - F_i (t) = e^{- q_i t}, \quad q_i > 0, $$

and will then coincide with the process described in Sect. 21.2.

If the distribution F i is not exponential, then, given the value ξ(t)=i, the time between t and the next jump epoch will depend on the epoch of the preceding jump of ξ(⋅), because

$$\mathbf{P}\bigl(\zeta^{(i)} > v + u\big | \zeta^{(i)} > v\bigr) = \frac{1 - F_i (v + u)}{1 - F_i (v)} $$

for non-exponential F i depends on v. It is this property that means that the process is non-Markovian, for fixing the “present” (i.e. the value of ξ(t)) does not make the “future” of the process ξ(u) independent of the “past” (i.e. of the trajectory of ξ(u) for u<t).

The process ξ(t) can be “complemented” to a Markov one by adding to it the component χ(t) of which the value gives the time u for which the trajectory ξ(t+u), u≥0, will remain in the current state ξ(t). In other words, χ(t) is the excess of level t for the random walk Z 0,Z 1,… (see Fig. 21.2):

$$\chi(t) = Z_{\nu(t)+1} - t, \qquad\nu(t) = \max\{ k : Z_k \leq t \}. $$
Fig. 21.2
figure 2

The trajectories of the semi-Markov process ξ(t) and of the residual sojourn time process χ(t)

The process χ(t) is Markovian and has “saw-like” trajectories deterministic inside the intervals (Z k ,Z k+1). The process X(t)=(ξ(t),χ(t)) is obviously Markovian, since the value of X(t) uniquely determines the law of evolution of the process X(t+u) for u≥0 whatever the “history” X(v), v<t, is. Similarly, we could consider the Markov process Y(t)=(ξ(t),γ(t)), where γ(t) is the defect of level t for the walk Z 0,Z 1,… :

$$\gamma(t) = t - Z_{\nu(t)}. $$

21.4.2 The Ergodic Theorem

In the sequel, we will distinguish between the following two cases.

(A) The arithmetic case when the possible values of ζ (i), i=0,1,…, are multiples of a certain value h which can be assumed without loss of generality to be equal to 1. In that case we will also assume that the g.c.d. of the possible values of the sums of the variables ζ (i) is also equal to h=1. This is clearly equivalent to assuming that the g.c.d. of the possible values of recurrence times θ (i) of ξ(t) to the state i is equal to 1 for any fixed i.

(NA) The non-arithmetic case, when condition (A) does not hold.

Put a i :=E ζ (i).

Theorem 21.4.1

Let the Markov chain {X n } be ergodic (satisfy the conditions of Theorem 13.4.1) and {π j } be the stationary distribution of that chain. Then, in the non-arithmetic case (NA), for any initial distribution (ζ 0,X 0) there exists the limit

$$ \lim _{t \to\infty} \mathbf{P}\bigl(\xi(t) = i, \ \chi(t) > v \bigr) = \frac{\pi_i}{\sum\pi_j a_j}\int^{\infty}_v\mathbf{P} \bigl(\zeta ^{(i)}>u\bigr)\, du. $$
(21.4.2)

In the arithmetic case (A), (21.4.2) holds for integer-valued v (the integral becomes a sum in that case). It follows from (21.4.2) that the following limit exists

$$\lim _{t\to\infty}\mathbf{P}\bigl(\xi(t)=i\bigr)=\frac{\pi_i a_i}{\sum\pi _j a_j}. $$

Proof

For definiteness we restrict ourselves to the non-arithmetic case (NA). In Sect. 13.4 we considered the times τ (i) between consecutive visits of {X n } to state i. These times could be called “embedded”, as well as the chain {X n } itself in regard to the process ξ(t). Along with the times τ (i), we will need the “real” times θ (i) between the visits of the process ξ(t) to the state i. Let, for instance, X 1=1. Then

$$\theta^{(1)}\stackrel{d}{ =} \zeta_1^{(X_1)} + \zeta_2^{(X_2)} + \cdots+ \zeta_{\tau}^ {(X_{\tau})}, $$

where τ=τ (1). For definiteness and to reduce notation, we fix for the moment the value i=1 and put θ (1)=:θ. Let first

$$ \zeta_0 \stackrel{d}{=} \zeta^{(1)}, \quad X_0 = 1. $$
(21.4.3)

Then the whole trajectory of the process X(t) for t≥0 will be divided into identically distributed independent cycles by the epochs when the process hits the state ξ(t)=1. We denote the lengths of these cycles by θ 1,θ 2… ; they are independent and identically distributed. We show that

$$ \mathbf{E}\theta= \frac{1}{\pi_1} \sum a_j \pi_j. $$
(21.4.4)

Denote by θ(n) the “real” time spent on n transitions of the governing chain {X n }. Then

$$ \theta_1 + \cdots+ \theta_{\eta(n) - 1} \leq \theta(n) \leq\theta _1 + \cdots+ \theta_{\eta(n)}, $$
(21.4.5)

where η(n):=min{k:T k >n}, \(T_{k} = \sum^{k}_{j=1}\tau_{j}\), τ j are independent and distributed as τ. We prove that, as n→∞,

$$ \mathbf{E}\theta(n) \sim n \pi_1 \mathbf{E}\theta. $$
(21.4.6)

By Wald’s identity and (21.4.5),

$$ \mathbf{E}\theta(n) \leq\mathbf{E}\theta\mathbf{E}\eta(n), $$
(21.4.7)

where E η(n)∼n/E τ= 1.

Now we bound from below the expectation E θ(n). Put m:=⌊ 1εn⌋, \(\varTheta_{n} := \sum^{n}_{j=1} \theta_{j}\). Then

$$\begin{aligned} \mathbf{E}\theta(n) \geq&\mathbf{E}\bigl(\theta(n); \, \eta(n) > m\bigr) \\\geq&\mathbf{E}\bigl(\varTheta_m;\, \eta (n)>m\bigr)=m \mathbf{E}\theta-\mathbf{E}\bigl(\varTheta_m; \, \eta(n) \leq m \bigr). \end{aligned}$$
(21.4.8)

Here the random variable Θ m /m≥0 possesses the properties

$${\varTheta_m}/{m}\stackrel{p}{\to}\mathbf{E}\theta\quad\mbox {as } m\to\infty, \qquad \mathbf{E} ({\varTheta_m}/{m}) = \mathbf{E} \theta. $$

Therefore it satisfies the conditions of part 4 of Lemma 6.1.1 and is uniformly integrable. This, in turn, by Lemma 6.1.2 and convergence P(η(n)≤m)→0 means that the last term on the right-hand side of (21.4.8) is o(m). By virtue of (21.4.8), since ε>0 is arbitrary, we obtain that

$$\liminf_{n\to\infty} n^{-1} \mathbf{E}\theta(n) \geq \pi_1 \mathbf{E}\theta. $$

This together with (21.4.7) proves (21.4.6).

Now we will calculate the value of E θ(n) using another approach. The variable θ(n) admits the representation

$$\theta(n) = \sum_j \bigl(\zeta_1^{(j)} + \cdots+ \zeta^{(j)}_{N (j,n)}\bigr), $$

where N(j,n) is the number of visits of the trajectory of {X k } to the state j during the first n steps. Since \(\{ \zeta_{k}^{(j)} \}^{\infty}_{k=1}\) and N(j,n) are independent for each j, we have

$$\mathbf{E}\theta(n) = \sum_j a_j \mathbf{E}N (j, n), \qquad \mathbf{E}N (j, n) = \sum^n_{k=1} p_{1 j} (k). $$

Because p 1j (k)→π j as k→∞, one has

$$\lim_{n\to\infty} n^{-1} \mathbf{E}N(j,n)= \pi_j. $$

Moreover,

$$\pi_j = \sum\pi_l p_{l j} (k) \geq \pi_1 p_{1 j}(k) $$

and, therefore,

$$p_{1 j} (k) \leq{\pi_j}/{\pi_1}. $$

Hence

$$n^{-1}\mathbf{E}N (j, n) \leq{\pi_j}/{\pi_1}, $$

and in the case when ∑ j a j π j <∞, the series ∑ j a j n −1 E N(j,n) converges uniformly in n. Consequently, the following limit exists

$$\lim_{n \to\infty} n^{-1} \mathbf{E}\theta(n) = \sum _j a_j \pi_j. $$

Comparing this with (21.4.6) we obtain (21.4.4). If E θ=∞ then clearly E θ(n)=∞ and ∑ j a j π j =∞, and vice versa, if ∑ j a j π j =∞ then E θ=∞.

Consider now the random walk {Θ k }. To the k-th cycle there correspond T k transitions. Therefore, by the total probability formula,

$$\mathbf{P}\bigl(\xi(t) = 1, \chi(t) > v\bigr) = \sum ^{\infty}_{k=1} \int^t_0 \mathbf{P}\bigl(\varTheta_k \in d u, \, \zeta^{(1)}_{T_k +1} > t - u + v\bigr), $$

where \(\zeta^{(1)}_{T_{k+1}}\) is independent of Θ k and distributed as ζ (1) (see Lemma 11.2.1 or the strong Markov property). Therefore, denoting by \(H_{\theta} (u) := \sum^{\infty}_{k=1} \mathbf{P}(\varTheta_{k} < u)\) the renewal function for the sequence {Θ k }, we obtain for the non-arithmetic case (NA), by virtue of the renewal theorem (see Theorem 10.4.1 and (10.4.2)), that, as t→∞,

(21.4.9)

We have proved assertion (21.4.2) for i=1 and initial conditions (21.4.3). The transition to arbitrary initial conditions is quite obvious and is done in exactly the same way as in the proof of the ergodic theorems of Chap. 13.

If ∑a i π i =∞ then, as we have already observed, E θ=∞ and, by the renewal theorem and (21.4.9), one has P(ξ(t)=1, χ(t)>v)→0 as t→∞. It remains to note that instead of i=1 we can fix any other value of i. The theorem is proved. □

In the same way we could also prove that

(see Theorem 10.4.3).

21.4.3 Semi-Markov Processes on Chain Transitions

Along with the semi-Markov processes ξ(t) described at the beginning of the present section, one sometimes considers semi-Markov processes “given on the transitions” of the chain {X n }. In that case, the distributions F ij of random variables ζ (ij)>0 are given and, similarly to (21.4.1), for the initial condition (X 0,X 1,ζ 0) one puts

$$ \begin{aligned}[c] \xi(u) &:= (X_0, X_1) \quad{\mbox{for }} 0 \leq u < \zeta_0\\\xi(u) &:= (X_1, X_2) \quad{\mbox{for }} \zeta_0 \leq u < \zeta_0 + \zeta_1^{(X_0, X_1)} \\\xi(u) &:= (X_2, X_3) \quad{\mbox{for }} \zeta_0 + \zeta_1^{(X_0, X_1)} \leq u < \zeta_0 + \zeta_1^{(X_0, X_1)} + \zeta_2^{(X_1, X_2)} , \end{aligned} $$
(21.4.10)

and so on. Although at first glance this is a very general model, it can be completely reduced to the semi-Markov processes (21.4.1). To that end, one has to notice that the “two-dimensional” sequence Y n =(X n ,X n+1), n=0,1,… , also forms a Markov chain. Its transition probabilities have the form

so that if the chain {X n } is ergodic, then {Y n } is also ergodic and

$$p_{(i j)(k l)}(n) \to\pi_k p_{k l}. $$

This enables one to restate Theorem 21.4.1 easily for the semi-Markov processes (21.4.10) given on the transitions of the Markov chain {X n }, since the process (21.4.10) will be an ordinary semi-Markov process given on the chain {Y n }.

Corollary 21.4.1

If the chain {X n } is ergodic then, in the non-arithmetic case,

In the arithmetic case v must be a multiple of the lattice span.

We will make one more remark which could be helpful when studying semi-Markov processes and which concerns the so-called semi-Markov renewal functions H ij (t). Denote by T ij (n) the epoch (in the “real time”) of the n-th jump of the process ξ(t) from state i to j. Put

$$H_{i j} (t) := \sum^{\infty}_{n=1} \mathbf{P}\bigl(T_{i j} (n) < t\bigr). $$

If ν ij (t) is the number of jumps from state i to j during the time interval [0,t), then clearly H ij (t)=E ν ij (t).

Set Δf(t):=f(t+Δ)−f(t), Δ>0.

Corollary 21.4.2

In the non-arithmetic case,

$$ \lim_{t \to\infty} \varDelta H_{i j} (t) = \frac{\pi_i p_{i j} \varDelta}{ \sum_l a_l \pi_l}. $$
(21.4.11)

In the arithmetic case v must be a multiple of the lattice span.

Proof

Denote by \(\nu^{(k)}_{i j} (u)\) the number of transitions of the process ξ(t) from i to j during the time interval (0,u) given the initial condition (k,0). Then, by the total probability formula,

$$\mathbf{E}\varDelta\nu_{i j} (t)=\int^{\varDelta}_0 \sum^{\infty }_{k = 0} \mathbf{P}\bigl(\xi(t) = k, \,\chi(t)\in du\bigr) \mathbf{E}\nu_{i j}^{(k)}(\varDelta-u). $$

Since \(\nu^{(k)}_{i j} (u) \leq\nu^{(i)}_{i j} (u)\), by Theorem 21.4.1 one has

$$ h_{i j} (\varDelta):=\lim_{t \to\infty} \mathbf{E}\varDelta\nu _{i j}(t) = \frac{1}{\sum_l a_l\pi_l}\sum ^{\infty}_{k=0} \pi_k \int ^{\varDelta}_0 \mathbf{P} \bigl(\zeta^{(k)} > u \bigr) \mathbf{E}\nu^{(k)}_{ij} (\varDelta-u)\, du. $$
(21.4.12)

Further,

$$\mathbf{P}\bigl(\zeta^{(i)} < \varDelta- u\bigr) \leq F_i ( \varDelta) \to0 $$

as Δ→0, and

It follows from the aforesaid that

$$\mathbf{E}\nu^{(k)}_{i j} (\varDelta- u) = o \bigl(F_i (\varDelta)\bigr), \qquad \mathbf{E}\nu^{(i)}_{i j} (\varDelta- u) = p_{i j} + o \bigl(F_i (\varDelta)\bigr). $$

Therefore,

$$ h_{i j} (\varDelta) = \frac{\pi_i p_{i j} \varDelta}{\sum_l a_l \pi_l} + o ( \varDelta). $$
(21.4.13)

Further, from the equality

$$H_{ij}(t+2\varDelta)-H_{ij}(t)=\varDelta H_{ij}(t)+\varDelta H_{ij}(t+\varDelta) $$

we obtain that h ij (2Δ)=2h ij (Δ), which means that h ij (Δ) is linear. Together with (21.4.13) this proves (21.4.11). The corollary is proved. □

The class of processes for which one can prove ergodicity using the same methods as the one used for semi-Markov processes and also in Chap. 13, can be somewhat extended. For this broader class of processes we will prove in the next section the ergodic theorem, and also the laws of large numbers and the central limit theorem for integrals of such processes.

21.5 Regenerative Processes

21.5.1 Regenerative Processes. The Ergodic Theorem

Let X(t) and X 0(t); t≥0, be processes given in the space D(0,∞) of functions without discontinuities of the second type (the state space of these processes could be any metric space, not necessarily the real line). The process X(t) is said to be regenerative if it possesses the following properties:

(1) There exists a state x 0 which is visited by the process X with probability 1. After each such visit, the evolution of the process starts anew as if it were the original process X(t) starting at the state X(0)=x 0. We will denote this new process by X 0(t) where X 0(0)=x 0. To state this property more precisely, we introduce the time τ 0 of the first visit to x 0 by X:

$$\tau_0 := \inf\bigl\{ t \geq0: X (t) = x_0 \bigr\}. $$

However, it is not clear from this definition whether τ 0 is a random variable. For definiteness, assume that the process X is such that for τ 0 one has

$$\{ \tau_0 > t \} = \bigcup_n \bigcap _{t_k \in S} \bigl\{ \bigl|X (t_k) - x_0 \bigr| > {1}/{n}\bigr\}, $$

where S is a countable set everywhere dense in [0,t]. In that case the set {τ 0>t} is clearly an event and τ 0 is a random variable. The above stated property means that τ 0 is a proper random variable: P(τ 0<∞)=1, and that the distribution of X(τ 0+u), u≥0, coincides with that of X 0(u), u≥0, whatever the “history” of the process X(t), tτ 0.

(2) The recurrence time τ of the state x 0 has finite expectation E τ<∞, τ:=inf{t:X 0(t)=x 0}.

The aforesaid means that the evolution of the process is split into independent identically distributed cycles by its visits to the state x 0. The visit times to x 0 are called regeneration times. The behaviour of the process inside the cycles may be arbitrary, and no further conditions, including Markovity, are imposed.

We introduce the so-called “taboo probability”

$$P (t, B) := \mathbf{P}\bigl(X_0 (t) \in B,\, \tau> t\bigr). $$

We will assume that, as a function of t, P(t,B) is measurable and Riemann integrable.

Theorem 21.5.1

Let X(t) be a regenerative process and the random variable τ be non-lattice. Then, for any Borel set B, as t→∞,

$$\mathbf{P}\bigl(X (t) \in B\bigr) \to\boldsymbol{\pi}(B) = \frac{1}{\mathbf {E}\tau} \int_0^{\infty} \mathbf{P}(u, B)\, du. $$

If τ is a lattice variable (which is the case for processes X(t) in discrete time), the assertion holds true with the following obvious changes: t→∞ along the lattice and the integral is replaced with a sum.

Proof

Let T 0:=0, T k :=τ 1+⋯+τ k be the epoch of the k-th regeneration of the process X 0(t), and

$$H (u) := \sum^{\infty}_{k =0} \mathbf{P}( \tau_k < u) $$

(\(\tau_{k}\stackrel{d}{=}\tau\) are independent). Then, using the total probability formula and the key renewal theorem, we obtain, as t→∞,

$$\begin{aligned} \mathbf{P}\bigl(X_0 (t) \in B\bigr) = & \sum ^{\infty}_{k = 0}\int^t_0 \mathbf{P}( T_k \in du)\, P(t - u, B) \\=& \int^t_0 d H (u) \, P (t - u, B) \to \frac{1}{\mathbf{E}\tau} \int^{\infty}_0 P(u,B)\, du = \boldsymbol{\pi}(B). \end{aligned}$$

For the process X(t) one gets

$$\mathbf{P}\bigl(X (t) \in B\bigr) = \int^t_0 \mathbf{P}(t_0 \in d u) \mathbf {P}\bigl(X_0 (t - u) \in B\bigr) \to\pi(B). $$

The theorem is proved. □

21.5.2 The Laws of Large Numbers and Central Limit Theorem for Integrals of Regenerative Processes

Consider a measurable mapping \(f: \mathcal{X}\to\mathbb{R}\) of the state space \(\mathcal{X}\) of a process X(t) to the real line \(\mathbb{R}\). As in Sect. 21.4.2, for the sake of simplicity, we can assume that \(\mathcal{X}=\mathbb{R}\) and the trajectories of X(t) lie in the space D(0,∞) of functions without discontinuities of the second kind. In this case the paths f(X(u)), u≥0, will be measurable functions, for which the integral

$$S(t)=\int_0^t f \bigl(X(u) \bigr)\,du $$

is well defined. For such integrals we have the following law of large numbers. Set

$$\zeta:=\int_0^\tau f \bigl(X_0(u) \bigr)\,du, \qquad a:=\mathbf{E}\tau. $$

Theorem 21.5.2

Let the conditions of Theorem 21.5.1 be satisfied and there exist a ζ :=E ζ. Then, as t→∞,

$$\frac{S(t)}{t}\stackrel{p}{\to}\frac{a_\zeta}{a}. $$

For conditions of existence of E ζ, see Theorem 21.5.4 below.

Proof

The proof of the theorem largely repeats that of the similar assertion (Theorem 13.8.1) for sums of random variables defined on a Markov chain. Divide the domain u≥0 into half-intervals

$$(0,T_0],\quad(T_{k-1},T_k],\quad k\geq1,\quad T_0=\tau_0, $$

where T k are the epochs of hitting the state x 0 by the process X(t), τ k =T k T k−1 for k≥1 are independent and distributed as τ. Then the random variables

$$\zeta_k=\int_{T_{k-1}}^{T_k}f \bigl(X(u) \bigr)\,du,\quad k\geq1 $$

are independent, distributed as ζ, and have finite expectation a ζ . The integral S(t) can be represented as

$$S(t)=z_0+\sum_{k=1}^{\nu(t)} \zeta_k+z_t, $$

where

$$\nu(t):=\max\{k:T_k\leq t\},\qquad z_0:=\int _0^{T_0}f \bigl(X(u) \bigr)\,du,\qquad z_t:=\int_{T_{\nu(t)}}^t f \bigl(X(u) \bigr)\,du. $$

Since τ 0 is a proper random variable, z 0 is a proper random variable as well, and hence \({z_{0}}/{t}\stackrel{\mathit{a}.\mathit {s}.}{\longrightarrow}0\) as t→∞. Further,

$$z_t\stackrel{d}{=}\int_0^{\gamma(t)}f \bigl(X_0(u) \bigr)\,du, $$

where γ(t)=tT ν(t) has a proper limiting distribution as t→∞ (see Chap. 10), so \({z_{t}}/{t}\stackrel{p}{\to}0\) as t→∞. The sum \(S_{\nu(t)}=\sum_{k=1}^{\nu(t)}\zeta_{k}\) is nothing else but the generalised renewal process studied in Chaps. 10 and 11. By Theorem 11.5.2, as t→∞,

$$\frac{S_{\nu(t)}}{t}\stackrel{p}{\to}\frac{a_\zeta}{a}. $$

The theorem is proved. □

In order to prove the strong law of large numbers we need a somewhat more restrictive condition than that in Theorem 21.5.2. Put

$$\zeta^*:=\int_0^\tau \bigl|f \bigl(X_0(u) \bigr) \bigr|\,du. $$

Theorem 21.5.3

Let the conditions of Theorem 21.5.1 be satisfied and E ζ <∞. Then

$$\frac{S(t)}{t}\stackrel{\mathit{a}.\mathit{s}.}{\longrightarrow } \frac{a_\zeta}{a}. $$

The proof

essentially repeats (as was the case for Theorem 21.5.2) that of the law of large numbers for sums of random variables defined on a Markov chain (see Theorem 13.8.3). One only needs to use, instead of (13.8.18), the relation

$$\sup_{T_k\leq u\leq T_{k+1}} \biggl|\,\int_{T_k}^u f \bigl(X(v) \bigr)\,dv \biggr|\leq \zeta_k^*=\int_{T_k}^{T_{k+1}} \bigl|f \bigl(X(v) \bigr) \bigr|\,dv $$

and the fact that \(\mathbf{E}\,\zeta_{k}^{*}<\infty\). The theorem is proved. □

Here an analogue of Theorem 13.8.2, in which the conditions of existence of Eζ and E ζ are elucidated, is the following.

Theorem 21.5.4

(Generalisation of Wald’s identity)

Let the conditions of Theorem 21.5.1 be met and there exist

$$\mathbf{E}\bigl|f \bigl(X(\infty) \bigr) \bigr|= \int \bigl|f(x) \bigr|\boldsymbol{\pi}(dx), $$

where X(∞) is a random variable with the stationary distribution  π. Then there exist

$$\mathbf{E}\zeta^*=\mathbf{E}\tau \mathbf{E}\bigl|f \bigl(X(\infty) \bigr) \bigr|,\qquad \mathbf{E}\zeta=\mathbf{E}\tau \mathbf{E}f \bigl(X(\infty) \bigr). $$

The proof of Theorem 21.5.4

repeats, with obvious changes, that of Theorem 13.8.2. □

Theorem 21.5.5

(The central limit theorem)

Let the conditions of Theorem 21.5.1 be met and E τ 2<∞, E ζ 2<∞. Then

$$\frac{S(t)-rt}{d\sqrt{t/a}}\mathbin {{\subset }\hspace {-.65em}{\Rightarrow }}\boldsymbol {\Phi }_{0,1},\quad t\to\infty, $$

where r=a ζ /a, d 2=D(ζ).

The proof, as in the case of Theorems 21.5.2–21.5.4, repeats, up to evident changes, that of Theorem 13.8.4. □

Here an analogue of Theorem 13.8.5 (on the conditions of existence of variance and on an identity for a −1 d 2) looks more complicated than under the conditions of Sect. 13.8 and is omitted.

21.6 Diffusion Processes

Now we will consider an important class of Markov processes with continuous trajectories.

Definition 21.6.1

A homogeneous Markov process ξ(t) with state space \(\langle\mathbb{R},\mathfrak{B}\rangle\) and the transition function P(t,x,B) is said to be a diffusion process if, for some finite functions a(x) and b 2(x)>0,

  1. (1)

    \(\lim_{\varDelta\to0} \frac{1}{ \varDelta} \int(y-x) P(\varDelta ,x,dy) =a(x)\),

  2. (2)

    \(\lim_{\varDelta\to0} \frac{1}{ \varDelta} \int(y-x)^{2} P (\varDelta,x,dy) =b^{2} (x)\),

  3. (3)

    for some δ>0 and c<∞,

    $$\int|y-x|^{2+\delta} P (\varDelta,x,dy)< c\varDelta^{1+\delta/2}. $$

Put Δξ(t):=ξ(t+Δ)−ξ(t). Then the above conditions can be written in the form:

The coefficients a(x) and b(x) are called the shift and diffusion coefficients, respectively. Condition (3) is an analogue of the Lyapunov condition. It could be replaced with a Lindeberg type condition:

  1. (3a)

    E[(Δξ(t))2; |Δξ(t)|>ε]=o(Δ) for any ε>0 as Δ→0.

It follows immediately from condition (3) and the Kolmogorov theorem that a diffusion process ξ(t) can be thought of as a process with continuous trajectories.

The standard Wiener process w(t) is a diffusion process, since in that case

Therefore the Wiener process has zero shift and a constant diffusion coefficient. Clearly, the process w(t)+at will have shift a and the same diffusion coefficient.

We saw in Sect. 21.2 that the “local” characteristic Q of a Markov process ξ(t) with a discrete state space \(\mathcal{X}\) specifies uniquely the evolution law of the process. A similar situation takes place for diffusion processes: the distribution of the process is determined uniquely by the coefficients a(x) and b(x). The way to establishing this fact again lies via the Chapman–Kolmogorov equation.

Theorem 21.6.1

If the transition probability P(t;x,B) of a diffusion process is twice continuously differentiable with respect to x, then P(t;x,B) is differentiable with respect to t and satisfies the equation

$$ \frac{\partial P}{\partial t} = a\frac{\partial P}{\partial x} + \frac{b^2 }{2} \frac{\partial^2 P}{\partial x^2 } $$
(21.6.1)

with the initial condition

$$ P(0;x,B)=\mathrm{I}_B (x). $$
(21.6.2)

Remark 21.6.1

The conditions of the theorem on smoothness of the transition function P can actually be proved under the assumption that a and b are continuous, bb 0>0, |a|≤c(|x|+1) and b 2c(|x|+1).

Proof of Theorem 21.6.1

For brevity’s sake denote by \(P'_{t}\), \(P'_{x}\), and \(P''_{x}\) the partial derivatives \(\frac{\partial P}{\partial t}\), \(\frac{\partial P}{\partial x}\) and \(\frac{\partial^{2} P}{\partial x^{2} }\), respectively, and make use of the relation

(21.6.3)

Then by the Chapman–Kolmogorov equation

$$ \begin{aligned}[b] P(t+\varDelta;x,B) -P(t;x,B) & =\int P(\varDelta;x ,dy)\bigl[P(t;y,B) -P(t;x,B)\bigr] \\& =a(x) P'_x \varDelta+\frac{b^2 (x)}{2} P''_x \varDelta+o(\varDelta) +R , \end{aligned} $$
(21.6.4)

where

$$R=\int\frac{(y-x)^2}{2} \bigl[P''_x (t;y_x ,B) -P''_x (t;x,B) \bigr] P(\varDelta;x ,dy) =\int _{|y-x| \le\varepsilon} +\int _{|y-x| > \varepsilon} . $$

The first integral, by virtue of the continuity of \(P''_{x}\), does not exceed

$$\delta(\varepsilon) \biggl[ \frac{b^2 (x)}{2} \varDelta+o(\varDelta ) \biggr], $$

where δ(ε)→0 as ε→0; the second integral is o(Δ) by condition (3a). Since ε is arbitrary, one has R=o(Δ) and it follows from the above that

$$P'_t=\lim_{\varDelta\to0} \frac{P(t+\varDelta;x,B) -P(t;x,B)}{\varDelta} = a(x) P'_x +\frac{b^2 (x)}{2} P''_x . $$

This proves (21.6.1). The theorem is proved. □

It is known from the theory of differential equations that, under wide assumptions about the coefficients a and b and for B=(−∞,z), the Cauchy problem (21.6.1)–(21.6.2) has a unique solution P which is infinitely many times differentiable with respect to t, x and z. From this it follows that P(t;x,B) has a density p(t;x,z) which is the fundamental solution of (21.6.1).

It is also not difficult to derive from Theorem 21.6.1 that, along with P(t;x,B), the function

$$u(t,x) =\int g (z) P(t;x,dz) =\mathbf{E} \bigl[g \bigl(\xi^{(x)} (t) \bigr)\bigr] $$

will also satisfy Eq. (21.6.1) for any smooth function g with a compact support, ξ (x)(t) being the diffusion process with the initial value ξ (x)(0)=x.

In the proof of Theorem 21.6.1 we considered (see (21.6.4)) the time increment Δ preceding the main time interval. In this connection Eqs. (21.6.1) are called backward Kolmogorov equations. Forward equations can be derived in a similar way.

Theorem 21.6.2

(Forward Kolmogorov equations)

Let the transition density p(t;x,y) be such that the derivatives

$$\frac{\partial}{\partial y} \bigl[a(y) p(t;x,y)\bigr] \quad\mathit{and}\quad \frac{\partial^2 }{\partial y^2 } \bigl[b^2 (y) p(t;x,y)\bigr] $$

exist and are continuous. Then p(t,x,y) satisfies the equation

$$ Dp := \frac{\partial p}{\partial t} + \frac{\partial}{\partial y} \bigl[a(y) p(t;x,y)\bigr] -\frac{1}{2}\frac{\partial^2 }{\partial y^2 } \bigl[b^2 (y) p(t;x,y)\bigr]=0. $$
(21.6.5)

Proof

Let g(y) be a smooth function with a bounded support,

$$u(t,x) :=\mathbf{E}g \bigl(\xi^{(x)} (t)\bigr) =\int g (y) p(x;t,y) \,dy. $$

Then

$$ \begin{aligned}[b] &u(t+\varDelta,x) -u(t,x)\\& \quad{}= \int p(t; x,z) \biggl[ p(\varDelta; z ,y) g (y) \,dy - \int p(\varDelta,z, y) g (z) \,dy \biggr] \,dz . \end{aligned} $$
(21.6.6)

Expanding the difference g(y)−g(z) into a series, we obtain in the same way as in the proof of Theorem 21.4.1 that, by virtue of properties (1)–(3), the expression in the brackets is

$$\biggl[ a(z) g' (z) +\frac{b^2 (z)}{2} g'' (z) \biggr] \varDelta +o(\varDelta) . $$

This implies that there exists the derivative

$$\frac{\partial u}{\partial t} =\int p(t;x,z) \biggl[ a(z) g' (z)\,dz + \frac{1}{2}\frac{b^2 (z)}{2} g'' (z) \biggr] \,dz. $$

Integrating by parts we get

$$\frac{\partial u}{\partial t} =\int \biggl\{ {-}\frac{\partial }{\partial z} \bigl[a(z) p(t;x,z)\bigr] + \frac{1}{2} \frac{\partial }{\partial z^2 } \bigl[b^2 (z) p(t;x,z)\bigr] \biggr\}g(z)\,dz =0 $$

or, which is the same,

$$\int D p(t;x,z) g (z) \,dz =0. $$

Since g is arbitrary, (21.6.5) follows. The theorem is proved. □

As in the case of discrete \(\mathcal{X}\), the difference between the forward and backward Kolmogorov equations becomes more graphical for non-homogeneous diffusion processes, when the transition probabilities P(s,x;t,B) depend on two time variables, while a and b in conditions (1)–(3) are functions of s and x. Then the backward Kolmogorov equation (for densities) will relate the derivatives of the transition densities p(s,x;t,y) with respect to the first two variables, while the forward equation will hold for the derivatives with respect to the last two variables.

We return to homogeneous diffusion processes. One can study conditions ensuring the existence of the limiting stationary distribution of ξ (x)(t) as t→∞ which is independent of x using the same approach as in Sect. 21.2. Theorem 21.2.3 will remain valid (one simply has to replace i 0 in it with x 0, in agreement with the notation of the present section). The proof of Theorem 21.2.3 also remains valid, but will need a somewhat more precise argument (in the new situation, on the event B dv one has ξ(tv)∈dx 0 instead of ξ(tv)=x 0).

If the stationary distribution density

$$ \lim_{t \to\infty} p(t;x,y) =p(y) $$
(21.6.7)

exists, how could one find it? Since the dependence of p(t;x,y) of t and x vanishes as t→∞, the backward Kolmogorov equations turn into the identity 0=0 as t→∞. Turning to the forward equations and passing in (21.6.6) to the limit first as t→∞ and then as Δ→0, we come, using the same argument as in the proof of Theorem 21.2.3, to the following conclusion.

Corollary 21.6.1

If (21.6.7) and the conditions of Theorem 21.6.2 hold, then the stationary density p(y) satisfies the equation

$$-\bigl[a(y) p(y)\bigr]' +\frac{1}{2} \bigl[b^2 (y) p(y)\bigr]'' =0 $$

(which is obtained from (21.6.5) if we put \(\frac {\partial p}{\partial t} =0 \)).

Example 21.6.1

The Ornstein–Uhlenbeck process

$$\xi^{(x)} (t) =x e^{at} +\sigma e^{at}w \biggl( \frac{1-e^{-2at}}{2a} \biggr) , $$

where w(u) is the standard Wiener process, is a homogeneous diffusion process with the transition density

$$ p(t;x,y) =\frac{1}{\sqrt{2 \pi} \sigma(t)} \exp \biggl\{ -\frac{(y-xe^{at})^2 }{2 \sigma^2(t)} \biggr\} , \qquad\sigma^2 (t) =\frac{\sigma^2 }{ 2a }\bigl(e^{2at} -1\bigr). $$
(21.6.8)

We leave it to the reader to verify that this process has coefficients a(x)=ax, b(x)=σ=const, and that function (21.6.8) satisfies the forward and backward equations. For a<0, there exists a stationary process (the definition is given in the next chapter)

$$\xi(t) =\sigma e^{at} w \biggl( \frac{e^{-2at}}{2a} \biggr) , $$

of which the density (which does not depend on t) is equal to

$$p(y) =\lim_{t \to\infty} p(x;t,y) = \frac{1}{\sqrt{2 \pi} \sigma (\infty)} \exp \biggl\{ {-}\frac{y^2 }{2 \sigma^2 (\infty)} \biggr\} , \qquad\sigma(\infty) =-\frac{\sigma^2 }{2a}. $$

In conclusion of this section we will consider the problem, important for various applications, of finding the probability that the trajectory of a diffusion process will not leave a given strip. For simplicity’s sake we confine ourselves to considering this problem for the Wiener process. Let c>0 and d<0.

Put

$$\begin{aligned} U(t;x,B) := & \mathbf{P}\bigl(w^{(x)} (u) \in(d,c) \ \mbox{for all } u \in [0,t];\ w^{(x)} (t) \in B \bigr) \\=& \mathbf{P} \Bigl(\, \sup_{u \le t} w^{(x)} (u) <c ,\, \inf_{u \le t} w^{(x)} (u) >d ,\, w^{(x)} (t) \in B \Bigr) . \end{aligned}$$

Leaving out the verification of the fact that the function U is twice continuously differentiable, we will only prove the following proposition.

Theorem 21.6.3

The function U satisfies Eq. (21.6.1) with the initial condition

$$ U(0;x,B) =\mathrm{I}_B (x) $$
(21.6.9)

and boundary conditions

$$ U(t;c,B) =U(t;d,B) =0. $$
(21.6.10)

Proof

First of all note that the function U(t;x,B) for x∈(d,c) satisfies conditions (1)–(3) imposed on the transition function P(t;x,B). Indeed, consider, for instance, property (1).

We have to verify that

$$ \int_d^c (y-x) U( \varDelta;x,dy)=\varDelta a(x) +o(\varDelta) $$
(21.6.11)

(with a(x)=0 in our case). But U(t,x,B)=P(t;x,B)−V(t;x,B), where

$$V(t;x,B) =\mathbf{P} \Bigl( \Bigl\{\, \sup_{u\le t} w^{(x)} (u) \ge c \ \mbox{or}\ \inf_{u\le t} w^{(x)} (u) \le d \Bigr\} \cap\bigl\{ w^{(x)} (t) \in B \bigr\} \Bigr) , $$

and

The first probability in the brackets is given, as we know (see (20.2.1) and Theorem 19.2.2), by the value

$$2\mathbf{P}\bigl( w^{(x)} (\varDelta) >c\bigr) =2\mathbf{P} \biggl( w(1) >\frac{c-x}{\sqrt{\varDelta}} \biggr) \sim\frac{2}{\sqrt{2\pi} z} e^{-{z^2 }/2}, \quad z=\frac{c-x}{\sqrt{\varDelta}}. $$

For any x<c and k>0, it is o(Δ k). The same holds for the second probability. Therefore (21.6.11) is proved. In the same way one can verify properties (2) and (3).

Further, because by the total probability formula, for x∈(d,c),

$$U(t+\varDelta;x,B) =\int_d^c U( \varDelta;x,dy) U(t;y,B), $$

using an expansion of the form (21.6.3) for the function U, we obtain in the same way as in (21.6.4) that

$$\begin{aligned} U(t+\varDelta;x,B) - U(t;x,B) =& \int U(\varDelta;x,dy) \bigl[U(t;y,B) - U(t;x,B) \bigr] \\=& a(x) \frac{\partial U}{ \partial x} \varDelta + \frac{b^2 (x)}{2} \frac{\partial^2 U}{ \partial x^2 } \varDelta +o(\varDelta). \end{aligned}$$

This implies that \(\frac{\partial U}{ \partial t}\) exists and that Eq. (21.6.1) holds for the function U.

That the boundary and initial conditions are met is obvious. The theorem is proved. □

The reader can verify that the function

$$u(t;x,y) :=\frac{\partial}{ \partial y} U\bigl(t;x,(-\infty,y)\bigr) ,\quad y \in(d,c) , $$

playing the role of the fundamental solution to the boundary problem (21.6.9)–(21.6.10) (the function u satisfies (21.6.1) with the boundary conditions (21.6.10) and the initial conditions degenerating into the δ-function), is equal to

$$\begin{aligned} u(t;x,y) =& \frac{1}{\sqrt{2\pi t}} \Biggl[\, \sum_{k=-\infty }^{\infty} \exp \biggl\{ {-}\frac{[y+2k(c-d)]^2}{2t} \biggr\} \\& {}-\sum_{k=0}^{\infty} \exp \biggl\{ -\frac{[y-2c-2k(c-d)]^2}{2t} \biggr\} \\& {}-\sum_{k=0}^{\infty} \exp \biggl\{ -\frac{[y-2d-2k(c-d)]^2}{2t} \biggr\} \Biggr] . \end{aligned}$$

This expression can also be obtained directly from probabilistic considerations (see, e.g., [32]).