Abstract
This chapter presents the fundamentals of the theory of general Markov processes in continuous time. Section 21.1 contains the definitions and a discussion of the Markov property and transition functions, and derives the Chapman–Kolmogorov equation. Section 21.2 studies Markov processes in countable state spaces, deriving systems of backward and forward differential equations for transition probabilities. It also establishes the ergodic theorem and contains examples illustrating the presented theory. Section 21.3 deals with continuous time branching processes. Then the elements of the general theory of semi-Markov processes are presented in Sect. 21.4, including the ergodic theorem and some other related results for such processes. Section 21.5 discusses the so-called regenerative processes, establishing their ergodicity and the Laws of Large Numbers and Central Limit Theorem for integrals of functions of their trajectories. Section 21.6 is devoted to diffusion processes. It begins with the classical definition of diffusion, derives the forward and backward Kolmogorov equations for the transition probability function of a diffusion process, and gives a couple of examples of using the equations to compute important characteristics of the respective processes.
Access provided by Autonomous University of Puebla. Download chapter PDF
Keywords
- semi-Markov Process
- Backward Kolmogorov Equation
- Continuous Time Branching Process
- Forward Equation
- Countable State Space
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
21.1 Definitions and General Properties
Markov processes in discrete time (Markov chains) were considered in Chap. 13. Recall that their main property was independence of the “future” of the process of its “past” given its “present” is fixed. The same principle underlies the definition of Markov processes in the general case.
21.1.1 Definition and Basic Properties
Let \(\langle\varOmega,\mathfrak{F},\mathbf{P}\rangle\) be a probability space and {ξ(t)=ξ(t,ω), t≥0} a random process given on it. Set
so that the variable ξ(u) is \(\mathfrak{F}_{t}\)-measurable for u≤t and \(\mathfrak{F}_{[t,\infty)}\)-measurable for u≥t. The σ-algebra \(\sigma(\mathfrak{F}_{t} ,\mathfrak{F}_{[t,\infty)})\) is generated by the variables ξ(u) for all u and may coincide with \(\mathfrak{F}\) in the case of the sample probability space.
Definition 21.1.1
We say that ξ(t) is a Markov process if, for any t, \(A \in\mathfrak{F}_{t}\), and \(B \in\mathfrak{F}_{[t,\infty)}\), we have
This expresses precisely the fact that the future is independent of the past when the present is fixed (conditional independence of \(\mathfrak{F}_{t}\) and \(\mathfrak{F}_{[t,\infty)}\) given ξ(t)).
We will now show that the above definition is equivalent to the following.
Definition 21.1.2
We say that ξ(t) is a Markov process if, for any bounded \(\mathfrak{F}_{[t,\infty)}\)-measurable random variable η,
It suffices to take η to be functions of the form η=f(ξ(s)) for s≥t.
Proof of the equivalence
Let (21.1.1) hold. By the monotone convergence theorem it suffices to prove (21.1.2) for simple functions η. To this end it suffices, in turn, to prove (21.1.2) for η=I B , the indicator of the set \(B \in\mathfrak{F}_{[t ,\infty)}\). Let \(A \in\mathfrak{F}_{t}\). Then, by (21.1.1),
On the other hand,
Because (21.1.3) and (21.1.4) hold for any \(A \in \mathfrak{F}_{t}\), this means that \(\mathbf{P}(B | \mathfrak{F}_{t} ) =\mathbf{P}(B | \xi(t))\).
Conversely, let (21.1.2) hold. Then, for \(A \in\mathfrak{F}_{t}\) and \(B \in\mathfrak{F}_{[t ,\infty)}\), we have
□
It remains to verify that it suffices to take η=f(ξ(s)), s≥t, in (21.1.2). In order to do this, we need one more equivalent definition of a Markov process.
Definition 21.1.3
We say that ξ(t) is a Markov process if, for any bounded function f and any t 1<t 2<⋯<t n ≤t,
Proof of the equivalence
Relation (21.1.5) follows in an obvious way from (21.1.2). Now assume that (21.1.5) holds. Then, for any A∈σ(ξ(t 1),…,ξ(t n )),
Both parts of (21.1.6) are measures coinciding on the algebra of cylinder sets. Therefore, by the theorem on uniqueness of extension of a measure, they coincide on the σ-algebra generated by these sets, i.e. on \(\mathfrak{F}_{t_{n} }\). In other words, (21.1.6) holds for any \(A \in\mathfrak{F}_{t_{n} }\), which is equivalent to the equality
for any t n ≤t. Relation (21.1.2) for η=f(ξ(t)) is proved. □
We now prove that in (21.1.2) it suffices to take η=f(ξ(s)), s≥t. Let t≤u 1<⋯<u n . We prove that then (21.1.2) is true for
We will make use of induction and assume that equality (21.1.2) holds for the functions
(for n=1 relation (21.1.2) is true). Then, putting g(u n−1):=E[f n (ξ(u n ))|ξ(u n−1)], we obtain
By the induction hypothesis this implies that
and, therefore, that \(\mathbf{E}(\eta| \mathfrak{F}_{t} )\) is σ(ξ(t))-measurable and
We proved that (21.1.2) holds for σ(ξ(u 1),…,ξ(u n ))-measurable functions of the form (21.1.7). By passing to the limit we establish first that (21.1.2) holds for simple functions, and then that it holds for any \(\mathfrak{F}_{[t ,\infty)}\)-measurable functions. □
21.1.2 Transition Probability
We saw that, for a Markov process ξ(t), the conditional probability
is a Borel function of ξ(s) which we will denote by
One can say that P(s,x;t,B) as a function of B and x is the conditional distribution (see Sect. 4.9) of ξ(t) given that ξ(s)=x. By the Markov property, it satisfies the relation (s<u<t)
which follows from the equality
Equation (21.1.8) is called the Chapman–Kolmogorov equation.
The function P(s,x;t,B) can be used in an analytic definition of a Markov process. First we need to clarify what properties a function P x,B (s,t) should possess in order that there exists a Markov process ξ(t) for which
Let \(\langle\mathcal{X},\mathfrak{B}_{\mathcal{X}}\rangle\) be a measurable space.
Definition 21.1.4
A function P x,B (s,t) is said to be a transition function on \(\langle\mathcal{X},\mathfrak{ B}_{\mathcal{X}}\rangle\) if it satisfies the following conditions:
-
(1)
As a function of B, P x,B (s,t) is a probability distribution for each s≤t, \(x\in\mathcal{X}\).
-
(2)
P x,B (s,t) is measurable in x for each s≤t and \(B \in\mathfrak{ B}_{\mathcal{X}}\).
-
(3)
For 0≤s<u<t and all x and B,
$$P_{x,B} (s,t) =\int P_{x,dy} (s,u) P_{y,B} (u,t) $$(the Chapman–Kolmogorov equation).
-
(4)
P x,B (s,t)=I B (x) for s=t.
Here properties (1) and (2) ensure that P x,B (s,t) can be a conditional distribution (cf. Sect. 4.9).
Now define, with the help of P x,B (s,t), the finite-dimensional distributions of a process ξ(t) with the initial condition ξ(0)=a by the formula
By virtue of properties (3) and (4), these distributions are consistent and therefore by the Kolmogorov theorem define a process ξ(t) in \(\langle\mathbb{R}^{ T } ,\mathfrak{ B}_{R}^{T} \rangle\), where T=[0,∞).
By formula (21.1.9) and rule (21.1.5),
We could also verify this equality in a more formal way using the fact that the integrals of both sides over the set {ξ(t 1)∈B 1,…,ξ(t n−1)∈B n−1} coincide.
Thus, by virtue of Definition 21.1.3, we have constructed a Markov process ξ(t) for which
This function will also be called the transition function (or transition probability) of the process ξ(t).
Definition 21.1.5
A Markov process ξ(t) is said to be homogeneous if P(s,x;t,B), as a function of s and t, depends on the difference t−s only:
This is the probability of transition during a time interval of length t−s from x to B. If
then the function p(u;x,y) is said to be a transition density.
It is not hard to see that the Wiener and Poisson processes are both homogeneous Markov processes. For example, for the Wiener process,
21.2 Markov Processes with Countable State Spaces. Examples
21.2.1 Basic Properties of the Process
Assume without loss of generality that the “discrete state space” \(\mathcal{X}\) coincides with the set of integers {0,1,2,…}. For simplicity’s sake we will only consider homogeneous Markov processes.
The transition function of such a process is determined by the collection of functions P(t;i,j)=p ij (t) which form a stochastic matrix P(t)=∥p ij (t)∥ (with p ij (t)≥0, ∑ j p ij (t)=1). Chapman–Kolmogorov’s equation now takes the form
or, which is the same, in the matrix form,
In what follows, we consider only stochastically continuous processes for which \(\xi(t+s)\stackrel{p}{\to}\xi(t)\) as s→0, which is equivalent in the case under consideration to each of the following three relations:
as s→0 (component-wise; E is the unit matrix).
We will also assume that convergence in (21.2.2) is uniform (for a finite \(\mathcal{X}\) this is always the case).
According to the separability requirement, we will assume that ξ(t) cannot change its state in “zero time” more than once (thus excluding the effects illustrated in Example 18.1.1, i.e. assuming that if ξ(t)=j then, with probability 1, ξ(t+s)=j for s∈[0,τ), τ=τ(ω)>0). In that case, the trajectories of the processes will be piece-wise constant (right-continuous for definiteness), i.e. the time axis is divided into half-intervals [0,τ 1), [τ 1,τ 1+τ 2),… , on which ξ(t) is constant. Put
Theorem 21.2.1
Under the above assumptions (stochastic continuity and separability),
where q i <∞; moreover, q i >0 if \(p_{ii} (t) \not\equiv1\). There exist the limits
where ∑ j:j≠i q ij =q i .
Proof
By the Markov property,
and q i (t)↓. Therefore there exists a unique solution \(q_{i} (t) =e^{-q_{i} t} \) of this equation, where q i <∞, since P(τ 1>0)=1 and q i >0, because q i (t)<1 when \(p_{ii} (t) \not\equiv1\).
Let further 0<t 0<t 1⋯<t n <t. Since the events
are disjoint,
Here, by condition (21.2.2), p ji (t−t r+1)<ε t for all j≠i, and ε t →0 as t→0, so that the sum in (21.2.4) does not exceed
Together with the obvious inequality p ii (t)≥q i (t) this gives
(i.e. the asymptotic behaviour of 1−q i (t) and 1−p ii (t) as t→∞ is identical). This implies the second assertion of the theorem (i.e., the first relation in (21.2.3)).
Now let t r :=rt/n. Consider the transition probabilities
This implies that
and that the upper limit on the right-hand side is bounded. Passing to the limit as t→0, we obtain
Since ∑ j:j≠i p ij (t)=1−p ii (t), we have ∑ j:j≠i q ij =q i . The theorem is proved. □
The theorem shows that the quantities
form a stochastic matrix and give the probabilities of transition from i to j during an infinitesimal time interval Δ given the process ξ(t) left the state i during that time interval:
as Δ→0.
Thus the evolution of ξ(t) can be thought of as follows. If ξ(0)=X 0, then ξ(t) stays at X 0 for a random time \(\tau_{1} \mathbin {{\subset }\hspace {-.7em}{=}}\boldsymbol {\Gamma }_{q_{X_{0} }}\). Then ξ(t) passes to a state X 1 with probability \(p_{X_{0} X_{1}}\). Further, ξ(t)=X 1 over the time interval [τ 1,τ 1+τ 2), \(\tau_{2}\mathbin {{\subset }\hspace {-.7em}{=}}\boldsymbol {\Gamma }_{q_{X_{1} }}\), after which the system changes its state to X 2 and so on. It is clear that X 0,X 1,… is a homogeneous Markov chain with the transition matrix ∥p ij ∥. Therefore the further study of ξ(t) can be reduced in many aspects to that of the Markov chain {X n ; n≥0}, which was carried out in detail in Chap. 13.
We see that the evolution of ξ(t) is completely specified by the quantities q ij and q i forming the matrix
where we put q ii :=−q i , so that ∑ j q ij =0. We can also justify this claim using an analytical approach. To simplify the technical side of the exposition, we will assume, where it is needed, that the entries of the matrix Q are bounded and convergence in (21.2.3) is uniform in i.
Denote by e A the matrix-valued function
Theorem 21.2.2
The transition probabilities p ij (t) satisfy the systems of differential equations
Each of the systems (21.2.6) and (21.2.7) has a unique solution
It is clear that the solution can be obtained immediately by formally integrating equation (21.2.6).
Proof
By virtue of (21.2.1), (21.2.2) and (21.2.5),
In the same way we obtain, from the equality
the second equation in (21.2.7). The passage to the limit is justified by the assumptions we made.
Further, it follows from (21.2.6) that the function P(t) is infinitely differentiable, and
The theorem is proved. □
Because of the derivation method, (21.2.6) is called the backward Kolmogorov equation, and (21.2.7) is known as the forward Kolmogorov equation (the time increment is taken after or before the basic time interval).
The difference between these equations becomes even more graphical in the case of inhomogeneous Markov processes, when the transition probabilities
depend on two time arguments: s and t. In that case, (21.2.1) becomes the equality P(s,t+u)=P(s,t)P(t,t+u), and the backward and forward equations have the form
respectively, where
The reader can derive these relations independently.
What are the general conditions for existence of a stationary limiting distribution? We can use here an approach similar to that employed in Chap. 13.
Let ξ (i)(t) be a process with the initial value ξ (i)(0)=i and right-continuous trajectories. For a given i 0, put
Here in the second formula we consider the values t≥ν k−1+1, since for t≥ν k−1 we would have ν k ≡ν k−1. Clearly, P(ν k −ν k−1=1)>0, and P(ν k −ν k−1∈(t,t+h))>0 for any t≥1 and h>0 provided that \(p_{i_{0} i_{0}} (t) \not\equiv1\).
Note also that the variables ν k , k=0,1,… , are not defined for all elementary outcomes. We put ν 0=∞ if ξ (i)(t)≠i 0 for all t≥0. A similar convention is used for ν k , k≥1. The following ergodic theorem holds.
Theorem 21.2.3
Let there exist a state i 0 such that E ν 1<∞ and P(ν (i)<∞)=1 for all \(i \in\mathcal{X}_{0} \subset\mathcal{X}\). Then there exist the limits
which are independent of \(i \in\mathcal{X}_{0}\).
Proof
As was the case for Markov chains, the epochs ν 1,ν 2,… divide the time axis into independent cycles of the same nature, each of them being completed when the system returns for the first time (after one time unit) to the state i 0. Consider the renewal process generated by the sums ν k , k=0,1,… , of independent random variables ν 0, ν k −ν k−1, k=1,2,… . Let
The event A dv :={γ(t)∈[v,v+dv)} can be represented as the intersection of the events
and \(C_{v}:=\{\xi(u)\ne i_{0}\mbox{ for } u\in[t-v+1,t]\}\in\mathfrak{F}_{[t-v,\infty)}\). We have
On the set B dv , one has ξ(t−v)=i 0, and hence the probability inside the last integral is equal to
and is independent of t and i. Since P(B dv )=dH(t−v), one has
By the key renewal theorem, as t→∞, this integral converges to
The existence of the last integral follows from the inequality g(v)≤P(ν 1>v). The theorem is proved. □
Theorem 21.2.4
If the stationary distribution
exists with all the rows of the matrix P being identical, then it is a unique solution of the equation
It is evident that Eq. (21.2.10) is obtained by setting P′(t)=0 in (21.2.6). Equation (21.2.7) gives the trivial equality QP=0.
Proof
Equation (21.2.10) is obtained by passing to the limit in (21.2.8) first as t→∞ and then as s→0. Now assume that P 1 is a solution of (21.2.10), i.e. P 1 Q=0. Then P 1 P(t)=P 1 for t<1, since
Further, P 1=P 1 P k(t)=P 1 P(kt), P(kt)→P as k→∞, and hence P 1=P 1 P=P. The theorem is proved. □
Now consider a Markov chain {X n } in discrete time with transition probabilities p ij =q ij /q i , i≠j, p ii =0. Suppose that this chain is ergodic (see Theorem 13.4.1). Then its stationary probabilities {π j } satisfy Eqs. (13.4.2). Now note that Eq. (21.2.10) can be written in the form
which has an obvious solution p j =cπ j /q j , c=const. Therefore, if
then there exists a solution to (21.2.10) given by
In Sects. 21.4 and 21.5 we will derive the ergodic theorem for processes of a more general form than the one in the present section. That theorem will imply, in particular, that ergodicity of {X n } and convergence (21.2.11) imply (21.2.9). Recall that, for ergodicity of {X n }, it suffices, in turn, that Eqs. (13.4.2) have a solution {π j }. Thus the existence of solution (21.2.12) implies the ergodicity of ξ(t).
21.2.2 Examples
Example 21.2.1
The Poisson process ξ(t) with parameter λ is a Markov process for which q i =λ, q i,i+1=λ, and p i,i+1=1, i=1,0,… . For this process, the stationary distribution p=(p 0,p 1,…) does not exist (each trajectory goes to infinity).
Example 21.2.2
Birth-and-death processes. These are processes for which, for i≥1,
so that
are probabilities of birth and death, respectively, of a particle in a certain population given that the population consisted of i particles and changed its composition. For i=0 one should put μ 0:=0. Establishing conditions for the existence of a stationary regime is a rather difficult problem (related mainly to finding conditions under which the trajectory escapes to infinity). If the stationary regime exists, then according to Theorem 21.2.4 the stationary probabilities p j can be uniquely determined from the recursive relations (see Eq. (21.2.10), in our case q ii =−q i =−(λ i +μ i ))
and condition ∑p j =1.
Example 21.2.3
The telephone lines problem from queueing theory. Suppose we are given a system consisting of infinitely many communication channels which are used for telephone conversations. The probability that, for a busy channel, the transmitted conversation terminates during a small time interval (t,t+Δ) is equal to λΔ+o(Δ). The probability that a request for a new conversation (a new call) arrives during the same time interval is μΔ+o(Δ). Thus the “arrival flow” of calls is nothing else but the Poisson process with parameter λ, and the number ξ(t) of busy channels at time t is the value of the birth-and-death process for which λ i =λ and μ i =iμ.
In that case, it is not hard to verify with the help of Theorem 21.2.3 that there always exists a stationary limiting distribution, for which Eqs. (21.2.13) have the form
From this we get that
so that p 0=e −λ/μ, and the limiting distribution will be the Poisson law with parameter λ/μ.
If the number of channels n is finite, the calls which find all the lines busy will be rejected, and in (21.2.13) one has to put λ n =0, p n+1=p n+2=⋯=0. In that case, the last equation in (21.2.14) will have the form μnp n =λp n−1. Since the formulas (21.2.15) will remain true for k≤n, we obtain the so-called Erlang formulas for the stationary distribution:
(the truncated Poisson distribution).
The next example will be considered in a separate section.
21.3 Branching Processes
The essence of the mathematical model describing a branching process remains roughly the same as in Sect. 7.7.2. A continuous time branching process can be defined as follows. Let ξ (i)(t) denote the number of particles at time t with the initial condition ξ (i)(0)=i. Each particle, independently of all others, splits during the time interval (t,t+Δ) with probability μΔ+o(Δ) into a random number η≠1 of particles (if η=0, we say that the particle dies). Thus,
where \(\xi_{k}^{(1)} (t)\) are independent and distributed as ξ (1)(t). Moreover,
so that here q ij =iμh j−i+1, q ii =−iμ.
By formula (21.3.2), iμΔ is the principal part of the probability that at least one particle will split. Clearly, the state 0 is absorbing. It will not be absorbing any more if one considers processes with immigration when a Poisson process (with intensity λ) of “outside” particles is added to the process ξ (i)(t). Then
We return to the branching process (21.3.1), (21.3.2). By (21.3.1) we have
where
Equation (21.2.7) implies
Therefore, differentiating (21.3.3) with respect to t, we find that
But q 1l =μp l for l≠1, q 11=−μ, and putting
we can write (21.3.4) in the form
We have obtained a differential equation for r=r(t,z) (equivalent to (21.3.2)) which is more convenient to write in the form
Consider the behaviour of the function f 1(y)=E y η−y on [0,1]. Clearly, f 1(0)=P(η=0), f 1(1)=0, and
Consequently, the function f 1(y) is convex and has no zeros in (0,1) if E η≤1. When E η>1, there exists a point q∈(0,1) such that f 1(q)=0, \(f'_{1} (q) <0\) (see Fig. 21.1), and \(f_{1} (y) =(y-q) f'_{1} (q) +O((y-q)^{2})\) in the vicinity of this point.
Thus if E η>1, z<q and r↑q, then, by virtue of the representation
we obtain
This implies that, as t→∞,
In particular, the extinction probability
converges exponentially fast to q, p 10(∞)=q. Comparing our results with those from Sect. 7.7, the reader can see that the extinction probability for a discrete time branching process had the same value (we could also come to this conclusion directly). Since p k0(t)=[p 10(t)]k, one has p k0(∞)=q k.
It follows from (21.3.5) that the remaining “probability mass” of the distribution of ξ(t) quickly moves to infinity as t→∞.
If E η<1, the above argument remains valid with q replaced with 1, so that the extinction probability is p 10(∞)=p k0(∞)=1.
If E η=1, then
Thus the extinction probability r(t,0)=p 10(t) also tends to 1 in this case.
21.4 Semi-Markov Processes
21.4.1 Semi-Markov Processes on the States of a Chain
Semi-Markov processes can be described as follows. Let an aperiodic discrete time irreducible Markov chain {X n } with the state space \(\mathcal{X}= \{ 0, 1, 2,\dots\}\) be given. To each state i we put into correspondence the distribution F i (t) of a positive random variable ζ (i):
Consider independent of the chain {X n } and of each other the sequences \(\zeta^{(i)}_{1}, \zeta^{(i)}_{2}, \ldots\) ; \(\zeta^{(i)}_{j}\stackrel{d}{=}\zeta^{(i)}\), of independent random variables with the distribution F i . Let, moreover, the distribution of the initial random vector (X 0,ζ 0), \(X_{0} \in \mathcal{X}\), ζ 0≥0, be given. The evolution of the semi-Markov process ξ(u) is described as follows:
and so on. Thus, upon entering state X n =j, the trajectory of ξ(u) remains in that state for a random time \(\zeta_{n}^{(X_{n})} = \zeta_{n}^{(j)}\), then switches to state X n+1 and so on. It is evident that such a process is, generally speaking, not Markovian. It will be a Markov process only if
and will then coincide with the process described in Sect. 21.2.
If the distribution F i is not exponential, then, given the value ξ(t)=i, the time between t and the next jump epoch will depend on the epoch of the preceding jump of ξ(⋅), because
for non-exponential F i depends on v. It is this property that means that the process is non-Markovian, for fixing the “present” (i.e. the value of ξ(t)) does not make the “future” of the process ξ(u) independent of the “past” (i.e. of the trajectory of ξ(u) for u<t).
The process ξ(t) can be “complemented” to a Markov one by adding to it the component χ(t) of which the value gives the time u for which the trajectory ξ(t+u), u≥0, will remain in the current state ξ(t). In other words, χ(t) is the excess of level t for the random walk Z 0,Z 1,… (see Fig. 21.2):
The process χ(t) is Markovian and has “saw-like” trajectories deterministic inside the intervals (Z k ,Z k+1). The process X(t)=(ξ(t),χ(t)) is obviously Markovian, since the value of X(t) uniquely determines the law of evolution of the process X(t+u) for u≥0 whatever the “history” X(v), v<t, is. Similarly, we could consider the Markov process Y(t)=(ξ(t),γ(t)), where γ(t) is the defect of level t for the walk Z 0,Z 1,… :
21.4.2 The Ergodic Theorem
In the sequel, we will distinguish between the following two cases.
(A) The arithmetic case when the possible values of ζ (i), i=0,1,…, are multiples of a certain value h which can be assumed without loss of generality to be equal to 1. In that case we will also assume that the g.c.d. of the possible values of the sums of the variables ζ (i) is also equal to h=1. This is clearly equivalent to assuming that the g.c.d. of the possible values of recurrence times θ (i) of ξ(t) to the state i is equal to 1 for any fixed i.
(NA) The non-arithmetic case, when condition (A) does not hold.
Put a i :=E ζ (i).
Theorem 21.4.1
Let the Markov chain {X n } be ergodic (satisfy the conditions of Theorem 13.4.1) and {π j } be the stationary distribution of that chain. Then, in the non-arithmetic case (NA), for any initial distribution (ζ 0,X 0) there exists the limit
In the arithmetic case (A), (21.4.2) holds for integer-valued v (the integral becomes a sum in that case). It follows from (21.4.2) that the following limit exists
Proof
For definiteness we restrict ourselves to the non-arithmetic case (NA). In Sect. 13.4 we considered the times τ (i) between consecutive visits of {X n } to state i. These times could be called “embedded”, as well as the chain {X n } itself in regard to the process ξ(t). Along with the times τ (i), we will need the “real” times θ (i) between the visits of the process ξ(t) to the state i. Let, for instance, X 1=1. Then
where τ=τ (1). For definiteness and to reduce notation, we fix for the moment the value i=1 and put θ (1)=:θ. Let first
Then the whole trajectory of the process X(t) for t≥0 will be divided into identically distributed independent cycles by the epochs when the process hits the state ξ(t)=1. We denote the lengths of these cycles by θ 1,θ 2… ; they are independent and identically distributed. We show that
Denote by θ(n) the “real” time spent on n transitions of the governing chain {X n }. Then
where η(n):=min{k:T k >n}, \(T_{k} = \sum^{k}_{j=1}\tau_{j}\), τ j are independent and distributed as τ. We prove that, as n→∞,
By Wald’s identity and (21.4.5),
where E η(n)∼n/E τ=nπ 1.
Now we bound from below the expectation E θ(n). Put m:=⌊nπ 1−εn⌋, \(\varTheta_{n} := \sum^{n}_{j=1} \theta_{j}\). Then
Here the random variable Θ m /m≥0 possesses the properties
Therefore it satisfies the conditions of part 4 of Lemma 6.1.1 and is uniformly integrable. This, in turn, by Lemma 6.1.2 and convergence P(η(n)≤m)→0 means that the last term on the right-hand side of (21.4.8) is o(m). By virtue of (21.4.8), since ε>0 is arbitrary, we obtain that
This together with (21.4.7) proves (21.4.6).
Now we will calculate the value of E θ(n) using another approach. The variable θ(n) admits the representation
where N(j,n) is the number of visits of the trajectory of {X k } to the state j during the first n steps. Since \(\{ \zeta_{k}^{(j)} \}^{\infty}_{k=1}\) and N(j,n) are independent for each j, we have
Because p 1j (k)→π j as k→∞, one has
Moreover,
and, therefore,
Hence
and in the case when ∑ j a j π j <∞, the series ∑ j a j n −1 E N(j,n) converges uniformly in n. Consequently, the following limit exists
Comparing this with (21.4.6) we obtain (21.4.4). If E θ=∞ then clearly E θ(n)=∞ and ∑ j a j π j =∞, and vice versa, if ∑ j a j π j =∞ then E θ=∞.
Consider now the random walk {Θ k }. To the k-th cycle there correspond T k transitions. Therefore, by the total probability formula,
where \(\zeta^{(1)}_{T_{k+1}}\) is independent of Θ k and distributed as ζ (1) (see Lemma 11.2.1 or the strong Markov property). Therefore, denoting by \(H_{\theta} (u) := \sum^{\infty}_{k=1} \mathbf{P}(\varTheta_{k} < u)\) the renewal function for the sequence {Θ k }, we obtain for the non-arithmetic case (NA), by virtue of the renewal theorem (see Theorem 10.4.1 and (10.4.2)), that, as t→∞,
We have proved assertion (21.4.2) for i=1 and initial conditions (21.4.3). The transition to arbitrary initial conditions is quite obvious and is done in exactly the same way as in the proof of the ergodic theorems of Chap. 13.
If ∑a i π i =∞ then, as we have already observed, E θ=∞ and, by the renewal theorem and (21.4.9), one has P(ξ(t)=1, χ(t)>v)→0 as t→∞. It remains to note that instead of i=1 we can fix any other value of i. The theorem is proved. □
In the same way we could also prove that
(see Theorem 10.4.3).
21.4.3 Semi-Markov Processes on Chain Transitions
Along with the semi-Markov processes ξ(t) described at the beginning of the present section, one sometimes considers semi-Markov processes “given on the transitions” of the chain {X n }. In that case, the distributions F ij of random variables ζ (ij)>0 are given and, similarly to (21.4.1), for the initial condition (X 0,X 1,ζ 0) one puts
and so on. Although at first glance this is a very general model, it can be completely reduced to the semi-Markov processes (21.4.1). To that end, one has to notice that the “two-dimensional” sequence Y n =(X n ,X n+1), n=0,1,… , also forms a Markov chain. Its transition probabilities have the form
so that if the chain {X n } is ergodic, then {Y n } is also ergodic and
This enables one to restate Theorem 21.4.1 easily for the semi-Markov processes (21.4.10) given on the transitions of the Markov chain {X n }, since the process (21.4.10) will be an ordinary semi-Markov process given on the chain {Y n }.
Corollary 21.4.1
If the chain {X n } is ergodic then, in the non-arithmetic case,
In the arithmetic case v must be a multiple of the lattice span.
We will make one more remark which could be helpful when studying semi-Markov processes and which concerns the so-called semi-Markov renewal functions H ij (t). Denote by T ij (n) the epoch (in the “real time”) of the n-th jump of the process ξ(t) from state i to j. Put
If ν ij (t) is the number of jumps from state i to j during the time interval [0,t), then clearly H ij (t)=E ν ij (t).
Set Δf(t):=f(t+Δ)−f(t), Δ>0.
Corollary 21.4.2
In the non-arithmetic case,
In the arithmetic case v must be a multiple of the lattice span.
Proof
Denote by \(\nu^{(k)}_{i j} (u)\) the number of transitions of the process ξ(t) from i to j during the time interval (0,u) given the initial condition (k,0). Then, by the total probability formula,
Since \(\nu^{(k)}_{i j} (u) \leq\nu^{(i)}_{i j} (u)\), by Theorem 21.4.1 one has
Further,
as Δ→0, and
It follows from the aforesaid that
Therefore,
Further, from the equality
we obtain that h ij (2Δ)=2h ij (Δ), which means that h ij (Δ) is linear. Together with (21.4.13) this proves (21.4.11). The corollary is proved. □
The class of processes for which one can prove ergodicity using the same methods as the one used for semi-Markov processes and also in Chap. 13, can be somewhat extended. For this broader class of processes we will prove in the next section the ergodic theorem, and also the laws of large numbers and the central limit theorem for integrals of such processes.
21.5 Regenerative Processes
21.5.1 Regenerative Processes. The Ergodic Theorem
Let X(t) and X 0(t); t≥0, be processes given in the space D(0,∞) of functions without discontinuities of the second type (the state space of these processes could be any metric space, not necessarily the real line). The process X(t) is said to be regenerative if it possesses the following properties:
(1) There exists a state x 0 which is visited by the process X with probability 1. After each such visit, the evolution of the process starts anew as if it were the original process X(t) starting at the state X(0)=x 0. We will denote this new process by X 0(t) where X 0(0)=x 0. To state this property more precisely, we introduce the time τ 0 of the first visit to x 0 by X:
However, it is not clear from this definition whether τ 0 is a random variable. For definiteness, assume that the process X is such that for τ 0 one has
where S is a countable set everywhere dense in [0,t]. In that case the set {τ 0>t} is clearly an event and τ 0 is a random variable. The above stated property means that τ 0 is a proper random variable: P(τ 0<∞)=1, and that the distribution of X(τ 0+u), u≥0, coincides with that of X 0(u), u≥0, whatever the “history” of the process X(t), t≤τ 0.
(2) The recurrence time τ of the state x 0 has finite expectation E τ<∞, τ:=inf{t:X 0(t)=x 0}.
The aforesaid means that the evolution of the process is split into independent identically distributed cycles by its visits to the state x 0. The visit times to x 0 are called regeneration times. The behaviour of the process inside the cycles may be arbitrary, and no further conditions, including Markovity, are imposed.
We introduce the so-called “taboo probability”
We will assume that, as a function of t, P(t,B) is measurable and Riemann integrable.
Theorem 21.5.1
Let X(t) be a regenerative process and the random variable τ be non-lattice. Then, for any Borel set B, as t→∞,
If τ is a lattice variable (which is the case for processes X(t) in discrete time), the assertion holds true with the following obvious changes: t→∞ along the lattice and the integral is replaced with a sum.
Proof
Let T 0:=0, T k :=τ 1+⋯+τ k be the epoch of the k-th regeneration of the process X 0(t), and
(\(\tau_{k}\stackrel{d}{=}\tau\) are independent). Then, using the total probability formula and the key renewal theorem, we obtain, as t→∞,
For the process X(t) one gets
The theorem is proved. □
21.5.2 The Laws of Large Numbers and Central Limit Theorem for Integrals of Regenerative Processes
Consider a measurable mapping \(f: \mathcal{X}\to\mathbb{R}\) of the state space \(\mathcal{X}\) of a process X(t) to the real line \(\mathbb{R}\). As in Sect. 21.4.2, for the sake of simplicity, we can assume that \(\mathcal{X}=\mathbb{R}\) and the trajectories of X(t) lie in the space D(0,∞) of functions without discontinuities of the second kind. In this case the paths f(X(u)), u≥0, will be measurable functions, for which the integral
is well defined. For such integrals we have the following law of large numbers. Set
Theorem 21.5.2
Let the conditions of Theorem 21.5.1 be satisfied and there exist a ζ :=E ζ. Then, as t→∞,
For conditions of existence of E ζ, see Theorem 21.5.4 below.
Proof
The proof of the theorem largely repeats that of the similar assertion (Theorem 13.8.1) for sums of random variables defined on a Markov chain. Divide the domain u≥0 into half-intervals
where T k are the epochs of hitting the state x 0 by the process X(t), τ k =T k −T k−1 for k≥1 are independent and distributed as τ. Then the random variables
are independent, distributed as ζ, and have finite expectation a ζ . The integral S(t) can be represented as
where
Since τ 0 is a proper random variable, z 0 is a proper random variable as well, and hence \({z_{0}}/{t}\stackrel{\mathit{a}.\mathit {s}.}{\longrightarrow}0\) as t→∞. Further,
where γ(t)=t−T ν(t) has a proper limiting distribution as t→∞ (see Chap. 10), so \({z_{t}}/{t}\stackrel{p}{\to}0\) as t→∞. The sum \(S_{\nu(t)}=\sum_{k=1}^{\nu(t)}\zeta_{k}\) is nothing else but the generalised renewal process studied in Chaps. 10 and 11. By Theorem 11.5.2, as t→∞,
The theorem is proved. □
In order to prove the strong law of large numbers we need a somewhat more restrictive condition than that in Theorem 21.5.2. Put
Theorem 21.5.3
Let the conditions of Theorem 21.5.1 be satisfied and E ζ ∗<∞. Then
The proof
essentially repeats (as was the case for Theorem 21.5.2) that of the law of large numbers for sums of random variables defined on a Markov chain (see Theorem 13.8.3). One only needs to use, instead of (13.8.18), the relation
and the fact that \(\mathbf{E}\,\zeta_{k}^{*}<\infty\). The theorem is proved. □
Here an analogue of Theorem 13.8.2, in which the conditions of existence of E ζ ∗ and E ζ are elucidated, is the following.
Theorem 21.5.4
(Generalisation of Wald’s identity)
Let the conditions of Theorem 21.5.1 be met and there exist
where X(∞) is a random variable with the stationary distribution π. Then there exist
The proof of Theorem 21.5.4
repeats, with obvious changes, that of Theorem 13.8.2. □
Theorem 21.5.5
(The central limit theorem)
Let the conditions of Theorem 21.5.1 be met and E τ 2<∞, E ζ 2<∞. Then
where r=a ζ /a, d 2=D(ζ−rτ).
The proof, as in the case of Theorems 21.5.2–21.5.4, repeats, up to evident changes, that of Theorem 13.8.4. □
Here an analogue of Theorem 13.8.5 (on the conditions of existence of variance and on an identity for a −1 d 2) looks more complicated than under the conditions of Sect. 13.8 and is omitted.
21.6 Diffusion Processes
Now we will consider an important class of Markov processes with continuous trajectories.
Definition 21.6.1
A homogeneous Markov process ξ(t) with state space \(\langle\mathbb{R},\mathfrak{B}\rangle\) and the transition function P(t,x,B) is said to be a diffusion process if, for some finite functions a(x) and b 2(x)>0,
-
(1)
\(\lim_{\varDelta\to0} \frac{1}{ \varDelta} \int(y-x) P(\varDelta ,x,dy) =a(x)\),
-
(2)
\(\lim_{\varDelta\to0} \frac{1}{ \varDelta} \int(y-x)^{2} P (\varDelta,x,dy) =b^{2} (x)\),
-
(3)
for some δ>0 and c<∞,
$$\int|y-x|^{2+\delta} P (\varDelta,x,dy)< c\varDelta^{1+\delta/2}. $$
Put Δξ(t):=ξ(t+Δ)−ξ(t). Then the above conditions can be written in the form:
The coefficients a(x) and b(x) are called the shift and diffusion coefficients, respectively. Condition (3) is an analogue of the Lyapunov condition. It could be replaced with a Lindeberg type condition:
-
(3a)
E[(Δξ(t))2; |Δξ(t)|>ε]=o(Δ) for any ε>0 as Δ→0.
It follows immediately from condition (3) and the Kolmogorov theorem that a diffusion process ξ(t) can be thought of as a process with continuous trajectories.
The standard Wiener process w(t) is a diffusion process, since in that case
Therefore the Wiener process has zero shift and a constant diffusion coefficient. Clearly, the process w(t)+at will have shift a and the same diffusion coefficient.
We saw in Sect. 21.2 that the “local” characteristic Q of a Markov process ξ(t) with a discrete state space \(\mathcal{X}\) specifies uniquely the evolution law of the process. A similar situation takes place for diffusion processes: the distribution of the process is determined uniquely by the coefficients a(x) and b(x). The way to establishing this fact again lies via the Chapman–Kolmogorov equation.
Theorem 21.6.1
If the transition probability P(t;x,B) of a diffusion process is twice continuously differentiable with respect to x, then P(t;x,B) is differentiable with respect to t and satisfies the equation
with the initial condition
Remark 21.6.1
The conditions of the theorem on smoothness of the transition function P can actually be proved under the assumption that a and b are continuous, b≥b 0>0, |a|≤c(|x|+1) and b 2≤c(|x|+1).
Proof of Theorem 21.6.1
For brevity’s sake denote by \(P'_{t}\), \(P'_{x}\), and \(P''_{x}\) the partial derivatives \(\frac{\partial P}{\partial t}\), \(\frac{\partial P}{\partial x}\) and \(\frac{\partial^{2} P}{\partial x^{2} }\), respectively, and make use of the relation
Then by the Chapman–Kolmogorov equation
where
The first integral, by virtue of the continuity of \(P''_{x}\), does not exceed
where δ(ε)→0 as ε→0; the second integral is o(Δ) by condition (3a). Since ε is arbitrary, one has R=o(Δ) and it follows from the above that
This proves (21.6.1). The theorem is proved. □
It is known from the theory of differential equations that, under wide assumptions about the coefficients a and b and for B=(−∞,z), the Cauchy problem (21.6.1)–(21.6.2) has a unique solution P which is infinitely many times differentiable with respect to t, x and z. From this it follows that P(t;x,B) has a density p(t;x,z) which is the fundamental solution of (21.6.1).
It is also not difficult to derive from Theorem 21.6.1 that, along with P(t;x,B), the function
will also satisfy Eq. (21.6.1) for any smooth function g with a compact support, ξ (x)(t) being the diffusion process with the initial value ξ (x)(0)=x.
In the proof of Theorem 21.6.1 we considered (see (21.6.4)) the time increment Δ preceding the main time interval. In this connection Eqs. (21.6.1) are called backward Kolmogorov equations. Forward equations can be derived in a similar way.
Theorem 21.6.2
(Forward Kolmogorov equations)
Let the transition density p(t;x,y) be such that the derivatives
exist and are continuous. Then p(t,x,y) satisfies the equation
Proof
Let g(y) be a smooth function with a bounded support,
Then
Expanding the difference g(y)−g(z) into a series, we obtain in the same way as in the proof of Theorem 21.4.1 that, by virtue of properties (1)–(3), the expression in the brackets is
This implies that there exists the derivative
Integrating by parts we get
or, which is the same,
Since g is arbitrary, (21.6.5) follows. The theorem is proved. □
As in the case of discrete \(\mathcal{X}\), the difference between the forward and backward Kolmogorov equations becomes more graphical for non-homogeneous diffusion processes, when the transition probabilities P(s,x;t,B) depend on two time variables, while a and b in conditions (1)–(3) are functions of s and x. Then the backward Kolmogorov equation (for densities) will relate the derivatives of the transition densities p(s,x;t,y) with respect to the first two variables, while the forward equation will hold for the derivatives with respect to the last two variables.
We return to homogeneous diffusion processes. One can study conditions ensuring the existence of the limiting stationary distribution of ξ (x)(t) as t→∞ which is independent of x using the same approach as in Sect. 21.2. Theorem 21.2.3 will remain valid (one simply has to replace i 0 in it with x 0, in agreement with the notation of the present section). The proof of Theorem 21.2.3 also remains valid, but will need a somewhat more precise argument (in the new situation, on the event B dv one has ξ(t−v)∈dx 0 instead of ξ(t−v)=x 0).
If the stationary distribution density
exists, how could one find it? Since the dependence of p(t;x,y) of t and x vanishes as t→∞, the backward Kolmogorov equations turn into the identity 0=0 as t→∞. Turning to the forward equations and passing in (21.6.6) to the limit first as t→∞ and then as Δ→0, we come, using the same argument as in the proof of Theorem 21.2.3, to the following conclusion.
Corollary 21.6.1
If (21.6.7) and the conditions of Theorem 21.6.2 hold, then the stationary density p(y) satisfies the equation
(which is obtained from (21.6.5) if we put \(\frac {\partial p}{\partial t} =0 \)).
Example 21.6.1
The Ornstein–Uhlenbeck process
where w(u) is the standard Wiener process, is a homogeneous diffusion process with the transition density
We leave it to the reader to verify that this process has coefficients a(x)=ax, b(x)=σ=const, and that function (21.6.8) satisfies the forward and backward equations. For a<0, there exists a stationary process (the definition is given in the next chapter)
of which the density (which does not depend on t) is equal to
In conclusion of this section we will consider the problem, important for various applications, of finding the probability that the trajectory of a diffusion process will not leave a given strip. For simplicity’s sake we confine ourselves to considering this problem for the Wiener process. Let c>0 and d<0.
Put
Leaving out the verification of the fact that the function U is twice continuously differentiable, we will only prove the following proposition.
Theorem 21.6.3
The function U satisfies Eq. (21.6.1) with the initial condition
and boundary conditions
Proof
First of all note that the function U(t;x,B) for x∈(d,c) satisfies conditions (1)–(3) imposed on the transition function P(t;x,B). Indeed, consider, for instance, property (1).
We have to verify that
(with a(x)=0 in our case). But U(t,x,B)=P(t;x,B)−V(t;x,B), where
and
The first probability in the brackets is given, as we know (see (20.2.1) and Theorem 19.2.2), by the value
For any x<c and k>0, it is o(Δ k). The same holds for the second probability. Therefore (21.6.11) is proved. In the same way one can verify properties (2) and (3).
Further, because by the total probability formula, for x∈(d,c),
using an expansion of the form (21.6.3) for the function U, we obtain in the same way as in (21.6.4) that
This implies that \(\frac{\partial U}{ \partial t}\) exists and that Eq. (21.6.1) holds for the function U.
That the boundary and initial conditions are met is obvious. The theorem is proved. □
The reader can verify that the function
playing the role of the fundamental solution to the boundary problem (21.6.9)–(21.6.10) (the function u satisfies (21.6.1) with the boundary conditions (21.6.10) and the initial conditions degenerating into the δ-function), is equal to
This expression can also be obtained directly from probabilistic considerations (see, e.g., [32]).
References
Skorokhod, A.V.: Random Processes with Independent Increments. Kluwer Academic, Dordrecht (1991)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Borovkov, A.A. (2013). Markov Processes. In: Probability Theory. Universitext. Springer, London. https://doi.org/10.1007/978-1-4471-5201-9_21
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5201-9_21
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5200-2
Online ISBN: 978-1-4471-5201-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)