Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

In the early twentieth century, Markov (1856–1922) introduced in [67] a new class of models called Markov chains, applying sequences of dependent random variables that enable one to capture dependencies over time. Since that time, Markov chains have developed significantly, which is reflected in the achievements of Kolmogorov, Feller, Doob, Dynkin, and many others. The significance of the extensive theory of Markov chains and the continuous-time variant called Markov processes is that it can be successfully applied to the modeling behavior of many problems in, for example, physics, biology, and economics, where the outcome of one experiment can affect the outcome of subsequent experiments. The terminology is not consistent in the literature, and many authors use the same name (Markov chain) for both discrete and continuous cases. We also apply this terminology.

Heuristically, the property that characterizes Markov chains can be expressed by the so-called memoryless notion (Markov property) as follows: a Markov chain is a stochastic process for which future behavior, given the past and the present, depends only on the present and not on the past.

This chapter presents a brief introduction to the theory of discrete-time Markov chains (DTMCs) and to the continuous-time variant, continuous-time Markov chains (CTMCs), that will be applied to the modeling and analysis of queueing systems. Note that DTMCs and CTMCs taking values in a set of countable elements have many similar properties; however, in contrast to discrete-time processes, the characteristics of a sample path essentially differ in continuous cases.

We limit ourselves here to the definition of Markov processes and to their basic properties with countable state space in discrete time \(\mathcal{T} =\{ 0,1,\ldots \}\) and continuous time \(\mathcal{T} = [0,\infty )\). In connection with the classic results discussed in this chapter, we refer mainly to the classic works [35, 36].

Consider a discrete-time or continuous-time stochastic process \(X = ({X}_{t},\ t \in \mathcal{T} )\) given on a probability space \((\Omega ,\mathcal{A},P)\) and taking values in a countable set, called the state space, \(\mathcal{X} =\{ {x}_{0},{x}_{1},\ldots \}\). The state space \(\mathcal{X}\) is called finite if it consists of a finite number of elements. The sample path of a discrete-time process with discrete sample space is defined in the space of sequences \(\mathcal{S} =\{ {x}_{{k}_{0}},{x}_{{k}_{1}},\ldots \}\), \({x}_{{k}_{i}} \in \mathcal{X}\), while it is an element of the space of all functions \(\mathcal{S} =\{ {x}_{t} :\ {x}_{t} \in \mathcal{X},\ t \geq 0\}\) in continuous-time cases.

We say that the process is in the state \(x \in \mathcal{X}\) at the time \(t \in \mathcal{T}\) if X t  = x. The process starts from a state \({x}_{0} \in \mathcal{X}\) determined by the distribution of the random variable X 0, which is the initial distribution of the process. If there exists a state \({x}_{0} \in \mathcal{X}\) for which P \(\left ({X}_{0} = {x}_{0}\right ) = 1\), then the state x 0 is called the initial state. The state of the process can change from time to time, and these changes in state are known as transitions. The probabilities of these state changes are called transition probabilities, which with the initial distribution determine the statistical behavior of the process.

If we denote by \({\mathcal{B}}_{X}\) the σ-algebra of all subsets of the state space \(\mathcal{X}\), then the pair \((\mathcal{X},{\mathcal{B}}_{X})\) is a measurable space and the connection \(\{{X}_{t} \in A\} \in \mathcal{A}\) holds for all \(t \in \mathcal{T}\) and \(\{A \in {\mathcal{B}}_{X}\} \in \mathcal{A}\).

FormalPara Definition 3.1.

A stochastic process \(({X}_{t},t \in \mathcal{T} )\) with the discrete state space \(\mathcal{X}\) is called a Markov chain if for every nonnegative integer n and for all \({t}_{0} < \ldots < {t}_{n} < {t}_{n+1},\)  \({t}_{i} \in \mathcal{T} ,\ \ {x}_{0},\ldots ,{x}_{n+1} \in \mathcal{X}\)

$$\begin{array}{rcl} \mathbf{P}\left ({X}_{{t}_{n+1}} = {x}_{n+1}\ \vert \ {X}_{{t}_{0}} = {x}_{0},\ldots ,{X}_{{t}_{n}} = {x}_{n}\right ) = \mathbf{P}\left ({X}_{{t}_{n+1}} = {x}_{n+1}\ \vert \ {X}_{{t}_{n}} = {x}_{n}\right ),& & \\ & &\end{array}$$
(3.1)

provided that this conditional probability exists. Let \(\ x,y \in \mathcal{X},\ s \leq t,\ s,t \in \mathcal{T}\); then the function

$${p}_{x,y}(s,t) = \mathbf{P}\left ({X}_{t} = y\ \vert \ {X}_{s} = x\right )$$

is called a transition probability function of a Markov chain. If the equation \({p}_{x,y}(s,t) = {p}_{x,y}(t - s)\) holds for all \(x,y \in \mathcal{X},\ s \leq t,\ s,t \in \mathcal{T}\), then the Markov chain is called (time) homogeneous; otherwise it is known as inhomogeneous.

In both discrete- and continuous-time cases, this definition expresses the aforementioned memoryless property of a Markov chain, and it ensures that the transition probabilities depend only on the present state X s , not on how the present state was reached. We start with a discussion of DTMCs.

1 Discrete-Time Markov Chains with Discrete State Space

Given a Markov chain \(X = ({X}_{t},\ t \in \mathcal{T} )\), \(\mathcal{T} =\{ 0,1,\ldots \}\) on a probability space \((\Omega ,\mathcal{A},P)\) taking values in a finite or countably infinite set of elements \(\mathcal{X}\). It is conventional to denote the finite state space by the set \(\mathcal{X} =\{ 0,1,\ldots ,K\}\) \((0 < K < \infty )\) and the countably infinite one by \(\mathcal{X} =\{ 0,1,\ldots \}\). This notation is quite reasonable for queueing systems, and in general, it does not lead to a separate problem if the elements of \(\mathcal{X}\) serve to distinguish the states only; otherwise, the state space is chosen based on practical requirements. Assume that the events \(\{{X}_{t} = i\},\ i \in \mathcal{X}\), are disjoint for all \(t \in \mathcal{T}\).

In the discrete-time case we can give an alternative definition of a Markov chain instead of Eq. (3.1).

Definition 3.2.

A discrete-time stochastic process X with state space \(\mathcal{X}\) is called a Markov chain if for every n = 0, 1,  and for all states \({i}_{0},\ldots ,{i}_{n+1} \in \mathcal{X}\)

$$\begin{array}{rcl}{ p}_{{i}_{n},{i}_{n+1}}(n,n + 1)& =& \mathbf{P}\left ({X}_{n+1} = {i}_{n+1}\ \vert \ {X}_{0} = {i}_{0},\ldots ,{X}_{n} = {i}_{n}\right ) \\ & =& \mathbf{P}\left ({X}_{n+1} = {i}_{n+1}\ \vert \ {X}_{n} = {i}_{n}\right ), \end{array}$$
(3.2)

provided that a conditional probability exists. The probability

$${p}_{i,j}(n,n + 1) = \mathbf{P}\left ({X}_{n+1} = j\ \vert \ {X}_{n} = i\right ),\ i,j \in \mathcal{X}\text{ , }n = 0,1,\ldots ,$$

is called a one-step transition probability, which is the probability of a transition from a state i to a state j in a single step from time n to time n + 1.

Relation (3.2) is simpler in our case than that of Eq. (3.1), but it can be easily checked that they are equivalent to each other. Here we can define, from a practical point of view, the transition probability \({p}_{i,j}(s,t) = 0\), when the probability of the event \(\{{X}_{s} = i\}\) equals 0 at the time point s because if \(\mathbf{P}\{{X}_{s} = i\} = 0\) holds, then the sample path arrives at the state i with probability 0 at time s; therefore, the quantity \({p}_{i,j}(s,t)\) can be defined freely in this case.

Definition 3.3.

We say that a stochastic process X with state space \(\mathcal{X}\) is a Markov chain of m -order (or a Markov chain with memory m) if for every n = 1, 2,  and for arbitrary states \({i}_{k} \in \mathcal{X},\) \(k = 0,\ldots ,n + m\),

$$\begin{array}{rcl} & \mathbf{P}\left ({X}_{n+m} = {i}_{n+m}\ \vert \ {X}_{0} = {i}_{0},\ldots ,{X}_{n+m-1} = {i}_{n+m-1}\right ) & \\ & \quad = \mathbf{P}\left ({X}_{n+m} = {i}_{n+m}\ \vert \ {X}_{n} = {i}_{n},\ldots ,{X}_{n+m-1} = {i}_{n+m-1}\right ),& \\ \end{array}$$

provided that a conditional probability exists.

It is not difficult to verify that an m-order Markov chain can be represented as a first-order one if we introduce a new m-dimensional process as follows. Define the vector-valued process \(Y = ({Y }_{0},{Y }_{1},\ldots ),\)

$${Y }_{n} = ({X}_{n},\ldots ,{X}_{n+m-1}),\ n = 0,1,\ldots ,$$

with state space

$${\mathcal{X}}^{{\prime}} =\{ ({k}_{ 1},\ldots ,{k}_{m}) :\ {k}_{1},\ldots ,{k}_{m} \in \mathcal{X}\}.$$

Then the process Y is a first-order Markov chain because

$$\begin{array}{rcl} & & \mathbf{P}\left ({Y }_{n+1} = ({i}_{n+1},\ldots ,{i}_{n+m})\ \vert \ {Y }_{0} = ({i}_{0},\ldots ,{i}_{m-1}),\ldots ,{Y }_{n} = ({i}_{n},\ldots ,{i}_{n+m-1})\right ) \\ & & \quad = \mathbf{P}\left ({X}_{n+m} = {i}_{n+m},\ldots ,{X}_{n+1} = {i}_{n+1}\ \vert \ {X}_{0} = {i}_{0},\ldots ,{X}_{n+m-1} = {i}_{n+m-1}\right ) \\ & & \quad = \mathbf{P}\left ({X}_{n+m} = {i}_{n+m},\ldots ,{X}_{n+1} = {i}_{n+1}\ \vert \ {X}_{n} = {i}_{n},\ldots ,{X}_{n+m-1} = {i}_{n+m-1}\right ) \\ & & \quad = \mathbf{P}\left ({Y }_{n+1} = ({i}_{n+1},\ldots ,{i}_{n+m})\ \vert \ {Y }_{n} = ({i}_{n},\ldots ,{i}_{n+m-1})\right ).\end{array}$$

This is why we consider only first-order Markov chains and why, later on, we will write only Markov chain instead of Markov chain of first order.

In the theory of DTMCs, the initial distribution

$$P = ({p}_{i},\ i \in \mathcal{X}),\text{ where }{p}_{i} = \mathbf{P}\left ({X}_{0} = i\right ),$$

and the transition probabilities [see Eq. (3.2)]

$${p}_{ij}(n,n + 1),\ i,j \in \mathcal{X},n = 0,1,\ldots ,$$

play a fundamental role because the statistical behavior of a Markov chain is completely determined by them (Theorem 3.4).

The states i and j, which play a role in Definition 3.2, can be identical, which means that the process can remain in the same state at the next time point. We say that a Markov chain X is (time) homogeneous if the transition probabilities do not depend on time shifting, that is,

$${p}_{ij} = \mathbf{P}\left ({X}_{n+1} = j\ \vert \ {X}_{n} = i\right ) = \mathbf{P}\left ({X}_{1} = j\ \vert \ {X}_{0} = i\right ),\ \ i,j \in \mathcal{X},n = 0,1,\ldots .$$

If a Markov chain is not homogeneous, then it is called inhomogeneous.

1.1 Homogeneous Markov Chains

From a practical point of view, the class of homogeneous Markov chains plays a significant role; therefore, in this chapter we will investigate the properties of this class of processes. However, many results for homogeneous cases remain valid in the inhomogeneous case, too.

By definition, for a homogeneous Markov chain the one-step transition probability (or simply transition probability) \({p}_{i,j},\ i,j \in \mathcal{X}\), equals the probability that, starting from the initial state X 0 = i at time 0, the process will be in the state j at the next time point 1, and this probability does not change if we take the transition probability in arbitrary time n = 1, 2, ,

$${p}_{ij} = \mathbf{P}\{{X}_{1} = j\ \vert \ {X}_{0} = i\} = \mathbf{P}\{{X}_{n+1} = j\ \vert \ {X}_{n} = i\}.$$

The transition probabilities satisfy the equation

$$\sum\limits_{j\in \mathcal{X}}{p}_{ij} = 1.$$

This equation expresses the obvious fact that starting in a state i at the next time point the process takes certainly some state \(j \in \mathcal{X}\). The following theorem states that the initial distribution and the transition probabilities determine the finite-dimensional distribution of a homogeneous Markov chain, and as a consequence we obtain that a Markov chain can be given in a statistical sense with the state space, the initial distribution, and the transition probabilities.

Theorem 3.4.

The finite-dimensional distributions of a Markov chain X are uniquely determined by the initial distribution and the transition probabilities and

$$\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{n} = {i}_{n}\right ) = {p}_{{i}_{n-1}{i}_{n}}{p}_{{i}_{n-2}{i}_{n-1}} \cdot \ldots \cdot {p}_{{i}_{0}{i}_{1}}{p}_{{i}_{0}}.$$
(3.3)

Proof.

Let n be a positive integer, and let \({i}_{0},\ldots ,{i}_{n} \in \mathcal{X}\). First, assume that \(\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{n} = {i}_{n}\right ) > 0\). By the definition of conditional probability,

$$\begin{array}{rcl} & & \mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{n} = {i}_{n}\right ) \\ & & \quad = \mathbf{P}\left ({X}_{n} = {i}_{n}\ \vert \ {X}_{0} = {i}_{0},\ldots ,{X}_{n-1} = {i}_{n-1}\right )\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{n-1} = {i}_{n-1}\right ) = \ldots \\ & & \quad = \mathbf{P}\left ({X}_{n} = {i}_{n}\ \vert \ {X}_{0} = {i}_{0},\ldots ,{X}_{n-1} = {i}_{n-1}\right ) \\ & & \qquad \times \mathbf{P}\left ({X}_{n-1} = {i}_{n-1}\ \vert \ {X}_{0} = {i}_{0},\ldots ,{X}_{n-2} = {i}_{n-2}\right ) \cdot \ldots \cdot \mathbf{P}\left ({X}_{1} = {i}_{1}\ \vert \ {X}_{0} = {i}_{0}\right ) \\ & & \qquad \qquad \,\mathbf{P}\left ({X}_{0} = {i}_{0}\right ).\end{array}$$

Using the Markov property we can rewrite this formula in the form

$$\begin{array}{rcl} & & \mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{n} = {i}_{n}\right ) \\ & & \quad = \mathbf{P}\left ({X}_{n} = {i}_{n}\ \vert \ {X}_{n-1} = {i}_{n-1}\right ) \cdot \ldots \cdot \mathbf{P}\left ({X}_{1} = {i}_{1}\ \vert \ {X}_{0} = {i}_{0}\right )\mathbf{P}\left ({X}_{0} = {i}_{0}\right ) \\ & & \quad = {p}_{{i}_{n-1}{i}_{n}}{p}_{{i}_{n-2}{i}_{n-1}} \cdot \ldots \cdot {p}_{{i}_{0}{i}_{1}}{p}_{{i}_{0}}.\end{array}$$

If \(\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{n} = {i}_{n}\right ) = 0\), then either \(\mathbf{P}\left ({X}_{0} = {i}_{0}\right ) = {p}_{{i}_{0}} = 0\) or there exists an index m, \(0 \leq m \leq n - 1\), for which

$$\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{m} = {i}_{m}\right ) > 0\text{ and }\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{m+1} = {i}_{m+1}\right ) = 0.$$

Consequently,

$$\begin{array}{rcl} & & \mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{m+1} = {i}_{m+1}\right ) \\ & & \quad = \mathbf{P}\left ({X}_{m+1} = {i}_{m+1}\ \vert \ {X}_{0} = {i}_{0},\ldots ,{X}_{m} = {i}_{m}\right )\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{m} = {i}_{m}\right ) \\ & & \quad = {p}_{{i}_{m}{i}_{m+1}}\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{m} = {i}_{m}\right ), \\ \end{array}$$

and therefore \({p}_{{i}_{m}{i}_{m+1}} = 0\). This means that the product \({p}_{{i}_{n-1}{i}_{n}}{p}_{{i}_{n-2}{i}_{n-1}} \cdot \ldots \cdot {p}_{{i}_{0}{i}_{1}}{p}_{{i}_{0}}\) equals 0 in both cases, and so assertion (3.3) of the theorem is true. □ 

Comment 3.5.

From relation (3.4) it immediately follows that for any \({A}_{i} \subset \mathcal{X},\ 0 \leq i \leq n\) the probability \(\mathbf{P}\left ({X}_{0} \in {A}_{0},\ldots ,{X}_{n} \in {A}_{n}\right )\) can be given in the form

$$\mathbf{P}\left ({X}_{0} \in {A}_{0},\ldots ,{X}_{n} \in {A}_{n}\right ) =\sum\limits_{{i}_{0}\in {A}_{0}}\ldots \sum\limits_{{i}_{n}\in {A}_{n}}\mathbf{P}\left ({X}_{0} = {i}_{0},\ldots ,{X}_{n} = {i}_{n}\right ),$$

where the probabilities are determined by relation  (3.3) , that is, with the help of the initial distribution and the transition probabilities.

The following remark clarifies an essential property of the homogeneous Markov chain, and on that basis limit theorems can be proved. This property relates the behavior of Markov chains to renewal and regenerative processes, which we will discuss later on in Sects. 4.1 and 4.2.

Comment 3.6.

From the memoryless property of a Markov chain X it follows that we can divide the time access into disjoint parts where the process behavior is mutually independent and follows the same probabilistic rules. We define the limits of these independent parts by the time instants when the process stays in the state \({i}_{0} \in \mathcal{X}\) .

Formally, we define the sequence of random time points \(0 \leq {\tau }_{1} < {\tau }_{2} < \ldots \) by the condition \({X}_{{\tau }_{n}} = {i}_{0}\) , n = 1,2,…, and \({X}_{s}\neq {i}_{0}\)  if   \(s\not\in \{{\tau }_{1},{\tau }_{2},\ldots \}\) . In this way \(0 \leq {\tau }_{1} < {\tau }_{2} < \ldots \) are the times of the first, second, etc. visits to the state i 0 , and i 0 is not visited between \({\tau }_{n}\) and \({\tau }_{n+1}\) , n = 1,2,…. We define \({Y }_{n}\) and Z n,k by \({Y }_{n} = {\tau }_{n+1} - {\tau }_{n}\) and \({Z}_{n,k} = {X}_{{\tau }_{n}+k},\ 0 \leq k < {Y }_{n}\) . Y n is the time between the nth and the n + 1th visits to i 0 , and \({Z}_{n,k}\) is the state of the process at k steps after the nth visit to i 0 , having that the next visit to i 0 is after \({\tau }_{n} + k\) . Using the memoryless property of the Markov chain X we obtain that the random variables \(({Y }_{n},\ {Z}_{n,k},\ 0 \leq k < {Y }_{n}),\ n = 1,2,\ldots \) , are independent and their stochastic behaviors are identical. This fact ensures that the process is regenerative (Sect.  4.2 ).

In many cases, the study of Markov chains will be made simpler by the use of transition probability matrices.

Definition 3.7.

The matrices associated with the transition probabilities of a Markov chain \(\mathcal{X}\) with finite or countably infinite elements are

$$\mathbf{\Pi } = \left [\begin{array}{cccc} {p}_{00} & {p}_{01} & \cdots & {p}_{0N} \\ {p}_{10} & {p}_{11} & \cdots & {p}_{1N}\\ \vdots & \vdots & \ddots & \vdots \\ {p}_{N0} & {p}_{N1} & \cdots &{p}_{NN} \end{array} \right ]\text{ and }\mathbf{\Pi } = \left [\begin{array}{cccc} {p}_{00} & {p}_{01} & {p}_{12} & \cdots \\ {p}_{ 10} & {p}_{11} & {p}_{12} & \cdots \\ {p}_{ 20} & {p}_{21} & {p}_{22} & \cdots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right ].$$

These matrices are called (one-step) transition probability matrices.

A matrix with nonnegative entries \(\mathbf{A} ={ \left [{a}_{ij}\right ]}_{i,j\in \mathcal{X}}\) is called a stochastic matrix if for every row the sum of row elements equals 1. Then all transition probability matrices are stochastic ones:

  1. (a)

    The elements of \(\Pi \) are obviously nonnegative,

    $${p}_{ij} \geq 0,\ i,j \in \mathcal{X}.$$
  2. (b)

    For every i the sum of the ith row elements of \(\Pi \) equals 1,

    $$\sum\limits_{j\in \mathcal{X}}{p}_{ij} = 1,\ i \in \mathcal{X}.$$

The first of the following three examples shows that a sequence of independent and identically distributed discrete random variables is a homogeneous Markov chain. The second one shows that the sequence of sums of these random variables also constitutes a homogeneous Markov chain. If in the second case the random variables are independent, but not identically distributed, then the defined sequences will be an inhomogeneous Markov chain. The third example describes the stochastic behavior of a random walk on the real number line; in this case it is reasonable to choose the state space to be the set of all integer numbers, that is, \(\mathcal{X} =\{ 0,\pm 1,\pm 2,\ldots \}\).

Let \({Z}_{0},{Z}_{1},\ldots \) be a sequence of independent and identically distributed random variables with a common CDF

$$\mathbf{P}\left ({Z}_{m} = k\right ) = {p}_{k},\ {p}_{k} \geq 0,\ k = 0,1,\ldots ,\ m = 0,1,\ldots \,.$$

Example 3.8.

Define the discrete-time stochastic process X with the relation \({X}_{n} = {Z}_{n},\ n = 0,1,\ldots \). Then X is a homogeneous Markov chain with initial distribution \(\mathbf{P}\left ({X}_{0} = k\right ) = {p}_{k},\ k = 0,1,\ldots \) and transition probability matrix

$$\mathbf{\Pi } = \left [\begin{array}{cccc} {p}_{0} & {p}_{1} & {p}_{2} & \cdots \\ {p}_{ 0} & {p}_{1} & {p}_{2} & \cdots \\ {p}_{ 0} & {p}_{1} & {p}_{2} & \cdots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right ].$$

Example 3.9.

Consider the process \({X}_{n} = {Z}_{1} + \ldots + {Z}_{n},\ n = 0,1,\ldots \), with the initial distribution \(\mathbf{P}\left ({X}_{0} = 0\right ) = 1\), i.e., the initial state is 0. The one-step transition probabilities are

$$\begin{array}{rcl}{ p}_{ij}(n,n + 1)& =& \mathbf{P}\left ({X}_{n+1} = j\ \vert \ {X}_{n} = i\right ) \\ & =& \mathbf{P}\left ({Z}_{1} + \ldots + {Z}_{n+1} = j\ \vert \ {Z}_{1} + \ldots + {Z}_{n} = i\right ) \\ & =& \mathbf{P}\left ({Z}_{n+1} = j - i\right ) = \left \{\begin{array}{c} {p}_{j-i},\text{ ha }j \geq i, \\ \ \ 0,\text{ ha }\ j < i. \end{array} \right. \end{array}$$

This means that the process X is a homogeneous Markov chain with the transition probability matrix

$$\mathbf{\Pi } = \left [\begin{array}{cccccc} {p}_{0} & {p}_{1} & {p}_{2} & {p}_{3} & {p}_{4} & \cdots \\ 0 &{p}_{0} & {p}_{1} & {p}_{2} & {p}_{3} & \cdots \\ 0 & 0 &{p}_{0} & {p}_{1} & {p}_{2} & \cdots \\ 0 & 0 & 0 &{p}_{0} & {p}_{1} & \cdots \\ \vdots & \ddots & \ddots & \ddots & \ddots & \ddots \end{array} \right ].$$

Example 3.10.

Now let

$$\mathbf{P}\left ({Z}_{i} = +1\right ) = p,\ \mathbf{P}\left ({Z}_{i} = -1\right ) = 1 - p\ \ \ (0 < p < 1),\ i = 1,2,\ldots ,$$

be the common distribution function of a sequence of independent random variables \({Z}_{0},{Z}_{1},\ldots \), and define the process \({X}_{n} = {Z}_{1} + \ldots + {Z}_{n},\ n = 1,2,\ldots \). Let \(\mathbf{P}\left ({X}_{0} = 0\right ) = 1\) be the initial distribution of the process X. Then the process X is a homogeneous Markov chain with initial state X 0 = 0 and transition probability matrix

$$\begin{array}{rcl}{ p}_{ij}(n,n + 1)& =& \mathbf{P}\left ({X}_{n+1} = j\ \vert \ {X}_{n} = i\right ) \\ & =& \mathbf{P}\left ({Z}_{1} + \ldots + {Z}_{n+1} = j\ \vert \ {Z}_{1} + \ldots + {Z}_{n} = i\right ) \\ & =& \mathbf{P}\left ({Z}_{n+1} = j - i\right ) = \left \{\begin{array}{c} \ \ \ p,\ \ \ \text{ if }j = i + 1,\ \\ 1 - p,\text{ if }j = i - 1, \\ \ \ \text{ }0,\text{ if }\left \vert i - j\right \vert \neq 1. \end{array} \right. \end{array}$$

The process X describes the random walk on the number line starting from the origin and moves at every step one unit to the right with probability p and to the left with probability (1 − p), with these moves being independent of each other. The case \(p = 1/2\) corresponds to the symmetric random walk.

Figure 3.1 demonstrates the transitions of the random walk, while Fig. 3.2 shows the transitions of a Markov chain with a finite state space.

Fig. 3.1
figure 1

Random walk

Fig. 3.2
figure 2

Markov chain with finite state space

1.2 The m-Step Transition Probabilities

Let X be a DTMC with discrete state space \(\mathcal{X}\). Denote by

$${p}_{ij}(s,t) = \mathbf{P}\left ({X}_{t} = j\ \vert \ {X}_{s} = i\right )$$

the transition probabilities of X and by

$$\Pi (s,t) ={ \left [{p}_{ij}(s,t)\right ]}_{i,j\in \mathcal{X}},\ i,j \in \mathcal{X}\text{ and }0 \leq s \leq t < \infty $$

the transition probability matrices. We set for s = t

$${ p}_{ij}(s,s) = \left \{\begin{array}{c} 1,\text{ if }i = j,\\ 0, \text{ if } i\neq j. \end{array} \right.$$

If the Markov chain X is homogeneous, then the transition probability \({p}_{ij}(s,t)\) depends only on the difference t − s. Thus, using the notation \(t = s + m\), we have

$${p}_{ij}(s,s + m) = {p}_{ij}(m),\ s,m = 0,1,\ldots ,\ i,j \in \mathcal{X}.$$

Definition 3.11.

The quantities \({p}_{ij}(m),\ m = 0,1,\ldots ,\ i,j \in \mathcal{X}\) are called the m -step transition probabilities of the Markov chain X, and the matrix \(\Pi (m) ={ \left [{p}_{ij}(m)\right ]}_{i,j\in \mathcal{X}}\) associated with them is called an m -step transition probability matrix.

Theorem 3.12 (Chapman–Kolmogorov equation). 

For every nonnegative integer number r,s, the (r + s)-step transition probabilities of the homogeneous Markov chain satisfy the equation

$${p}_{ij}(r + s) =\sum\limits_{k\in \mathcal{X}}{p}_{ik}(r){p}_{kj}(s).$$
(3.4)

Proof.

Assume the initial state of the process is i, that is, the process starts from the state i at the time point 0. First we note that the relation

$${p}_{ik}(r) = \mathbf{P}\left ({X}_{r} = k\ \vert \ {X}_{0} = i\right ) = \frac{\mathbf{P}\left ({X}_{0} = i,{X}_{r} = k\right )} {\mathbf{P}\left ({X}_{0} = i\right )} = 0$$

holds for some state k if and only if \(\mathbf{P}\left ({X}_{0} = i,{X}_{r} = k\right ) = 0\). On the other hand, since \(\{{X}_{r} = k\},\ k \in \mathcal{X}\) form a complete system of events, \(\sum\limits_{k\in \mathcal{X}}\mathbf{P}\left ({X}_{r} = k\right ) = 1\), and, in accordance with the definitions of the (r + s)-step transition probability and the conditional probability, we obtain

$$\begin{array}{rcl}{ p}_{ij}(r + s)& =& \mathbf{P}\left ({X}_{r+s} = j\ \vert \ {X}_{0} = i\right ) \\ & =& \frac{\mathbf{P}\left ({X}_{r+s} = j,\ {X}_{0} = i\right )} {\mathbf{P}\left ({X}_{0} = i\right )} =\sum\limits_{k\in \mathcal{X}}\frac{\mathbf{P}\left ({X}_{r+s} = j,{X}_{0} = i,{X}_{r} = k\right )} {\mathbf{P}\left ({X}_{0} = i\right )} \\ & =& \sum\limits_{k\in \mathcal{X}}{\mathcal{I}}_{\left \{{p}_{ik}\neq 0\right \}}\frac{\mathbf{P}\left ({X}_{0} = i,{X}_{r} = k\right )} {\mathbf{P}\left ({X}_{0} = i\right )} \frac{\mathbf{P}\left ({X}_{r+s} = j,{X}_{0} = i,{X}_{r} = k\right )} {\mathbf{P}\left ({X}_{0} = i,{X}_{r} = k\right )} \\ & =& \sum\limits_{k\in \mathcal{X}}{\mathcal{I}}_{\left \{{p}_{ik}\neq 0\right \}}\mathbf{P}\left ({X}_{r} = k\ \vert \ {X}_{0} = i\right )\mathbf{P}\left ({X}_{r+s} = j\ \vert \ {X}_{r} = k,{X}_{0} = i\right ) \\ & =& \sum\limits_{k\in \mathcal{X}}{\mathcal{I}}_{\left \{{p}_{ik}\neq 0\right \}}{p}_{ik}(0,r){p}_{kj}(r,r + s) =\sum\limits_{k\in \mathcal{X}}{p}_{ik}(r){p}_{kj}(s).\end{array}$$

 □ 

If we use the matrix notation \(\mathbf{\Pi }(s,t) ={ \left [{p}_{ij}(s,t)\right ]}_{i,j\in \mathcal{X}}\), then the Chapman–Kolmogorov equation can be rewritten in the matrix form

$$\mathbf{\Pi }(s,t) = \mathbf{\Pi }(s,r)\mathbf{\Pi }(r,t),$$

where s, r, t, and n are integer numbers satisfying the inequality \(0 \leq s \leq r \leq t,n \geq 1\). Successively repeating this relation we have

$$\mathbf{\Pi }(0,n) = \mathbf{\Pi }(0,1)\mathbf{\Pi }(1,n) = \ldots = \mathbf{\Pi }(0,1)\mathbf{\Pi }(1,2) \cdot \ldots \cdot \mathbf{\Pi }(n - 1,n).$$

Consequently, the m-step transition probability matrix of a homogeneous Markov chain can be given in the form

$$\mathbf{\Pi }(m) ={ \mathbf{\Pi }}^{m},$$

where \(\mathbf{\Pi } = \mathbf{\Pi }(0,1)\) is the (one-step) transition probability matrix of the Markov chain.

1.3 Classification of States of Homogeneous Markov Chains

The behavior of a Markov chain and its asymptotic properties essentially depend on the transition probabilities, which reflect the connections among the different states.

Denote by \({P}_{i}(t) = \mathbf{P}\left ({X}_{t} = i\right ),i \in \mathcal{X}\) the distribution of the Markov chain X at the time \(t \geq 0\). One of the most important questions in the theory of Markov chains concerns the conditions under which a limit distribution exists for all initial states \({X}_{0} = k \in \mathcal{X}\),

$$\mathop{\lim }\limits_{t \rightarrow \infty }P(t) = \pi = ({\pi }_{i},i \in \mathcal{X}),$$

of the time-dependent distribution \(P(t)=\left ({P}_{i}(t),i \in \mathcal{X}\right )\), where \({\pi }_{i}\geq 0,\ \sum\limits_{i\in \mathcal{X}}{\pi }_{i} = 1\). In the answer to this question, the arithmetic properties of the transition probabilities play an important role.

To demonstrate this fact, consider the case where the sample space \(\mathcal{X}\) can be divided into two disjoint (nonempty) sets \({\mathcal{X}}_{1}\) and \({\mathcal{X}}_{2}\) such that

$${p}_{ij} = {p}_{ji} = 0,\text{ for all }i \in {\mathcal{X}}_{1}\text{ and }j \in {\mathcal{X}}_{2}.$$

Obviously, if \({X}_{0} = {i}_{0} \in {\mathcal{X}}_{1}\) is the initial state, then the relation \({X}_{t} \in {\mathcal{X}}_{1}\) is valid for all \(t \geq 0\), and in the opposite case, \({X}_{t} \in {\mathcal{X}}_{2}\) for all \(t \geq 0\) holds if the initial state i 0 satisfies the condition \({i}_{0} \in {\mathcal{X}}_{2}\). This means that in this case we can in fact consider two Markov chains \(({\mathcal{X}}_{k},({P}_{i}(0),\ i \in {\mathcal{X}}_{k}),{\Pi }_{k})\), k = 1, 2, that can be investigated independently of each other.

Definition 3.13.

The state \(j \in \mathcal{X}\) is accessible from the state \(i \in \mathcal{X}\) (denoted by \(i \rightarrow j\)) if there exists a positive integer m such that \({p}_{ij}(m) > 0\). If the states \(i,j \in \mathcal{X}\) are mutually accessible from each other, then we say that they communicate (denoted by \(i\longleftrightarrow j\)).

\({p}_{ii}(0) = 1,\ i \in \mathcal{X}\) represents the assumption that “every state is accessible in 0 steps from itself.” If the state \(j \in \mathcal{X}\) is not accessible from the state \(i \in \mathcal{X}\) (denoted by \(i \nrightarrow j\)), then \({p}_{ij}(m) = 0,\ m \geq 1\). It is easy to check that \(i\longleftrightarrow j\) is an equivalence relation: it is reflexive, transitive, and symmetric. Furthermore, if the states i and j do not communicate, then either \({p}_{ij}(m) = 0,\ m \geq 1\), or \({p}_{ji}(m) = 0\), \(m \geq 1\). If a state i satisfies the condition \({p}_{ii} = {p}_{ii}(1) = 1\), then the state i is called absorbing. This means that if the process visits an absorbing state at time t, then it remains there forever and no more state transitions occur.

If the state space \(\mathcal{X}\) does not consist of the states i and j such that \(i \rightarrow j\), but \(j \nrightarrow i\), then \(\mathcal{X}\) can be given as a union of finite or countable disjoint sets

$$\mathcal{X} = {\mathcal{X}}_{1} \cup {\mathcal{X}}_{2} \cup \ldots ,$$

where for every k the states of \({\mathcal{X}}_{k}\) communicate, while for every k, n, \(k\neq n\), the states of \({\mathcal{X}}_{k}\) cannot be accessible from the states of \({\mathcal{X}}_{n}\).

Definition 3.14.

A set of states is called irreducible if all pairs of its elements communicate.

In the theory of Markov chains, irreducible classes play an important role because they can be independently analyzed.

Definition 3.15.

A Markov chain is called irreducible if all pairs of its states communicate.

Clearly, if a Markov chain is irreducible, then it consists of only one irreducible class of states, that is, for every \(i,j \in \mathcal{X}\) there exists an integer \(m \geq 1\) (depending on i and j) such that \({p}_{ij}(m) > 0\).

Definition 3.16.

For every i denote by d(i) the greatest common divisor of integer numbers \(m \geq 1\) for which \({p}_{ii}(m) > 0\). If \({p}_{ii}(m) = 0\) for every m, then we set d(i) = 0. Then the number d(i) is called the period of the Markov chain. If d(i) = 1 for every state, then the Markov chain is called aperiodic.

Example 3.17 (Periodic Markov chain). 

Consider the random walk on the number line demonstrated earlier in Example 3.10. Starting from an arbitrary state i we can return to state i with positive probabilities in steps \(2,4,\ldots \) only. It is clear that in this case, \({p}_{ii}(2k) > 0\) and \({p}_{ii}(2(k - 1) + 1) = 0\) for every \(i \in \mathcal{X}\) and \(k = 1,2,\ldots \); therefore, d(i) = 2. At the same time, the Markov chain is obviously irreducible.

Theorem 3.18.

Let X be a homogeneous Markov chain with state space \(\mathcal{X}\) , and let \({\mathcal{X}}^{{\prime}}\subset \mathcal{X}\) be a nonempty irreducible class. Then for every \(i,j \in {\mathcal{X}}^{{\prime}}\) , the periods of i and j are the same, i.e., d(i) = d(j).

Proof.

Let \(i,j \in {\mathcal{X}}^{{\prime}}\), \(i\neq j\), be two arbitrary states. Since \({\mathcal{X}}^{{\prime}}\) is an irreducible class, there exist \(t,s \geq 1\) integers such that the inequalities \({p}_{ij}(t) > 0\) and \({p}_{ji}(s) > 0\) hold. From this, by the Chapman–Kolmogorov equation, we obtain

$${p}_{ii}(t + s) \geq {p}_{ij}(t){p}_{ji}(s) > 0\text{ and }{p}_{jj}(t + s) \geq {p}_{ji}(s){p}_{ij}(t) > 0;$$

therefore, the numbers d(i) and d(j) differ from 0. Choose arbitrarily an integer \(m \geq 1\) such that \({p}_{ii}(m) > 0\). Repeatedly applying the Chapman–Kolmogorov equation, we have for any \(k \geq 1\)

$${p}_{jj}(t + s + km) \geq {p}_{ji}(s){p}_{ii}(km){p}_{ij}(t) \geq {p}_{ji}(t){\left ({p}_{ii}(m)\right )}^{k}{p}_{ ij}(s) > 0.$$

Thus by the definition of the period of the state j, d(j) is a divisor of both \((t + s + m)\) and \((t + s + 2m)\), and hence it is also a divisor of their difference \((t + s + 2m) - (t + s + m) = m\). From this it immediately follows that d(j) is a divisor of every m for which p ii (m) > 0, and thus it is a divisor of d(i); therefore, \(d(j) \leq d(i)\). Changing the role of i and j we get the reverse inequality \(d(j) \geq d(i)\), and consequently d(j) = d(i). □ 

Notice that from this theorem it follows that the states of an irreducible class have a common period \(d({\mathcal{X}}^{{\prime}})\) called the period of the class. As a consequence, we have the following assertion.

Corollary 3.19.

If the Markov chain X is homogeneous and irreducible with state space \(\mathcal{X}\) , then every state has the same period \(d = d(\mathcal{X}) > 0\) and is periodic or aperiodic depending on d > 1 or d = 1, respectively.

The main property of the numbers for which the probabilities of returning to a state i in k steps are positive, i.e., \({p}_{ii}(k) > 0\), is given by the following assertion.

Theorem 3.20.

Let X be a homogeneous irreducible Markov chain with state space \(\mathcal{X}\) . Then for every state \(i \in \mathcal{X}\) there exists an integer M i such that \({p}_{ii}(d(i)m) > 0\) if \(m \geq {M}_{i}\) .

Proof.

By the previous theorem, \(d(i) \geq 1\). Let \({m}_{1},\ldots ,{m}_{L}\) be different positive integer numbers such that, on the one hand, \({p}_{ii}({m}_{k}) > 0\), \(1 \leq k \leq L\), and on the other hand, d(i) can be given as the greatest common divisor of integers \({m}_{1},\ldots ,{m}_{L}\). Then, using the well-known assertion from the number theory that there exists an integer M i such that for every integer \(m \geq {M}_{i}\), the equation \(md(i) = {r}_{1}{m}_{1} + \ldots + {r}_{L}{m}_{L}\) has a solution with nonnegative integers \({i}_{1},\ldots ,{i}_{L}\). Applying this fact and the Chapman–Kolmogorov equation we obtain

$${p}_{ii}(md(i)) \geq {\left ({p}_{ii}({m}_{1})\right )}^{{r}_{1} } \cdot \ldots \cdot {\left ({p}_{ii}({m}_{L})\right )}^{{r}_{L} } > 0,$$

and consequently the assertion of the theorem is true. □ 

Consider now the homogeneous irreducible Markov chain with period \(d(\mathcal{X}) > 1\). We show that for the transitions among the states there exists a cyclic property, demonstrated in Example 3.28, of the random walk on a number line: if the walk starts from state 0, then the process can take only even integers in even steps and only odd integers in odd steps. The cyclic property in this case means that after even numbers follow odd numbers and after odd numbers follow even numbers as states. This division of states is generalized subsequently for Markov chains with arbitrary period d.

Let \({i}_{0} \in \mathcal{X}\) be an arbitrarily fixed state, and define the sets

$${\mathcal{X}}_{k} =\{ j \in \mathcal{X} :\ {p}_{{i}_{0}j}(k + md) > 0,\text{ for some }m \geq 0\}\text{ , }k = 0,1,\ldots ,d - 1.$$

That is, \({\mathcal{X}}_{k}\) is the set of states that are available from i 0 in k + md (m = 0, 1, , ) steps.

Theorem 3.21.

The sets \({\mathcal{X}}_{1},\ldots ,{\mathcal{X}}_{d-1}\) are disjoint, \(\mathcal{X} = {\mathcal{X}}_{0} \cup \ldots \cup {\mathcal{X}}_{d-1}\) , and the Markov chain allows for only the following cyclic transitions among the sets \({\mathcal{X}}_{k}\) :

$${\mathcal{X}}_{0} \rightarrow {\mathcal{X}}_{1} \rightarrow \ldots \rightarrow {\mathcal{X}}_{d-1} \rightarrow {\mathcal{X}}_{0}.$$
(3.5)

Proof.

First we prove that the sets \({\mathcal{X}}_{0},\ldots ,{\mathcal{X}}_{d-1}\) are disjoint and their union is \(\mathcal{X}\). In contrast, assume that there exist integers \({k}_{1},{k}_{2},{m}_{1},{m}_{2}\) such that \(0 \leq {k}_{1} < {k}_{2} \leq d - 1,\) \({m}_{1},{m}_{2} \geq 1,\) \({p}_{{i}_{0}j}({k}_{1} + {m}_{1}d) > 0\), and \({p}_{{i}_{0}j}({k}_{2} + {m}_{2}d) > 0\). Since the Markov chain is irreducible, there exists an integer \(K \geq 1\) such that \({p}_{i{i}_{0}}(K) > 0\). Using the Chapman–Kolmogorov equation we have

$$\begin{array}{rcl}{ p}_{{i}_{0}{i}_{0}}({k}_{1} + {m}_{1}d + K)& \geq & {p}_{{i}_{0}j}({k}_{1} + {m}_{1}d){p}_{j{i}_{0}}(K) > 0, \\ {p}_{{i}_{0}{i}_{0}}({k}_{2} + {m}_{2}d + K)& \geq & {p}_{{i}_{0}j}({k}_{2} + {m}_{2}d){p}_{j{i}_{0}}(K) >0. \end{array}$$

By the definition of the period d, d is a divisor of both \(({k}_{1} + {m}_{1}d + K)\) and \(({k}_{2} + {m}_{2}d + K)\), thus it is also a divisor of their difference, that is, \(({k}_{2} - {k}_{1}) + ({m}_{2} - {m}_{1})d\). Consequently, d is a divisor of the difference \(({k}_{2} - {k}_{1})\), which is a contradiction, because \(0 < {k}_{2} - {k}_{1} \leq d - 1\). The irreducibility condition ensures that if all states \(i \in \mathcal{X}\) are accessible from the state i 0, then \(\mathcal{X} = {\mathcal{X}}_{0} \cup \ldots \cup {\mathcal{X}}_{d-1}\).

We now verify that for every \(k,\ 0 \leq k \leq d - 1,\ i \in {\mathcal{X}}_{k}\) and \(j \in \mathcal{X}\) such that \({p}_{ij} > 0\), the relation \(j \in {\mathcal{X}}_{K}\), \(0 \leq K < d\), is true, where

$$K = \left \{\begin{array}{c} k + 1,\ \text{ if }0 \leq k < d - 1,\\ \text{ } 0,\ \ \ \ \ \ \text{ if } k = d - 1.\text{ } \end{array} \right.$$

This property guarantees the transitions between the states in (3.5).

Since \(j \in {\mathcal{X}}_{k}\), then, by the definition of the sets \({\mathcal{X}}_{k}\), there exists an integer m ≥ 0 such that \({p}_{{i}_{0}j}(k + md) > 0\). From this, by the use of the Chapman–Kolmogorov equality, we have

$${p}_{{i}_{0}\mathcal{l}}(k + 1 + md) \geq {p}_{{i}_{0}j}(k + md){p}_{j\mathcal{l}} > 0.$$

In view of the fact that

$$k+1+md = \left \{\begin{array}{ll} K + m\ d, &if\ \ \ 0 \leq k < d - 1,\\ 0 + (m + 1)\ d, &if\ \ \ k = d - 1, \end{array} \right.$$

from the definition of \({\mathcal{X}}_{K}\) follows the relation \(j \in {\mathcal{X}}_{K}\). □ 

As a consequence of Theorem 3.21, we have the next important corollary, which allows us to consider an aperiodic Markov chain instead of a periodic one.

Corollary 3.22.

Theorem  3.21 states that starting from a state of \({\mathcal{X}}_{k}\) , \(k = 0,1,\ldots ,d - 1\) , after exactly d steps the process returns to a state of \({\mathcal{X}}_{k}\) . If we define the quantities

$${p}_{ij}^{(k)} = \mathbf{P}\left ({X}_{ d} = i\ \vert \ {X}_{0} = j\right ),\ i,j \in {\mathcal{X}}_{k},$$

then \(\sum\limits_{j\in {\mathcal{X}}_{k}}{p}_{ij}^{(k)} = 1,\ i \in {\mathcal{X}}_{k}\) follows. This means that the matrices \({\mathbf{P}}^{(k)} ={ \left [{p}_{ij}^{(k)}\right ]}_{i,j\in {\mathcal{X}}_{k}}\) are stochastic; they can be interpreted as one-step transition probability matrices, and consequently the processes

$${Y }^{(k)} = ({Y }_{ 0},{Y }_{1},\ldots ),\ k = 0,1,\ldots ,d - 1,$$

with the state space \({\mathcal{X}}_{k}\) and transition probability matrix \({\mathbf{P}}^{(k)}\) , are homogeneous and irreducible Markov chains, and so, instead of the original chain, d homogeneous irreducible Markov chains can be considered independently.

If the states of the Markov chain are numbered according to the \({\mathcal{X}}_{k}\), \(k = 0,1,\ldots ,d - 1\), sets, then the transition probability matrix has the following structure:

1.4 Recurrent Markov Chains

We consider the question of what conditions ensure the existence of limit theorems for homogeneous aperiodic Markov chains, that is, under what conditions does there exist the limit distribution \(\pi = ({\pi }_{i},i \in \mathcal{X}),\) \(({\pi }_{i} \geq 0,\ \sum\limits_{i\in \mathcal{X}}{\pi }_{i} = 1)\), such that, independently of the initial distribution \(({p}_{i},i \in \mathcal{X})\), the limit is

$$\mathop{\lim }\limits_{n \rightarrow \infty }{P}_{i}(n) =\mathop{\lim }\limits_{ n \rightarrow \infty }\mathbf{P}\left ({X}_{n} = i\right ) = {\pi }_{i},\ i \in \mathcal{X}?$$

To provide an answer to this question, it is necessary to consider some quantities such as the probability and the expected value of returning to a given state of a Markov chain or arriving at a state j from another state i. Let \(i,j \in \mathcal{X}\) be two arbitrary states, and introduce the following notations:

$$\begin{array}{rcl} {T}_{ij}& =& \inf \{n :\ n > 1,\ {X}_{n} = j\ \vert \ {X}_{0} = i\}, \\ {f}_{ij}(0)& =& 0,\ \\ {f}_{ij}(1)& =& \mathbf{P}\left ({X}_{1} = j\ \vert \ {X}_{0} = i\right ), \\ {f}_{ij}(n)& =& \mathbf{P}\left ({X}_{1}\neq j,{X}_{2}\neq j,\ldots ,{X}_{n-1}\neq j,{X}_{n} = j\ \vert \ {X}_{0} = i\right ),\ n = 2,3,\ldots .\end{array}$$

If \(i\neq j\), then the quantities \({f}_{ij}(n) = \mathbf{P}\{{T}_{ij} = n\}\) mean the first hit (or first passage) probabilities for the state j from i, which is the probability that starting from the state i at time point 0, the process will be first in the state j during n steps (or in time n). If i = j, then the quantity f ii (n) means the first return probability in the state i in n steps.

Denote \({f}_{ij} =\sum\limits_{k=1}^{\infty }{f}_{ij}(k),\ i,j \in \mathcal{X}\). Obviously, the quantity f ij means the probability that the Markov chain starts from a state i at time 0 and at some time arrives at the state j, that is, \({f}_{ij} = \mathbf{P}\{{T}_{ij} < \infty \}\).

Definition 3.23.

A state i is called recurrent if the process returns to the state i with probability 1, that is, \({f}_{ii} = \mathbf{P}\{{T}_{ii} < \infty \} = 1\). If \({f}_{ii} < 1\), then the state i is called transient.

From the definition it follows that when i is a transient state, then a process with positive probability will never return to the state i. The following theorem describes the connection between the return probabilities and the m-step transition probabilities of a Markov chain in the form of a so-called discrete renewal equation.

Theorem 3.24.

For every \(i,j \in \mathcal{X},\ n = 1,2,\ldots \) ,

$${p}_{ij}(n) =\sum\limits_{k=1}^{n}{f}_{ ij}(k){p}_{jj}(n - k).$$
(3.6)

Proof.

By the definition \({p}_{jj}(0) = 1\), in the case n = 1 we have \({p}_{ij}(1) = {f}_{ij}(1){p}_{jj}(0) = {f}_{ij}(1)\). Now let \(n \geq 2\). Using conditional probability and the Markov property we get

$$\begin{array}{rcl} \mathbf{P}\left ({X}_{n}\,=\,j,{X}_{1}\,=\,j\ \vert \ {X}_{0}\,=\,i\right )& =& \mathbf{P}\left ({X}_{n}\,=\,j\ \vert \ {X}_{1}\,=\,j,{X}_{0}\,=\,i\right )\mathbf{P}\left ({X}_{1}\,=\,j\ \vert \ {X}_{0}\,=\,i\right ) \\ & =& {p}_{jj}(n - 1){p}_{ij}(1) = {f}_{ij}(1){p}_{jj}(n - 1).\end{array}$$

Similarly, we obtain

$$\begin{array}{rcl} & & \mathbf{P}\left ({X}_{n} = j,{X}_{k} = j,{X}_{m}\neq j,1 \leq m \leq k - 1\ \vert \ {X}_{0} = i\right ) \\ & & \quad = {f}_{ij}(k){p}_{jj}(n - k),\ \ n = 1,2,\ldots .\end{array}$$

On the basis of the last two equations, it follows that

$$\begin{array}{rcl}{ p}_{ij}(n)& =& \mathbf{P}\left ({X}_{n} = j,{X}_{1} = j\ \vert \ {X}_{0} = i\right ) \\ & & +\sum\limits_{k=2}^{n}\mathbf{P}\left ({X}_{ n} = j,{X}_{k} = j,{X}_{m}\neq j,1 \leq m \leq k - 1\ \vert \ {X}_{0} = i\right ) \\ & =& {f}_{ij}(1){p}_{jj}(n - 1) +\sum\limits_{k=2}^{n}{f}_{ ij}(k){p}_{jj}(n - k),\ \ n = 1,2,\ldots .\end{array}$$

 □ 

The notion of the recurrence of a state is defined by the return probabilities, but the following theorem makes it possible to provide a condition for it with the use of n-step transition probabilities \({p}_{ii}(n)\) and to classify the Markov chains.

Theorem 3.25.

  1. (a)

    The state \(i \in \mathcal{X}\) is recurrent if and only if

    $$\sum\limits_{n=1}^{\infty }{p}_{ ii}(n) = \infty.$$
  2. (b)

    If i and j are communicating states and i is recurrent, then j is also recurrent.

  3. (c)

    If a state \(j \in \mathcal{X}\) is transient, then for arbitrary \(i \in \mathcal{X}\)

    $$\sum\limits_{n=1}^{\infty }{p}_{ ij}(n) < \infty \text{ and consequently }\mathop{\lim }\limits_{n \rightarrow \infty }{p}_{ij}(n) = 0.$$

Proof.

  1. (a)

    By the definition \({p}_{ii}(0) = 1\) and using relation (3.6) of the preceding theorem we obtain

    $$\begin{array}{rcl} \sum\limits_{n=1}^{\infty }{p}_{ ii}(n)& =& \sum\limits_{n=1}^{\infty }\sum\limits_{k=1}^{n}{f}_{ ii}(k){p}_{ii}(n - k) =\sum\limits_{k=1}^{\infty }\sum\limits_{n=k}^{\infty }{f}_{ ii}(k){p}_{ii}(n - k) \\ & =& \sum\limits_{k=1}^{\infty }{f}_{ ii}(k)\left ({p}_{ii}(0) +\sum\limits_{n=1}^{\infty }{p}_{ ii}(n)\right ).\end{array}$$

    From this equation, if the sum \(\sum\limits_{n=1}^{\infty }{p}_{ii}(n)\) is finite, then we get

    $${f}_{ii} ={ \left (1 +\sum\limits_{n=1}^{\infty }{p}_{ ii}(n)\right )}^{-1}\sum\limits_{n=1}^{\infty }{p}_{ ii}(n) < 1;$$

    consequently, i is not a recurrent state.If \(\sum\limits_{n=1}^{\infty }{p}_{ii}(n) = \infty \), then obviously \(\mathop{\lim }\limits_{N \rightarrow \infty }\sum\limits_{n=1}^{N}{p}_{ii}(n) = \infty \). Since for all positive integers N the relation

    $$\begin{array}{rcl} \sum\limits_{n=1}^{N}{p}_{ ii}(n)& =& \sum\limits_{n=1}^{N}\sum\limits_{k=1}^{n}{f}_{ ii}(k){p}_{ii}(n - k) \\ & =& \sum\limits_{k=1}^{N}\sum\limits_{n=k}^{N}{f}_{ ii}(k){p}_{ii}(n - k) \leq \sum\limits_{k=1}^{N}{f}_{ ii}(k)\sum\limits_{n=0}^{N}{p}_{ ii}(n) \\ & \leq &\left (1 +\sum\limits_{n=1}^{N}{p}_{ ii}(n)\right )\sum\limits_{k=1}^{N}{f}_{ ii}(k) \\ \end{array}$$

    holds, from the limit \(\sum\limits_{n=1}^{N}{p}_{ii}(n) \rightarrow \infty \)

    $$1\geq {f}_{ii}\,=\,\sum\limits_{k=1}^{\infty }{f}_{ ii}(k)\geq \sum\limits_{k=1}^{N}{f}_{ ii}(k)\geq {\left (1+\sum\limits_{k=1}^{N}{p}_{ ii}(k)\right )}^{-1}\sum\limits_{k=1}^{N}{p}_{ ii}(k)\rightarrow 1,\ N\,\rightarrow \,\infty $$

    follows. Consequently, f ii  = 1, and thus the state i is recurrent.

  2. (b)

    Since the states i and j communicate, there exist integers \(n,m \geq 1\) such that \({p}_{ij}(m) > 0\) and \({p}_{ji}(n) > 0\). By the Chapman–Kolmogorov equation for every integer \(k \geq 1\),

    $$\begin{array}{rcl} {p}_{ii}(m + k + n)& \geq & {p}_{ij}(m){p}_{jj}(k){p}_{ji}(n), \\ {p}_{jj}(m + k + n)& \geq & {p}_{ji}(n){p}_{ii}(k){p}_{ij}(m).\end{array}$$

    From this

    $$\begin{array}{rcl} \sum\limits_{k=1}^{\infty }{p}_{ ii}(k)& \geq &\sum\limits_{k=1}^{\infty }{p}_{ ii}(m + n + k) \geq {p}_{ij}(m){p}_{ji}(n)\sum\limits_{k=1}^{\infty }{p}_{ jj}(k), \\ \sum\limits_{k=1}^{\infty }{p}_{ jj}(k)& \geq &\sum\limits_{k=1}^{\infty }{p}_{ jj}(m + n + k) \geq {p}_{ij}(m){p}_{ji}(n)\sum\limits_{k=1}^{\infty }{p}_{ ii}(k).\end{array}$$

    Both series \(\sum\limits_{k=1}^{\infty }{p}_{ii}(k)\) and \(\sum\limits_{k=1}^{\infty }{p}_{jj}(k)\) are simultaneously convergent or divergent because \({p}_{ij}(m) > 0\) and \({p}_{ji}(n) > 0\); thus, by assertion (a) of the theorem, the states i and j are recurrent or transient at the same time.

  3. (c)

    Applying the discrete renewal Eq. (3.6) and result (a), assertion (c) immediately follows.

 □ 

Definition 3.26.

A Markov chain is called recurrent or transient if every state is recurrent or transient.

Comment 3.27.

Using the n-step transition probabilities \({p}_{ii}(n)\) , a simple formula can be given for the expected value of the number of returns to a state \(i \in \mathcal{X}\) . Let X 0 = i be the initial state of the Markov chain. The expected value of the return number is expressed as

$$\begin{array}{rcl} \mathbf{E}\left (\sum\limits_{k=1}^{\infty }{\mathcal{I}}_{\left \{{ X}_{k}=i\right \}}\vert {X}_{0} = i\right )& =& \sum\limits_{k=1}^{\infty }\mathbf{E}\left ({\mathcal{I}}_{\left \{{ X}_{k}=i\right \}}\vert {X}_{0} = i\right ) \\ & =& \sum\limits_{k=1}^{\infty }\mathbf{P}\left ({X}_{ k} = i\ \vert \ {X}_{0} = i\right ) =\sum\limits_{k=1}^{\infty }{p}_{ ii}(k).\end{array}$$

The assertion of Theorem  3.25 can be interpreted in another way: a state \(i \in \mathcal{X}\) is recurrent if and only if the expected value of the number of returns equals infinity.

Example 3.28 (Recurrent Markov chain). 

Consider the random walk process \(X = ({X}_{n},\ n = 0,1,\ldots )\) described in Example 3.10. The process, starting from the origin, at all steps moves one unit to the right with probability p and to the left with probability (1 − p), independently of each other. We have proved earlier that the process X is a homogeneous, irreducible, and periodic Markov chain with period 2. Here we discuss the conditions under which the Markov chain will be recurrent.

By the condition X 0 = 0, it is clear that \({p}_{00}(2k + 1) = 0,\ k = 0,1,\ldots \). The process can return in 2k steps to the state 0 only if it moves, in some way, k times to the left and k times to the right, the probability of which is

$${p}_{00}(2k) = \left ({ 2k \atop k} \right ){p}^{k}{(1 - p)}^{k} = \frac{(2k)!} {k!k!} {[p(1 - p)]}^{k}.$$

Using the well-known Stirling’s formula, which gives an asymptotic relation for k! as \(k \rightarrow \infty \) as follows (see p. 616 of [5]):

$$\sqrt{2\pi }{k}^{k+1/2}{\mathrm{e}}^{-k} < k! < \sqrt{2\pi }{k}^{k+1/2}{\mathrm{e}}^{-k}\left (1 + \frac{1} {4k}\right );$$

then

$$k! \approx {\left (\frac{k} {\mathrm{e}} \right )}^{k}\sqrt{2\pi k};$$

and thus we have

$${p}_{00}(2k) \approx {\left (\frac{2k} {\mathrm{e}} \right )}^{2k}\sqrt{2\pi (2k)}{\left ({\left (\frac{k} {\mathrm{e}} \right )}^{k}\sqrt{2\pi k}\right )}^{-2}{[p(1 - p)]}^{k} = \frac{{[4p(1 - p)]}^{k}} {\sqrt{\pi k}}.$$

By the inequality between arithmetic and geometrical means, the numerator has an upper bound

$$4\left [p(1 - p)\right ] \leq 4{\left [\frac{p + (1 - p)} {2} \right ]}^{2} = 1,$$

where the equality holds if and only if \(p = 1 - p\), that is, \(p = 1/2\). In all other cases the product is less than 1; consequently, the sum of return probabilities p 00(2k) is divergent if and only if \(p = 1/2\) (symmetric random walk); otherwise, it is convergent. As a consequence of Theorem 3.25, we obtain that the state 0, and together with it all states of the Markov chain, is recurrent if and only if \(p = 1/2\).

Note that a similar result is valid if we consider the random walk with integer coordinates in a plane. It can be verified that only in the case of a symmetric random walk will the state (0, 0) be recurrent, when the probabilities of the movements left-right-up-down are \(1/4 - 1/4\). In addition, if a random walk is defined in a similar way in higher-dimensional (\(\geq 3\)) spaces, then the Markov chain will no longer be recurrent.

2 Fundamental Limit Theorem of Homogeneous Markov Chains

2.1 Positive Recurrent and Null Recurrent Markov Chains

Let X be a homogeneous Markov chain with the finite (\(N < \infty \)) or countably infinite (\(N = \infty \)) state space \(\mathcal{X} =\{ 0,1,\ldots ,N\}\) and (one-step) transition probability matrix \(\mathbf{\Pi } ={ \left [{p}_{ij}\right ]}_{i\in \mathcal{X}}\). Let \(P = ({p}_{i} = \mathbf{P}\left ({X}_{0} = i\right ),\ i \in \mathcal{X})\) be the initial distribution. Denote by \(P(n) = ({P}_{i}(n) = \mathbf{P}\left ({X}_{n} = i\right ),\ i \in \mathcal{X}),\) \(n = 0,1,\ldots \), the time-dependent distribution of the Markov chain; then \(P(0) = P.\)

The main question to be investigated here concerns the conditions under which there exists a limit distribution of m-step transition probabilities

$$\mathop{\lim }\limits_{m \rightarrow \infty }{p}_{ij}(m) = {\pi }_{j},\text{ where }{\pi }_{j} \geq 0\text{ and }\sum\limits_{i\in \mathcal{X}}{\pi }_{i} = 1$$

and how it can be determined. The answer is closely related to the behavior of the recurrent states i of a Markov chain. Note that the condition of recurrence \({f}_{ii} =\sum\limits_{k=1}^{\infty }{f}_{ii}(k) = 1\) does not ensure the existence of a limit distribution. The main characteristics are the expected values of the return times \({\mu }_{i} = {T}_{ii} =\sum\limits_{k=1}^{\infty }k{f}_{ii}(k)\), and the recurrent states will be classified according to whether or not the \({\mu }_{i}\) are finite because the condition \({\mu }_{i} < \infty ,\ i \in \mathcal{X}\), guarantees the existence of a limit distribution.

Definition 3.29.

A recurrent state \(i \in \mathcal{X}\) is called positive recurrent (or nonnull recurrent) if the return time has a finite expected value \({\mu }_{i}\); otherwise, if \({\mu }_{i} = \infty \), then it is called null recurrent.

Theorem 3.30.

Let X be a homogeneous, irreducible, aperiodic, and recurrent Markov chain. Then for all states \(i,j \in \mathcal{X}\) ,

$$\mathop{\lim }\limits_{m \rightarrow \infty }{p}_{ij}(m) = \frac{1} {{\mu }_{j}}.$$

Note that this theorem not only gives the limit of the m-step transition probabilities with the help of the expected value of the return times, but it interprets the notion of positive and null recurrence. By definition, a recurrent state j is positive recurrent if \(1/{\mu }_{j} > 0\) and null recurrent if \(1/{\mu }_{j} = 0\) (here and subsequently, we write \(1/\infty = 0\)). The assertion given in the theorem is closely related to the discrete renewal Eq. (3.6), and using it we can prove a limit theorem, as the following lemma shows (see [29] and Chap. XIII of [31]).

Lemma 3.31 (Erdős, Feller, Pollard). 

Let \(({q}_{i},\ i \geq 0)\) be an arbitrary distribution on the natural numbers, i.e., \({q}_{i} \geq 0,\ \sum\limits_{i=0}^{\infty }{q}_{i} = 1\) . Assume that the distribution \(({q}_{i},\ i \geq 0)\) is not latticed, that is, the greatest common divisor of the indices with the probabilities \({q}_{i} > 0\) equals 1. If the sequence \(\{{v}_{n},\ n \geq 0\}\) , satisfies the discrete renewal equation

$${v}_{0} = 1,\ \ {v}_{n} =\sum\limits_{k=1}^{n}{q}_{ k}{v}_{n-k},\ n \geq 1,$$

then

$$\mathop{\lim }\limits_{n \rightarrow \infty }{v}_{n} = \frac{1} {\mu },$$

where \(\mu =\sum\limits_{k=1}^{\infty }k{q}_{k}\) and \(\frac{1} {\mu } = 0\) if \(\mu = \infty \) .

The proof of Theorem 3.30 uses the following result from analysis.

Lemma 3.32.

Assume that the sequence \(({q}_{1},{q}_{2},\ldots )\) of nonnegative real numbers satisfies the condition \(\sum\limits_{i=0}^{\infty }{q}_{i} = 1\) . If the sequence of real numbers \(({w}_{n},\ n \geq 0)\) is convergent, \(\mathop{\lim }\limits_{n \rightarrow \infty }{w}_{n} = w\) , then

$$\mathop{\lim }\limits_{n \rightarrow \infty }\sum\limits_{k=0}^{n}{q}_{ n-k}{w}_{n} = w.$$

Proof.

It is clear that the elements of \(\{{w}_{n}\}\) are bounded; then there exists a number W such that \(\vert {w}_{n}\ \vert \ \leq W,\ n \geq 0\). From the conditions \(\mathop{\lim }\limits_{n \rightarrow \infty }{w}_{n} = w\) and \(\sum\limits_{i=0}^{\infty }{q}_{i} = 1\) it follows for any \(\epsilon > 0\) that there exist integers \(N(\epsilon )\) and \(K(\epsilon )\) such that

$$\vert {w}_{n} - w\ \vert \ < \epsilon \text{ and }\sum\limits_{k=K(\epsilon )}^{\infty }{q}_{ k} < \epsilon.$$

It is easy to check that for every \(n > n(\epsilon ) =\max (N(\epsilon ),K(\epsilon ))\),

$$\begin{array}{rcl} \vert {w}_{n} - w\ \vert \ & \leq & \left \vert \sum\limits_{k=0}^{n}{q}_{ k}{w}_{n-k} -\sum\limits_{k=0}^{n}{q}_{ k}w\right \vert \\ & \leq & \sum\limits_{k=0}^{n(\epsilon )}{q}_{ k}\vert {w}_{n-k} - w\ \vert \ +\sum\limits_{k=n(\epsilon )+1}^{n}{q}_{ k}\vert {w}_{n-k} - w\ \vert \ +\sum\limits_{k=n+1}^{\infty }{q}_{ k}\vert w\ \vert \ \\ &\leq & \sum\limits_{k=0}^{n(\epsilon )}{q}_{ k}\epsilon +\sum\limits_{k=n(\epsilon )+1}^{n}{q}_{ k}(W + \vert w\vert ) +\sum\limits_{k=n+1}^{\infty }{q}_{ k}\vert w\ \vert \ \\ &\leq & \epsilon + \epsilon (W + \vert w\vert ) + \epsilon \vert w\ \vert \ = \epsilon (1 + W + 2\vert w\vert ).\end{array}$$

Since \(\epsilon > 0\) can be chosen freely, we get the convergence \({w}_{n} \rightarrow w,\ n \rightarrow \infty \). □ 

Proof (Theorem 3.30). 

  1. (a)

    We prove firstly the assertion for the case i = j. By the discrete renewal equation

    $${p}_{ii}(0) = 1,\ \ \ {p}_{ii}(n) =\sum\limits_{k=1}^{n}{f}_{ ii}(k){p}_{ii}(n - k),\ \ n = 1,2,\ldots ,$$

    where the state i is recurrent, \({f}_{ii} =\sum\limits_{k=1}^{\infty }{f}_{ii}(k) = 1\) (\({f}_{ii} \geq 0\)). Using the assertion of Lemma 3.31 we have

    $$\mathop{\lim }\limits_{n \rightarrow \infty }{p}_{ii}(n) = \frac{1} {{\mu }_{i}}.$$
  2. (b)

    Now let \(i\neq j\), and apply Lemma 3.32. Since the Markov chain is irreducible and recurrent, \({f}_{ij} =\sum\limits_{k=1}^{\infty }{f}_{ij}(k) = 1\) (\({f}_{ij} \geq 0\)). Then, as  \(n \rightarrow \infty \),

    $$\mathop{\lim }\limits_{n \rightarrow \infty }{p}_{ij}(n) =\mathop{\lim }\limits_{ n \rightarrow \infty }\sum\limits_{k=1}^{n}{f}_{ ij}(k){p}_{jj}(n - k) =\sum\limits_{k=1}^{\infty }{f}_{ ij}(k) \frac{1} {{\mu }_{j}} = \frac{1} {{\mu }_{j}}.$$

     □ 

Similar results can be easily proven for periodic cases. Let X be a homogeneous, irreducible, and recurrent Markov chain with period d > 1. Then the state space \(\mathcal{X}\) can be decomposed into disjoint subsets \({\mathcal{X}}_{0},\ldots ,{\mathcal{X}}_{d-1}\) [see Eq. (3.21)] such that the Markov chain allows only for cyclic transitions between the states of the sets \({\mathcal{X}}_{i}\): \({\mathcal{X}}_{0} \rightarrow {\mathcal{X}}_{1} \rightarrow \ldots \rightarrow {\mathcal{X}}_{d-1} \rightarrow {\mathcal{X}}_{0}\). Let \(0 \leq k,\ m \leq d - 1\) be arbitrarily fixed integers; then, starting from a state \(i \in {\mathcal{X}}_{k}\), the process arrives at a state of \({\mathcal{X}}_{k}\) in exactly

$$\mathcal{l} = \left \{\begin{array}{c} \ \ m - k,\text{ if }k < m,\\ m - k + d, \text{ if } m \leq k, \end{array} \right.$$

steps. From this follows \({p}_{ij}(s) = 0\) if s − 1 is divisible by d.

Theorem 3.33.

Let X be a homogeneous, irreducible, and recurrent Markov chain with period d > 1 and \(i \in {\mathcal{X}}_{k}\) , \(j \in {\mathcal{X}}_{m}\) arbitrarily fixed states. Then

$$\mathop{\lim }\limits_{n \rightarrow \infty }{p}_{ij}(\mathcal{l} + nd) = \frac{d} {{\mu }_{j}},$$

where \({\mu }_{j} =\sum\limits_{k=1}^{\infty }k\ {f}_{jj}(k) =\sum\limits_{r=1}^{\infty }rd\ {f}_{jj}(rd).\)

Proof.

First assume k = m, and consider the transition probabilities \({p}_{ij}(nd)\) for \(i,j \in {\mathcal{X}}_{k}\). This is equivalent to the case (see Conclusion 3.22 according to the cyclic transitions of a Markov chain) where we investigate the Markov chain \(\overline{X}\) with the state space \({\mathcal{X}}_{k}\) and it has the (one-step) transition probability matrix \(\overline{\Pi } = {[{\overline{p}}_{ij}]}_{i,j\in {\mathcal{X}}_{k}}\), \({\overline{p}}_{ij} = {p}_{ij}(d),\) \(i,j \in {\mathcal{X}}_{k}\). Obviously, the Markov chain \(\overline{X}\) that originated from X is a homogeneous, irreducible, recurrent, and aperiodic Markov chain. Using the limit theorem 3.31 we obtain

$$\mathop{\lim }\limits_{n \rightarrow \infty }{\overline{p}}_{ii}(n) =\mathop{\lim }\limits_{ n \rightarrow \infty }{p}_{ii}(nd) = \frac{1} {\sum\limits_{k=1}^{\infty }k{f}_{ii}(kd)} = \frac{d} {\sum\limits_{k=1}^{\infty }kd{f}_{ii}(kd)} = \frac{d} {\sum\limits_{k=1}^{\infty }k{f}_{ii}(k)} = \frac{d} {{\mu }_{i}},$$

where \({f}_{ii}(r) = 0\) if \(r\neq d,2d,\ldots \).

Assume now that \(k\neq m\). Then \({f}_{ij}(k) = 0\) and \({p}_{ij}(k) = 0\) if \(k\neq \mathcal{l} + nd,\ n \geq 0\); moreover, the Markov chain \(\overline{X}\) is recurrent because

$${f}_{ij} =\sum\limits_{s=1}^{\infty }{f}_{ ij}(s) =\sum\limits_{k=1}^{\infty }{f}_{ ij}(\mathcal{l} + rd) = 1;$$

then

$${ p}_{ij}(\mathcal{l}+nd) =\sum\limits_{k=1}^{\mathcal{l}+nd}{f}_{ ij}(k){p}_{jj}(\mathcal{l}+nd-k) =\sum\limits_{r=1}^{\mathcal{l}+nd}{f}_{ ij}(\mathcal{l}+rd){p}_{jj}(rd) \rightarrow \frac{d} {{\mu }_{j}},\ \ n \rightarrow \infty.$$

 □ 

Theorem 3.34.

If the homogeneous Markov chain X is irreducible and has a positive recurrent state \(i \in \mathcal{X}\) , then all its states are positive recurrent.

Proof.

Let \(j \in \mathcal{X}\) be arbitrary. Since the Markov chain is irreducible, there exist integers \(s,t > 0\) such that \({p}_{ij}(s) > 0,\ {p}_{ji}(t) > 0\). Denote by d the period of the Markov chain. It is clear that d > 0 because \({p}_{ii}(s + t) \geq {p}_{ij}(s){p}_{ji}(t) > 0\). Moreover,

$$\begin{array}{rcl} {p}_{ii}(s + nd + t)& \geq & {p}_{ij}(s){p}_{jj}(nd){p}_{ji}(t), \\ {p}_{jj}(s + nd + t)& \geq & {p}_{ji}(t){p}_{ii}(nd){p}_{ij}(s).\end{array}$$

Applying Theorem 3.33 and taking the limit as \(n \rightarrow \infty \) we have

$$\frac{1} {{\mu }_{i}} \geq {p}_{ij}(s) \frac{1} {{\mu }_{j}}{p}_{ji}(t),\ \ \frac{1} {{\mu }_{j}} \geq {p}_{ij}(s) \frac{1} {{\mu }_{i}}{p}_{ji}(t);$$

thus

$$\frac{1} {{\mu }_{i}} \geq {p}_{ij}(s){p}_{ji}(t) \frac{1} {{\mu }_{j}} \geq {[{p}_{ij}(s){p}_{ji}(t)]}^{2} \frac{1} {{\mu }_{i}}.$$

From the last inequality it immediately follows that when the state i is recurrent, at the same time j is also recurrent. □ 

Summing up the results derived previously, we can state the following theorem.

Theorem 3.35.

Let X be a homogeneous irreducible Markov chain; then

  1. 1.

    All states are aperiodic or all states are periodic with the same period,

  2. 2.

    All states are transient or all states are recurrent, and in the latter case

    • All are positive recurrent or all are null recurrent.

2.2 Stationary Distribution of Markov Chains

Retaining the notations introduced previously, \(P(n) = ({P}_{i}(n) = \mathbf{P}\left ({X}_{n} = i\right ),\ i \in \mathcal{X})\) denotes the distribution of a Markov chain depending on the time \(n \geq 0\). Then \(P(0) = ({P}_{i}(0) = {p}_{i},\ i \in \mathcal{X})\) is the initial distribution.

Definition 3.36.

Let \(\pi = ({\pi }_{i},\ i \in \mathcal{X})\) be a distribution, i.e., \({\pi }_{i} \geq 0\) and \(\sum\limits_{i\in \mathcal{X}}{\pi }_{i} = 1\). π is called a stationary distribution of the Markov chain X if by choosing \(P(0) = \pi \) as the initial distribution, the distribution of the process does not depend on time, that is,

$$P(n) = \pi ,\ n \geq 0.$$

A stationary distribution is also called an equilibrium distribution of a chain.

With Markov chains, the main problem is the existence and determination of stationary distributions. Theorem 3.30 deals with the convergence of the sequence of n-step transition probabilities P(n) as \(n \rightarrow \infty \), and if it converges, then the limit gives the stationary distribution of the chain. The proofs of these results are not too difficult but consist of many technical steps [35, 36], and so we omit them here.

Theorem 3.37.

Let X be a homogeneous, irreducible, recurrent, and aperiodic Markov chain. Then the following assertions hold:

  1. (A)

    The limit

    $${\pi }_{i} =\mathop{\lim }\limits_{ n \rightarrow \infty }{P}_{i}(n) = \frac{1} {{\mu }_{i}},\ i \in \mathcal{X},$$

    exists and does not depend on the initial distribution.

  2. (B)

    If all states are recurrent null states, then the stationary distribution does not exist and \({\pi }_{i} = 0\) for all \(i \in \mathcal{X}\).

  3. (C)

    If all states are positive recurrent, then the stationary distribution \(\pi = ({\pi }_{i},\ i \in \mathcal{X})\) does exist and \({\pi }_{i} = 1/{\mu }_{i} > 0\) for all \(i \in \mathcal{X}\) and \(P(n) \rightarrow \pi ,\) as \(n \rightarrow \infty \) . The stationary distribution is unique and satisfies the system of linear equations

    $$\sum\limits_{i\in \mathcal{X}}{\pi }_{i} = 1,$$
    (3.7)
    $${\pi }_{i} =\sum\limits_{j\in \mathcal{X}}{\pi }_{j}{p}_{ji},\ i \in \mathcal{X}.$$
    (3.8)

Comment 3.38.

Since the Markov chain is irreducible, it is enough to require in part (B) the existence of a positive recurrent state because from the existence of a single positive recurrent state and the fact that the Markov chain is irreducible it follows that all states are positive recurrent.

Equation  (3.8) of Theorem  3.37 can be rewritten in the more concise form \(\pi = \pi \mathbf{\Pi }\) , where \(\Pi \) is the one-step transition probability matrix of the chain.

The initial distribution does not play a role in Eqs.  (3.7) and (3.8) ; therefore, when the stationary distribution π exists, it does not depend on the initial distribution, only on the transition probability matrix \(\Pi \) .

Given that the stationary distribution π exists, it can be easily proven that \(\pi \) satisfies the system of linear Eq. (3.8), and at the same time, these circumstances lead to an iterative method of solution [see Eq. (3.9) below]. This iterative procedure to determine the stationary distribution can be applied to chains with finite state spaces.

The time-dependent distribution \(P(n) = ({P}_{0}(n),{P}_{1}(n),\ldots )\) satisfies the equation for all n = 0, 1, ,

$$P(n) = P(n - 1)\mathbf{\Pi }.$$
(3.9)

Repeating this equation n times, we have

$$P(n) = P(0){\mathbf{\Pi }}^{n},\ n = 0,1,\ldots .$$

Since it is assumed that the stationary distribution π exists, we can write

$$\pi =\mathop{\lim }\limits_{ n \rightarrow \infty }P(n);$$

thus from the equation

$$\mathop{\lim }\limits_{n \rightarrow \infty }P(n) =\mathop{\lim }\limits_{ n \rightarrow \infty }P(n - 1)\mathbf{\Pi }$$

it follows that

$$\pi = \pi \mathbf{\Pi }.$$

Definition 3.39.

A state i of an irreducible homogeneous Markov chain X is called ergodic if the state i is aperiodic and positive recurrent, i.e., \(d(i) = 1,\ {\mu }_{i} < \infty \). If all states of the chain are ergodic, then the Markov chain is called ergodic.

Here we define the ergodic property only of Markov chains. This property can be defined for much more complex stochastic processes as well.

By Theorem 3.37, a homogeneous, aperiodic, positive recurrent Markov chain is always ergodic. Since an irreducible Markov chain with finite state space is positive recurrent, the following statement is also true.

Theorem 3.40.

A homogeneous, irreducible, aperiodic Markov chain with finite state space is ergodic.

In practical applications, the equilibrium distributions of Markov chains play an essential role. In what follows, we give two theorems without proofs whose conditions ensure the existence of the stationary distribution of a homogeneous, irreducible, aperiodic Markov chain X with state space \(\mathcal{X} =\{ 0,1,\ldots \}\). The third theorem gives an upper bound for the convergence rate to the stationary distribution of the iterative procedure (3.9).

Theorem 3.41 (Klimov [56]). 

If there exists a function \(g(i),\ i \in \mathcal{X}\) , a state \({i}_{0} \in \mathcal{X}\) , and a positive constant \(\epsilon \) such that the relations

$$\begin{array}{rcl} \mathbf{E}\left (g({X}_{n+1})\ \vert \ {X}_{n} = i\right )& \leq & g(i) - \epsilon ,\ \ i \geq {i}_{0},\ n \geq 0, \\ \mathbf{E}\left (g({X}_{n+1})\ \vert \ {X}_{n} = i\right )& <& \infty ,\ \ i \geq 0,\ n \geq 0, \\ \end{array}$$

hold, then the chain X is ergodic.

Theorem 3.42 (Foster [33]). 

Assume that there exist constants a,b > 0 and \(\mathcal{l} \geq 0\) such that the inequalities

$$\begin{array}{rcl} \mathbf{E}\left ({X}_{n+1}\ \vert \ {X}_{n} = i\right )& \leq & a,\ \ i \leq \mathcal{l}, \\ \mathbf{E}\left ({X}_{n+1}\ \vert \ {X}_{n} = i\right )& \leq & i - b,\ \ i > \mathcal{l}, \\ \end{array}$$

are valid. Then the Markov chain X is ergodic.

Theorem 3.43 (Bernstein [10]). 

Assume that there exist a state \({i}_{0} \in \mathcal{X}\) and a constant \(\lambda > 0\) such that for all \(i \in \mathcal{X}\) the inequality \({p}_{i{i}_{0}} \geq \lambda \) holds. Then

$$\mathop{\lim }\limits_{n \rightarrow \infty }{p}_{ij}(n) = {\pi }_{j},\ i,j \in \mathcal{X},$$

where \(\pi = ({\pi }_{i},i \in \mathcal{X})\) denotes the stationary distribution of the Markov chain; moreover,

$$\sum\limits_{i\in \mathcal{X}}\left \vert {p}_{ij}(n) - {\pi }_{j}\right \vert \leq 2{(1 - \lambda )}^{n},\ n \geq 1.$$

2.3 Ergodic Theorems for Markov Chains

Let X be a homogeneous irreducible and positive recurrent Markov chain with state space \(\mathcal{X} =\{ 0,1,\ldots \}\) and i a fixed state. Compute the time and the relative frequencies when the process stays in the state i on the time interval [0, T] as follows:

$$\begin{array}{rcl}{ S}_{i}(T)& =& \sum\limits_{n=0}^{T}{\mathcal{I}}_{\left \{{ X}_{n}=i\right \}}, \\ {\overline{S}}_{i}(T)& =& \frac{1} {T}\sum\limits_{n=0}^{T}{\mathcal{I}}_{\left \{{ X}_{n}=i\right \}} = \frac{1} {T}{S}_{i}(T).\end{array}$$

Let us consider when and in what sense there exists a limit of the relative frequencies \({\overline{S}}_{i}(T)\) as \(T \rightarrow \infty \) and, if it exists, how it can be determined. This problem has, in particular, practical importance when applying simulation methods. To clarify the stochastic background of the problem, we introduce the following notations.

Assume that a process starts at time 0 from the state i. Let \(0 = {T}_{0}^{(i)} < {T}_{1}^{(i)} < {T}_{2}^{(i)} < \ldots \) be the sequence of the consecutive random time points when a Markov chain arrives at the state i, that is, \({T}_{k}^{(i)},\ k = 1,2,\ldots \), are the return time points to the state i of the chain. This means that

$$X({T}_{n}^{(i)}) = i,\ n = 0,1,\ldots \text{ and }X(k)\neq i,\text{ if }k\neq {T}_{ 0}^{(i)},{T}_{ 1}^{(i)},\ldots .$$

Denote by

$${\tau }_{k}^{(i)} = {T}_{ k}^{(i)} - {T}_{ k-1}^{(i)},\ k = 1,2,\ldots ,$$

the time length between the return time points. Since the Markov chain has the memoryless property, these random variables are independent; moreover, from the homogeneity of the Markov chain it follows that \({\tau }_{n}^{(i)},n \geq 1\), are also identically distributed. The common distribution of these random variables \({\tau }_{n}^{(i)}\) is the distribution of the return times from the state i to i, namely, \(({f}_{ii}(n),n \geq 1)\).

Heuristically, it is clear that when the return time has a finite expected value \({\mu }_{i}\), then during the time T the process returns to the state i on average \(T/{\mu }_{i}\) times. This means that the quantity \({\overline{S}}_{i}(T)\) fluctuates around the value \(1/{\mu }_{i}\) and has the same limit as \(T \rightarrow \infty \). This result can be given in exact mathematical form on the basis of the law of large numbers as follows.

Theorem 3.44.

If X is an ergodic Markov chain, then, with probability 1,

$$\mathop{\lim }\limits_{T \rightarrow \infty }{\overline{S}}_{i}(T) = \frac{1} {{\mu }_{i}},\quad i \in \mathcal{X}.$$
(3.10)

If the Markov property is satisfied, then not only are the return times independent and identically distributed, but the stochastic behaviors of the processes on the return periods are identical as well. This fact allows us to prove more general results for an ergodic Markov chain as Eq. (3.10).

Theorem 3.45.

Let X be an ergodic Markov chain and \(g(i),\ i \in \mathcal{X}\) , be a real-valued function such that \(\sum\limits_{i\in \mathcal{X}}{\pi }_{i}\vert g(i)\ \vert \ < \infty \) . Then the convergence

$$\mathop{\lim }\limits_{T \rightarrow \infty }\frac{1} {T}\sum\limits_{n=1}^{T}g({X}_{ n}) =\sum\limits_{i\in \mathcal{X}}{\pi }_{i}g(i)$$

is true with probability 1, where \({\pi }_{i},i \in \mathcal{X}\) , denotes the stationary distribution of the Markov chain, which exists under the given condition.

2.4 Estimation of Transition Probabilities

In modeling ergodic Markov chains an important question is to estimate the transition probabilities by the observation of the chain. The relative frequencies give corresponding estimates of the probabilities because by Theorem 3.44 they tend to them with probability 1 under the given conditions. Note that from the heuristic approach discussed previously it follows under quite general conditions that not only can the law of large numbers be derived for the relative frequencies, but the central limit theorems can as well.

Consider now the estimate of transition probabilities with the maximum likelihood method. Let X be an ergodic Markov chain with finite state space \(\mathcal{X} =\{ 0,1,\ldots ,N\}\) and with the (one-step) transition probability matrix \(\mathbf{\Pi } = {({p}_{ij})}_{i,j\in \mathcal{X}}\). Assume that we have an observation of n elements \({X}_{1} = {i}_{1},\ldots ,{X}_{n} = {i}_{n}\) starting from the initial state \({X}_{0} = {i}_{0}\), and we will estimate the entries of the matrix Π. By the Markov property, the conditional likelihood function can be given in the form

$$\mathbf{P}\left ({X}_{1} = {i}_{1},\ldots ,{X}_{n} = {i}_{n}\ \vert \ {X}_{0} = {i}_{0}\right ) = {p}_{{i}_{0}{i}_{1}}\ldots {p}_{{i}_{n-1}{i}_{n}}.$$

Denote by \({n}_{ij},\ i,j \in \mathcal{X}\), the number of one-step transitions from the state i to j in the sample path \({i}_{0},{i}_{1},\ldots ,{i}_{n}\), and let \({0}^{0} = 1,\ 0/0 = 0\). Then the conditional likelihood function given the \({X}_{0} = {i}_{0}\) initial state is

$$L({i}_{1},\ldots ,{i}_{n};\mathbf{\Pi }\ \vert \ {i}_{0}) =\prod\limits_{i=0}^{N}\left (\prod\limits_{j=0}^{N}{p}_{ ij}^{{n}_{ij} }\right ).$$
(3.11)

Applying the maximum likelihood method, maximize the expression in \({p}_{ij}\) under the conditions

$${p}_{ij} \geq 0,\ i,j \in \mathcal{X},\ \sum\limits_{j\in \mathcal{X}}{p}_{ij} = 1,\ i \in \mathcal{X}.$$

It is clear that there are no relations between the products playing a role in the parentheses of Eq. (3.11) for different i; therefore, the maximization problem can be solved by means of N + 1 different, but similar, optimization problems:

$$\max \left \{\prod\limits_{j=0}^{N}{p}_{ ij}^{{n}_{ij} } :\ {p}_{ij} \geq 0,\ \sum\limits_{j\in \mathcal{X}}{p}_{ij} = 1\right \},\ \ i = 0,1,\ldots ,N.$$

Obviously it is enough to solve it only for one state i since the others can be derived analogously to that one.

Let \(i \in \mathcal{X}\) be a fixed state, and denote \({n}_{i} =\sum\limits_{j\in \mathcal{X}}{n}_{ij}\). Apply the Lagrange multiplier method; then for every \(m = 0,\ldots ,N\),

$$\frac{\partial } {\partial {p}_{im}}\left (\prod\limits_{j=0}^{N}{p}_{ ij}^{{n}_{ij} } + \lambda ({p}_{i0} + {p}_{11} + \ldots + {p}_{iN} - 1)\right ) = \frac{{n}_{im}} {{p}_{im}} \prod\limits_{j=0}^{N}{p}_{ ij}^{{n}_{ij} }+\lambda = 0;$$

consequently, for a constant \({\lambda }_{i}\) we have

$$\frac{{n}_{im}} {{p}_{im}} = \lambda \prod\limits_{j=0}^{N}{p}_{ ij}^{-{n}_{ij} } = {\lambda }_{i},\ m = 0,\ldots ,N.$$

From this it follows that the equations

$${n}_{im} = {\lambda }_{i}{p}_{im},\ \ m = 0,\ldots ,N,$$

hold; then

$$\sum\limits_{m=0}^{N}{n}_{ im} = {n}_{i} = {\lambda }_{i}\sum\limits_{m=0}^{N}{p}_{ im} = {\lambda }_{i}.$$

These relations lead to the conditional maximum likelihood estimates for the transition probabilities \({p}_{im}\) as follows:

$$\widehat{{p}}_{im} = \frac{{n}_{im}} {{\lambda }_{i}} = \frac{{n}_{im}} {{n}_{i}} ,\ 0 \leq i,m \leq N.$$

It can be verified that these estimates \(\widehat{{p}}_{im}\) converge to p im with probability 1 as \(\rightarrow \infty \).

3 Continuous-Time Markov Chains

Like the case of the DTMCs, we assume that the state space \(\mathcal{X}\) is a finite \(\{0,1,\ldots ,N\}\) or countably infinite set \(\{0,1,\ldots \}\) and assume that the time parameter varies in \(\mathcal{T} = [0,\infty )\). According to the general definition (3.1), a process \(X = ({X}_{t},\ t \geq 0)\) is said to be CTMC with state space \(\mathcal{X}\) if for every positive integer n and \(0 \leq {t}_{0} < {t}_{1} < \ldots < {t}_{n}\), \({i}_{0},\ldots ,{i}_{n} \in \mathcal{X}\), the equation

$$\begin{array}{rcl} & \mathbf{P}\left ({X}_{{t}_{n}} = {i}_{n}\ \vert \ {X}_{{t}_{n-1}} = {i}_{n-1},\ldots ,{X}_{{t}_{0}} = {i}_{0}\right ) & \\ & \quad = \mathbf{P}\left ({X}_{{t}_{n}} = {i}_{n}\ \vert \ {X}_{{t}_{n-1}} = {i}_{n-1}\right ) = {p}_{{i}_{n-1},{i}_{n}}({t}_{n-1},{t}_{n})& \\ \end{array}$$

holds, provided that a conditional probability exists. The Markov chain X is (time) homogeneous if the transition probability function \({p}_{ij}(s,t)\) satisfies the condition \({p}_{ij}(s,t) = {p}_{ij}(t - s)\) for all \(i,j \in \mathcal{X}\), \(0 \leq s \leq t\). Denote by \(\Pi (s,t) = \left [{p}_{ij}(s,t),i,j \in \mathcal{X}\right ]\) the transition probability matrix.

In the case of a CTMC the time index \(t \in [a,b]\) can take uncountably many values for arbitrary \(0 \leq a < b < \infty \); therefore, the collection of random variables \({X}_{t},\ t \in (a,b]\), is also uncountable. If we consider the questions in accordance with the sample paths of the chain, then these circumstances can lead to measurability problems (discussed later). However, the Markov processes that will be investigated later are the so-called stepwise processes, and they ensure the necessary measurability property.

We will deal mainly with the part of the theory that is relevant to queueing theory, and we touch upon only some questions in general cases showing the root of the measurability problems. A discussion of jumping processes, which is more general than the investigation of stepwise Markov chains, can be found in [36, Chap. III].

If the Markov chain \(\{{X}_{t},\ t \geq 0\}\), is homogeneous, then the transition probability functions \({p}_{ij}(s,t)\) can be given in a simpler form:

$${p}_{ij}(s,s + t) = {p}_{ij}(t),\ i,j \in \mathcal{X},\ s,t \geq 0,$$

and thus the matrix form of transition probabilities is

$$\Pi (s,s + t) = \Pi (t),\ s,t \geq 0.$$

As was done previously, denote by

$$P(t) = ({P}_{0}(t),{P}_{1}(t),\ldots ),\ t \geq 0,$$

the time-dependent distribution of the chain, where \({P}_{i}(t) = \mathbf{P}\left ({X}_{t} = i\right ),\ i \in \mathcal{X}\); then P(0) means the initial distribution, while if there exists a state \(k \in \mathcal{X}\) such that \(\mathbf{P}\left ({X}_{0} = k\right ) = 1\), then k is the initial state.

3.1 Characterization of Homogeneous Continuous-Time Markov Chains

We now deal with the main properties of homogeneous CTMCs. Similarly to the discrete-time case, the transition probabilities satisfy the following conditions.

  1. (A)

    \({p}_{ij}(s) \geq 0,\ s \geq 0,\ \ {p}_{ij}(0) = {\delta }_{ij},\ i,j \in \mathcal{X}\), where \({\delta }_{ij}\) is the Kronecker \(\delta \)-function (which equals 1 if i = j and 0 if \(i\neq j\)).

  2. (B)

    \(\sum\limits_{j\in \mathcal{X}}{p}_{ij}(s) = 1,\ s \geq 0,\ i \in \mathcal{X}\).

  3. (C)

    \({p}_{ij}(s + t) =\sum\limits_{k\in \mathcal{X}}{p}_{ik}(s){p}_{kj}(t),\ s,t \geq 0,\ i,j \in \mathcal{X}\).

    An additional condition is needed for our considerations.

  4. (D)

    The transition probabilities of the Markov chain X satisfy the conditions

    $$\mathop{\lim }\limits_{h \rightarrow 0 +} {p}_{ij}(h) = {p}_{ij}(0) = {\delta }_{ij},\ i,j \in \mathcal{X}.$$
    (3.12)

Comment 3.46.

Condition (B) expresses that \(\Pi (s),\ s \geq 0\) , is a stochastic matrix. We will not consider the so-called killed Markov chains, where the lifetime [0,τ] of the chain is random (where the process is defined) and with probability 1 is finite, i.e., \(\mathbf{P}\{\tau < \infty \} = 1\). It should be noted that condition (B) ensures that the chain is defined on the whole interval \([0,\infty )\) because the process will be certainly in some state \(i \in \mathcal{X}\) for any time \(s \geq 0\).

Condition (C) is the Chapman–Kolmogorov equation related to the continuous-time case. It can be given in matrix form as follows:

$$\text{ }\Pi (s + t) = \Pi (s)\Pi (t)\text{ , }s,t \geq 0.$$

Similarly to the discrete-time case, the time-dependent distribution of the chain satisfies the equation

$$P(s + t) = P(s)\Pi (t),\text{ }s,t \geq 0,$$

and thus for all t > 0

$$P(t) = P(0)\Pi (t).$$

The last relation means that the initial distribution and the transition probabilities uniquely determine the distribution of the chain at all time points \(t \geq 0\) .

Instead of (D) it is enough to assume that the condition

$$\mathop{\lim }\limits_{h \rightarrow 0 +} {p}_{ii}(h) = 1,\ i \in \mathcal{X},$$

holds, because for every \(i,j \in \mathcal{X}\) , \(i\neq j\) , the relation

$$0 \leq {p}_{ij}(h) \leq \sum\limits_{j\neq i}{p}_{ij}(h) = 1 - {p}_{ii}(h) \rightarrow 0,\ h \rightarrow 0+,$$

is true.

Under conditions (A)–(D), the following relations are valid.

Theorem 3.47.

The transition probabilities \({p}_{ij}(t),\ 0 \leq t < \infty ,\ i\neq j\) , are uniformly continuous.

Proof.

Using conditions (A)–(D) we obtain

$$\begin{array}{rcl} \left \vert {p}_{ij}(t + h) - {p}_{ij}(t)\right \vert & =& \left \vert \sum\limits_{k\in \mathcal{X}}{p}_{ik}(h){p}_{kj}(t) -\sum\limits_{k\in \mathcal{X}}{\delta }_{ik}{p}_{kj}(t)\right \vert \\ & \leq & \sum\limits_{k\in \mathcal{X}}\left \vert {p}_{ik}(h) - {\delta }_{ik}\right \vert {p}_{kj}(t) \\ & \leq & 1-{p}_{ii}(h)+\sum\limits_{k\neq i}{p}_{ik}(h) = 2(1-{p}_{ii}(h)) \rightarrow 0,\ \ h \rightarrow 0 +. \end{array}$$

 □ 

Theorem 3.48 ( [36, p. 200]). 

For all \(i,j \in \mathcal{X}\) , \(i\neq j\) , the finite limit

$${q}_{ij} =\mathop{\lim }\limits_{ h \rightarrow 0 +} \frac{{p}_{ij}(h)} {h}$$

exists.

For every \(i \in \mathcal{X}\) there exists a finite or infinite limit

$${q}_{i} =\mathop{\lim }\limits_{ h \rightarrow 0 +} \frac{1 - {p}_{ii}(h)} {h} = -{p}_{ii}^{{\prime}}(0).$$

The quantities q ij and q i are the most important characteristics of a homogeneous continuous-time Markov chain. Subsequently we will use the notation \({q}_{ii} = -{q}_{i}\), \(i \in \mathcal{X}\), also and interpret the meaning of these quantities.

Definition 3.49.

The quantity q ij is called the transition rate of intensity from the state i to the state j, while \({q}_{i}\) is called the transition rate from the state i.

We classify the states in accordance with whether or not the rate \({q}_{i}\) is finite. If \({q}_{i} < \infty \), then i is called a stable state, while if \({q}_{i} = +\infty \), then we say that i is an instantaneous state. Note that there exists a Markov chain with the property \({q}_{i} = +\infty \) [36, pp. 207–210].

Definition 3.50.

A stable noninstantaneous state i is called regular if

$$\sum\limits_{i\neq j}{q}_{ij} = -{q}_{ii} = {q}_{i},$$

and a Markov chain is locally regular if all its states are regular.

Corollary 3.51.

As a consequence of Theorem  3.48 , we obtain that locally regular Markov chains satisfy the following asymptotic properties as \(h \rightarrow 0+\) :

$$\begin{array}{rcl} & & \mathbf{P}\left ({X}_{t+h}\neq i\ \vert \ {X}_{t} = i\right ) = {q}_{i}h + o\left (h\right ),\ \\ & & \mathbf{P}\left ({X}_{t+h} = i\ \vert \ {X}_{t} = i\right ) = 1 - {q}_{i}h + o\left (h\right ),\ \\ & & \mathbf{P}\left ({X}_{t+h} = j\ \vert \ {X}_{t} = i\right ) = {q}_{ij}h + o\left (h\right ),\ \ j\neq i.\end{array}$$

From Theorem  3.48 it also follows that Markov chains with a finite state space are locally regular because all \({q}_{ij},\ i\neq j\) , are finite and, consequently, all \({q}_{i}\) are also finite.

The condition

$$q =\mathop{\sup }\limits_{ i \in \mathcal{X}}\ {q}_{i} < \infty $$
(3.13)

will play an important role in our subsequent investigations. We introduce the notation

$$Q ={ \left [{q}_{ij}\right ]}_{i,j\in \mathcal{X}} ={ \left [{p}_{ij}^{{\prime}}(0)\right ]}_{ i,j\in \mathcal{X}} = {\Pi }^{{\prime}}(0)$$

for locally regular Markov chains. Recall that

$$\mathop{\lim }\limits_{t \rightarrow 0 +} \Pi (t) = \Pi (0) = I,$$
(3.14)

where I is the identity matrix with suitable dimension.

Definition 3.52.

The matrix \(Q\) is called a rate or infinitesimal matrix of a continuous-time Markov chain.

The following assertions hold for all locally regular Markov chains under the initial condition (3.14) [36, pp. 204–206].

Theorem 3.53.

The transition probabilities of a locally regular Markov chain satisfy the Kolmogorov backward differential equation

$${\Pi }^{{\prime}}(t) = Q\ \Pi (t),\ t \geq 0\ \ \ \ \ \ \ \text{ (I).}$$

If condition (3.13) is fulfilled, then the Kolmogorov forward differential equation

$${\Pi }^{{\prime}}(t) = \Pi (t)\ Q,\ t \geq 0\ \ \ \ \ \ \ \text{ (II)}$$

is valid. Under condition (3.13) differential Eqs. (I) and (II), referred to as first- and second-system Kolmogorov equations, have unique solutions.

3.2 Stepwise Markov Chains

The results of Theorem 3.53 are related to the analytical properties of transition probabilities and do not deal with the stochastic behavior of sample paths. In this part we investigate the so-called stepwise Markov chains and their sample paths. We introduce the embedded Markov chain and consider the transition probabilities and holding times. In the remaining part of this chapter we assume that the Markov chain is locally regular and condition (3.13) holds.

Definition 3.54.

A Markov chain X is a jump process if for any \(t \geq 0\) there exists a random time \(\Delta = \Delta (t,\omega ) > 0\) such that

$${X}_{s} = {X}_{t},\ \ \text{ if}\ \ s \in [t,t + \Delta ).$$

In the definition, \(\Delta \) can be the remaining time the process stays at state X(t), and the definition requires that this time be positive.

Definition 3.55.

We say that a Markov chain has a jump at time \({t}_{0} > 0\) if there exists a monotonically increasing sequence \({t}_{1},{t}_{2},\ldots \) such that \({t}_{n} \rightarrow {t}_{0},\ n \rightarrow \infty \) and at the same time \({X}_{{t}_{n}}\neq {X}_{{t}_{0}},\ n = 1,2,\ldots \). A Markov chain is called a stepwise process if it is a jump process and the number of jumps is finite for all sample paths on all finite intervals [0, t].

It should be noted that a stepwise process is continuous from the right and has a limit from the left at all jumping points.

Denote by \(({\tau }_{0} =)0 < {\tau }_{1} < {\tau }_{2} < \ldots \) the sequence of consecutive jumping points; then all finite time intervals consist, at most, of finite jumping points. Between two jumping points the state of the process does not change, and this time is called the holding time.

Definition 3.56.

A stepwise Markov chain is called regular if the sequence of holding times \({\zeta }_{k} = {\tau }_{k+1} - {\tau }_{k},\ k = 0,1,\ldots \), satisfies the condition

$$\mathbf{P}\left (\sum\limits_{k=0}^{\infty }{\zeta }_{ k} = \infty \right ) = 1.$$

By the definition of stepwise process, we have

$${X}_{s} \equiv {X}_{{\tau }_{i}},\ s \in [{\tau }_{i},{\tau }_{i+1}),\ i = 0,1,\ldots .$$

Denote by \({Y }_{k} = {X}_{{\tau }_{{}_{ k}}},\ k = 0,1,\ldots \), the states at time points where the transitions change, and define for \(i\neq j\)

$${ \pi }_{ij} = \left \{\begin{array}{c} \frac{{q}_{ij}} {{q}_{i}} ,\ \ \text{ if}\ \ {q}_{i} > 0, \\ \ \ 0,\ \ \ \text{ if}\ \ {q}_{i} = 0. \end{array} \right.$$
(3.15)

In addition, let

$${\pi }_{ii} = 1 -\sum\limits_{j\neq i}{\pi }_{ij}\ \cdot $$
(3.16)

By the Markov property, the process \(({Y }_{k},\ k \geq 0)\), is a discrete-time homogeneous Markov chain with the state space \(\mathcal{X} =\{ 0,1,\ldots \}\) and the transition probabilities

$$\mathbf{P}\left ({Y }_{n+1} = j\ \vert \ {Y }_{n} = i\right ) = {\pi }_{ij},\ ij \in \mathcal{X},\ n \geq 0.$$

The process \(({Y }_{k},\ k \geq 0)\) is called an embedded Markov chain of the continuous-time stepwise Markov chain X.

Note that the condition \({q}_{i} = 0\) corresponds to the case where i is an absorbing state, and in other cases the holding times for arbitrary state i have an exponential distribution with parameter q i whose density function is \({q}_{i}\ {\mathrm{e}}^{-{q}_{i}x},\ x > 0\).

3.3 Construction of Stepwise Markov Chains

The construction derived here gives a method for simulating stepwise Markov chains at the same time. Thus, we construct a CTMC \(\{{X}_{t},\ t \geq 0\}\), with initial distribution \(P(0) = ({P}_{0}(0),{P}_{1}(0),\ldots )\) and transition probability matrix \(\Pi (t) = \left [{p}_{ij}(t)\right ],\ t \geq 0\), satisfying condition (3.13).

Using notations (3.15) and (3.16), define the random time intervals with length \({S}_{0},{S}_{1},\ldots \), nonnegative random variables \({K}_{0},{K}_{1},\ldots \), taking integer numbers and the random jumping points \({\tau }_{m} = {S}_{0} + \ldots + {S}_{m-1},\) m = 1, 2, , by the following procedure.

  1. (a)

    Generate a random variable K 0 with distribution P(0) [i.e., \(\mathbf{P}\left ({K}_{0} = k\right ) = {P}_{k}(0),\ k \in \mathcal{X}\)] and a random variable S 0 distributed exponentially with parameter \({q}_{{K}_{0}}\) conditionally dependent on K 0. Define \({X}_{t} = {K}_{0}\)  if  \(0 \leq t < {S}_{0}\).

  2. (b)

    In the mth steps (m = 1, 2, ) generate a random variable K m with distribution \({P}^{(m)} = ({\pi }_{{K}_{m-1},j},\ j \in \mathcal{X})\), and a random variable S m distributed exponentially with the parameter \({q}_{{K}_{m}}\). Define \({X}_{t} = {K}_{m}\)  if  \({\tau }_{m} \leq t < {\tau }_{m+1},\ m = 0,1,\ldots \).

Then the stochastic process \(\{{X}_{t},t \geq 0\}\) is a stepwise Markov chain with initial distribution P(0) and transition probability matrix \(\Pi (t),\ t \geq 0\).

3.4 Some Properties of the Sample Path of Continuous-Time Markov Chains

By the considerations of the sample paths of CTMCs, there are problems that cannot arise in the case of discrete-time chains. For example, \({X}_{t},\ t \geq 0\), are random variables; therefore, \(\left \{{X}_{t} \leq x\right \} \in \mathcal{A}\) is an event for all t ≥ 0 and \(x \in \mathbb{R}\). But at the same time, for example, the set

$$\bigcap\limits_{a\leq t<b}\{\omega \in \Omega : {X}_{t}(\omega ) \leq x\}$$

is not necessarily an event (element of \(\mathcal{A}\)). This question is closely connected to the separability property of the processes (see, for example, [35, Chap. III]). The root essence is whether a countable and everywhere dense subset \(\mathcal{S}\subset [0,\infty )\) exists such that the statistical behavior of the process X can be characterized by a countable set of the random variables \({X}_{t},\ t \in \mathcal{S}\). The notion of separability is given in general by the following definition.

Definition 3.57.

A process \(X = ({X}_{t},\ t \geq 0)\) is called separable if there exists an event \(N \in \mathcal{A}\)  with probability 0 and a countable subset \(\mathcal{S} =\{ {r}_{i},\ i = 1,2,\ldots \}\) of \({\mathbb{R}}_{+} = [0,\infty )\) that is always dense in \({\mathbb{R}}_{+}\) such that for any open set \(G \subset {\mathbb{R}}_{+}\) and for any closed set \(F \subset \mathcal{X}\) the sets \(\{\omega :\ {X}_{{r}_{i}} \in F,\ {r}_{i} \in G\}\) and \(\{\omega :\ {X}_{t} \in F,\ t \in G\}\) can differ only on the subset of N.

With the help of transition probabilities one can easily give a simple condition that ensures the continuity in probability of the process and, at the same time, the separability property.

Definition 3.58.

A stochastic process \(({X}_{t},\ t \geq 0)\) is called continuous in probability (or stochastically) at the point \({t}_{0} \geq 0\)if for all positive numbers \(\epsilon \) the convergence

$$\mathop{\lim }\limits_{t \rightarrow {t}_{0}}\mathbf{P}\left (\vert {X}_{t} - {X}_{{t}_{0}}\ \vert \ > \epsilon \right ) = 0$$

holds. A process is said to be continuous in probability if it is continuous in probability everywhere.

Theorem 3.59.

If a Markov chain X is locally regular and condition (3.13) is satisfied, then it is continuous in probability.

Proof.

First we check that

$$\delta (h) =\mathop{\sup }\limits_{ k \in \mathcal{X}}(1 - {p}_{kk}(h)) \rightarrow 0,\ h \rightarrow 0 +.$$
(3.17)

Since by the relation in [36, p. 201]

$$\frac{1 - {p}_{kk}(h)} {h} \leq \mathop{\lim }\limits_{ h \rightarrow 0 +} \frac{1 - {p}_{kk}(h)} {h} = {q}_{k} \leq q,$$

then

$$\mathop{\sup }\limits_{k \in \mathcal{X}}(1 - {p}_{kk}(h)) \leq qh \rightarrow 0,\ h \rightarrow 0 +.$$

It is not difficult to see that for arbitrary \(u,h \geq 0\) and \(\epsilon > 0\) we have

$$\begin{array}{rcl} \mathbf{P}\left (\vert {X}_{u+h} - {X}_{u}\ \vert \ > \epsilon \right )& \leq & \mathbf{P}\left (\vert {X}_{u+h} - {X}_{u}\ \vert \ > 0\right ) \\ & =& \sum\limits_{k\in \mathcal{X}}\mathbf{P}\left (\vert {X}_{u+h} - {X}_{u}\ \vert \ > 0\ \vert \ {X}_{u} = k\right )\mathbf{P}\left ({X}_{u} = k\right ) \\ & =& \sum\limits_{k\in \mathcal{X}}[1 -\mathbf{P}\left (\vert {X}_{u+h} - {X}_{u}\ \vert \ = 0\ \vert \ {X}_{u} = k\right )\mathbf{P}\left ({X}_{u} = k\right ) \\ & =& \sum\limits_{k\in \mathcal{X}}(1 - {p}_{kk}(h))\mathbf{P}\left ({X}_{u} = k\right ) \leq \delta (h) \rightarrow 0,\ h \rightarrow 0+,\\ \end{array}$$

which means actually the continuity in probability of the chain X. □ 

Definition 3.60.

The stochastic processes \(({X}_{t},\ t \geq 0)\) and \(({X}_{t}^{{\prime}},\ t \geq 0)\), given on the same probability space, are said to be equivalent if

$$\mathbf{P}\left ({X}_{t} = {X}_{t}^{{\prime}}\right ) = 1,\ t \geq 0.$$

The following theorem ensures that under the condition of continuity in probability, one can consider the separable version of the original process.

Theorem 3.61.

If a process \(({X}_{t},\ t \geq 0)\) is continuous in probability, then there exists a continuous-in-probability separable version \(({X}_{t}^{{\prime}},\ t \geq 0)\) that is stochastically equivalent to \(({X}_{t},\ t \geq 0)\) .

Theorem 3.62.

If a Markov chain satisfies condition (3.13) , then there exists a separable and stochastically equivalent version of this Markov chain.

Proof.

From Theorem 3.59 it follows that the Markov chain is continuous in probability; therefore, as a consequence of Theorem 3.61, we have the assertion of the present theorem. □ 

We assume later on that condition (3.13) is fulfilled because this condition with Theorem 3.62 guarantees that the Markov chain has a stochastically equivalent separable version. Assuming that condition (3.13) holds, one can bypass the measurability problems that can arise in the case of CTMCs, and the holding times are positive for all states.

Theorem 3.63.

If a homogeneous Markov chain X satisfies condition (3.13) , then X has an equivalent stepwise version.

Proof.

Since from condition (3.13) follows Eq. (3.17), then by the use of the theorem of [36, p. 281], we obtain that there exists a stepwise version of the Markov chain that is equivalent to the original Markov chain. □ 

3.5 Poisson Process as Continuous-Time Markov Chain

Theorem 3.64.

Let \(({N}_{t},\ t \geq 0)\) be a homogeneous Poisson process with intensity rate \(\lambda \) , N 0 = 0. Then the process N t is a homogeneous Markov chain.

Proof.

Choose arbitrarily a positive integer n, integers \(0 \leq {i}_{1} \leq \ldots \leq {i}_{n+1}\), and real numbers \({t}_{0} = 0 < {t}_{1} < \ldots < {t}_{n+1}\). It can be seen that

$$\begin{array}{rcl} & & \mathbf{P}\left ({N}_{{t}_{n+1}} = {i}_{n+1}\ \vert \ {N}_{{t}_{n}} = {i}_{n},\ldots ,{N}_{{t}_{1}} = {i}_{1}\right ) \\ & & \quad = \frac{\mathbf{P}\left ({N}_{{t}_{n+1}} = {i}_{n+1},{N}_{{t}_{n}} = {i}_{n},\ldots ,{N}_{{t}_{1}} = {i}_{1}\right )} {\mathbf{P}\left ({N}_{{t}_{n}} = {i}_{n},\ldots ,{N}_{{t}_{1}} = {i}_{1}\right )} \\ & & \quad = \frac{\mathbf{P}\left ({N}_{{t}_{n+1}} - {N}_{{t}_{n}} = {i}_{n+1} - {i}_{n},\ldots ,{N}_{{t}_{2}} - {N}_{{t}_{1}} = {i}_{2} - {i}_{1},{N}_{{t}_{1}} = {i}_{1}\right )} {\mathbf{P}\left ({N}_{{t}_{n}} - {N}_{{t}_{n-1}} = {i}_{n} - {i}_{n-1},\ldots ,{N}_{{t}_{2}} - {N}_{{t}_{1}} = {i}_{2} - {i}_{1},{N}_{{t}_{1}} = {i}_{1}\right )}.\end{array}$$

Since the increments of the Poisson process are independent, the last fraction can be written in the form

$$\begin{array}{rcl} & & \frac{\mathbf{P}\left ({N}_{{t}_{n+1}} - {N}_{{t}_{n}} = {i}_{n+1} - {i}_{n}\right ) \cdot \ldots \cdot \mathbf{P}\left ({N}_{{t}_{2}} - {N}_{{t}_{1}} = {i}_{2} - {i}_{1}\right )\mathbf{P}\left ({N}_{{t}_{1}} = {i}_{1}\right )} {\mathbf{P}\left ({N}_{{t}_{n}} - {N}_{{t}_{n-1}} = {i}_{n} - {i}_{n-1}\right ) \cdot \ldots \cdot \mathbf{P}\left ({N}_{{t}_{2}} - {N}_{{t}_{1}} = {i}_{2} - {i}_{1}\right )\mathbf{P}\left ({N}_{{t}_{1}} = {i}_{1}\right )} \\ & & \quad = \mathbf{P}\left ({N}_{{t}_{n+1}} - {N}_{{t}_{n}} = {i}_{n+1} - {i}_{n}\right ).\end{array}$$

From the independence of the increments \({N}_{{t}_{n+1}} - {N}_{{t}_{n}}\) and \({N}_{{t}_{n}} = {N}_{{t}_{n}} - {N}_{0}\) it follows that the events \(\left \{{N}_{{t}_{n+1}} - {N}_{{t}_{n}} = {i}_{n+1} - {i}_{n}\right \}\) and \(\left \{{N}_{{t}_{n}} = {i}_{n}\right \}\) are also independent, and thus

$$\begin{array}{rcl} \mathbf{P}\left ({N}_{{t}_{n+1}} - {N}_{{t}_{n}} = {i}_{n+1} - {i}_{n}\right )& =& \mathbf{P}\left ({N}_{{t}_{n+1}} - {N}_{{t}_{n}} = {i}_{n+1} - {i}_{n}\ \vert {N}_{{t}_{n}} = {i}_{n}\right ) \\ & =& \mathbf{P}\left ({N}_{{t}_{n+1}} = {i}_{n+1}\vert {N}_{{t}_{n}} = {i}_{n}\right ), \\ \end{array}$$

and finally we have

$$\mathbf{P}\left ({N}_{{t}_{n+1}} = {i}_{n+1}\vert {N}_{{t}_{n}} = {i}_{n},\ldots ,{N}_{{t}_{1}} = {i}_{1}\right ) = \mathbf{P}\left ({N}_{{t}_{n+1}} = {i}_{n+1}\vert {N}_{{t}_{n}} = {i}_{n}\right ).$$

 □ 

It is easy to determine the rate matrix of a homogeneous Poisson process with intensity \(\lambda \). Clearly, the transition probability of the process is

$${p}_{ij}(h) = \mathbf{P}\left ({N}_{t+h} = j\ \vert \ {N}_{t} = i\right ) = \mathbf{P}\left ({N}_{h} = j - i\right ) = \frac{{(\lambda h)}^{j-i}} {(j - i)!}{ \mathrm{e}}^{-\lambda h},\ \ j \geq i,$$

and

$${p}_{ij}(h) \equiv 0,\ \ j < i.$$

If j < i, then obviously \({q}_{ij} \equiv 0\). Let now \(i \leq j\); then

$${ q}_{ij} =\mathop{\lim }\limits_{ h \rightarrow 0+}\frac{{p}_{ij}(h)} {h} =\mathop{\lim }\limits_{ h \rightarrow 0+}\frac{1} {h} \frac{{(\lambda h)}^{j-i}} {(j - i)!}{ \mathrm{e}}^{-\lambda h} = \left \{\begin{array}{c} \lambda ,\ \ \text{ if}\ \ j = i + 1, \\ 0,\ \ \text{ if}\ \ j > i + 1. \end{array} \right.$$

Finally, let i = j. By the use of the L’Hospital rule

$${q}_{i} =\mathop{\lim }\limits_{ t \rightarrow 0 +} \frac{1 - {p}_{ii}(h)} {h} =\mathop{\lim }\limits_{ t \rightarrow 0 +} \frac{1 -{\mathrm{e}}^{-\lambda h}} {h} = \lambda.$$

Thus, summing up the obtained results, we have the rate matrix

$$Q = \left [\begin{array}{ccccc} - \lambda & \lambda & 0 & 0 &\cdot \\ 0 & - \lambda & \lambda &0 &\cdot \\ 0 & 0 & - \lambda &\lambda &\cdot \\ \cdot & \cdot & \cdot & \cdot &\cdot \end{array} \right ].$$
(3.18)

The Poisson process is regular because for all \(i \in \mathcal{X}\)

$$\sum\limits_{j\neq i}{q}_{ij} = \lambda = {q}_{i} < \infty.$$

3.6 Reversible Markov Chains

Definition 3.65.

A discrete-time Markov process is called reversible if for every state i, j the equation

$${\pi }_{i}{p}_{ij} = {\pi }_{j}{p}_{ji}$$

holds, where \({\pi }_{i}\) is the equilibrium probability of the states \(i \in \mathcal{X}\).

The equation of the definition is usually called a local (or detailed) balance condition because of its similarity to the (global) balance Eq. (3.8) or, more precisely, to its form

$$\sum\limits_{j\in \mathcal{X}}{\pi }_{i}{p}_{ij} =\sum\limits_{j\in \mathcal{X}}{\pi }_{j}{p}_{ji},\ i \in \mathcal{X}.$$

The notation of reversibility of Markov chains originates from the fact that if the initial distribution of the chain equals the stationary one, then the forward and reverse conditional transition probabilities are identical, that is,

$$\mathbf{P}\left ({X}_{n} = i\mid {X}_{n+1} = j\right ) = \mathbf{P}\left ({X}_{n+1} = i\mid {X}_{n} = j\right ).$$

Indeed,

$$\begin{array}{rcl} \mathbf{P}\left ({X}_{n} = i\mid {X}_{n+1} = j\right )& =& \frac{\mathbf{P}\left ({X}_{n} = i,{X}_{n+1} = j\right )} {\mathbf{P}\left ({X}_{n+1} = j\right )} \\ & =& \frac{\mathbf{P}\left ({X}_{n} = i\right )\mathbf{P}\left ({X}_{n+1} = j\mid {X}_{n} = i\right )} {\mathbf{P}\left ({X}_{n+1} = j\right )} \\ & =& \frac{{\pi }_{i}{p}_{ij}} {{\pi }_{j}} = \frac{{\pi }_{j}{p}_{ji}} {{\pi }_{j}} = {p}_{ji} \\ & =& \mathbf{P}\left ({X}_{n+1} = i\mid {X}_{n} = j\right ).\end{array}$$

In the case of CTMCs, a definition can be applied analogously to the discrete-time case.

Definition 3.66.

A CTMC is called reversible if for all pairs i, j of states the equation

$${\pi }_{i}{q}_{ij} = {\pi }_{j}{q}_{ji}$$

holds, where \({\pi }_{i}\) is the equilibrium probability of the state \(i \in \mathcal{X}\).

The reversibility property and the local balance equations are often valid for Markov chains describing the processes in queueing networks (Sect. 10.1); in consequence the equilibrium probabilities can be computed in a simple, so-called product form.

4 Birth-Death Processes

Definition 3.67.

The right-continuous stochastic process \(\{\nu (t),\,\,t \geq 0\}\) is a birth-death process if

  1. 1.

    Its set of states is \(I =\{ 0,1,2,\ldots \}\) [that is, \(\nu (t) \in I\)];

  2. 2.

    The sojourn time in the state \(k \in I,k > 0\), is exponentially distributed with the parameter

    $${\alpha }_{k} = {a}_{k} + {b}_{k},\quad {a}_{k},{b}_{k} \geq 0,k > 0,$$

    and it is independent of the trajectory before arriving at the state k;

  3. 3.

    After the state \(k \in I\), \(k \geq 1\), the process visits the state k + 1 with probability \({p}_{k} = \frac{{a}_{k}} {{\alpha }_{k}}\) and state k − 1 with probability \({q}_{k} = 1 - {p}_{k} = \frac{{b}_{k}} {{\alpha }_{k}}\);

  4. 4.

    For the state 0 we consider the following two cases:

    • The process stays an exponentially distributed amount of time in state 0 with parameter \({\alpha }_{0} = {a}_{0} > 0\) and after that visits state 1 (with probability p 0 = 1).

    • Once the process arrives at state 0 it remains there forever (\({q}_{0} = 1,{p}_{0} = 0\)).

\({P}_{k}(0) = \mathbf{P}\left (\nu (0) = k\right ) = {\varphi }_{k},\;k \in I\), denotes the initial distribution of the process.

If \(\{\nu (t),\;t \geq 0\}\) is a birth-death process, then it is an infinite-state continuous-time (time-homogeneous) Markov chain. The parameters a k and b k are referred to as the birth rate and the death rate in the state k, respectively, and k is referred to as the population. The special case where \({b}_{k} \equiv 0\) is referred to as the birth process and where \({a}_{k} \equiv 0\) as the death process.

Let \({T}_{0} = 0 < {T}_{1} < {T}_{2} < \ldots \) denote the time instants of the population changes (birth and death). The discrete-time \(\{{\nu }_{n},\;n \geq 0\}\) process, where \({\nu }_{n} = \nu ({T}_{n})\) is the population after the nth change in population [nth jump of ν(t)], is referred to as the Markov chain embedded in the population changes of \(\{\nu (t),\;t \geq 0\}\). The state-transition probability matrix of the embedded Markov chain is

$$\left [\begin{array}{cccccc} {q}_{0} & {p}_{0} & 0 & 0 & 0 &\cdots \\ {q}_{1} & 0 &{p}_{1} & 0 & 0 &\cdots \\ 0 &{q}_{2} & 0 &{p}_{2} & 0 &\cdots \\ 0 & 0 &{q}_{3} & 0 &{p}_{3} & \cdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \end{array} \right ]\cdot$$

4.1 Some Properties of Birth-Death Processes

The transient state probability, its Laplace transform, and the initial probabilities for \(k \geq 0\), \(t \geq 0\), and Re s > 0 are denoted by

$${P}_{k}(t) = \mathbf{P}\left (\nu (t) = k\right ),\quad {p}_{k}^{{_\ast}}(s) =\int\limits_{0}^{\infty }{\mathrm{e}}^{-st}{P}_{ k}(t)\,\mathrm{d}t,\quad {P}_{k}(0) = \mathbf{P}\left (\nu (0) = k\right ) = {\varphi }_{k}.$$

In special cases, the following theorems are true. ([69])

Theorem 3.68.

If \({p}_{0} = 1,\;0 < {p}_{k} < 1,\;k \geq 1\) , then the following statements hold:

  1. 1.

    \({P}_{k}(t)\) satisfies the following ordinary differential equations:

    $$\begin{array}{rcl} {P}_{0}^{{\prime}}(t)& =& -{a}_{ 0}{P}_{0}(t) + {b}_{1}{P}_{1}(t), \\ {P}_{k}^{{\prime}}(t)& =& {a}_{ k-1}{P}_{k-1}(t) - ({a}_{k} + {b}_{k}){P}_{k}(t) + {b}_{k+1}(t){P}_{k+1}(t),\quad k \geq \end{array}$$
    (1.)
  2. 2.

    For \({\varphi }_{k},\,k \geq 0\) , and \(Re\ s > 0\) the following linear system defines \({p}_{k}^{{_\ast}}(s)\) :

    $$\begin{array}{rcl} s{p}_{0}^{{_\ast}}(s) - {\varphi }_{ 0}& =& -{a}_{0}{p}_{0}^{{_\ast}}(s) + {b}_{ 1}{p}_{1}^{{_\ast}}(s), \\ s{p}_{k}^{{_\ast}}(s) - {\varphi }_{ k}& =& {a}_{k-1}{p}_{k-1}^{{_\ast}}(s) - ({a}_{ k} + {b}_{k}){p}_{k}^{{_\ast}}(s) + {b}_{ k+1}{p}_{k+1}^{{_\ast}}(s),\quad k \geq \end{array}$$
    (1.)
  3. 3.

    For \(k \geq 0\) the limits

    $$\lim_{t\rightarrow \infty }{P}_{k}(t) = {\pi }_{k}$$

    exist and are independent of the initial distribution of the process.

    $${\pi }_{k} = 0,\;k \geq 0,$$

    if

    $$\sum\limits_{k=0}^{\infty }{\rho }_{ k} < \infty ,$$
    (3.19)

    where \({\rho }_{0} = 1\) and \({\rho }_{k} = \frac{{a}_{0}{a}_{1}\cdots {a}_{k-1}} {{b}_{1}{b}_{2}\cdots {b}_{k}} ,\;k \geq 1\) . Otherwise, \({\pi }_{k} > 0,\;k \geq 0\) , and

    $${\pi }_{0} ={ \left (\sum\limits_{j=0}^{\infty }{\rho }_{ j}\right )}^{-1},$$
    (3.20)
    $${\pi }_{k} = {\rho }_{k}{\pi }_{0}.$$
    (3.21)

Theorem 3.69 (Finite birth-death process). 

Let the state space of \(\nu (t)\) be \(\{0,1,2,\ldots ,n\}\) , p 0 = 1, \(0 < {p}_{k} < 1\) , for \(1 \leq k \leq n - 1\) and p n = 0; then the following statements hold:

  1. 1.

    \({P}_{k}(t)\) satisfies the following ordinary differential equations:

    $$\begin{array}{rcl} {P}_{0}^{{\prime}}(t)& =& -{a}_{ 0}{P}_{0}(t) + {b}_{1}{P}_{1}(t), \\ {P}_{k}^{{\prime}}(t)& =& {a}_{ k-1}{P}_{k-1}(t) - ({a}_{k} + {b}_{k}){P}_{k}(t) + {b}_{k+1}{P}_{k+1}(t),\quad 1 \leq k \leq n - 1, \\ {P}_{n}^{{\prime}}(t)& =& {a}_{ n-1}{P}_{n-1}(t) - {b}_{n}{P}_{n}(t).\end{array}$$
  2. 2.

    If the initial distribution of the process is \({\varphi }_{k} = \mathbf{P}\left (\nu (0) = k\right ),\;0 \leq k \leq n\) , then for \(Re\ s > 0\) the Laplace transforms of the transient state probabilities \({p}_{k}^{{_\ast}}(s)\) satisfy

    $$\begin{array}{rcl} s{p}_{0}^{{_\ast}}(s) - {\varphi }_{ 0}& =& -{a}_{0}{p}_{0}^{{_\ast}}(s) + {b}_{ 1}{p}_{1}^{{_\ast}}(s), \\ s{p}_{k}^{{_\ast}}(s) - {\varphi }_{ k}& =& {a}_{k-1}{p}_{k-1}^{{_\ast}}(s) - ({a}_{ k} + {b}_{k}){p}_{k}^{{_\ast}}(s) + {b}_{ k+1}{p}_{k+1}^{{_\ast}}(s),\;1 \leq k \leq n - 1, \\ s{p}_{n}^{{_\ast}}(s) - {\varphi }_{ n}& =& {a}_{n-1}{p}_{n-1}^{{_\ast}}(s) - {b}_{ n}{p}_{n}^{{_\ast}}(s).\end{array}$$
  3. 3.

    For \(0 \leq k \leq n\) the

    $$\lim_{t\rightarrow \infty }{P}_{k}(t) = {\pi }_{k} > 0$$

    limit exists and is independent of the initial distribution:

    $${\pi }_{j} = {\rho }_{j}{\pi }_{0},\quad {\pi }_{0} ={ \left (\sum\limits_{j=0}^{\infty }{\rho }_{ j}\right )}^{-1},$$

    where

    $${\rho }_{0} = 1,\quad {\rho }_{j} = \frac{{a}_{0}{a}_{1}\cdots {a}_{j-1}} {{b}_{1}{b}_{2}\cdots {b}_{j}} ,\;1 \leq j \leq n.$$

Theorem 3.70.

The following equations hold.

  1. 1.

    Let \({p}_{0} = 0,\;0 < {p}_{k} < 1,\;k \geq 1\) ; then for \({P}_{k}(t)\) we have

    $$\begin{array}{rcl} {P}_{0}^{{\prime}}(t)& =& {b}_{ 1}{P}_{1}(t), \\ {P}_{1}^{{\prime}}(t)& =& -({a}_{ 1} + {b}_{1}){P}_{1}(t) + {b}_{2}{P}_{2}(t), \\ {P}_{k}^{{\prime}}(t)& =& {a}_{ k-1}{P}_{k-1}(t) - ({a}_{k} + {b}_{k}){P}_{k}(t) + {b}_{k+1}{P}_{k+1}(t),\;k \geq 2, \\ \end{array}$$

    and for \(Re\ s > 0\) and the initial distribution \({\varphi }_{k},\;k \geq 0\) , we have

    $$\begin{array}{rcl} s{p}_{0}^{{_\ast}}(s) - {\varphi }_{ 0}& =& {b}_{1}{p}_{1}^{{_\ast}}(s), \\ s{p}_{1}^{{_\ast}}(s) - {\varphi }_{ 1}& =& -({a}_{1} + {b}_{1}){p}_{1}^{{_\ast}}(s) + {b}_{ 2}{p}_{2}^{{_\ast}}(s), \\ s{p}_{k}^{{_\ast}}(s) - {\varphi }_{ k}& =& {a}_{k-1}{p}_{k-1}^{{_\ast}}(s) - ({a}_{ k} + {b}_{k}){p}_{k}^{{_\ast}}(s) + {b}_{ k+1}{p}_{k+1}^{{_\ast}}(s),\;k \geq 2. \end{array}$$
  2. 2.

    Let \(\nu (t) \in \{ 0,1,2,\ldots ,n\}\) , p 0 = 0, \(0 < {p}_{k} < 1\) if \(1 \leq k \leq n - 1\) , and p n = 0; then for P k (t) we have

    $$\begin{array}{rcl}{ P}_{0}^{{\prime}}(t)& =& {b}_{ 1}{P}_{1}(t), \\ {P}_{1}^{{\prime}}(t)& =& -({a}_{ 1} + {b}_{1}){P}_{1}(t) + {b}_{2}{P}_{2}(t), \\ \end{array}$$
    $$\begin{array}{rcl} {P}_{k}^{{\prime}}(t)& =& {a}_{ k-1}{P}_{k-1}(t) - ({a}_{k} + {b}_{k}){P}_{k}(t) + {b}_{k+1}{P}_{k+1}(t),\quad 2 \leq k \leq n - 1, \\ {P}_{n}^{{\prime}}(t)& =& {a}_{ n-1}{P}_{n-1}(t) - {b}_{n}{P}_{n}(t), \\ \end{array}$$

    and for \({p}_{k}^{{_\ast}}(s)\) , \(Re\ s > 0\) , we have [ \({\varphi }_{k} = \mathbf{P}\left (\nu (0) = k\right ),\;0 \leq k \leq n\) ]

    $$\begin{array}{rcl} s{p}_{0}^{{_\ast}}(s) - {\varphi }_{ 0}& =& {b}_{1}{p}_{1}^{{_\ast}}(s), \\ s{p}_{1}^{{_\ast}}(s) - {\varphi }_{ 1}& =& -({a}_{1} + {b}_{1}){p}_{1}^{{_\ast}}(s) + {b}_{ 2}{p}_{2}^{{_\ast}}(s), \\ s{p}_{k}^{{_\ast}}(s) - {\varphi }_{ k}& =& {a}_{k-1}{p}_{k-1}^{{_\ast}}(s) - ({a}_{ k}+{b}_{k}){p}_{k}^{{_\ast}}(s)+{b}_{ k+1}{p}_{k+1}^{{_\ast}}(s),\quad 2\,\leq \,k\,\leq \,n - 1, \\ s{p}_{n}^{{_\ast}}(s) - {\varphi }_{ n}& =& {a}_{n-1}{p}_{n-1}^{{_\ast}}(s) - {b}_{ n}{p}_{n}^{{_\ast}}(s).\end{array}$$

Comment 3.71.

In Theorems  3.683.70 the differential equations for P j (t) are indeed the Kolmogorov (forward) differential equations for the given systems. The equations for \({p}_{j}^{{_\ast}}(s)\) can be obtained from the related differential equations for P j (t) using

$$\int\limits_{0}^{\infty }{\mathrm{e}}^{-st}{P}_{ j}^{{\prime}}(t)\,\mathrm{d}t = s{p}_{ j}^{{_\ast}}(s) - {P}_{ j}^{{\prime}}(0)\ \cdot$$

In Theorem 3.70 state \(0\) is an absorbing state. In this way, the theorem allows one to compute the parameters of the busy period of birth-death Markov chains starting from state k (\({\varphi }_{k} = 1\)), where the busy period is the time to reach state 0 (which commonly represents the idle state of a system, where the server is not working, in contrast to the i > 0 states, where the server is commonly busy). Let \({\Pi }_{k}\) denote the length of the busy period starting from state k; then

$${\Pi }_{k}(t) = \mathbf{P}\left ({\Pi }_{k} \leq t\right ) = \mathbf{P}\left (\nu (t) = 0\right ) = {P}_{0}(t)$$

defines the distribution of the length of the busy period, and from Theorem 3.70.1 we have

$${\Pi }_{k}^{{\prime}}(t) = {P}_{ 0}^{{\prime}}(t) = {b}_{ 1}{P}_{1}(t),$$

from which the Laplace–Stieltjes transform of the distribution of \({\Pi }_{k}(t)\), \({\pi }_{k}(s)\), is

$$\begin{array}{rcl}{ \pi }_{k}(s)& =& \int\limits_{0}^{\infty }{\mathrm{e}}^{-st}\,\mathrm{d}{\Pi }_{ k}(t) =\int\limits_{0}^{\infty }{\mathrm{e}}^{-st}{\Pi }_{ k}^{{\prime}}(t)\,\mathrm{d}t \\ & =& \int\limits_{0}^{\infty }{\mathrm{e}}^{-st}{b}_{ 1}{P}_{1}(t)\,\mathrm{d}t = {b}_{1}{p}_{1}^{{_\ast}}(s).\end{array}$$

If the arrival intensity is constant in all states, i.e., \({a}_{k} = \lambda > 0\) (\(\forall k \geq 0\)), then the arrival process is a Poisson process at rate \(\lambda \). Further results on the properties of special birth-death processes can be obtained, e.g., in [48].

5 Exercises

Exercise 3.1.

Compute the probability that a CTMC with the generator matrix \(\left (\begin{array}{ccc} - 1& 0.5 & 0.5\\ 1 & - 2 & 1 \\ 1 & 0 & - 1\\ \end{array} \right )\) stays in state 1 after the second state transition if the initial distribution is (0. 5, 0. 5, 0).

Exercise 3.2.

Compute the stationary distribution of a CTMC with the generator matrix \(\left (\begin{array}{ccc} - 3& 3 &0\\ 4 & - 4 &0 \\ 0 & 0 &0\\ \end{array} \right )\) if the initial distribution is (0. 5, 0, 0. 5).

Exercise 3.3.

Z n and Y n , n = 1, 2, , are discrete independent random variables. \(\mathbf{P}\left ({Z}_{n} = 0\right ) = 1 - p\), \(\mathbf{P}\left ({Z}_{n} = 1\right ) = p\) and \(\mathbf{P}\left ({Y }_{n} = 0\right ) = 1 - q\), \(\mathbf{P}\left ({Y }_{n} = 1\right ) = q\). Define the transition probability matrix of the DTMC X n if

$${X}_{n+1} = {({X}_{n} - {Y }_{n})}^{+} + {Z}_{ n},$$

where \({(x)}^{+} =\max (x,0)\). This equation is commonly referred to as the evolution equation of a DTMC.

Exercise 3.4.

X n , n = 1, 2, , is a DTMC with the transition probability matrix \(P = \left (\begin{array}{ccc} 3/6&1/6&2/6 \\ 3/4& 0 &1/4 \\ 0 &1/3&2/3\\ \end{array} \right )\). Compute \(\mathbf{E}\left ({X}_{0}{X}_{1}\right )\) and \(corr({X}_{0},{X}_{1})\) if the initial distribution is (0. 5, 0, 0. 5) and the state space is S = { 0, 1, 2}.

Exercise 3.5.

The generator of a CTMC is defined by

$${q}_{0j} = \left \{\begin{array}{ll} \frac{1} {3} & \mbox{ if }j = 1, \\ \frac{1} {3} & \mbox{ if }j = 2, \\ -\frac{2} {3} & \mbox{ if }j = 0,\\ 0 &\mbox{ otherwise}; \\ \end{array} \right.\ \ \ {q}_{ij} = \left \{\begin{array}{ll} \frac{1} {3i} &\mbox{ if }j = i + 1, \\ \frac{1} {3i} &\mbox{ if }j = i + 2, \\ - \frac{2} {3i} - \mu &\mbox{ if }j = i,\\ \mu &\mbox{ if } j = i - 1, \\ 0 &\mbox{ otherwise},\\ \end{array} \right.\mbox{ for }i = 1,2,\ldots.$$

Evaluate the properties of this Markov chain using, e.g., the Foster theorem.

Exercise 3.6.

Show examples of

  • Reducible

  • Periodic (and irreducible)

  • Transient (and irreducible)

DTMCs. Evaluate \(\lim_{n\rightarrow \infty }\mathbf{P}\left ({X}_{n} = i\right )\) for these DTMCs, where i is a state of the Markov chain.

Exercise 3.7.

Two players, A and B, play with dice according to the following rule. They throw the dice, and if the number is 1, then A gets £2 from B; if the number is 2 or 3, then A gets £1 from B; and if the number is greater than 3, then B gets £1 from A. At the beginning of the game both A and B have £3. The game lasts until one of the players can no longer pay. What is the probability that A wins?

Exercise 3.8.

Two players, A and B, play with dice according to the following rule. They throw the dice, and if the number is 1, then A gets £2 from B; if the number is 2 or 3, then A gets £1 from B; and if the number is greater than 3, then B gets £1 from A. At the beginning of the game both A and B have £3. If one of them cannot pay the required amount, then he must give all his money to the other player and the game goes on. What is the expected amount of money A will have after a very long run? What is the probability that B will not be able to pay the required amount in the next step of the game after a very long run?

Exercise 3.9.

There are two machines, A and B, at a production site. Their failure times are exponentially distributed with the parameters λ A and \({\lambda }_{B}\), respectively. Their repair times are also exponentially distributed with the parameters \({\mu }_{A}\) and \({\mu }_{B}\), respectively. A single repairman is associated with the two machines; he can work on only one machine at a time. Compute the probability that at least one of the machines works.

Exercise 3.10.

Let \(X = ({X}_{0},{X}_{1},\ldots )\) be a two-state Markov chain with the state space \(\mathcal{X} =\{ 0,1\}\) and with the probability transition matrix \(P = \left [\begin{array}{cc} a &1 - a\\ 1 - b & b \end{array} \right ]\), where 0 < a, b < 1. Prove that \({P}^{n} = \frac{1} {2-a-b}\Pi + \frac{{(a+b-1)}^{n}} {2-a-b} (I - P)\), where \(\Pi = \left [\begin{array}{cc} 1 - b&1 - a\\ 1 - b &1 - a \end{array} \right ]\) and \(I = \left [\begin{array}{cc} 1&0\\ 0 &1 \end{array} \right ]\).